Title: Hybrid-Type Extensions for Actor-Oriented Modeling (a.k.a. Semantic Data-types for Kepler)
1Hybrid-Type Extensions for Actor-Oriented
Modeling(a.k.a. Semantic Data-types for Kepler)
- Shawn Bowers Bertram Ludäscher
- University of California, Davis
- Genome Center CS Dept.
- May, 2005
2Outline
- Hybrid Types
- Hybrid Types and Scientific Workflow Design
- Super Rapid Prototyping The Sparrow Family of
Languages - Next Steps Adding a Hybrid-Type System to Kepler
3Hybrid Types
4Hybrid Types Superimposing Semantics
- Separation of Concerns
- Conventional Data Modeling (Structural Data
Types) - E.g., XML Schema / DTD, etc.
- Conceptual Data Modeling (Semantic Types)
- Drawn from ontologies (expressed in Description
Logic) - Capturing domain knowledge (e.g., biodiversity,
ecology) - Explicit (external) linkages (Semantic
Annotations) - Simple links (one concept per item)
- Links expressed as constraints (logical mappings)
5Hybrid Types Superimposing Semantics
T1
table
meas
Datatypes
site
spp
bm
plot
string
string
double
string
a relational table of measurements
T2
list
spp
string
a list of strings
6Hybrid Types Superimposing Semantics
SpeciesBiomass ? Measurement ? ?item.Species ?
?prop.Biomass ? ?loc.Location
Semtypes
Species biomass is a measure of the amount of
biomass of a particular species within a
location.
SpeciesCommBiomass ? SpeciesBiomass ?
?loc.Community
Species community biomass is a species biomass
within a community.
SpeciesCommBiomass
T1
table
table/Xmeas gt XSpeciesCommBiomass
meas
site
spp
bm
plot
each table/meas instanceis a measurement
string
string
double
string
7Hybrid Types Superimposing Semantics
SpeciesBiomass ? Measurement ? ?item.Species ?
?prop.Biomass ? ?loc.Location
Semtypes
Species biomass is a measure of the amount of
biomass of a particular species within a
location.
SpeciesCommBiomass ? SpeciesBiomass ?
?loc.Community
Species community biomass is a species biomass
within a community.
SpeciesCommBiomass
T1
table
table/Xmeas, X/Ysite, X/Zplot gt
XSpeciesCommBiomass, Cf(Y,Z), (X,C)loc,
CCommunity,
Community
meas
site
spp
bm
plot
string
string
double
string
8Hybrid Types Superimposing Semantics
SpeciesBiomass ? Measurement ? ?item.Species ?
?prop.Biomass ? ?loc.Location
Semtypes
Species biomass is a measure of the amount of
biomass of a particular species within a
location.
SpeciesCommBiomass ? SpeciesBiomass ?
?loc.Community
Species community biomass is a species biomass
within a community.
SpeciesCommBiomass
T1
table
table/Xmeas, X/Ysite, X/Zplot, X/Uspp gt
XSpeciesCommBiomass, Cf(Y,Z), (X,C)loc,
CCommunity, (X,U)item, USpecies,
Community
meas
Species
site
spp
bm
plot
string
string
double
string
9Hybrid Types Superimposing Semantics
SpeciesBiomass ? Measurement ? ?item.Species ?
?prop.Biomass ? ?loc.Location
Semtypes
Species biomass is a measure of the amount of
biomass of a particular species within a
location.
SpeciesCommBiomass ? SpeciesBiomass ?
?loc.Community
Species community biomass is a species biomass
within a community.
SpeciesCommBiomass
T1
table
table/Xmeas, X/Ysite, X/Zplot, X/Uspp, X/Bbm
gt XSpeciesCommBiomass, Cf(Y,Z),
(X,C)loc, CCommunity, (X,U)item,
USpecies, (X,B)prop, BBiomass.
Community
meas
Species
Biomass
site
spp
bm
plot
string
string
double
string
10Hybrid Types Superimposing Semantics
- Searching
- Concept-based, e.g., find all datasets
containing biomass measurements - Merging/Integrating
- Combining heterogeneous sources based on
annotations - Concatenate, Union (merge), Join, etc.
- Transforming
- Construct mappings from schema S1 to S2 based on
annotations - Semantic Propagation
- Pushing semantic annotations through
transformations/queries
11Semantic Annotation Propagation
- Capture I/O constraints
- Similar to unit type constraints
- Can enable automated metadata creation
(annotation propagation) - Can help refine ontologies and existing
annotations
12Semantic Annotation Propagation
Port 1 Annotation
T1
table
table/Xmeas, X/Ysite, X/Zplot, X/Uspp, X/Bbm
gt XSpeciesCommBiomass, Cf(Y,Z),
(X,C)loc, CCommunity, (X,U)item,
USpecies, (X,B)prop, BBiomass.
meas
site
spp
bm
plot
string
string
double
string
Actor I/O constraint (approx.)
p3.obs(site, plot, spp, bm) - p1.meas(site,
plot, spp, bm), p2.spp(spp).
seasonal species
p1T1
p3T3
p2T2
Port 2 Chased Annotation
T3
table
table/Xobs, X/Ysite, X/Zplot, X/Uspp, X/Bbm
gt XSpeciesCommBiomass, Cf(Y,Z),
(X,C)loc, CCommunity, (X,U)item,
USpecies, (X,B)prop, BBiomass.
obs
site
spp
bm
plot
string
string
double
string
13Hybrid Types and Scientific Workflow Design
14Workflow Design Primitives
- End-to-End Workflow Design and Implementation
- Viewed as a series of primitive transformations
- Each takes a WF and produces a new WF
- Can be combined to form design strategies
W0
Workflow Design
Top-Down
t
W1
t
Task Driven
W2
Data Driven
Bottom-Up
Structure Driven
Wm
Output Driven
Semantic Driven
t
Workflow Implementation
Wn
Input Driven
15Workflow Design Primitives
- Semantic types and Actor Oriented Modeling
- Actors and Workflows can have semantic types
conceptually describing their function - Ports can have semantic types conceptually
describing what they consume and produce - I/O Constraints a general form of constraint
between input and output (e.g., like unit
constraints) approximating the function of an
actor
16Basic Design Primitives Inherited from Ptolemy
Basic Transformations
Starting Workflow
Resulting Workflow
Resulting Workflow
t1 Entity Introduction (actor or data connection)
t2 Port Introduction
t3 Datatype Refinement (s s, t t)
s?
s
s
t
t
t?
t4 Hierarchical Abstraction
t5 Hierarchical Refinement
t6 Data Connection
t7 Director Introduction
17Additional (Planned) Design Primitives for
Semantic Types
Extended Transformations
Starting Workflow
Resulting Workflow
Resulting Workflow
t9 Actor Semantic Type Refinement (T? T)
T?
T
t10 Port Semantic TypeRefinement (C? C, D?
D)
C
D
C?
D
C
D?
D
D
t11 AnnotationConstraint Refinement (?? ? ?)
C
D
C
C
?1
?2
??1
?2
?1
??2
t
t
t
s
s
s
t12 I/O Constraint Strengthening (? ? ? )
?
?
t13 Data Connection Refinement
t14 Adapter Insertion
t15 Actor Replacement
f
f?
t16 Workflow Combination (Map)
18Adapters for Semantic and Structural
Incompatibility
- Adapters may
- be abstract (no impl.)
- be concrete
- bridge a semantic gap
- fix a structural mismatch
- be generated automatically (e.g., Tavernas list
mismatch) - be reused components(based on signatures)
C
D
C
C?
D?
D
C1
C1?
D1?
C1
D
D
C2
C2?
D2?
C2
map
f2
f1
f1
f2
S
S?
T
S?
S
T
map
map
f2
f1
f1
S
S?
T
S?
T
S
f2
19Applying the Replacement Primitive
D
D
C
D?
C?
D
C
C
context-sensitive replacement (wiggle room)
general replacement
unsafe replacement
C?
D?
C?
D?
C
D??
C??
D
C C? D D?
C,C? overlap (e.g., C C?) D,D? overlap (e.g.,
D D?)
C C?? (e.g., C? C??) D D?? (e.g., D?
D??)
- General replacement doesnt consider surrounding
connections - Context-sensitive replacement gives more wiggle
room by tuning the actors semtypes based on
connections
20Workflow Elaboration
- Adapter insertion, replacement, and search
provide a powerful mechanism for workflow
elaboration - Given an initial, user specified set of connected
abstract actors - Repeatedly search for replacement concrete
actors (atomic and composite) - At each step, insert adapters when necessary
- Allow user to select returned workflows to be
combined
21Super Rapid Prototyping The Sparrow Family of
Languages
22The Sparrow Family of Language
- Basic Idea Have both Machine and Humanreadable
syntax - Sparrow-DL
- Description logic
- Sparrow-DTD
- Datatypes, variant of XML DTDs
- Sparrow-Annotate
- Configuring concepts linking datatypes and
ontologies - Sparrow-SWF
- KSW-Based MoML Metadata
- Sparrow-Rule
- Fancy stuff, like type constraints (a la unit
types), function approximation, and misc. other
constraints
23The Sparrow Family of Languages
Sparrow-DTD
Sparrow-SWF
Sparrow-DL
24Sparrow-Toolkit Operations
w1
Author
ISBN
Author
ASIN
AMSQuote
a1
a4
p1string
p1string
p2int
p1string
P2pricestring, condstring, seller
- Sparrow-Toolkit (example) Operations
- Is w1 semantically and/or structurally well
typed? - What can be semantically connected to a3?
- Insert abstract adapter between a3 and a4
- What can replace (e.g., implement) the adapter?
25Future Steps Adding a Hybrid-Type System to
Kepler
26Current and Future Implementation
- Concept-based Actor Search
- Implemented as proof-of-concept
- About to undergo major revision
- Additional operations slated for next Kepler
Release (data search, actor-based port search,
etc.) - Biggest Challenges
- Building/searching a repository
- Making changes to MoML (see KSW)
- GUI changes
- Ontology management
Workflow Components (MoML)
Ontologies (OWL) Default Other
Semantic Annotations
instance expressions
urn ids