Title: Integrated support for data integration and science portals
1Integrated support for data integration and
science portals
- Amarnath Gupta
- University of California San Diego
2Overview
- We will first
- Discuss what cyberinfrastructure for science
means - Situate the business of data integration within
the cyberinfrastructure setting - Then we will briefly describe a few
cyberinfrastructure projects in different science
disciplines - Biomedical sciences, geo-sciences, environmental
sciences, marine biology, physical oceanography - We will examine some dimensions of the data
integration problem - Discuss how they are approached in different
projects from a CS /Data Management perspective - Discuss common and complementary themes across
these approaches
3Cyberinfrastructure
- Cyberinfrastructure is the organized aggregate of
technologies enabling access and coordination of
information technology resources to facilitate
science, engineering, and societal goals. - Data access from distributed systems
- Data inter-operability and assimilation
- Computation grid based and workflows
- Visualization
- Tools
- Information Integration highlighted today
-
- National Science Foundations Cyberinfrastructure
NSF Blue Ribbon Panel (Atkins) Report provided a
compelling and comprehensive vision of an
integrated Cyberinfrastructure
Modified from Berman, SDSC, 2005
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES
A.K.Sinha, Virginia Tech, 2005
4Source Mark Ellisman
5Source Mark Ellisman
6- We are here
- Making more general-purpose data integration
infrastructure over distributed resources - Extending to accommodate various scientific
applications with stored and streaming data
Source Mark Ellisman
7GEONgrid Software Layers
Portal (login, myGEON)
Registration
GEONsearch
Core Grid Services GT3, OGSA-DAI, GSI, CAS,
gridFTP, SRB, PostGIS, mySQL, DB2
Physical Grid RedHat Linux, ROCKS, Internet, I2,
OptIPuter (planned)
GEON Space
8BIRN Major System Components
Collaborating Groups of Biomedical Researchers
Registered BIRN Data
9BIRN Specific Implementations
Mouse, Function, Morphometry ( New Areas and
Users )
Pegasus, Kepler, Loni Pipeline, etc.
e.g., AFNI, Air, 3DSlicer, LONI, ..
BIRN Data Integration Suite
Registered BIRN Data
10The OntoGrid View
Third-party tools
Tavernae-Science workbench
Applications
Haystack
LSID Launchpad
Web portals
Utopia
e-Science process patterns
LSID support
myGrid information model
e-Science mediator
e-Science coordination
Metadata Management
Data Management
e-Science events
KAVE metadata store
Service workflow discovery
mIR myGrid information repository
Fetasemantic discovery
KAVE provenance capture
Core Services
Pedro semantic publication
Workflow enactment
Pedro semantic publication
Freefluoworkflow engine
GRIMOIRES federated UDDI registry
Notification service
myGrid ontology
Web Service (Grid Service) communication fabric
External Services
Java applications
Soaplab
AMBITtext extraction service
OGSA-DAI DQP service
Executable codes with an IDL
Gowlab
Legacy applications
Web Services
OGSA-DAI databases
Web Sites
Courtesy Carole Goble
11A Word about Data in ScienceExcerpts from a
Report by NSFs Office of the Cyberinfrastructure
- Data. data are any and all complex data
entities from observations, experiments,
simulations, models, and higher order assemblies,
along with the associated documentation needed to
describe and interpret the data. - Metadata. Metadata are a subset of data, and are
data about data. Metadata summarize data content,
context, structure, inter-relationships, and
provenance (information on history and origins).
They add relevance and purpose to data, and
enable the identification of similar data in
different data collections. - Ontology. An ontology is the systematic
description of a given phenomenon, often includes
a controlled vocabulary and relationships,
captures nuances in meaning and enables knowledge
sharing and reuse.
12What is data integration?
- For applications where there are a number of data
sources (recall previous slide) - Geographically distributed
- Having data on different platforms
- (may be) on systems with different query
capabilities (e.g., different DBMSs, files,
spreadsheets) - Perhaps even having different data models
- Having different schema
- BUT about one common, general theme
- One may want to construct
- A general-purpose information system such that
- All these data sources can be co-accessed as if
they belong to a single data source - It can produce combined information objects
on-demand for ad hoc queries to facilitate
problem-specific analyses performed through other
software products (workflows, atlases,
statistical packages ) - Data integration refers to a body of techniques
to produce such an information system
13Data Integration vis-à-vis Data Grid
- A different aspect of data management
Inter-organizational Information Storage
Management
Semantic data Organization (with behavior)
Virtual Data Transparency
Data Replica Transparency
image_0.jpgimage_100.jpg
Data Identifier Transparency
Storage Location Transparency
Storage Resource Transparency
Courtesy Reagan Moore and Arun Jagatheesan
14Data Integration in Science Starts with Science
Questions
- GeoScience (GEON)
- What is the geologic and geophysical record of
Super-Continent assembly and dispersal? - What are the architectures of terrain boundaries
at depth? - How do composition, temperature and strain
fabrics vary within the lithosphere and
asthenosphere? Are lithospheric and
asthenospheric strain coupled? - Neuroscience (BIRN)
- Find volumetric data/metadata from MRIs of humans
with specific diagnosis(es) - Which structures are decreased/increased in size
relative to normal controls - Which structures show structural differences
across a variety of diagnoses - Given a structure which shows structural
differences - Which other structures are associated with it
- Do any of these associated structures show
structural differences - Do these other changed structures have
commonalities (i.e. cell types,
neurotransmitters, other afferent/efferent
connections) - Environmental Science (PAKT, CAMERA)
- Explain biodiversity by correlating distribution
of a taxonomic group with spatial (temporal)
distribution of temperature, dissolved oxygen,
salinity. - What accounts for large-scale genetic variation
in microbial genomes that share a very recent
common ancestry among coral reef habitats?
DATA NEEDED TO ADDRESS THESE QUESTIONS ARE
DISTRIBUTED ACROSS THE WORLD
15A Science Question can be Complex
Q1. What is the geologic and geophysical record
of Super-Continent assembly and dispersal?
Needs complex integration of geophysical data
with those associated with sub-crustal
lithosphere ages, its composition and physical
properties (seismic, thermal etc), surface
geology and associated events chronology
Adapted from D.Seber, SDSC
A.K.Sinha, Virginia Tech, 2005
16Converting Questions to Queries
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES
A.K.Sinha, Virginia Tech, 2005
17(Some) Dimensions of Information Integrationin
Cyberinfrastructure Projects
- Source Information Model
- Integration Engines Information Model
- Specification of semantic correspondences across
sources - The 3-party power play among global schema,
local schema, ontology - Query paradigms over integrated data
- The mechanics of
- query planning
- query execution
18About Semantic Correspondences
- The general problem
- For any data integration across multiple sources
there needs to be a way to - Specify how two objects from different data
sources may correspond - Specify of the joining of these two objects
would create a composite data object - Whats the big deal?
- Identical object versus equivalent objects
- Complete objects versus partial objects
- Multi-scale representations of the same object
- Handling definitional differences
- Taking into account natural variability
- Contextual correspondence
Are these always specifiable through ontological
standards like OWL? Do we need to have
correspondence checking services? Listen to
Oscar and Carols session tomorrow for a
different angle
19About the 3-party Power Play
- While we want to create a single (cyber-)
infrastructure with a data integration component,
different applications have different integration
scenarios - Is there a single global schema?
- Do new applications (and hence global schema) get
added all the time over existing sources and
ontologies? - Are the sources fixed? Do new sources get added
all the time? Do sources come and go? - Are sources added dynamically as data sets that
users want to integrate on the fly? - Do local schemata come with their own ontologies?
Is there a global ontology that all local
ontologies must map to? - How does the global schema (if one exists) relate
to the global and local ontologies? - Do new (or modified) ontologies get added all the
time? - Do the local schemata evolve all the time?
Is there a general way to manage this? Do we need
to architect any cyberinfrastructure components
differently?
20Source Information Models
- BIRN
- Data Sources
- Relational DBMS
- Standard data types
- Semantic data types (attribute-domain references
to ontologies) - Some data and computation sources expose a set of
functions - Key constraints
- Ontology Sources
- Simplifying assumptions
- Ontologies can be approximated by edge-labeled
directed graphs stored in relational systems - Graph traversal functions can be mimicked as
database functions - BONFIRE
- Glue ontology for simple inter-ontology mappings
and extensions - Image and Spatial Data Sources
- Discussed later
21Source Information Models
- GEON
- Data Sources
- Assumption all data are in GEONSpace
- Items and Item details
- Any relational jdbc data source (e.g., Excel
files) is admitted - Standard relational data types, shapefiles for
spatial data - Semantic Data types by connecting to ontology
- Ontology Sources
- Any OWL-specified ontology
- Registration in GEON
- Level 1 Federation Based Integration
- Users should know the component database
schemata - Level 2 View Based Integration
- Same as in BIRN
- Level 3 Ontology Based Integration
- Preferred Method
22Source Information Models
- PAKT (marine biogeography)
- Data Sources
- Relational
- Spatial (vectors) supported by GIS and Spatial
DBMS - Spatial (raster continuously partitionable
arrays) - ArcGIS (map algebra),
- Nested, non-aligned, multiple resolution
- Spatially-indexed time series
- Function-exposing sources (WSDL)
- Parameter and result data types are interpretable
or BLOBS - Ontology Sources
- Any ontology specified in a subset of OWL
- Any DAG-structured data source
23Source Information Models
- CAMERA
- PAKT
- Data sources that export annotated sequences as a
base data type - Phylogenetic trees
- XML repositories with XPath/XQuery Processor
- RDBMS with XML processing capabilities
- Graphs such as molecular interaction networks
(e.g., biological pathways), chemical reaction
networks
24Integration Engines Information Model
- BIRN
- Sources from the mediators view
- Base relations may have binding patterns
- Distinction between data and metadata is not
strictly observed - SRB metadata catalog is treated as a relational
source with some special functions - Files are accessed by reference to data-grid URIs
(SRB ids) - Integration Model
- Essentially Global-as-view (GAV) mediation
- semantic aspect of the mediation executed
through opaque functions over ontology sources - Key constraints not used during standard query
processing but are used for keyword queries
25Integration Engines Information Model
- BIRN (contd.)
- The 3-party power-play
- Many integrated views used by several global
schemata on a relatively fixed set of sources - Ontologies are used in two ways
- A global view may be defined using ontology
functions - Keyword queries use simple ontological
relationships - Some terms in the global schema mapped to
ontologies through semantic typing - Otherwise the global schema and integrated views
are independent from the ontology - Some data are warped to a common atlas coordinate
systems to enable atlas queries - Atlas mapping spatial annotation
26Integration Engines Information Model
- BIRN Integration architecture
- Gateway
- has XML API for source registration, source
schema update - Has XML API for queries
- Can be accessed as web service
- Registry
- API-based access to schema elements and view
definitions - Implemented over MySQL for portability
- Spatial registry for image data
- Planner and Executor
- Described later
- Wrappers
- Local and remote
- OTIS
- Inverted index for ontological terms
Atlas Client
Onto Client
Query Client
Ontological Query Processor
Atlas Query Processor
OTIS
Spatial Registry
Mediator
Data Grid Access
Wrapper Access
27BIRN Tool Source Registration
28Information Engines Information Model
- GEON
- Sources from the Integration Engines Viewpoint
- Metadata (Item-level information) maintained in a
GEON standard called ADN (Alexandria-Delese-NASA) - Item-detail level information is either any
relationalizable data or shapefiles - Any WMS, WFS service is a valid source for map
information management - Does not permit an external ontology source, all
ontologies have to be defined in the GEON
framework - Integration Model
- Every source schema is registered to an ontology
29Integration Engines Information Model
- 3-party power play
- Several global schemata can be defined
- A global schema IS the OWL-DL compliant ontology
- A couple of consequences
- All transitive closure information is
pre-computed after registration - If a concept class have key constraints,
subsumption is NEXP-Time hard, and undecidable if
the key constraint has a complex domain - Does not matter much in practice because
subsumption is hardly computed - Pragmatics
- As new sources join, or new applications are
attempted, the ontology needs to evolve
30Geon Data Registration
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES
A.K.Sinha, Virginia Tech, 2005
31Registration of Item Detail
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES
A.K.Sinha, Virginia Tech, 2005
32ODAL (Ontological Database Annotation Language)
- Create a partial model of ontologies from
database - Independent on any GUI
- Independent on any concrete implementations
- reusable
The values in the column ssID of the table
Samples, RockTexture, RockGeoChemistry,
ModalData,MineralChemistry and Images represent
instances of RockSample
33ODAL Import Ontologies
- The Ontologies used for annotating a database can
be imported as follows
lt?xml version"1.0"?gt ltodalODAL xmlnsrdf
http//www.w3.org/1999/02/22-rdf-syntax-ns
xmlnsowl"http//www.w3.org/20
02/07/owl" xmlnsodal
http//www.sdsc.edu/odal gt ltodalOntologygt
ltodalImports rdfresource"http//www.library.o
rg/Book.owl"/gt ltodalImports
rdfresource"http//www.writer.org/Writer.owl"/gt
lt/odalOntologygt lt/odalODALgt
34ODAL Database Connection Declaration
- The target database for making annotation is
declared as follows
lt?xml version"1.0"?gt ltodalODAL xmlnsrdf
http//www.w3.org/1999/02/22-rdf-syntax-ns
xmlnsowl"http//www.w3.org/2002/
07/owl" xmlnsodal
http//www.sdsc.edu/odal gt ltodalDatabase
odalid"PublicationDatabase"gt
ltodalDatabaseProductNamegtOracleltodalDatabaseProd
uctNamegt ltodalDatabaseProductVersiongt9.1.21lto
dalDatabaseProductVersiongt
ltodalHostgtoracle.sdsc.edult/odalHostgt
ltodalPortgt3456lt/odalPortgt
ltodalDatabaseNamegtPublicationslt/odalDatabaseName
gt lt/odalDatabasegt lt/odalODALgt
35ODAL Simple Named Individuals
Suppose the book ontology contains a class Book
and the schema Collection contains a table
book-price with a column ISBN.
- ltodalNamedIndividuals odalid"BookInTableBookPr
ice" -
odaldatabase"PublicationDatabase" gt - ltodalClass odalresource"http//www.amazon.c
om/Book.owlBook"/gt - ltodalSchemagtCollectionslt/odalSchemagt
- ltodalTablegtbook-pricelt/odalTablegt
- ltodalColumngtISBNlt/odalColumngt
- lt/odalNamedIndividualsgt
The statement says that each value in the column
ISBN represents a book individual.
odalid gives a name to the declaration, and
represents the set of the individuals generated
by the statement.
36ODAL The Names of Individuals
- ltodalNamedIndividuals odalid"BookInTableBookPr
ice" -
odaldatabase"PublicationDatabase" gt - ltodalClass odalresource"http//www.amazon.c
om/Book.owlBook"/gt - ltodalSchemagtCollectionslt/odalSchemagt
- ltodalTablegtbook-pricelt/odalTablegt
- ltodalColumngtISBNlt/odalColumngt
- lt/odalNamedIndividualsgt
ISBN
0817313478
Individual Name
(BookInTableBookPrice, PublicationDatabase.Collect
ions.book-price.ISBN0817313478)
37ODAL Named Individuals from Multiple Columns
Suppose an ontology contains a class Location and
a database table Rock-Sample with two columns
Latitude and Longitude.
- ltodalNamedIndividuals odalid"LocationInTableRoc
kSample" gt - ltodalClass odalresource"http//www.usgs.o
rg/Space.owlLocation"/gt - ltodalSchemagtCalifornialt/odalSchemagt
- ltodalTablegtRock-Samplelt/odalTablegt
- ltodalColumngtLatitudelt/odalColumngt
- ltodalColumngtLongitudelt/odalColumngt
- lt/odalNamedIndividualsgt
The statement says that a pair of latitude and
longitude gives a location
38ODAL Named Individuals with Conditions
ltodalNamedIndividuals odalid"MaleEmployeeInTabl
eEmployee" gt ltodalClass odalresource"http/
/www.abc.com/Employee.owlMaleEmployee"/gt
ltodalTablegtemployeelt/odalTablegt
ltodalColumngtEmployeeIdlt/odalColumngt
ltodalConditiongtlt!CDATA GenderM
gtlt/odalConditiongt lt/odalNamedIndividualsgt ltod
alNamedIndividuals odalid"FemaleEmployeeInTable
Employee" gt ltodalClass odalresource"http//
www.abc.com/EmployeeFemaleEmployee"/gt
ltodalTablegtemployeelt/odalTablegt
ltodalColumngtEmployeeIdlt/odalColumngt
ltodalConditiongtlt!CDATA GenderF
gtlt/odalConditiongt lt/odalNamedIndividualsgt
A condition in an odalCondition element should
be a Boolean expression which is valid to be used
in any WHERE clauses of SQL queries
39ODAL Data Type Property Declaration
Person
age
SSN
8
123-56-7890
hasAge
posInt
ltodalNamedIndividuals odalid"PersonInTablePers
on" gt ltodalClass odalresource"http//www.
foo.org/Person.owlPerson"/gt
ltodalTablegtPersonlt/odalTablegt
ltodalColumngtssnlt/odalColumngt lt/odalNamedIndivid
ualsgt ltodalOntologyPropertygt
ltodalDatatypeProperty odalresource"http//www.f
oo.org/Person.owlhasAge"/gt
ltodalTablegtpersonlt/odalTablegt ltodalDomain
odalresource"PersonInTablePerson" /gt
ltodalRange odalresource"age"
/gt lt/odalOntologyPropertygt
40Conditions for Joining Individuals from Different
Resources
- Usually we dont make join on individuals cross
different resources - A set of datatype properties can be declared as a
key for a class in the ontology. We do join cross
multiple resources based on keys. - e.g. hasLatitude, hasLongitude can
be declared as a key of Location - Two locations from
different resources are same if they have the
same - latitude and longitude
Rock
RockSampleID
10001
RockID
10001
We dont know whether 10001 represents the same
rock in the two resources. By default, we assume
they are not.
41The Architecture of GEON Semantic Mediator
Oracle
DB2
MySQL
SQL Server
PostgreSQL
PostGIS
Query Execution
Query Optimization
Query Planning
Internal Database
SQL Parser
Spatial SQL against federal schemas
Mediator JDBC Driver
SOQL Parser
Semantic Query Rewriter
SOQL
Ontology Reasoner
ODAL Processor
GUI
Portal or Application
OWL
ODAL
SOQL Processor
42The Map Integration Architecture
43Map Integration
44Integration Engines Information Model
- PAKT (briefly)
- Type extensibility of the mediator
- Nested relational query language extended by tree
and a restricted set of graph pattern operations - Construction operations important
- Passive extensibility
- Source more powerful than the mediator
- Source exports a set of type-based optimization
rules to the mediator - Active extensibility
- Mediator extends its set of interpreted types
- Ontology management
- Ontological queries processed by a separate
co-processor that interoperates with mediator - Query planner partitions the query into
ontological and mediated query processors
45Query Paradigms
- What are the different kinds of queries
scientists and applications pose to an integrated
system? - Metadata-based file access
- 21,038 raw image files per subject
- 2.4 GB of raw image data per subject
- 25 GB to 40 GB of processed image data per
subject - 10 million slices of functional imaging data in
Phase II - 7 Terabytes of image data for all of the Phase II
analyses - (conservative estimate of 25 GB/subject)
- Ontologically supported mediated queries
- Find most recent FMRI data of all patients with
low scores in working memory tasks having
volumetric changes of hippocampus over 10 in 2
years - Keyword queries
- FMRI working memory task hippocampus
- Ontologically supported keyword queries
- Associative searches
46GEON SOQL (Simple Ontology Query Language)
- Query single or integrated resources
- via ontologies (i.e., high level logical views)
- independent on any physical presentation (i.e.
schemas)
47Question Finding all seismic stations within 1
mile from railroads
SELECT X2.stationcode, X2.lat, X2.lon FROM
stationdatatable X2 WHERE bounding box condition
48BIRN A Functional View of the Mediation Process
Planner Execution Engine
Query Expression (UCQ Nesting Grouping
Aggregate)
Pre-Executable Plan
Executable Plan
Flattening of Nested Queries
Post-processing aggregate
Execution Control
View Unfolding
Normalization to DNF
Result Building
Predicate Reordering (binding patterns maximal
chunk)
Result Reporting
Maximal Feasible Plan
Algebraic Plan
Cost/Selectivity-based Optimization
Pre-Executable Plan
49View Definition and Query Language
- Union of conjunctive queries
- May contain function term
- Expressed in XML Datalog with aggregated
functions - Query q(X,F(Y))-r1(X,Z),r2(Z,Y), - where F(Y)
aggregate function operated on set of Y and X
group-by variables. - Planner and Executor translate this to
- q(X,Y)-r1(X,Z),r2(Z,Y)
- q(X,W)-F(gb(q(X,Y))
- Where group-by gb function with aggregate
function F pushed to data source whenever
possible or evaluate at Mediator. - Query Language allows for nested query inner
queries are assigned to intermediate variables
that are used by main query
50BIRN Mapping Relations
- Ontology Mapping -maps data values from a source
to an ontology term of a known ontology (UMLS) - Joinable relation pairs attributes from different
relations - Value-Map maps mediator-supported data value to
source supported (for example gender 0/1 at
some source is male/female for mediator)
51Processing Ontological Queries
Courtesy Vadim Astakhov
52PAKT Spatial and Taxonomic Queries
53Example Queries
OBIS
OBIS
WOA
Geo-Spatial
Biological
Geo-Spatial
Biological
Physiochemical
Q1 where is species X found?
OBIS(scientific_name,lat,long)
Q3 where is species X found given certain
physical parameter? OBIS(scientific_name,la
t,long) WOA(physio,lat,long)
Q2 for a given polygon, what species are found?
OBIS(scientific_name,m_lat,m_long,m_lat,m_lo
ng)
Q4 what are the aggregated physical properties
of species X? OBIS(scientific_name,lat,long
) WOA(physio,lat,long)
Italics input Underline output
OBIS
WOA
extended
Geo-Spatial
Geo-Spatial
Biological
Physiochemical
Benth_Hab
Habitat
Benth_Hab
Habitat
Q5 where is habitat X found?
Q7 where is habitat X found given certain
physical parameter?
CMECS(habitat,physio)
BH(habitat_grp,shape)
BH(habitat_grp,shape)
WOA(physio,lat,long)
CMECS(habitat,physio)
Q6 for a given polygon A, what habitats are
found?
CMECS(habitat,physio)
BH(habitat_grp,shape)
PolygonA
Q8 what are the aggregated physical properties
of habitat X?
BH(habitat_grp,shape)
WOA(physio,lat,long)
CMECS(habitat,physio)
Q9 what species can be found at habitat X?
CMECS(habitat,physio)
BH(habitat_grp,shape)
OBIS(scientific_name,lat,long)
Q10 what habitats is a species X found at ?
CMECS(habitat,physio)
BH(habitat_grp,shape)
OBIS(scientific_name,lat,long)
54Frequent Query Patterns
- Example queries are joins of
- Left query patterns habitat-spatial, and
- Right query patterns spatial-environmental/specie
s distribution
BH(..,shape)
WOA(physio,lat,long)
(
)
PolygonA
BH(..,shape)
OBIS(scientific_name,lat,long)
BH(..,shape)
WOA(physio,lat,long)
BH(..,shape)
OBIS(scientific_name,lat,long)
Mediators queries
Onto-modules queries
API
55The Resource Management Aspect of Query
Evaluation
node 5
DQP
- Primarily done by the Manchester group (Watson et
al) - Polar
- Based on OQL (internally monoid comprehension)
- Multi-node planning
- Plan partitioning
- Exchange operator
- Attribute sensitivity
- Data index repartitioning
- Plan scheduling
- Query execution
reduce
node 4
node 3
DQP
DQP
join (A1,B1)
join (A2,B2)
node 1
node 2
DQP
DQP
scan (A)
scan (B)
OGSA-DAI
OGSA-DAI
DBMS
DBMS
data
data
From Amy Krause
56The Adaptivity Issue in DQP on a Grid
- Monitoring-Assessment-Response framework of
adaptive query processing in a grid (by Gounaris) - Monitoring
- a separate module that keeps track of information
like - Has a resource (e.g., memory availability)
changed more than 10? - Has the data volume changed recently?
- Occurs between operators or within an operators
execution process - Other modules subscribe to this notification
- Assessment
- Diagnosis is carried out for suboptimal
execution, resource shortage, resource idleness,
unmet performance requirements, unmet user needs - Response
- Operator replacement ore rescheduling, machine
rescheduling, plan re-optimization
57Commonalities and Complementarities
- Common themes
- Overall architectural similarity of
cyberinfrastructure projects - Service orientation
- The data integration task is part of a larger
scientific computing, exploration and analysis
process - Has impact on integration setting, design
decisions and performance expectations - Mediation with semantic mapping and reasoning
seems to be winning - Complementary approaches
- Details of the architecture
- Relationship with workflows
- Styles of mediation
- Extensibility of mediator
- Adaptivity of query planning and evaluation
58Thank you!
- Questions? Comments? Integrated Queries?