Title: From Web 1.0 Web 3.0: Is RDF access to RDB enough
1From Web 1.0 ? Web 3.0 Is RDF access to RDB
enough?
- Vipul Kashyap
- vkashyap1_at_partners.org
- Senior Medical Informatician, Clinical
Informatics RD - Partners Healthcare System
- Martin Flanagan,
- mflanagan_at_insilicodiscovery.com
- CTO, InSilico Discovery
- W3C Workshop on RDF Access to Relational
Databases - October 26th , 2007
2Outline
- Position
- Use Case Scenario
- Solution Approach
- A Generalized Framework for RDF Access
- Next Steps
- Proposed Roadmap
- Research Topics
3Position
- There is a need for a generalized framework
(format, representation language, algebra?) for
RDF access to - Relational Databases
- Tabular Data Sources, e.g., Excel Spreadsheets
- Web Services
- Motivation
- Large amounts of tabular data and increasing
number of web services in the Healthcare and Life
Sciences - Learn from the relational database success
story Declarative query language Algebra
Opportunities for optimization - Potential for providing incremental value,
increasing the adoption and acceptance of the
Semantic Web.
4Use Case ScenarioBiological Explanations for
Statistical Correlations
- What is the location of a given Gene, e.g., CPNE1
on the Human Genome?Data Repository NCBI
EntrezAccess Mechanism Web Services - For what gene(s) is a given SNP, e.g.., rs6060535
in the upstream regulatory region?Data
Repository RDBMS containing dbSNP and regulatory
region data, Access Mechanism JDBC/SQL - What genes have been found to be "coexpressed"
with CPNE1 and in what study?Data Repository
Excel Spreadsheet containing the co-expression
patterns of various genes in various
studies.Access Mechanism .NET API, MS Office API
5Solution Approach
- Ontology based RDF query specification
- Mapping Framework
- Relational Databases
- Excel Spreadsheets
- Web Services
- Query Translations and Execution
- Illustrations of a working system based on the
Semantic Discovery System by InSilico Discovery
(http//www.insilicodiscovery.com)
6Ontology based RDF Query Specification
7Mapping to Relational Databases
8Mapping to Web Services
9Mapping to Excel Spreadsheets
10Query Translation and Execution
This one SPARQL statement joins data From NCBI,
Excel, Oracle who did what assay matching
this sequence data
Translators
11A Generalized Framework for RDF Access
The SDS Platform is based on the Mediator
Definition Language work done by Val Tannen and
his students at U. Pennsylvania. Was earlier
implemented in the K3 system and was widely used
in Pharma
12Conclusions
- Need to think of various types of
structured/semi-structured/tabular data sources
in a wholistic manner - XML Documents (GRDDL Transforms)
- Relational Databases
- Web Services
- Excel Spreadsheets
- Other Tabular and Tree data sources
- Potential for providing value beyond relational
databases - Accelerate the transition to the Semantic Web
- Increase Adoption and Acceptance
13Next Steps Proposed Roadmap
RDF
Generalized Transformation Language
Relational Algebra
GRDDL
Relational Databases
WSDL
XML
Excel Spreadsheets
14Next Steps Research
- Extension of Relational Algebra?
- XQuery
- RDF
- GRDDL Transformations
- WSDL
- Read only Web Service Choreography/Composition
- What aspects of the above can be webified?
- Access Transformation Languages
- Mapping Languages Is XQuery or RDF enough?
- Existing efforts in Mediator research
- E.g., Mediator Definition Language (MDL)