The EAVCR WebDB Toolkit: An open source application framework for building evolvable neuroscience da - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

The EAVCR WebDB Toolkit: An open source application framework for building evolvable neuroscience da

Description:

Neuroscience knowledge data is characterized for being in constant evolution. ... well understood process like olfaction involves constant variable revisions ... – PowerPoint PPT presentation

Number of Views:269
Avg rating:3.0/5.0
Slides: 30
Provided by: luisma5
Category:

less

Transcript and Presenter's Notes

Title: The EAVCR WebDB Toolkit: An open source application framework for building evolvable neuroscience da


1
  • The EAV/CR WebDB Toolkit An open source
    application framework for building evolvable
    neuroscience databases
  • Luis Marenco
  • Center for Medical Informatics
  • Yale University School of Medicine
  • 2004

2
Outline
  • Neuroscience knowledge data is characterized for
    being in constant evolution. An issue that
    affects traditional waterfall application
    design where relational database schemas are
    designed at-front with applications hard coded to
    them.
  • A particular approach to this problem will be
    reviewed in these topics
  • Motivation The SenseLab project
  • Background issues of traditional applications
  • Evolvable database applications goals
  • Possible solution scenarios
  • EAV/CR and derived methodologies
  • EAV/CR applications SenseLab and NDG
  • EAV/CR Solution Framework (EAVCR WebDB Toolkit)

3
Motivation The SenseLab Project
  • The SenseLab project is a ongoing effort to
    integrate multidisciplinary sensory data derived
    from the olfactory system.
  • This process involves the development of
    neuroinformatics databases and tools in support
    of this research.
  • SenseLab currently consists of the following
    Web-databases
  • Neuronal research NeuronDB, ModelDB, and
    CellPropDB
  • Olfactory research Olfactory Receptors DB,
    OdorDB, and OdorMapDB

4
Background Issues Traditional Databases
  • Traditional database applications are
    characterized with code entwined database schema
    elements.
  • The research of a not well understood process
    like olfaction involves constant variable
    revisions affecting the DB schema and derived
    code.
  • The implications of following this approach in
    SenseLab created
  • Increased code complexity as new elements were
    added to the DB
  • Limited code reusability (code specific to every
    data category)
  • Lack of robust interoperability (schema
    dependency)
  • Changing knowledge embedded in schemas created
  • Downtime and application breakdown
  • Interface redesign (User and Interoperability)
  • Introduction of code errors (when updating code)
  • Exponential maintenance burden (due to all of the
    above)

5
Background Issues Traditional Databases (2)
  • Web-database applications additionally involve
  • Data entry and security Elaborate, expensive and
    with limited portability
  • Ad hoc searching mechanisms are difficult to
    standardize and expensive to maintain.
  • Hard-coded Interoperability can be cumbersome to
    adapt to new standardized formats.
  • At the databases metadata level
  • Built-in data dictionary lacks expressivity
  • Limited schema extensibility
  • Reduced data types

6
Evolvable Database Application Goals
  • PRIMARY
  • Create a programmatic approach capable to allow
    DB structural changes without disrupting the
    existing data and code
  • Minimize codemetadata dependency focusing on
    automated interface generation (human and
    automated agents)
  • Attempt to improve code simplification as project
    matures (Extreme programming principles)
  • SECONDARY
  • Facilitate system integration to a Web platform
  • Allow accessibility from common web browsers
  • Incorporate role-based security for public and
    private data
  • Create generic interfaces and formats for data
    exchange
  • Improve code reusability leveraging interfaces
    and formats
  • Foresee robust interoperability with extensible
    protocols

7
Some Possible Solution Scenarios
  • Object oriented or object relational databases
    At the time Immature and unsupported
  • Leveraging of other flexible application
    approaches (e.g. Protégé) Lack of features
    (e.g. non-distributed, or web-based)
  • Built a new ground-up solution to provide
    needed featuresThe EAV/CR Application
    Framework(Data storage software practices)

8
EAV/CR Storage Approach
  • EAV/CR (Entity-Attribute-Value with Classes and
    Relationships)Is a data storage approach derived
    from EAV, a row based data modeling technique
    widely used in AI , Electronic Patient Record
    Systems, MS Windows Registry, and others.
  • EAV/CR uses a limited number of tables and
    constrains to represent any amount of tables,
    fields and cells from a RDB

9
EAV/CR Storage Approach
  • Conceptual
  • EAV/CR augments standard EAV by
  • Allowing unlimited categories grouping entities
    in Classes C

10
EAV/CR Storage Approach (2)
  • EAV/CR augments standard EAV by
  • Implementing strong data typing for values
  • Extending data types (computed attributes)
  • Allowing entity relationships R (inter-class
    and hierarchies)
  • Including implicit data and metadata versioning
    and timestamp
  • Including Web oriented features Enriched
    web-oriented metadata to automate web-interface
    generation (Web forms, XML, )
  • Facilitating ontological representation Mapping
    standardized vocabulary and semantic
    relationships identifiers to data and metadata
    elements
  • Ability to create database portals to present
    different subsets of the data to users with a
    particular research focus
  • Centralized role-based security. Uses distributed
    administration model to minimize dedicated
    administration costs
  • Monitoring tools

11
EAV/CR derived methodologies
  • Expandable system architecture Allows parallel
    processing by scaling-out. Parallel middle-tier
    servers connect to the same EAV/CR database
    preserving security, data and metadata
    concurrency
  • Delegated user profile management Users are
    responsible of their own profiles, administrators
    provide access and restrictions to specific
    database resources. (Web portal model for data
    and metadata)
  • Distributed data Shared Classes among databases
    allow tight data integration minimizing redundancy

12
EAV/CR derived methodologies (2)
  • Data Services Creation of the EAV/CR Dataset
    Protocol (EDSP) . An InfoSet protocol that
    describes database structural ontology,
    metadata, and data in a simple XML format. (It
    brings the EAV/CR approach to the XML world).
  • The following processes depend on EDSP
  • Data transference
  • Middle tier components
  • Automated Ad-hoc query interface generation
  • The use of EDSP as the source for these
    processes has improved software components
    stability and reusability

13
The EAV/CR Application Framework
  • Programming model
  • Database component programmer
  • Domain programmer
  • EAV/CR Framework Toolkit (version 1)
  • Database Component Encapsulates EAV/CR logic
    presenting interfaces for domain programmers.
    Created in C MS.NET
  • Plumbing code Generic web scripts for metadata
    driven navigation and interface generation.
    ASP-VBScript migrated to C MS.NET 2.0 (Visual
    Studio 2005)
  • Domain programmers customize plumbing code to
    their research goals.

14
EAV/CR Summary
  • EAV/CR and Evolvability
  • High data integration
  • Flexibility in database schema evolution /
    maintenance
  • Code reuse and increased reliability
  • Extensible application architecture
  • Disadvantages
  • Querying complexity
  • Multi-parameterized queries performance penalty
  • Complex EAV/CR components programming
  • Future Directions
  • Improve disadvantages
  • Test bed to design evolvable interoperability
    mechanisms like next SOAP version WS-STAR
    (Microsoft, IBM, Oracle, etc)

15
Links / Team
  • SenseLab Project - http//senselab.med.yale.edu
  • SfN - Neuroscience Database Gateway -
    http//big.sfn.org/ndg
  • EAV/CR Web site / WebDB toolkit / EDSP protocol
    -http//ycmi.med.yale.edu/EAVCR
  • Team Members
  • Gordon Shepherd PI
  • Perry Miller Project PI
  • Michael Hines Project PI (ModelDB/Neuron
    design)
  • Luis Marenco System/DB design
  • Prakash Nadkarni System/DB design
  • Qin Zhang EAV/CR WebDB Toolkit developer
  • Chiquito Crasto OrDB/OdorDB administrator /
    domain programmer
  • Tom Morse ModelDB/NeuronDB administrator /
    domain programmer
  • Nian Liu OdorMapDB administrator / domain
    programmer
  • Follow - DEMO SLIDES

16
Centralized Schema Management
17
Centralized Schema Management (2)
18
Metadata extensibility
  • EAV/CR allows global ontological annotation of
    any data or metadata element in the database.

19
Metadata driven Ad hoc interface generation
  • This generic interface is built in real time by
    reading the metadata. Boolean expressions can be
    added for complex associations. Results can be
    retrieved in HTML, XML text and other formats.

20
Metadata driven Ad hoc interface generation (2)
  • The same generic code is reused by other
    databases augmenting the value added to this
    robust evolvable design.

21
InfoSets and Evolvable Interoperability
  • The creation of the EDSP (EAV/CR dataset
    protocol) allows transference of database schema
    and data in a simple consistent extensible
    format.This picture show partial information of
    some olfactory receptors molecules from ORDB

22
InfoSets and Evolvable Interoperability (2)
  • Data exchange with standardized formats can be
    achieved through XML transformations. Below the
    previous EDSP message transformed into Microsoft
    XDR, a format used by the MS Office Suite to
    import/export data and metadata into MS Access
    and SQL Server databases.

23
Importing EAV/CR database into MS Access
24
Importing EAV/CR database into MS Access (2)
  • http//senselab.med.yale.edu/senselab/site/dbGate/
    Xtract.asp?o1798xsledsp-officedata

25
Importing EAV/CR database into MS Access (3)
26
Importing EAV/CR database into MS Access (4)
27
Importing EAV/CR database into MS Access (5)
  • relationships, and the data (preserving strong
    data typing )
  • All in one deEAVfication process.

28
Importing EAV/CR database into Protégé ontology
29
EAV/CR Physical DB Diagram
SenseLabPhysical schema Mix of both
worlds EAV/CR and RDB
  • http//senselab.med.yale.edu/senselab/site/dsArch/
    images/Visio-EAVCR_Physical_Schema_021205.png
Write a Comment
User Comments (0)
About PowerShow.com