PathCase A Web-Based Exploratory Querying and Visualization Tool for Biological Pathways - PowerPoint PPT Presentation

Loading...

PPT – PathCase A Web-Based Exploratory Querying and Visualization Tool for Biological Pathways PowerPoint presentation | free to view - id: 663eff-MzRkY



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

PathCase A Web-Based Exploratory Querying and Visualization Tool for Biological Pathways

Description:

PathCase A Web-Based Exploratory Querying and Visualization Tool for Biological Pathways Z. Meral zsoyo lu Case Western Reserve University Cleveland, Ohio 44106 – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 94
Provided by: SteveTo4
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: PathCase A Web-Based Exploratory Querying and Visualization Tool for Biological Pathways


1
PathCase A Web-Based Exploratory Querying and
Visualization Tool for Biological Pathways
Z. Meral Özsoyoglu Case Western Reserve
University Cleveland, Ohio 44106
4th International Conference on Pathways, Networks, and Systems Royal Myconian Conference Center, October 8-13, 2006, Mykonos, Greece
2
Digital Biology
  • Biology and Life Sciences have become
    increasingly data rich over the past decade
  • Rapid growth of biological data, (distributed,
    heterogeneous) due to
  • investments on public and private resources,
  • significant advances in data generation, storage,
    analysis, web-based availability, and sharing
    technologies,
  • emerging large-scale biological data gathering
    technologies

3
More growth in amount and diversity
  • Huge investments in
  • developing large biological information
    resources,
  • assembling this information in public databases.
  • Many such resources , and tools are available,
  • NCBIs Genbank, PubMed, Blast, MGIs tools and
    databases, etc.
  • Continued explosive growth in the amount and
    diversity of biological and biochemical data is
    expected in the next century.

4
medical informatics and physiological data
  • Also very large, diverse, non-standard, and
    distributed

5
Biological Data Challenges
  • In addition to being large, diverse and
    distributed, three important characteristics
  • Complexity,
  • Heterogeneity, and
  • Evolution of both data and the schema.

6
Biological Data is Complex
  • Very rich in metadata, requires metadata
    management techniques.
  • Large, temporal, and historical, requires special
    knowledge warehouse design and management
    techniques.
  • It has inherently deeply-nested hierarchical
    structures (e.g., ontologies), or best modeled as
    graph structures at the conceptual level (e.g.,
    metabolic pathways, or signaling pathways).

7
Biological Data is Heterogeneous
  • in the sense that it involves a wide array of
    data types, including text, image, sequence data,
    as well as streaming data (e.g., medical sensors
    data), temporal data, and incomplete and missing
    data.
  • Also, heterogeneous sources and formats

8
Biological Data is very Dynamic
  • Data management techniques effectively handle the
    dynamic data content.
  • But dynamic schema evolution poses challenges for
    data management
  • applications and the software tools are based on
    the schema and need to be updated and changed for
    the evolving schema accordingly.

9
Research
  • Using off-the-shelf data management software
    tools will not be sufficient for the data
    management needs of digital biology.
  • Integration of the existing technologies for
    biological data, and development new data
    management techniques are needed.
  • NIH BISTI workshop on Digital Biology
  • http//www.bisti.nih.gov/2003meeting/

10
PathCase Case Pathways DataBase System
  • integrated software tool for
  • storing,
  • visualizing,
  • querying,
  • Analyzing,
  • biological pathways at different levels of
    genetic, molecular, biochemical and organismal
    detail.
  • http//nashua.case.edu/pathways

11
Data Model
  • Graph structured database (hypergraph)
  • nodes substrates and products
  • hyper edges processes (reactions)
  • represented using a relational database
  • Querying and Visualization
  • based on the graph conceptual view.

12
Other systems and resources
  • Reactome
  • Kegg
  • BioCyc Pathway tools
  • Patika
  • Cytoscape
  • BioCarta
  • and others.

13
Data Model
  • Pathway interconnected arrangements of
    processes.
  • (representing functional role of genes in the
    genome)
  • Process a reaction (or step) in a pathway
    involving one genetically unique gene product.
  • (substrates, products, co-factors, inhibitors,
    activators, of a reaction are all molecular
    entities in this perspective).
  • Molecular Entity the general name given to any
    entity participating in a process, such as a
    basic molecule, protein, enzyme, gene, amino acid

14
Browser view
15
PathCase usage statistics hits from 62
countries.
User statistics
16
Browse Pathways
17
Metabolic Pathway groups
18
(No Transcript)
19
Select for pathway details
20
Select for interactive pathway graph
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
32
(No Transcript)
33
Database content
  • Metabolic Pathways (39)
  • 37 from Michal, G. Biochemical Pathways, John
    Wiley Sons Inc., 1999
  • 2 (Folate and Homocystine) for human and mouse
  • by Joe Nadeou and Toshimori Kitami
  • 876 processes (for different organisms)
  • Organisms
  • Human, mouse, animals, prokarya, plants yeasts,
    unspecified

34
Web-based Pathways Query and Visualization
sub-system
Server Client Architecture
35
Exploratory Querying and Visualization
  • Viewing whole network of pathways
  • Viewing in multiple levels of abstraction
  • Querying specific properties of any pathway
    component in any level of granularity
  • Path queries
  • Neighborhood queries
  • Different forms of queries displaying outputs
  • - textual -- graphical queries
  • - built-in -- parametrized
  • - tabular graphical query outputs
  • - advanced query interface

36
Calls the query interface for finding the paths
between two molecular entities
37
Query interface for Find paths between two
molecular entities query
38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
41
(No Transcript)
42
(No Transcript)
43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
(No Transcript)
47
(No Transcript)
48
(No Transcript)
49
(No Transcript)
50
(No Transcript)
51
(No Transcript)
52
(No Transcript)
53
(No Transcript)
54
(No Transcript)
55
(No Transcript)
56
(No Transcript)
57
Visualizing Pathways
Connected pathways are displayed.
58
(No Transcript)
59
(No Transcript)
60
(No Transcript)
61
(No Transcript)
62
(No Transcript)
63
(No Transcript)
64
Data Management and Path Query Evaluation for
Pathways Analysis
  • Relational, Object Relational Representation
  • XML as Storage Format for Pathways Data
  • Encoding Structure of XML documents
  • Efficient Evaluation of Path and Neighborhood
    Queries for Pathways
  • Query Rewriting using Materialized Views
  • (XML, Xpath queries, Xu Ozsoyoglu, VLDB
    2005)
  • Caching, indexing, and using SQL templates for
    more scalable querying

65
(No Transcript)
66
(No Transcript)
67
(No Transcript)
68
(No Transcript)
69
(No Transcript)
70
(No Transcript)
71
(No Transcript)
72
(No Transcript)
73
(No Transcript)
74
(No Transcript)
75
(No Transcript)
76
(No Transcript)
77
(No Transcript)
78
(No Transcript)
79
(No Transcript)
80
(No Transcript)
81
(No Transcript)
82
(No Transcript)
83
(No Transcript)
84
GO terms to/from Pathways
85
Gene Viewer
All mouse genes in this pathway. Chromosomal
locations are marked underneath
Processes and molecules not available in mouse
are grayed out
Location of the gene responsible for selected
process is highlighted
86
Advanced Query Interface
Advanced Query Interface is for effectively
generating hierarchical queries involving
pathways, processes, Molecules and organisms
Sample query, and output for the query result
87
AQI Creating a Query
  • We want to create a query that finds all
    processes
  • in mouse
  • involved in either Homocysteine pathway or Folate
    pathway
  • Includes Thiopurine molecular entity

88
AQI Creating a Query
Export the query to a file
Executes the created query
Reset the query to this initial state
Click on the add pathway button
89
AQI Creating a Query
Pathway attributes
Select Homocysteine from the menu and click or
90
AQI Creating a Query
Select Folate pathway from the menu
Click on the link Process to add a process to
the query
91
AQI Creating a Query
92
Ongoing and future work
  • Querying and Visualizing pathways given in BioPax
    form.
  • Challenge Integration of pathways data from
    different sources, and data cleaning
  • Use Pathcase as a tool for Pathways from other
    sources, e.g. Kegg, Reactome.
  • Tool for Comparative genomics Pathways
    Classification, Extending known pathways to other
    sequenced organisms -- Missing enzyme detection,
    protein-protein interaction networks, data mining
    tools.

93
Ongoing and future work (cont.)
  • Signaling Pathways
  • More dimensions and levels of abstractions
  • Pathway dynamicsdynamic pathways models
  • Mapping Between Models for Pathway Dynamics and
    Structural Representations of Biological Pathways
    Yavas Ozsoyoglu, BioInfo, 2005
  • More Integration (other resources and tools)
  • GO search, ontology browser, PubMed, Kegg,
    Swiss-Prot,
  • Phenotypes
  • More robust, scalable, easy to use implementation
    with state-of-the-art data management and
    querying techniques
  • Utilization of Pathcase as a tool toward early
    disease diagnosis, prevention and treatment or
    advances in biology and genetics.

94
Acknowledgments
  • Joseph Nadeau, Dept of Genetics, CWRU, Medical
    School
  • Gultekin Ozsoyoglu, Electrical Engineering and
    Computer Science, CWRU, Case School of
    Engineering
  • Graduate students S. Fatih Akgul, Ali Cakmak,
    Brendan Elliott, Sheila Ernest, Mustafa Kirac,
    Toshimori Kitami, Lakshmi Krishnamurty, Scott
    Newman, Murat Tasan, Wanhong Xu, Gokhan Yavas.
  • Undergraduates Greg Strnad, Michael Starke,
    Brandon Evans, Marc Reynolds, Greg Schaefer
  • Programmer Yu Mei
  • Wang Foundation

95
Case Western Reserve University
Cleveland
Thank you! http//nashua.case.edu/pathways
About PowerShow.com