Networks and Pathways I - PowerPoint PPT Presentation

1 / 58
About This Presentation
Title:

Networks and Pathways I

Description:

The Problem of Choice Too Many Databases. Data Exchange Formats ... http://cbio.mskcc.org/prl ... Ongoing wishful thinking about latest new technology. ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 59
Provided by: stephe78
Category:

less

Transcript and Presenter's Notes

Title: Networks and Pathways I


1
Networks and Pathways I

CBW Bioinformatics Workshop February 24th 2005,
Vancouver Christopher Hogue The Blueprint
Initiative
2
About this talk
  • The Problem of Choice Too Many Databases
  • Data Exchange Formats
  • Pathway Resources
  • KEGG
  • EcoCyc
  • Small Molecule Resources
  • PubChem
  • SMID-BLAST

3
Molecular Assembly Data
  • Interaction pair
  • A binds B
  • Database of Interactions
  • Molecule Vertex
  • Interaction Edge
  • Tools/Computations
  • Graph Theory
  • Pathway Finding
  • Simulations
  • Cellular CAD

Goodsell
4
Molecular Assembly What Databases to use?
  • DNA
  • RNA
  • Proteins
  • Small molecules
  • Complexes

5
The Problem
  • So many assembly databases, all with their own
    data models, formats, and data access methods.

http//cbio.mskcc.org/prl/
6
User Behavior
  • The problem of too much choice.
  • (M. Lepper _at_Stanford and S. Iyengar _at_Columbia)
  • Two tables in a supermarket
  • 24 jars of jam vs 6 jars of jam.
  • 3 vs 30
  • Choice frustration.
  • Leads to incrementalism as essential user
    criticism is withdrawn.
  • Cant Debug - This jam is a little bitter
    compared to
  • the other 6?
  • the other 26?
  • A whole lot of bad jam that nobody wants to buy

7
User Behavior
  • The problem of too much choice.
  • (M. Lepper _at_Stanford and S. Iyengar _at_Columbia)
  • Two tables in a supermarket
  • 24 jars of jam vs 6 jars of jam.
  • 3 vs 30
  • Choice frustration.
  • Leads to incrementalism
  • Essential user criticism is withdrawn.
  • Cant Debug - This jam is a little bitter
    compared to
  • the other 6?
  • the other 24?
  • A whole lot of bad jam that nobody wants to buy

8
Standards Fatigue
  • Data Standards are not an effective goal to
    achieve results in a timely way
  • Interactions/Pathways since NIH meeting in Nov
    1999. Efforts are still not integrated (PSI/IMEX
    and BIOPAX).
  • Information Systems are better goals.
  • Wet Lab Scientists are busy people who are
    (excuse me) trying to write papers.
  • Ongoing wishful thinking about latest new
    technology.
  • If only we had the semantic web it wouldl fix
    everything!

9
Community Standards
  • IMEX (BIND/DIP/INTACT/MINT/MIPS)
  • BioPAX (pathway databases)
  • SBML (gt70 software systems collaborating)
  • Cytoscape (collaborating interface developers)
  • NCBI/Blueprint (architecture)
  • Model Organism Databases (GMOD architecture)
  • Journals and Editors
  • Scientific Societies (FASEB)
  • Member and Non-member Scientists

10
Interaction Standards - PSI
11
BioPAX Pathways/Reactions
12
Exchange Formats in the Pathway Data Space
Database Exchange Formats
Simulation Model Exchange Formats
BioPAX
SBML, CellML
Genetic Interactions
PSI-MI 2
Rate Formulas
Biochemical Reactions
13
Two Views on Biomolecular Assembly Data
Integration
  • Separate Models
  • Pathways
  • Interactions
  • Separate Databases
  • Multiple DB ontologies
  • Ad-hoc curation standards
  • Ontology Consortia
  • PSI
  • BioPAX
  • APIs Exchange Only
  • Publish or perish
  • Unified Model
  • Networks with Interactions and Reactions
  • GenBank-Like Data Archive
  • One Ontology archiving all
  • Professional Curation
  • Single Curation Standard
  • FTP Services
  • APIs Atomistic Objects
  • Service or perish

14
Where to define data objects? API or Exchange or
Archive?
  • Software Systems Components (OSI Layers)
  • Human Interfaces
  • Application Programming Interfaces
  • Communications Protocols (Exchange)
  • Content Structure (Archive)
  • Database (ODBC/JDBC compliant MySQL)
  • Document Structures (XML)
  • Architectures (Compatible orchestration of the
    above)
  • Platforms (Runs the above Windows, Linux, Unix)

Atomistic
All-or-none
15
BioPAX Motivation
Common format will make data more accessible,
promoting data sharing and distributed curation
efforts
Application
Database
User
With BioPAX
Before BioPAX
gt150 DBs and tools
16
Pathways, Interactions and Signaling
Metabolic Pathways
Molecular Interaction Networks
Signaling Pathways
17
(No Transcript)
18
SummaryWorking with a spectrum of communities.
  • Identify the communities.
  • Recognize that communities are disjoint.
  • Success will arise from broad collaboration
    across the spectrum of identified communities.
  • Service all communities effectively with a whole
    system.
  • Drive innovation more through applications
    development and use.
  • Gain and effectively incorporate user critique.
  • Understand user needs, behaviors.

19
Pathway Databases
  • KEGG and EcoCyc

20
http//www.genome.jp/kegg/
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
32
(No Transcript)
33
(No Transcript)
34
PubChem Small Molecules
35
PubChem
  • Substance
  • descriptions of chemical samples, from a variety
    of sources, and links to PubMed citations,
    protein 3D structures, and biological screening
    results that are available in PubChem BioAssay.
  • If the contents of a chemical sample are known,
    the description includes links to PubChem
    compound.
  • Compound
  • Includes mixtures

36
(No Transcript)
37
(No Transcript)
38
(No Transcript)
39
Links PubChem Bioassay
Similarity Search
Similar Compounds
40
(No Transcript)
41
(No Transcript)
42
(No Transcript)
43
(No Transcript)
44
(No Transcript)
45
Small Molecule Interaction DB.
  • SMID-BLAST - for finding small molecule binding
    sites based on 3D structures.

46
(No Transcript)
47
Whats in SMID? SMID-BLAST?
  • SMID is a derived relational database
  • 3D structures that have small molecule binding
    sites
  • CDD domain regions families of conserved
    domains
  • Small molecule binding residues are mapped onto
    CDDs.
  • SMID-BLAST enhances domain searching with small
    molecule binding context.

48
Proteomics HUPO Poster
  • Proteomics Phenol upregulated protein in H.
    salinarium.
  • Spots identified by 2D gels of /- 1mM Phenol in
    4.5M NaCL
  • Han, Han, Kim, Joo and Chan-Wha Kim, Korea
    University
  • H. salinarium is not sequenced
  • Mass spec peptide hits to Halobacterium sp.
    NRC-1
  • GI 15791191 (Vng2406c) and
  • GI 15791140 (Vng2339c)
  • Poster authors presented no conclusions other
    than that these were completely unknown proteins.

49
(No Transcript)
50
Little information from CDD
51
(No Transcript)
52
(No Transcript)
53
Completely relaxed Search settings
54
(No Transcript)
55
Aromatic ligand binding site phenol
56
Oxygen Reactive site
57
SMID-BLAST
  • Offers small molecule context in addition to CDD
    domain hits
  • With SMID-BLAST we can speculate on how two
    proteins work to utilize Phenol as a carbon
    source
  • Reactive species and loose specificity
    hydrophobic binding sites.

58
SMID-BLAST Standalone
  • Scoring System
  • Distinguishes site specificity
  • Weights substrate/binding site size
  • Generates GenPept Annotation
  • Suitable for use in sequence analysis pipelines
Write a Comment
User Comments (0)
About PowerShow.com