Biomolecular Networks BBSRC Summer School Dr. Charlie Hodgman Leeds Dec. 2002 - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Biomolecular Networks BBSRC Summer School Dr. Charlie Hodgman Leeds Dec. 2002

Description:

Gene products (RNA, proteins) with each other. signal networks, ... or properties (e.g. Km, Vmax, compartment, charge, hydrophobicity) Graph terminology ... – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 40
Provided by: ch10
Category:

less

Transcript and Presenter's Notes

Title: Biomolecular Networks BBSRC Summer School Dr. Charlie Hodgman Leeds Dec. 2002


1
Biomolecular NetworksBBSRC Summer SchoolDr.
Charlie HodgmanLeeds - Dec. 2002
  • Biomolecular interactions
  • Information resources
  • Graphs
  • Representation in the computer
  • Dynamics

2
Biomolecular interactions can be used to build
interaction networks
  • Enzyme - ligand (substrate, product, inhibitor,
    activator)
  • metabolic pathways and networks
  • Gene products (RNA, proteins) with each other
  • signal networks, machinery for cellular processes
  • protein(/product) interaction networks
  • Gene - gene products
  • genetic networks

3
The 3 networks are actually interconnected
Gene product interaction network
Genetic network
STIMULUS
etc.
mRNA
Metabolic network
4
Putting the networks together
  • Integrated - combinations of metabolic, genetic
    and product-interaction networks
  • a.k.a. gene networks in Inst. Cytology
    Genetics, Novosibirsk
  • Holistic - integrated networks that capture
    (all) interactions within a cell or
    multi-cellular system
  • a.k.a. virtual cells, e-cells

5
Information resources - Enzymes
EMP http//www.empproject.com/ all
enzymological data record 1 enzyme from 1
paper BRENDA http//www.brenda.uni-koeln.de/
all enzymological data record 1 E.C.
code KEGG http//www.genome.ad.jp/kegg/kegg2.
html enzymes - ligands record 1 E.C.
code UMMBD http//umbbd.ahc.umn.edu/ enzymes
- metabolites record 1 recorded
bio-degradative or xenobiotic reaction ENZYME
http//ca.expasy.org/enzyme/ E.C.
reactions record 1 E.C. code
6
Problems with these databases
  • Non-standard terminology of metabolite and
    protein names, both within and between databases
  • cant retrieve synonyms
  • EC code assignment idiosyncratic (e.g. 19 codes
    for ATP hydrolysis), slow (can take years
    depending on how often EC defers decision), poor
    representation of allostery
  • EC code structure (i.e. number-based hierarchy)
    is poor model of classification for enzyme
    activity, but is ONLY system at present

7
Info resources - Protein interactions
For example, BIND http//www.bind.ca/ BRITE ht
tp//www.genome.ad.jp/brite/ CSNDB http//geo.ni
hs.go.jp/csndb/ DIP http//dip.doe-mbi.ucla.edu/
MINT http//cbm.bio.uniroma2.it/mint/
other interactions TRANSPATH http//193.175.244.1
48/ duplicates CSNDB data
  • Protein Standards Institute now working on data
    exchange format
  • EBI is european coordinator (http//www.ebi.ac.uk/
    intact/)

8
Interactions/associations have physical
andfunctional attributes
Stoichiometric Non-stoichiometric
Enzyme/substrate relationship
Regulatory subunits
Directed Undirected
Multi-subunit complexes
Filaments
Directed one molecule exerting a biological
effect upon the other. Stoichiometry the numbers
of molecules involved are physically
defined. These attributes need to be assigned
for studying cause-effect relationships
9
Problems with these databases
  • Non-standard terminology of protein names, though
    not as severe for some because associations to
    swissprot available
  • Non-standard (even conflicting) data-models, some
    accessed through proprietary interfaces (e.g.
    curagen,YPD)
  • Often no directions to biological relationship
    because yeast two-hybrid data used
  • Cellular location of a complex may determines its
    activity, e.g. AKAPs.
  • Bauman, A.L. Scott, J.D. (2002) Nature Cell
    Biol. 4, E203-E206

10
Information resources - gene regulation
TRANSFAC http//transfac.gbf.de/TRANSFAC/
accessible only through proprietary web
site TRRD http//www.bionet.nsc.ru/trrd/ EPD
http//www.epd.isb-sib.ch/ focus on human
data RegulonDB http//www.cifn.unam.mx/Computatio
nal_Genomics/regulondb/ data mostly pertain to
E. coli.
11
Problems with these resources
  • Non-standard terminology (especially of protein
    names and binding sites), though TRRD has
    thesaurus close to publication
  • low coverage of what is actually happening in
    cells (or even of what is published)
  • TRANSFAC matrices inconsistent in definition,
    number and often too vague for diagnostic
    purposes, but currently only option
  • EPD and RegulonDB have merits but are virtually
    organism-specific

12
Metabolic pathway databases
MPW http//www.empproject.com/,
KEGG http//www.genome.ad.jp/kegg/kegg2.html
Biocarta http//www.biocarta.com/ proprietary
web interface BioCyc http//www.biocyc.org/ brin
gs together Ecocyc, Humcyc, Metacyc etc. based
on Lamberts chart Boehringer http//www.expasy.o
rg/cgi-bin/search-biochem-index based on
Boehringer chart
13
MPW chart
14
A KEGG CHART
Title
15
Other pathway databases
CSNDB http//geo.nihs.go.jp/csndb/ SPAD http
//www.grt.kyushu-u.ac.jp/spad/ Biocarta
http//www.biocarta.com/ Transpath
http//193.175.244.148/ Gene
Networks http//wwwmgs.bionet.nsc.ru/mgs/gnw
16
GeneNetworks
17
Title
18
Problems with these databases
  • Pathways are abitrary entities
  • where to start/stop,
  • gaps in picture (KEGG) or pathways shrinking in
    length (MPW)
  • metabolite centric, suboptimal for functional
    genomics proteomics because enzymes (and
    especially multifunctional gene products) occur
    in multiple places
  • selective ignorance (only capture isolated
    pathways)
  • possibly expensive or restricted access (e.g.
    curagen, transpath)

19
Graphsin the mathematical sense
20
Graph terminology
Connected
Node/vertex
Edge
Unconnected
21
Graph terminology
Graph
(unconnected) subgraphs,clusters
22
Graph terminology
(connected) subgraphs
strongly connected nodes/clusters
23
Graph terminology
Undirected graph
Directed graph (digraph)
Directed acyclic graph (DAG)
24
Graph terminology
Bridge
Span
Articulation point
25
Graph terminology
26
Graph terminology
Tree
Leaf
Root
Node
27
Graph terminology
Forest
28
Graph terminology
Pruning
  • Different approaches
  • back one node

29
Graph terminology
Pruning
  • Different approaches
  • back one node
  • back to node with gt1 edge
  • (trees pruned back to root)

30
Graph terminology
Pruning
  • Different approaches
  • back one node
  • back to node with gt1 connection
  • (trees pruned back to root)
  • by given distance from a given node

Distance 2
31
Graph terminology
Pruning
  • Different approaches
  • back one node
  • back to node with gt1 connection
  • (trees pruned back to root)
  • by given distance from a given node
  • by given distance up/down digraph

Distance 2
32
Graph terminology
In a petrinet, the nodes alternates between
passive (e.g. metabolite) and active (e.g.
enzyme) states. N.B. it is usual to represent
active nodes by squares. Coloured
graphs/petrinets have nodes edges with multiple
attributes or properties (e.g. Km, Vmax,
compartment, charge, hydrophobicity)
33
Graph terminology
Vertex degree number of edges from a given
node Min. path length the least number of
edges to cross between 2 nodes, also sometimes
called the degrees of freedom Network centre
the node which has the lowest average minimum
path length Distance matrix matrix
containing the minimum path length between
every pair of nodes Aver. path length
the average of all the minimum path lengths in a
(sub)graph
34
Graph terminology
Networks random generated by random
addition of new nodes edges small-world has a
small number of nodes with high vertex degree,
resulting in a low average path length
scale-free distribution of vertex degrees
follows a power-law distribution (i.e. the
distribution follows a straight line on log-log
plots). These are stable to perturbation, but can
exhibit complex behaviour. Jeong, H., et al.
(2000) The large scale organisation of metabolic
networks. Nature, 407 651-654. These highly
nodes of high vertex degree are known as hubs. In
metabolic networks, they correspond to water,
ATP, NADH etc. Computational navigation of
metabolic networks that include hubs show that
glycolysis is only one of 500 000 pathways of
the same length from glucose to
pyruvate!! Kuffner, R. et al. (2000) Pathway
analysis in metabolic databases via differential
metabolic display. Bioinformatics 16, 825-836.
35
Computational representation of graphs
Process model
Adjacency matrix
E1
E2
M1 M2 M3 M4 M5 M1 0
1 0 1 1 M2 1
0 1 1 0 M3 0
0 0 0 0 M4 1
0 0 0 0 M5 1
0 0 0 0
M1
M5
M2
M4
E3
M3
E3 reaction irreversible
Distance matrices calculated from adjacency matrix
36
Computational representation of graphs
E1
E2
E3
M1
M2
M4 M5
M3 M4
Stoichiometric matrix one row per reaction,
minus plus signs respec. for LHS and RHS of
equations
M1
M2
M3
M4
M5
R1 1
-1 -1 R2 -1
1 1 R3
-1 1 R4 1
-1 R5 -1 1
1
37
Dynamics (a.k.a. Systems Biology)
  • The stoichoimetric matrices can be
  • converted into groups of connected (i.e. systems
    of)
  • differential equations
  • M1 M4 M5
  • R1 1 -1
    -1
  • dM1 r1 M4 M5 r1 reaction rate
  • dt
  • converted and subjected to stochastic methods
  • converted and subject to (probabilistic) logic
    programming
  • subjected to a range of other statistical
    techniques
  • For holistic virtual cells, this means that we
    can model the
  • behaviour of living systems, linking genotype to
    phenotype
  • in the computer.

38
Dynamics (a.k.a. Systems Biology)
  • Systems Biology Mark-up Language (SBML) is the
  • emerging data standard for such models (Hucka et
    al.
  • Bioinformatics, in press).
  • A broad range of generic and specific modeling
    tools are
  • available, including
  • DBsolve
  • Gepasi
  • Metatools
  • Stochsim
  • E-cell
  • Ecocyc
  • GeneNet
  • See A.P. Arkin (2001) Synthetic cell biology.
    Curr. Opin. In Biotech. 12, 638-644.

39
Thank you for listening!You can give your
brain a rest now
Write a Comment
User Comments (0)
About PowerShow.com