The Properties of Protein-Protein Interaction Networks and Its Use in Protein Function and Protein Complex Prediction - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

The Properties of Protein-Protein Interaction Networks and Its Use in Protein Function and Protein Complex Prediction

Description:

Course Name: Systems Biology Conducted by-Shigehiko kanaya & Md. Altaf-Ul-Amin – PowerPoint PPT presentation

Number of Views:81
Avg rating:3.0/5.0
Slides: 49
Provided by: Altaf8
Category:

less

Transcript and Presenter's Notes

Title: The Properties of Protein-Protein Interaction Networks and Its Use in Protein Function and Protein Complex Prediction


1
Course Name Systems Biology Conducted
by- Shigehiko kanaya Md. Altaf-Ul-Amin
2
  • Dates of Lectures
  • October
  • 4, 6, 13, 20, 25, 27
  • November
  • 1, 10, 17, 22, 24, 29
  • Time Mondays 920, Wednesdays 1100
  • Website
  • http//kanaya.naist.jp/Lecture/systemsbiology_2010
    /

3
Syllabus Introduction to Graphs/Networks,
Different network models, Properties of
Protein-Protein Interaction Networks, Different
centrality measures Protein Function prediction
using network concepts, Application of network
concepts in DNA sequencing, Line graphs
Hierarchical Clustering, Finding clusters in
undirected simple graphs application to protein
complex detection Introduction to KNApSAcK
database, Metabolic Reaction system as ordinary
differential equations, Metabolic Reaction system
as stochastic process Metabolic network and
stoichiometric matrix, Information contained in
stoichiometric matrix, Elementary flux modes and
extreme pathways Graph spectral analysis/Graph
spectral clustering and its application to
metabolic networks Normalization procedures for
gene expression data, Tests for differential
expression of genes, Multiple testing and FDR,
Reverse Engineering of genetic networks Finding
Biclusters in Bipartite Graphs, Properties of
transcriptional/gene regulatory
networks Introduction to signaling pathways,
Selected biological processes Glycolytic
oscillations, Sustained oscillation in signaling
cascades
4
Central dogma of molecular biology
5
The crowded Environment inside the cell
Some of the physical characteristics are as
follows Viscosity gt 100 µ H20 Osmotic pressure
lt 150 atm Electrical gradient 300000 V/cm Near
crystalline state
The osmotic pressure of ocean water is about 27
atm and that of blood is 7.7 atm at 25oC
Without a complicated regulatory system all the
processes inside the cell cannot be controlled
properly.
Source Systems biology by Bernhard O. Palsson
6
From Genome to Phenome
(Dynamic)
Phenome
Phenotype X
Metabolites (Bio-chemical molecules)
Metabolome
Proteins-Amino Acid Sequences
Proteome
mRNA and other RNAs - Nucleotide sequence-Single
Strand
Transcriptome
DNA Nucleotide sequence- ATCTGATDouble Helix
Genome (Gene set)
(Statiic)
Progressing genome projects, many kinds of
omics works have progressed such as
transcriptome, . These are dynamic information
reflecting to Phenome.
7
Bioinofomatics
Genome
5
3
b
c
h
i
k
m
5
3
a
d
e
f
g
j
l
Transcriptome
Activation ()
A
5
3
b
c
h
i
k
m
5
3
G
a
d
e
f
g
j
l
G
Repression (-)
Proteome, Interactome
A
B
C
D
E
F
G
H
I
J
K
L
M
Protein
Function Unit
A
B
C
D
E
G
H
I
J
K
L
M
F
Metabolome ?FT-MS
comprehensive and global analysis of diverse
metabolites produced in cells and organisms
B
C
I
L
Metabolite 1
Metabolite 2
Metabolite 3
Metabolite 5
D
E
F
Metabolic Pathway
H
K
Metabolite 4
Metabolite 6
8
Introduction to Graphs/NetworksRepresenting
as a network often helps to understand a system
9
Konigsberg bridge problem Konigsberg was a city
in present day Germany encompassing two islands
and the banks of Pregel River. The city was
connected by 7 bridges. Problem Start at any
point, walk over each bridge exactly once and
return to the same point. Possible?
10
Konigsberg bridge problem Konigsberg was a city
in present day Germany including two islands and
the banks of Pregel River. The city was connected
by 7 bridges. Problem Start at any point, walk
over each bridge exactly once and return to the
same point. Possible?
11
Konigsberg bridge problem Konigsberg was a city
in present day Germany including two islands and
the banks of Pregel River. The city was connected
by 7 bridges. Problem Start at any point, walk
over each bridge exactly once and return to the
same point. Possible?
12
Konigsberg bridge problem Problem Start at any
point, walk over each bridge exactly once and
return to the same point. Possible? This problem
was solved by Leonhard Eular in 1736 by means of
a graph.
13
Konigsberg bridge problem Problem Start at any
point, walk over each bridge exactly once and
return to the same point. Possible? This problem
was solved by Leonhard Eular in 1736 by means of
a graph.
14
Konigsberg bridge problem Problem Start at any
point, walk over each bridge exactly once and
return to the same point. Possible?
A, B, C, D circles represent land masses and each
line represent a bridge
The necessary condition for the existence of the
desired route is that each land mass be connected
to an even number of bridges.
The graph of Konigsberg bridge problem does not
hold the necessary condition and hence there is
no solution of the above problem.
This notion has been used in solving DNA
sequencing problem
15
Definition
A graph G(V,E) consists of a set of vertices
Vv1, v2,) and a set of edges Ee1,e2, ..)
such that each edge ek is identified by a pair of
vertices (vi, vj) which are called end vertices
of ek. A graph is an abstract representation of
almost any physical situation involving discrete
objects and a relationship between them.
16
It is immaterial whether the vertices are drawn
rectangular or circular or the edges are drawn
staright or curved, long or short.
A
B
D
C
Both these graphs are the same
17
Many systems in nature can be represented as
networks
18
Many systems in nature can be represented as
networks
Air route Network
Road Network
No such node exists
Very high degree node
19
Many systems in nature can be represented as
networks
Printed circuit boards are networks
Network theory is extensively used to design the
wiring and placement of components in electronic
circuits
20
Many systems in nature can be represented as
networks
Protein-protein interaction network of e.coli
21
  • Some Basic Concepts regarding networks
  • Average Path length
  • Diameter
  • Eccentricity
  • Clustering Coefficient
  • Degree distribution

22
Average Path length
Distance between node u and v called d(u,v) is
the least length of a path from u to v.
d(a,e) ?
a
c
d
b
f
e
23
Average Path length
Distance between node u and v called d(u,v) is
the least distance of a path from u to v.
d(a,e) ?
Length of a-b-c-d-f-e path is 5
a
c
d
b
f
e
24
Average Path length
Distance between node u and v called d(u,v) is
the least distance of a path from u to v.
d(a,e) ?
Length of a-b-c-d-f-e path is 5 Length of
a-c-d-f-e path is 4
a
c
d
b
f
e
25
Average Path length
Distance between node u and v called d(u,v) is
the least length of a path from u to v
d(a,e) ?
Length of a-b-c-d-f-e path is 5 Length of
a-c-d-f-e path is 4 Length of a-c-d-e path is 3
a
c
d
b
f
e
The minimum length of a path from a to e is 3 and
therefore d(a,e) 3.
26
Average Path length Average path length L of a
network is defined as the mean distance between
all pairs of nodes.
a
c
There are 6 nodes and 6C2 (6!)/(2!)(4!)15
distinct pairs for example (a,b), (a,c)..(e,f).
d
b
f
e
We have to calculate distance between each of
these 15 pairs and average them
27
Average Path length Average path length L of a
network is defined as the mean distance between
all pairs of nodes.
a to b 1 a to c 1 a to d 2 a to e 3 a to
f 3 ---------------------- ----------------------
____________________ 15 pairs 27(total length)
a
c
d
b
f
e
L27/151.8 Average path length of most real
complex network is small
28
Average Path length
Finding average path length is not easy when the
network is big enough. Even finding shortest path
between any two pair is not easy. A well known
algorithm is as follows Dijkstra E.W., A note on
two problems in connection with Graphs,
Numerische Mathematik, Vol. 1, 1959,
269-271. Dijkstras algorithm can be found in
almost every book of graph theory. There are
other algorithms for finding shortest paths
between all pairs of nodes.
29
Diameter
Distance between node u and v called d(u,v) is
the least length of a path from u to v. The
longest of the distances between any two node is
called Diameter
a to b 1 a to c 1 a to d 2 a to e 3 a to
f 3 ---------------------- ----------------------
15 pairs
a
c
d
b
f
e
Diameter of this graph is 3
30
Eccentricity And Radius
Eccentricity of a node u is the maximum of the
distances of any other node in the graph from
u. The radius of a graph is the minimum of the
eccentricity values among all the nodes of the
graph.
a to b 1 a to c 1 a to d 2 a to e 3 a to
f 3 Therefore eccentricity of node a is 3
a
c
d
b
f
3
e
Radius of this graph is 2
31
Degree Distribution
The degree distribution is the probability
distribution function P(k), which shows the
probability that the degree of a randomly
selected node is k.
32
Degree Distribution
of nodes having degree k
10
1
2
4
3
Degree
33
Degree Distribution
P(k)
1
1
2
4
3
Degree
Any randomness in the network will broaden the
shape of this peak
34
Degree Distribution
of nodes having degree k
4
2
1
2
4
3
Degree
35
Degree Distribution
P(k)
0.5
0.25
1
2
4
3
Degree
36
Degree Distribution
Poissons Distribution
e 2.71828..., the Base of natural Logarithms
Degree distribution of random graphs follow
Poissons distribution
37
Degree Distribution
P(k) k-?
Power Law Distribution
Degree distribution of many biological networks
follow Power Law distribution
Power Law Distribution on log-log plot is a
straight line
38
Clustering coefficient
ki of neighbors of node i Ei of edges
among the neighbors of node i
a
c
d
b
f
e
39
Clustering coefficient
Ca21/21 1
ki of neighbors of node i Ei of edges
among the neighbors of node i
a
c
d
b
f
e
40
Clustering coefficient
Ca21/21 1 Cb21/21 1 Cc21/32 0.333 Cd2
1/32 0.333 Ce21/21 1 Cf21/21 1 Total
4.666 C 4.666/6 0.7776
ki of neighbors of node i Ei of edges
among the neighbors of node i
a
c
d
b
f
e
41
Clustering coefficient
By studying the average clustering C(k) of nodes
with a given degree k, information about the
actual modular organization can be extracted.
Ca21/21 1 Cb21/21 1 Cc21/32 0.333 Cd2
1/32 0.333 Ce21/21 1 Cf21/21 1
a
c
d
b
C(1)0 C(2)(CaCbCeCf)/41 C(3)(CcCd)/20.333
f
e
42
Clustering coefficient
By studying the average clustering C(k) of nodes
with a given degree k, information about the
actual modular organization can be
extracted. For most of the known metabolic
networks the average clustering follows the
power-law.
C(k) k-?
Power Law Distribution
43
Subgraphs
Consider a graph G(V,E). The graph G'(V',E')
is a subgraph of G if V' and E' are respectively
subsets of V and E.
a
c
b
Subgraph of G
a
c
d
c
b
f
d
f
Subgraph of G
e
Graph G
44
Induced Subgraphs An induced subgraph on a graph
G on a subset S of nodes of G is obtained by
taking S and all edges of G having both
end-points in S.
a
c
b
Induced subgraph of G for Sa, b, c
a
c
d
c
b
f
d
f
Induced subgraph of G for Sc, d, f
e
Graph G
45
Graphlets Graphlets are non-isomprphic induced
subgraphs of large networks T. Milenkovic, J.
Lai, and N. Przulj, GraphCrunch A Tool for Large
Network Analyses, BMC Bioinformatics, 970,
January 30, 2008.
46
Partial subgraphs/Motifs A partial subgraph on a
graph G on a subset S of nodes of G is obtained
by taking S and some of the edges in G having
both end-points in S. They are sometimes called
edge subgraphs.
a
c
b
a
c
Partial subgraph of G For Sa, b, c
d
b
f
e
Graph G
47
Partial subgraphs/Motifs
Genomic analysis of regulatory network dynamics
reveals large topological changes Nicholas M.
Luscombe, M. Madan Babu, Haiyuan Yu, Michael
Snyder, Sarah A. Teichmann Mark Gerstein,
NATURE VOL 431 2004
SIMSingle input motif MIM Multiple input
motif FFLFeed forward loop This paper searched
for these motifs in transcriptional regulatory
network of Saccharomyces cerevisiae
48
Partial subgraphs/Motifs
Genomic analysis of regulatory network dynamics
reveals large topological changes Nicholas M.
Luscombe, M. Madan Babu, Haiyuan Yu, Michael
Snyder, Sarah A. Teichmann Mark Gerstein,
NATURE VOL 431 2004
Write a Comment
User Comments (0)
About PowerShow.com