Loading...

PPT – Statistical physics of complex networks PowerPoint presentation | free to download - id: 70afc-NGZkZ

The Adobe Flash plugin is needed to view this content

Statistical physics of complex networks

- Sergei Maslov
- Brookhaven National Laboratory

Short history complex systems before after

networks

- Statistical physics of complex systems was active

in 80s-90s (following the chaos boom of 70s) - Fractals (Mandelbrot and many others)
- Self-Organized Criticality (Per Bak and

co-authors) ? sandpiles ? granular systems - Complexmultiple time and length scales (e.g.

avalanches) ? Cult of power-laws - Cellular automata (mostly in real spacetime)
- Examples
- earthquakes
- disordered moving interfaces
- (co)-evolution of species
- agent-based modeling (ants)
- By the end of 90s breakup of the community and

specialization - Biology
- Economics and finance
- Internet
- Social sciences

Networks in complex systems

- Complex systems
- Large number of components interacting with each

other - All components and/or interactions are different

from each other (unlike in traditional physics

where 1023 electrons are all the same!) - Paradigms
- 104 types of proteins in an organism,
- 106 routers in the Internet
- 109 web pages in the WWW
- 1011 neurons in a human brain
- The simplest property who interacts with whom?

can be visualized as a network - Complex networks are just a backbone for complex

dynamical processes

Why study the topology of complex networks?

- Lots of easily available data thats where the

state of the art information is (at least in

biology) - Large networks may contain information about

basic design principles and/or evolutionary

history of the complex system - This is similar to paleontology learning about

an animal from its backbone

- Inside single cells

- A small part of a metabolic network the citric

acid cycle

Metabolic pathway chart by ExPASy

Protein binding networks

Bakers yeast S. cerevisiae (only nuclear

proteins shown)

Nematode worm C. elegans

Transcription regulatory networks

Single-celled eukaryote S. cerevisiae

Bacterium E. coli

GENOME

protein-gene interactions

PROTEOME

protein-protein interactions

METABOLISM

bio-chemical reactions

slide after Reka Albert

- Between cells in a multi-cellular organism

Sea urchin embryonic development (endomesoderm up

to 30 hours) by Davidsons lab

C. elegans neurons

- Between organisms

Freshwater food web by Neo Martinez and Richard

Williams

Sexual contacts M. E. J. Newman, The structure

and function of complex networks, SIAM Review 45,

167-256 (2003).

- Social

High school dating Data drawn from Peter S.

Bearman, James Moody, and Katherine Stovel

visualized by Mark Newman

Network of actor co-starring in movies

Networks of scientists co-authorship of papers

Webpages connected by hyperlinks on the ATT

website circa 1996 visualized by Mark

Newman Citation networks are similar to the WWW

but time-ordered

- Technological

Internet as measured by Hal Burch and Bill

Cheswick's Internet Mapping Project.

(No Transcript)

transportation networks airlines

transportation networks railway maps

Tokyo rail map

- Lecture 1 General introduction into networks
- Node degrees, its distribution, and correlations
- Simple models
- preferential attachment and Simon model
- Growth model for protein families
- Percolation transition on networks
- Clustering coefficient
- Lectures 2-3 Biomolecular (mostly protein)

networks - Regulatory and signaling networks
- How many regulators? Bureaucratic collapse
- Network motifs in directed (e.g. regulatory)

networks - Protein binding networks
- Broad degree distributions in protein binding

networks and possible explanations - Evolutionary (duplication-divergence)
- Biophysical (stickiness)
- Functional
- Beyond degree distributions How it all is wired

together? Correlations in degrees - Randomization of networks
- Law of Mass Action and propagation of

perturbations

Degree (or connectivity) of a node the of

neighbors

Degree K2

Degree K4

Directed networks havein- and out-degrees

In-degree Kin2

Out-degree Kout5

- Degree distributions in random and real networks

Degree distribution in a random network

- Randomly throw E edges among N nodes
- Solomonoff, Rapaport, Bull. Math. Biophysics

(1951)Erdos-Renyi (1960) - Degree distribution Binominal ? Poisson
- K???? with no hubs(fast decay of N(K))

Degree distribution in real protein binding

network

- Histogram N(K) is broad most nodes have low

degree 1, few nodes high degree 100 - Can be approximately fitted with N(K)K-?

functional formwith ?2.5

Many real world networkshave broad degree

distributions

Basic BA-model

- Very simple algorithm to implement
- start with an initial set of m0 fully connected

nodes - e.g. m0 3
- now add new vertices one by one, each one with

exactly m edges - each new edge connects to an existing vertex in

proportion to the number of edges that vertex

already has ? preferential attachment - easiest if you keep track of edge endpoints in

one large array and select an element from this

array at random - the probability of selecting any one vertex will

be proportional to the number of times it appears

in the array which corresponds to its degree

1 1 2 2 2 3 3 4 5 6 6 7 8 .

generating BA graphs contd

- To start, each vertex has an equal number of

edges (2) - the probability of choosing any vertex is 1/3
- We add a new vertex, and it will have m edges,

here take m2 - draw 2 random elements from the array suppose

they are 2 and 3 - Now the probabilities of selecting 1,2,3,or 4 are

1/5, 3/10, 3/10, 1/5 - Add a new vertex, draw a vertex for it to connect

from the array - etc.

The tale of linear vs exponential growth

- Linear growth Barabasi-Albert model with ?3 is

a version of the Simons word usage model ?2? - dnk/dt(k-1)nk-1/(t?t)-knk/(t?t)
- Exponential growth Protein duplication-deletion

model ?2?/(?dup-?del) - dnk/dt?dup (k-1)nk-1- (?dup?del )knk?del

(k1)nk1 NF?knk also grows exponentially

dNF/dt ? NG ? ?kknk

Preferential attachment with fitness

- Bianconi-Barabasi (2001)
- Attractiveness of a node to new edges is given by

fiki/?rfrkr - For uniform ?(f) Pk k-(1C)/ln(k), where

C1.255 - Generally C depends on ?(f)
- Some ?(f) result in Bose-Einstein condensation

in which super-hubs emerge

- Percolation transition in networks

Why should we care?

- The most important property of a network. It

quantifies how broken-up is a network - Below the percolation threshold many small

components - At the percolation threshold scale-free

distribution of component sizes P(S)S-2.5 - Above the percolation threshold giant connected

component and a few small ones? - Determines the propagation of perturbations which

affect neighbors with probability p (e.g.

infections)

Naïve (and wrong) argument

- An average node has ltKgt first neighbors, ltKgtltK-1gt

second neighbors, ltKgtltK-1gtltK-1gt third neighbors - We neglect overlap between e.g. second and first

neighbors in random networks a small effect 1/N - If ltK-1gt ? 1 a single node is connected to a

finite fraction of all nodes in the network

Where is it wrong?

- Probability to arrive at a node with K neighbors

is proportional to K! - All averages have to be modified ltF(K)gt ? ltF(K)

Kgt/ltKgt - The right answer ltK(K-1)gt/ltKgt ? 1 a

perturbation would spread - In directed networks it is ltKinKoutgt/ltKingt ? 1
- Correlations between degrees of neighbors and an

abnormally large number of triangles (clustering)

would affect the answer

How many clusters?

- If ltK(K-1)gt/ltKgt ltlt 1 there are only small

clusters - If ltK(K-1)gt/ltKgt ? 1 cluster sizes S have a

scale-free distribution P(S)S-2.5. - If ltK(K-1)gt/ltKgt gtgt 1 there is one giant

cluster and a few small ones - Perturbation which affects neighbors with

probability p propagates if pltK(K-1)gt/ltKgt ? 1 - For scale-free networks P(K)K-? with ?lt3,

ltK2gt? ? perturbation always spreads in a large

enough network

Diameter and mean cluster size are determined by

ltk(k-1)gt/ltkgt

- Mean diameter L 1ltkgt ltkgtltk(k-1)gt/ltkgt

ltkgt(ltk(k-1)gt/ltkgt)LN ? L ?

log(N/ltkgt)/log(ltk(k-1)gt/ltkgt)1 - Mean cluster size below pcltSgt1ltkgt/(1-ltk(k-1)gt/

ltkgt)

Amplification ratios

- A(dir) 1.08 - E. Coli, 0.58 - Yeast
- A(undir) 10.5 - E. Coli, 13.4 Yeast
- A(PPI) ? - E. Coli, 26.3 - Yeast

Clustering coefficient C?

- C?3 N?/?knk k(k-1)/2
- Could be defined for individual nodes or as a

function of k C?(k)3 N?(k)/nk k(k-1)/2 - C?1 could not be realized if k is heterogeneous
- Needs to be compared to its value in randomized

networks with the same degree sequence

End lecture 1

Lecture 2

- Protein networks

Places to learn molecular biology

- Molecular Biology of the Cell. Fourth Edition.

Bruce Alberts, Alexander Johnson, Julian Lewis,

Martin Raff, Keith Roberts, Peter Walter. Garland

Science. 2002. - DNA from the beginning. http//www.dnaftb.org/
- Online Biology Book. http//gened.emc.maricopa.edu

/bio/bio181/BIOBK/BioBookTOC.html - Kimballs Biology Pages. http//www.ultranet.com/

jkimball/BiologyPages/ - Gene expression. http//vlib.org/Science/Cell_Biol

ogy/gene_expression.shtml - Human Genome Project. http//www.ornl.gov/hgmis/
- Microarrays. http//www.gene-chips.com/

From Prof. Michael Hallett (McGill) online

lectures

Protein networks

- Nodes proteins
- Edges interactions between proteins
- Metabolic (protein enzymes on sharing common

metabolites are connected) - Physical (binding interactions)
- Regulatory and signaling (transcriptional

regulation, protein modifications) - Co-expression networks from microarray data

(connect genes with similar expression

(abundance) patterns under many conditions) - Genetic interactions e.g. synthetic lethal

protein pairs (removal of any one of the two

proteins doesnt kill the cell, but removal of

both proteins does) - Etc, etc, etc.

Sources of data on protein networks

- Genome-wide experiments
- Binding two-hybrid (Y2H) and mass-spec (MS)

high-throughput techniques - Transcriptional regulation ChIP-on-chip, or

ChIP-then-SAGE - Expression, disruption networks microarrays
- Lethality of genes (including synthetic lethals)

- Gene knockout yeast
- RNAi worm, fly
- Many small or intermediate-scale experiments
- All stored in public databases BIOGRID, DIP,

BIND, YPD (no longer public), SGD, Flybase,

Ecocyc, etc.

Pathway ? network paradigm shift

Images from ResNet3.0 by Ariadne Genomics

MAPK signaling

Inhibition of apoptosis

- Transcription regulatory networks

Transcription factors bind DNA

Activators and repressors

- Depending on the position of the binding site

(operator) with respect to the RNA-polymerase

binding site (promoter) Transcription Factors

could either activate or repress the production

of mRNA from a given gene (transcription) and

thus affect the abundance of a protein product

Transcription regulatory networks

Sea urchin embryonic development (endomesoderm up

to 30 hours) by Davidsons lab

- How many transcriptional regulators are out

there?

Fraction of transcriptional regulators in bacteria

Figure from Erik van Nimwegen, TIG 2003

Complexity of regulation grows with complexity of

organism

- NRltKoutgtNltKingtnumber of edges
- NR/N ltKingt/ltKoutgt increases with N
- ltKingt grows with N
- In bacteria NRN2 (Stover, et al. 2000)
- In eucaryots NRN1.3 (van Nimwengen, 2002)
- Networks in more complex organisms are more

interconnected then in simpler ones

Complexity is manifested in Kin distribution

E. coli vs H. sapiens

Table from Erik van Nimwegen, TIG 2003

Toolbox model

- NTFAN2 ? dNTF2ANdN ? dN/dNTF2A/N
- In small genomes 100 genes per TF. In large ones

only 4! - A toolbox (e.g. metabolic network) grows linearly

with N. To handle a new condition (NTF?NTF1) one

needs fewer and fewer new tools. - S. Maslov, S. Krishna, K. Sneppen in preparation

How is it all connected? (beyond degree

distribution)

What is unusual about topology of a given network?

- Look for a number of occurrences of a certain

topological pattern - Compare with a randomized network
- What patterns to look for?
- Number of edges connecting nodes with given

degrees (degree-degree correlations) - Motifs small subgraphs of 3-4 nodes (in

undirected networks clustering or the triangles) - Overrepresentation Nature needs them for some

function - Underrepresentation they are detrimental and

nature avoids them

- How to construct a proper random network?

Randomization of a network

Stub reconnection algorithm

- Break every edge into two halves (stubs)
- Randomly reconnect stubs
- Watch for multiple edges!
- For example, in the AS-Internet two largest hubs

would end up being connected with 50 edges (sic!)

- Not adaptable to conserve other low-level

topological properties of the network

Local rewiring algorithm

- R. Kannan, P. Tetali, and S. Vempala, Random

Structures and Algorithms (1999) - SM, K. Sneppen, Science (2002)

- Randomly select and rewire two edges
- Repeat many times

Metropolis rewiring algorithm

energy E

energy E?E

SM, K. Sneppen cond-mat preprint

(2002),Physica A (2004)

- Randomly select two edges
- Calculate change ?E in energy function

E(Nactual-Ndesired)2/Ndesired - Rewire with probability pexp(-?E/T)

- Degree-degree correlations

Central vs peripheral network architecture

random

A. Trusina, P. Minnhagen, SM, K. Sneppen, Phys.

Rev. Lett. 92, 17870, (2004)

What is the case for protein interaction network

SM, K. Sneppen, Science 296, 910 (2002)

Correlation profile

- Count N(k0,k1) the number of links between

nodes with connectivities k0 and k1 - Compare it to Nr(k0,k1) the same property in a

random network - Qualitative features are very noise-tolerant with

respect to both false positives and false

negatives

(No Transcript)

Correlation profile of the protein interaction

network

R(k0,k1)N(k0,k1)/Nr(k0,k1)

Z(k0,k1) (N(k0,k1)-Nr(k0,k1))/?Nr(k0,k1)

Similar profile is seen in the yeast regulatory

network

Hubs may act within a module, or connect modules

- Party hub
- simultaneous interactions
- tends to be within the same module
- Date hub
- sequential interactions
- connect different modules

Han et al, Nature 443, 88 (2004)

(No Transcript)

Correlation profile of the yeast regulatory

network

R(kout, kin)N(kout, kin)/Nr(kout,kin)

Z(kout,kin)(N(kout,kin)-Nr(kout,kin))/

?Nr(kout,kin)

Some scale-free networks may appear similar

In both networks the degree distribution is

scale-free P(k) k-? with ?2.2-2.5

But correlation profiles give them unique

identities

Internet

Protein interactions

- Small network motifs(Uri Alon and his group)

All 3 node motifs

Motifs can overlap in the network

motif to be found

graph

motif matches in the target graph

http//mavisto.ipk-gatersleben.de/frequency_concep

ts.html

Detection of important network motifs

- Technique
- construct many random graphs with the same number

of nodes and degree distribution - count the number of motifs in those graphs
- calculate the Z score the probability that the

same or larger number of motifs in the real world

network could have occurred in a random one - Software available
- http//www.weizmann.ac.il/mcb/UriAlon/

What the Z score means

m mean number of times the motifappeared in

the random graph

the probability observing a Z score of 2 is

0.02275 In the context of motifs Z gt 0, motif

occurs more often than for random graphs Z lt 0,

motif occurs less often than in random

graphs Z gt 1.65, only a 5 chance of random

occurrence

s standard deviation

of times motif appeared in random graph

x - mx

zx

sx

Examples of network motifs (3 nodes)

- Feed forward loop
- Found in many transcriptional regulatory

networks

Possible functional role of a coherent

feed-forward loop

- Noise filtering short pulses in input do not

result in turning on of the Z - To function needs time-delay (about 0.5hrs for

bacterial transcription)

All 4 node subgraphs (computational expense

increases with the size of the graph!)

Higher-order motifs

- 4-node motifs contain some 3-node motifs
- One needs to be careful when calculating

over-representation - Alon co-authors use our Metropolis algorithm to

generate networks with a given number of

low-level motifs

Table 1 from R Milo, S Shen-Orr, S Itzkovitz, N

Kashtan, D Chklovskii U Alon, Network Motifs

Simple Building Blocks of Complex Networks

Science, 298824-827 (2002)

Examples of network motifs (4 nodes)

- Parallel paths are over represented
- Neural networks
- Food webs

Finding classes on graphs based on their motif

profiles

THE END