ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING

Description:

analysis of genetic networks using attributed graph matching – PowerPoint PPT presentation

Number of Views:171
Avg rating:3.0/5.0
Slides: 37
Provided by: Yase7
Learn more at: https://cs.nyu.edu
Category:

less

Transcript and Presenter's Notes

Title: ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING


1
ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED
GRAPH MATCHING
2
BACKGROUND
  • Completion of sequencing projects
  • Need for functional discovery
  • Emerging area of study Large scale genomic
    analysis
  • Similarity of living systems

3
GENETIC NETWORKS
  • Modelling genetic networks
  • Interaction of genes and proteins
  • Relationship between topology and function

4
MOTIVATION
  • Common biological processes
  • Comparison of networks
  • Discovering missing interactions
  • Discovering missing genes

5
GRAPH MATCHING
G1
Search-based Algorithm Pruning Techniques
G2
6
ROADMAP
  • Scale-Free Networks
  • Modelling Genetic Networks
  • Graph Matching
  • Algorithm
  • Results

7
SCALE-FREE NETWORKS
8
COMPLEX NETWORKS
  • Small-world model
  • WWW
  • Human acquaintances network
  • Citation networks
  • Biological networks

9
SMALL-WORLD
  • Features
  • Characteristic path length
  • Clustering coefficient
  • Sparseness

10
SMALL-WORLD
  • Somewhere in between regular random graphs

11
SMALL-WORLD
  • Highly clustered
  • Short diameter

12
SCALE-FREE NETWORKS
  • Complex networks biological, social, www, power
    grid, citation etc.
  • Power low connectivity
  • P(k) k -a
  • Hubs - authorities

13
SCALE-FREE NETWORKS
  • Application for testing scale free behavior
  • Yeast
  • Helicobacter Pylori
  • Mycoplasma Pnuemonia
  • Mycoplasma Genitelium
  • Linear log-log graph
  • Slope a

14
SCALE-FREE NETWORKS
  • Slope is calculated by least mean square method

15
TOPOLOGY FUNCTIONALITY
  • Small diameter
  • ease of dissemination of information
  • ease of restoring after disturbance
  • Cliquishness
  • Alternate paths are found
  • Heterogeneity
  • Random removal does not effect the network
  • Hubs are vulnerable to attack

16
BIOLOGICAL ASPECTS
  • Multifunctionality
  • Grouped into functional units
  • Stability
  • Reason Most of the interactions are between hubs
    and authorities

17
MODELLING GENETIC NETWORKS
18
TYPES OF GENETIC NETWORKS
  • Categorized by data sources
  • Metabolic pathways
  • Gene expression arrays
  • Protein interactions
  • Gene interactions

19
INTERACTION MAPS
  • High level perspective
  • Nodes Genes or proteins
  • Edges Presence of an interaction
  • Data sources
  • Two-hybrid analysis
  • Fusion analysis
  • Chromosomal proximity
  • Phylogenetic analysis

20
GRAPH MATCHING
21
PROBLEM DEFINITION
  • Attributed Relational Graph (ARG)
  • G V, E, X.
  • V v1, v2, , vn Nodes
  • E e1, e2, , em Edges
  • X x1, x2,,xn Attributes

22
INEXACT SUBGRAPH MATCHING
  • Allow for
  • Mismatching attribute values
  • Missing nodes
  • Missing links
  • Also called error-correcting subgraph isomorphism
  • NP-Complete

23
SEARCH TECHNIQUES
  • Cost function
  • Pruning (Structure Constraints)
  • Backtracking

24
ATTRIBUTED GRAPH MATCHING TOOL
25
ATTRIBUTE MATCHING
  • Amino Acid Sequence Content Composition
  • array of 20, percentage of each aa
  • Amino acid grouped into classes array of 6
  • Amino acid triples grouped into classes array of
    216
  • MKVLNKNEL

6 x 6 x 6
26
ATTRIBUTE MATCHING
Difference in amino acid composition values of
gene pairs for M. Genitalium and M. Pneumoniae.
Score
observations
27
STRUCTURAL CONSTRAINTS
  • Effect of scale-free behaviour
  • Connectivity information Highly heterogeneous,
    thus start with most connected and work around it
  • Pruning strategy comparibility is determined by
    power low

28
STRUCTURAL CONSTRAINTS
  • Neigborhood connectivity
  • Choose the neighbor at the next stage
  • Backtracking
  • Component by component
  • Go back to the neighbor with the most
    connectivity within the component

29
TEST CASE
  • Mycoplasma Genitalium
  • smallest genome (470 ORFs)
  • Mycoplasma Pnuemoniae
  • Very similar, superset (688 ORFs)

30
TEST CASE...
  • Mycoplasma Genitalium
  • 232 nodes
  • 211 links
  • Mycoplasma Pnuemoniae
  • 267 nodes
  • 257 links
  • Inputs
  • MGE links
  • MPN links
  • MGE synonyms
  • MPN synonyms
  • MGE amino acid sequence
  • MPN amino acid sequence

31
RESULTS
MGE
MPN
32
DISCOVERY OF MISSING DATA
  • Missing link
  • Link between in MPN632 and MPN637 is missing in
    our data but exists in literature

33
DISCOVERY OF MISSING DATA
  • Missing node with known COG
  • MPN236--- MPN237---MPN238---MPN678
  • MG098 ----MG099-----MG100----MG459
  • MG459 is ortholog of MPN678

34
DISCOVERY OF MISSING DATA
  • Missing node without known ortholog

35
CONCLUSION
  • Large-scale genomics
  • Interaction data captures system structure and
    dynamics
  • Graph matching exploits the scale-free
    characteristics
  • Novel interactions and genes can be identified

36
ACKNOWLEDGEMENT
  • YASEMIN TÜRKELI
Write a Comment
User Comments (0)
About PowerShow.com