Comparative Genome Maps - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Comparative Genome Maps

Description:

Chromosome. Gene family 'key to understanding the human genome' ... Synteny: loci on the same chromosome. Colinearity: syntenic regions with conserved gene order ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 43
Provided by: vicgol
Category:

less

Transcript and Presenter's Notes

Title: Comparative Genome Maps


1
Comparative Genome Maps
  • CSCI 7000-005 Computational Genomics
  • Debra Goldberg
  • debg_at_hms.harvard.edu

2
What is a comparative map?
3
Why construct comparative maps?
  • Identify isolate genes
  • Crops drought resistance, yield, nutrition...
  • Human disease genes, drug response,
  • Infer ancestral relationships
  • Discover principles of evolution
  • Chromosome
  • Gene family
  • key to understanding the human genome

4
Why automate?
  • Time consuming, laborious
  • Needs to be redone frequently
  • Codify a common set of principles
  • Nadeau and Sankoff warn of arbitrary nature of
    comparative map construction

5
Definitions
  • Marker identifiable chromosomal locus
  • Homology genes with common ancester
  • Homeology chromosomal regions derived from a
    common ancestral linkage group
  • Synteny loci on the same chromosome
  • Colinearity syntenic regions with conserved gene
    order

6
Input/Output
  • Input
  • genetic maps of 2 species
  • marker/gene correspondences (homologs)
  • Output
  • a comparative map
  • homeologies identified

7
Map construction
Go from this
to this
Maize 1 (target), Rice (base) Wilson et al.
Genetics 1999
8
Chromosome labeling
Maize 1 (target), Rice (base) Wilson et al.
Genetics 1999
9
A natural model?
Maize 1 (target), Rice (base) Wilson et al.
Genetics 1999
10
Scoring
10L
3L
11
Assumptions
  • Accept published marker order
  • All linkage groups of base are unique
  • Simplistic homeology criteria
  • At least one homeologous region

12
A natural model?
13
A natural model?
14
A natural model?
15
A natural model?
16
Dynamic programming
  • li location of homolog to marker i
  • Si,a penalty (score) for an optimal labeling
    of the submap from marker i to the end, when
    labeling begins with label a

a 1 ... i ... n
17
Recurrence relation
  • Sn,a m ?(a, ln)Si,a m ?(a, li) min
    (Si1,b s ?(a,b) )

a ... n ... ln
b?L
18
Problem with linear model
  • s 2

19
The stack model
d
f
e
c
c
b
b
b
a
  • Segment at top of the stack can be
  • pushed (remembered), later popped
  • replaced
  • Push and replace cost s -- pop is free.

20
Scoring
21
Dynamic programming
  • Si,j,a score for an optimal labeling of
  • submap from marker i to marker j
  • when labeling begins with label a -- i.e.,
    marker i is labeled a

a 1 ... i ... j ... n
22
Recurrence relation
  • Si,i,a m ?(a, li)
  • Si,j,a min
  • m ?(a, li) min (Si1,j,b s ?(a,b) )
  • min Si,k,a Sk1,j,a

b?L
iltkltj
23
Results infers evolutionary events
Wilson et al.
Maize 1 (target) Rice (base)
24
Problem Incomplete input
  • Gene order not always fully resolved.
  • Co-located genes can be ordered to give most
    parsimonious labeling.

25
The reordering algorithm
  • Uses a compression scheme
  • Within a megalocus, group genes by location of
    related gene.
  • Order these groups
  • First, last groups interact with nearby genes
  • Any ordering of internal groups is equally
    parsimonious

26
The reordering algorithm
27
The reordering algorithm
28
Definitions
  • ? extended to distance to a set A of labels
  • 0 if a ? A,
  • 1 otherwise
  • S the set of indices of supernode start
    elements
  • For simplicity, call supernode i ? S

?(a, A)
29
Definitions
  • For i ? S
  • ni markers in i
  • ni(a) markers in i with a homolog on a
  • li set of labels matching markers in i
  • li a ? L ni(a) ? 1,

30
Definitions
  • pi(c) gives mismatched marker and segment
    boundary penalties for label c

31
Definitions
  • p(i,a,b) gives the total mismatched marker and
    segment boundary penalties attributed to hidden
    markers

? (pi(c)) m ?i (a,b) for i?S, a?b p(i,a,b)
? (m ni(c)) m ?i (a,b) for i?S,
ab 0 otherwise.
c ? a,b
c ? a
32
Definitions
  • For i ? S
  • ? i(a,b) labels in a,b without matching
    marker in i
  • ? i(a,b) ?(a, li) ?(b, li)
  • ? i(a,b) ? 0,1,2

33
Definitions
  • ?i (a,b) corrects if mismatch marker penalties
    assigned twice for same marker in the recurrence
    and in p(i,a,b)
  • For example
  • ?i (a,b) 0 if ? i(a,b) 0(if a, b are both
    represented in supernode)
  • ?i (a,a) -2 if ? i(a,a) gt 0(if a is not
    represented in supernode)

34
Recurrence relation
  • Si,i,a m ?(a, li)

Si,j,a min m ?(a, li) min (Si1,j,b s
?(a,b) p(i,a,b)) min Si,k,a Sk1,j,a
b?L
iltkltj k ? S
35
Results Fewer mismatches
  • stack reordering

Mouse 5 (target) Human (base)
36
Results Mismatches placed between segments
  • stack reordering

Mouse 8 (target) Human (base)
37
Results Detects new segments
  • stack reordering

Mouse 13 (target) Human (base)
38
Summary
  • Finds optimal comparative map
  • Arranges markers in most parsimonious way
  • First algorithm to use megalocus data
  • Fast, objective, simple to use
  • Biologically meaningful results

39
Summary
  • Global view
  • Biologically meaningful results
  • Provides testable hypotheses
  • Robust
  • not species-specific
  • high/low resolution, genetic/physical maps
  • stable to errors in marker order

40
Future Directions
  • Algorithmic extensions
  • 3rd species
  • polyploidy
  • search for ancient duplications
  • Deduce history of evolutionary events
  • makes genome rearrangement measures tractable and
    robust
  • infer common ancestor

41
Future Directions
  • Block-segmental sequence comparisons
  • non-local sequence alignment
  • protein domains
  • 2D block-segmental comparisons
  • comparison of regulatory networks
  • image processing

42
Acknowledgments
  • NSF
  • AAUW
  • David and Lucile Packard Foundation
  • USDA
  • Cooperative State Research Education and
    Extension Service
  • ONR
  • Jon Kleinberg
  • Susan McCouch
  • Chris Pelkie
  • Sandra Harrington
  • Sam Cartinhour
  • Dave Schneider
Write a Comment
User Comments (0)
About PowerShow.com