Efficient Exact p-Value Computation and Applications to Biosequence Analysis - PowerPoint PPT Presentation

About This Presentation
Title:

Efficient Exact p-Value Computation and Applications to Biosequence Analysis

Description:

Milestones due today. Anything to report? http://cs273a.stanford.edu [Bejerano Fall09/10]* – PowerPoint PPT presentation

Number of Views:80
Avg rating:3.0/5.0
Slides: 26
Provided by: GB298
Learn more at: http://web.stanford.edu
Category:

less

Transcript and Presenter's Notes

Title: Efficient Exact p-Value Computation and Applications to Biosequence Analysis


1
Milestones due today. Anything to report?
2
Lecture 17
  • Ultraconservation evolutionary data
  • Finish early come hear the talk with us?

3
Sequence Conservation implies Function
Comparative Genomics of Distantly related species
functional region!
...CTTTGCGA-TGAGTAGCATCTACTATTT...
human
mammalian ancestor
...ACGTGGGACTGACTA-CATCGACTACGA...
mouse
  • (but which function/s?...)

4
Human Genome full of Conserved Non-Coding
Elements
Human Genome 3109 letters
1.5 known function
compare to other species
gt50 junk
gt5 human genome functional
3x more functional DNA than known!
106 substrings do not code for protein
What do they do then?
5
Conserved elements in the Human Genome
all human-mouse alignments
human-mouse ancestral repeats alignment
human-mouse ancestral repeats alignment
election
Difference 5 of Human Genome
85id on average
Mouse consortium, Nature 2002
6
Conserved elements in the Human Genome
all human-mouse alignments
human-mouse ancestral repeats alignment
human-mouse ancestral repeats alignment
Simple but Unexpected (the lure of Bioinormtaics)
election
Difference 5 of Human Genome
Ultraconservation
85id on average
Mouse consortium, Nature 2002
7
Typical DNA Conservation levels
(dot base identical to human)
Conserved elements between human and mouse are
on average 85 identical. mouse consortium, 2002
8
Ultraconserved Elements
fish
481 elements perfectly conserved (100id) over
200bp or more between human, mouse and rat.
Bejerano et al., Science 2004
9
Ultraconserved Elements Why?
Hundreds of long substrings identical between
amniotes? they must have rejected many different
changes. But... all functions we understand in
our genome are encoded using redundant codes.
CDS
ncRNA
TFBS
seq.
10
Ultras are Functional
Back in 2004 we hypothesized
481 ultraconserved elements
nonexonic subset transcriptional regulators
exonic subset post transcriptional regulation
Pennacchio et al., Nature, 2006
Ni et al., Genes Dev. Lareau et al., Nature,
2007
11
Genomic Distribution of Ultraconserved Elements
  • exonic
  • non
  • possibly

Origins?
12
UC.338 comes from an ancient repeat
ultraconserved exon
novel coelacanth repeat
enhancer
LF-SINE
Bejerano et al, Nature ,2006
13
Ultras are Under Strong Human Selection
Mutational cold spots? NO. Rare (new) mutations
are introduced to the population. Fierce
purifying selection? YES. Very few of these get
anywhere near fixation.
A
A A G A
chimp
humans
NonSyn DAF
Ultra DAF
Katzman et al, Science ,2007
14
Touch an Ultra And You - DIY
Ahituv et al., PLoS Biology, 2007
15
What cant we measure in the lab?
Ne is population size, s selective
dis/advantage. Both of which are VERY wrong in
the lab.
16
So it can happen but does it FIX?
DNA element
t
mouse
17
Count Fraction Lost, Binned by id
bin by id
count_all
t
human macaque dog mouse rat
count_hole
100bp sliding window
dog
rat
mouse
human
macaque
18
Quite Some Time Later
19
Pragmatic Genomics
define goal run sensible approach while (results
full of artefacts) characterize artefact
write handler into code rerun
bio
cs
bio
cs
eg sequencing errors, assembly errors
contaminating sequence, ambiguous situations, etc.
20
Ultras are Fiercely Retained through Evolution
No Apparent Phenotype
100id primates-dog 1,691,090bp rodents
deleted 1,447bp (0.086)
Ultras are gt300 fold more persistent than neutral
DNA
But Doomed ...
the genomic deletion is
(25 deleted)
21
How special are the Ultras?
election
Ultraconservation
22
Adding More Species
Aha!!
23
Adding More Species
More and more species
Few species
Hmmm.
24
Most Non-Coding Elements likely work in cis
IRX1 is a member of the Iroquois homeobox gene
family. Members of this family appear to play
multiple roles during pattern formation of
vertebrate embryos.
gene deserts
regulatory jungles
9Mb
25
and Ultras are the tip of a functional iceberg
gene deserts
regulatory jungles
9Mb
This dense regulatory jungle contain a single
ultra
Write a Comment
User Comments (0)
About PowerShow.com