THE PROMISE OF 454 SEQUENCING IN CHARACTERIZING NATURAL DIVERSITY why sequencing centers will be out - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

THE PROMISE OF 454 SEQUENCING IN CHARACTERIZING NATURAL DIVERSITY why sequencing centers will be out

Description:

Fish. Human blood. Human stool. Human other. Sampling. Sites. Terrestrial/Soil. Amazon rainforest ... X-ray Cryst. Instruments. 4 Col. 2.35 TB. SRB/Web ... – PowerPoint PPT presentation

Number of Views:228
Avg rating:3.0/5.0
Slides: 28
Provided by: robe155
Category:

less

Transcript and Presenter's Notes

Title: THE PROMISE OF 454 SEQUENCING IN CHARACTERIZING NATURAL DIVERSITY why sequencing centers will be out


1
(No Transcript)
2
(No Transcript)
3
(No Transcript)
4
THE PROMISE OF 454 SEQUENCING IN CHARACTERIZING
NATURAL DIVERSITYwhy sequencing centers will be
outmoded in 6 months
GSC, Cambridge, Sept 2006
  • Rob Edwards

Dept. Biology , SDSU, San Diego, CA Computational
Sciences Research Center, SDSU , San Diego, CA
Center for Microbial Sciences, San Diego,
CA Fellowship for Interpretation of Genomes,
Chicago, IL The Burnham Inst. for Medical
Research, San Diego, CA IMEC, LLC, San Diego, CA
5
Outline
  • Fabulous four-five-four for facile functional
    findings
  • Is community structure antiestablishment?
  • Functional analysis is a blast
  • Why people suck
  • Why were screwed and what weve done

6
Metagenomics
200 liters water 5-500 g fresh fecal matter
Concentrate and purify viruses
Epifluorescent Microscopy
Extract nucleic acids
DNA/RNA LASL
Sequence
Breitbart et al., multiple papers
7
Pyrosequencing
whole genome amplification
5-100ng DNA
2-5 µg DNA
www.454.com
8
454 Sequence Data(In one year plus a bit)
  • 71 libraries
  • 40 microbial, 31 viral (many partial plates)
  • 1,309,019,537 bp total
  • 45 of the human genome
  • More than all complete and partial bacterial
    genomes
  • gt10 of community sequencing of JGI per year
  • 12,632,567 sequences
  • Average 177,923 per library
  • Average read length 103.5 bp
  • Av. read length has not increased

9
Metazoan associated Corals Fish Human
blood Human stool Human other
Sampling Sites
Freshwater Aquifer Glacial lake
Marine Near-shore water Off-shore water
Near- and off-shore sediments
Extreme Hot springs (84oC 78oC) Soda lake
(pH 13) Solar saltern (gt35 salt)
Terrestrial/Soil Amazon rainforest Konza
prairie Joshua Tree desert
Air
10
Can you assemble (100bp) 454 sequences?
Thanks Lutz Krause
11
Community structure
Community structure based on frequency of
finding overlapping fragments from the sequences
12
Functional Analysis UsingThe SEED
Database.Developed By FIG
  • http//www.theseed.org/

Current version 661 Bacteria (396 complete) 38
Archaea (26 complete) 562 Eukarya (29
complete) 82 Metagenomes
13
Functional analysis using the SEED
14
Heat Map for Comparing Frequencies
15
Phages In The Worlds Oceans
16
SAR Aligned Against the Chlamydia ?4
Individual sequence reads
Coverage
Concatenated hits
Chlamydia phi 4 genome
Chl4 ORF calls
12,297 sequence fragments hit using TBLASTX over
a 4.5 kb genome
17
Outline
  • Fabulous four-five-four for facile functional
    findings
  • Is community structure antiestablishment?
  • Functional analysis is a blast
  • Why people suck
  • Why were screwed and what weve done

18
Phages, Reefs, and Human Disturbance
19
Phages, Reefs, and Human Disturbance
Palmyra
Washington
Fanning
The Northern Line Islands Expedition, 2005
20
16S rDNA at each island
21
Christmas to Kingman Bias in No. Phage
Hosts Negative numbers mean relatively more phage
hosts at Kingman
22
Outline
  • Fabulous four-five-four for facile functional
    findings
  • Is community structure antiestablishment?
  • Functional analysis is a blast
  • Why people suck
  • Why were screwed and what weve done

23
Computational Challenges
  • Sequence annotations and analysis
  • What is there?
  • What is it doing?
  • How is it doing it?
  • Gene predictions in unknowns
  • Lutz Krause
  • Sequence comparisons
  • BLAST
  • Other ways to rapidly compare short sequences
  • What happens when everyone is using 454
    sequencing?
  • Metadata or just data?

24
Sequence data from 21 libraries
600 million bp
6 million sequences
  • Each BLASTX search takes 1,000 CPU hours
  • 71 libraries 71,000 CPU hours or 8.1 CPU years
  • Users want
  • repeat runs,
  • TBLASTX,
  • more analysis
  • more data
  • more, more, more, more

25
Life Sciences Gateway
SOAP interface for job submission and control
26
TeraGrid Resources
27
SDSU Forest Rohwer Liz Dinsdale
Beltran Rodriguez-Brito USF Mya
Breitbart Rohwer Lab Linda Wegley Florent
Angly Matt Haynes
FIG Veronika Vonstein Ross Overbeek
Annotators
ANL Rick Stevens Bob Olsen
Also at SDSU Anca Segall Stanley Maloy
Math Guys_at_SDSU Peter Salamon Steve Rayhawk
Bielefeld Lutz Krause
MIT Ed DeLong
SIO Stuart Sandin
Write a Comment
User Comments (0)
About PowerShow.com