The Promise of Metagenomics - PowerPoint PPT Presentation

1 / 62
About This Presentation
Title:

The Promise of Metagenomics

Description:

The Promise of Metagenomics – PowerPoint PPT presentation

Number of Views:1341
Avg rating:3.0/5.0
Slides: 63
Provided by: HENT7
Category:

less

Transcript and Presenter's Notes

Title: The Promise of Metagenomics


1
The Promise of Metagenomics
Robert A. FeldmanSymBio Corporation Cambridge
University Sept. 6, 2005
2
  • Outline
  • SymBio Corporation background information
  • The Amersham Genomics Dev. Group
  • Environmental Genomics (Proteorhodopsin)
  • Microbial genomic diversity (Crenarchaeal
    sponge symbiont)
  • Human microbiomics
  • Symbiont genomics (Riftia symbiont genome)
  • Deep-sea epibiont metagenomics
  • Habitat
  • Experimental design
  • Database
  • Metagenomics results
  • New vent chemoautophic pathway
  • Summary

3
SymBio CorporationSpecialists in
GenomicsPartnerships and Collaborations
  • Project based custom genomics
  • Medium to high throughput projects
  • Capacity to sequence 300,000 clones/month
  • Environmental, ecological and evolutionary
    genomics
  • Projects from preliminary data to genome
    completion and publication
  • Bioinformatics, annotation, phylogenetic and
    diversity analysis
  • Human microbiome RD

Feldman (left) and Craig Cary on board the RV
Atlantis performing the first ever DNA sequencing
at sea (Oct. 2001)
4
(No Transcript)
5
  • SymBio Corporation History
  • Amersham Biosciences Genomics Development Group
    (1999-2003)
  • Incorporated July 2003 as a California S-Corp
  • Operational Oct. 2003
  • Robert A. Feldman, Ph.D., Founder, President
    and CEO
  • Sequencing capacity 300,000 reads per month
  • Located in Menlo Park, CA
  • www.sym-bio.com

Apapane Hawaiian Honeycreeper
6
  • Core technologies and expertise
  • Bioinformatics, genomic and diversity analysis
  • Microbial genomes
  • Uncultivated microbes
  • Environmental genomics
  • Eukaryotic genomics
  • High throughput DNA sequencing
  • Microarray support
  • Academic/biotech collaborations

Iiwi, Hawaiian Honeycreeper
7
  • Current SymBio Projects
  • Microbial metagenomics
  • Microbial genomes
  • Environmental phage genomes
  • Human resequencing
  • rDNA survey studies
  • Crop cDNA genomics
  • Uncultivated marine virus genomics
  • Human microbiomics

8
SymBio is helping apply genomics to
biodiversity The Tree of Life 16S rRNA sequence
  • 1. Bacteria (the "prokaryotes") 10,000 species
  • 2. Archaea 500 species
  • 3. Eukarya (the eukaryotes) 1,500,000 species
  • protists (single celled protozoans, algae,
    amoeba)
  • plants
  • fungi
  • animals
  • mammals
  • reptiles
  • birds
  • amphibians
  • fish
  • invertebrate (insects, worms, crustaceans
    etc...)

The Three Domains of Life
9
De novo DNA sequencing Continues to grow
exponentially 400,000 beetle species
10
  • Amersham Projects
  • Protocol development and validation
  • Alpha test new instrumentation, bioinformatics
  • Alpha test new reagents, lab methods
  • Uncultivated marine microbes
  • Environmental genomics, uncultivated Archaea
  • Complete microbial genomes, ecology and
    evolution
  • Deep-sea hydrothermal vent genomics
  • Endangered species, mitochondrial genomes
  • Human cSNP discovery
  • Human and NHP phylogenomics

11
SymBio personnel helped develop the MegaBACE
platform
  • High-throughput CAE DNA sequencing
  • 384 capillary instrument
  • 270,000 bp/run 2,100,000 bp/day/instrument
  • The workhorse of genomics

- confocal optical scanning- high resolution
LPA matrix- 4-color detection, 2-laser
scanning- capillary lifetime gtgt1000 runs
12
SymBio Personnel Helped Develop The Scierra
Laboratory Workflow System
  • Manages and tracks workflow resources such as
    samples, reagents, and instruments
  • Manages and tracks workflow activities such as
    sample ordering, registration and preparation,
    instrument usage, and data analysis
  • Allows scientists to submit samples for
    analysis through a web based interface
  • Stores and retrieves information associated
    with the workflow and the outputs from the
    various laboratory activities
  • Includes a flexible reporting facility that
    will query across each of the LWS products
    Increases laboratory efficiency and ensures
    higher quality data

13
SymBio Personnel helped develop TempliPhi ?29
DNA Polymerase Based Rolling Circle Amplification
of Templates for DNA Sequencing
Phi29 DNA Polymerase
  • Single subunit
  • Proofreading
  • Highly processive
  • gt70,000 nt/binding event
  • Turnover 25-50 nt/s

Multiple Primers (random hexamer)
  • Stable gt 12 hours _at_ 30 oC
  • Strong strand displacement activity

14
  • Exponential kinetics
  • Isothermal reaction
  • Up to 107-fold amplification
  • Products generated
  • Double-stranded
  • Ready for sequencing
  • Restriction digest
  • Transformation
  • Cloning
  • Archiving

15
Conventional Methods - Extensive time, labor and
associated costs
supernate
G/STE NaOH/SDS KOAc
Sequence
ETOH
Conventional method Total time, gt20 hours Total
manipulations, 11 Plasticware for 384, 4 x 96
well plates
16
Templiphi generates lots of DNA
In routine use at SymBio
17
Products of the TempliPhi reaction sequence well
Clean sequence
Long reads
18
Current SymBio ProjectsEnvironmental Genomics of
Uncultivated Microbes
rRNA in situ hybridizations of Oceanic Sample
from Monterey Bay, CA
  • 99 Earths life forms are uncultivated
  • Ubiquitous in nature, found in ocean waters
    (Pacific, Antarctic), soils, lakes sediments,
    symbionts of sponges
  • Huge global biomass, approx. 16 total oceanic
    picoplankton

Green Archaea Blue Bacteria
19
Phylogeny of Uncultivated Archaea
C. symbiosum (green) Sponge nuclei (red)
Uncultivated Archaea are found in both the
Crenarchaeota and Euryarchaeota groups. These
microbes share a common ancestry with
hyperthermophiles yet live in mesophilic to
psychrophilic environments. Not yet isolated in
the laboratory, they have an unknown physiology.
Large insert cloning (BACs gt 300Kb),
phylogenetic anchoring, DNA sequencing and
bioinformatic analysis is allowing a fine scale
genomic dissection of these recently discovered
globally significant groups of microbes.
20
  • BACs from Uncultivated Marine Microbes
  • Phylogenetic anchor with rRNA gene (Stein et al,
    1996)
  • Sequence BAC
  • BLAST gene identification
  • Monterey Bay sample
  • 135 Kb Bac clone
  • gamma proteobacterial 16S rRNA gene (all others
    too)
  • Except Archael rhodopsin gene
  • Proteorhodopsin - new rhodopsin in gamma
    proteobacteria

21
Proteorhodopsin binds retinal and pumps protons
across a membrane in response to light
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
Uncultivated Medical Microbes The microbiome
  • The other human genome
  • Complete ignorance of normal composition of
    human microflora
  • 500 - 1000 species in every person
  • 50 cannot be cultivated
  • 5 Mbp, 4000 genes/genome 2.5 - 5.0 Bbp, 2 - 4
    million genes
  • 100 X as many genes in microbiome than human
  • a gene is a gene, makes proteins - cellular
    physiology, metabolic byproducts, toxins
  • role in normal human physiology?
  • Specific association of microbes and disease?
    (H. pylori and ulcers!)
  • Implications in several diseases
  • cardiovascular disease
  • prostatitis
  • Crohns disease
  • kidney stones
  • Tourettes syndrome
  • Combines genomics, diagnostics, bioinformatics
    into discovery opportunities

27
SymBio Microbiomics Projects
  • Active
  • Crohns Disease survey
  • Dental survey
  • Vaginal microflora survey
  • Chronic wound bugs survey
  • Planning
  • Expanded wound bugs (metagenomes)
  • Global HIV genomes
  • Tissue archived HIV genomes

28
Correspondence Between Wound rRNA Sequences and
Sequenced Public Genomes
1Same species as rRNA clone 2Same genus as rRNA
clone 3No genome in genus of rRNA
clone 4Frequency of rRNA clones in 36 chronic
wound samples 5Clones with frequency lt 1
29
(No Transcript)
30
  • The Mid-ocean Ridge
  • single largest geologic feature on earth
    (40,000 miles)
  • occurs at continental margins
  • spreading centers (new crust formed, hot vents)
  • subduction zones (crust reclaimed, cold seeps)

Jupiters Moon Europa
31
  • Unique ecosystem discovered 1977
  • giant tube worms, clams, mussels
  • chemoautotrophic based food production
  • transient, ephemeral deep-sea oases,
    colonization?

Tube worm wall 9N, East Pacific Rise
32
The Riftia Endosymbiont Genome Project
  • East Pacific deep-sea hydrothermal vents
  • Host is completely dependent on the symbiont
    for nutrition.
  • Low levels of genetic variability across
    geographic and host species ranges and may exist
    as monocultures within a given host animal.
  • Bacterial DNA extracted from a given host
    animal should show little to no genetic
    variability

With Jeff Stein (Quorex), Horst Felbeck
(UCSD-SIO)
DSV Alvin
33
A meta-genome level analysis of an extreme
microbial symbiosis
A comprehensive analysis of a complex
microbial community from an extreme environment.
A multi-institutional, interdisciplinary
study coupling environmental genomics with
geochemistry
Craig Cary, Ph.D. (Univ. of Del.) Molecular
microbial ecology Barbara Campbell, Ph.D (Univ.
of Del.) Library construction and
analysis Robert Feldman, Ph.D. (SymBio
Corporation) Sequencing/annotation Guang R. Gao,
Ph.D (Univ. of Del.) Bioinformatics Roy
Daniel, Ph.D. (Univ. of Waikato) Proteomics
and protein kinetics Jeff Stein, Ph.D. (Quorex
Pharm.) Quorum sensing Javier Garcia, Ph.D.
(Univ. of Del.) Metabolic modeling Alison
Murray, Ph.D. (DRI) Microarrays George Luther
(Univ. of Del.) Geochemistry
Funded through the National Science Foundation
GenEN Biocomplexity Initiative
34
  • The deep-sea hydrothermal vent metagenome project
  • microbial community genome scanning
  • Common on the East Pacific Rise
  • Complex microbial community
  • Epsilon proteobacteria
  • Related to Helicobacter, (mammal paths)
  • 40-50 different microbes
  • Hottest known metazoan
  • 81ºC, transient at 102 ºC
  • 500,000 clones
  • Sequence and BLAST ID
  • Microarray
  • Dissect microhabitat heterogeneity

Alvinella pompejana
Deep sea Alvinellid community
35
Pompeii Worm Habitat Characterization
20C
T 80C
42.3C - 94.0C
  • In tube temperature 42 - 94C
  • Major sulfur species FeSaq Fe 2
  • Free H2S/HS- was not detected
  • O2 not detected
  • high heavy metal concentrations

Most thermotolerant metazoan on the planet
36
Microbial community coat
The Pompeii Worm (Alvinella pompejana)
  • Colonies on side of chimneys in diffuse flow
  • Outside temp 30-40 oC
  • No endosymbionts

37
Community Composition Analysis
  • Episymbiont 16S rDNA Diversity
  • Form single clade within the
  • epsilon Proteobacteria
  • Other minor members include
  • delta Proteobacteria and
  • spirochetes
  • Epsilon Proteobacteria are
  • sulfur reducers and oxidizers
  • Dominant epibionts occupy broad
  • thermal and geochemical environ.
  • Dominant episymbiont culturing
  • attempts have been negative

Need for environmental and functional genomics
(large insert and meta-genome approaches) to
define role of symbionts
38
Metagenome Comparison of 16s rRNA
DGGE Ap201 (Bio9 2002)
NJ Distance Tree based on shotgun 16S sequences
(338-519)
39
Research Hypotheses
  • Under the constraints imposed by the geochemical
    environment the epsilon episymbiotic consortia
    employ a core basic metabolic strategy.
  • The episymbiotic association can be defined as a
    mutualistic partnership enabling survival in this
    hostile environment.
  • The episymbiont community employs an extensive
    range of sensing capabilities to change and
    respond to their dynamic environment.
  • The episymbiont community detoxifies the harsh
    chemical environment experienced by the worm.
  • Protein eurythermalism is a common adaptation of
    the Alvinella symbiont proteome.

40
  • Objectives
  • Perform in situ chemical analysis and collect
    Alvinella pompejana from a variety of habitats
  • Construct a metagenomic library of the
    representative symbiont population from
    geographically distinct habitats
  • Sequence 300,000 clones, or until frequency
    analysis of the metagenome demonstrates
    saturation
  • Analyze these sequences and populate a database
    with gene information, structural RNA
    classification, or by a number of other variables
    (i.e. metabolic pathway or functional category)
  • Develop microarrays to test use of the core
    metabolism or specific genes in the environment
  • Test eurythermal capabilities on enzymes present
    in core metabolism

41
A meta-genome approach
Central hypothesis Intense constraints imposed by
the geochemical environment on the Alvinella
episymbiotic consortia selected a basic metabolic
strategy (core metabolism).
  • Approach
  • 2 oceanographic cruises (2003, 2004) -
    geochemistry - collection
  • Extensive sequencing effort (300,000 shotgun
    clones)
  • Intensive bioinformatic stream - statistical
    analysis and modeling
  • Microarray development - in situ testing of
    hypotheses
  • Protein thermal stability characterization -
    Eurythermalism
  • Quorum sensing - microbial consortia maintenance

42
Metagenome cloning and sequencing strategy
Collect A. pompejana worms from 9N and 13N EPR
(cruises 2003,2004) Measure associated chemistry
and temperature.
Extract DNA and RNA from the episymbiont community
Tested for eukaryotic contamination (18S and
?eta-actin qPCR)
Generate shotgun libraries of 1.5-3 kB inserts
(TOPO and pSMART)
Array clones and bi-directional sequencing
(SymBio)
Trimming and quality assessment (Cut 15/window 11)
43
Library Construction
  • Collect A. pompejana worms from 9oN and 13oN
    EPR.
  • Measure associated chemistry and temperature.
  • Extract DNA from the episymbiont community.
  • Generate shotgun libraries of 1.5-3 kB inserts
    (TOPO and pSMART).
  • Array clones and sequence (Amersham
    Biosciences/SymBio).

44
Metagenome sequencing results
45
Raw Data (500,000 nucleotide sequences in FASTA
format)
  • using phred20 quality data for input
  • sequences 100 bases minimum

tBlastX Translates in 6 frames and
searches against NR DB
RBS finder Provides support for functional protein
Softberry FGenesB Annotator Translation, gene
finding and annotation
Coding genes/gene fragments
pFAM
Structural RNAs
BlastN
KEGG
Redundancy analysis of the metagenome
(NRMG) internal BLASTP of all against all
sequences, the minimal metagenome will be
available as a searchable data set and also used
for frequency analysis
Metagenome Annotation Pipeline (MAP)
  • Semi-automated
  • Retain all outputs
  • Direct output into customized database
  • gt150 outputs

46
Informational System - design and development
Database constraints
  • Handle large dataset of non-contiguous DNA
  • Strict compliance with 21 use cases
  • Cross linking of database tools with universal
    output
  • Web-based interface to all levels of database
  • An Entity Relational Model
  • Allows semi-automated annotation
  • Storage of all data - raw sequence to annotation
  • Reduced redundancy to increase efficiency for
    queries
  • 700 GB currently in 25 tables
  • User designed database - designed intuitive set
    of rules
  • Intuitive web-base interface -

47
rTCA cycle genes
48
Frequency analysis of top 12,000 metagenome genes
by NCBI number
Most genes are sequenced less than 5X
49
Alvinella Epibiont Metagenome Frequency
Analysis Results of ALL vs. ALL BlastN Search
(min 100 bases overlapping, Cut20 sequence)
To observe distribution of Blast hits at
different level of identity Regardless of
repeats
Overlapping of three curves indicates that
community is very similar
50
Core Metabolic Genes
As an initial test on coverage of the metagenome,
the LB library was assembled. The average
coverage for the core genes was 3, suggesting
the presence of 3 genome equivalents in the
library. Many of these genes were the same as
seen above.
51
Analysis of COG functional groups distribution
Out of 72K sequences FgenesB placed 28,850 into
COGs
52
Comparison of COG functional groups distribution
Out of 72K sequences 28,850 were placed into COGs
53
Metagenome ORFs of Interest
CHO, N, S Metabolism ATP citrate lyase, other
rTCA enzymes Nitrate reductase, nitrite
reductase, nitric and nitrous oxide
reductases Heterodisulfide reductase Sulfite
and sulfide dehydrogenases Polysulfide
reductase Sulfur oxidation protein
Symbiosis/Pathogen related genes Symbiosis
island integrase Toxin secretion ATP-binding
protein Cell surface adhesion protein Fibronecti
n/fibrinogen-binding protein Ser-Asp rich
fibrinogen-binding, bone sialoprotein-binding
protein Invasion antigen B
Metal/Drug/Other Resistance Cation efflux
system protein CzcA (Cd, Zn, Co) Ni-Co-Cd
resistance protein Multidrug resistance
protein Lysis tolerance protein (penicillin
tolerance protein) Toluene resistance protein
54
Functional Genomics Reveals Presence of the
Reductive TCA Cycle
Two 40 kb clones containing dominant 16S rDNA
phylotypes were sequenced from a fosmid library
of the A. pompejana symbiont community. Common
ORFs include the key gene in the reductive TCA
cycle, ATP citrate lyase (aclAB).
7G3
6C6
16
rDNA
C. tepidum
Vibrio cholerae
C.
jejuni
Other
No significant matches
lyase
Aquifex
ATP citrate
Fosmid analysis of A. pompejana symbionts
55
Reductive Tricarboxylic Acid Cycle
  • KEY ENZYMES
  • ATP citrate lyase (aclB)
  • 2-oxoglutarateoxidoreductase (oor)

56
Presence of rTCA enzymes in the A. pompejana
episymbiont metagenome
Over 9900 ORFs generated significant hits after
BLAST analysis of 18818 clones from shotgun
libraries prepared from the episymbiont
communities of worms from 9N and 13N EPR. The
rTCA enzymes were found multiple times in these
libraries.
57
Levels of genomic and expressed Alvinella beta
subunit ATP citrate lyase clones
A paradigm shift
  • Two key genes in rTCA (acl and oor) are
  • diverse and a consistent component of epibiont
    community
  • expressed at an equivalent level of diversity
  • metagenome shown that all other genes in rTCA
    present

Recent results for Alvinella and environmental
samples indicate
  • Diversity of epsilons and aclB and oorA greater
    than gammas and RiBisco (I,II)
  • Diversity of expressed aclB and oorA more
    ubiquitous and greater than RiBisCo (I,II)

Reverse TCA maybe as or more important than
Calvin Benson for carbon fixation in vents
58
Summary (Alvinella metagenome)
  • Shotgun library sequencing effort - 300K (
    keeping 85).
  • 150 Mbp analyzed
  • 50-60 genome equivalents
  • Developed semi-automated annotation pipeline and
    web-based biologist designed database.
  • Over 40 of acceptable sequences assigned COG
    function.
  • COG representation inline with other epsilons.
  • Many ORFs associated with chemoautotrophy,
    denitrification, sulfur metabolism, symbiosis
    (important/expressed).
  • Several genes identified that are linked to metal
    detoxification, drug resistance.
  • rTCA appears to be a functional/important
    pathway for the epibionts.

59
Summary (SymBio Corporation)
  • SymBio exists to enable academic genomics
    projects
  • Flexible, efficient solution (we do this
    everyday)
  • Alternative to genome centers, DNA sequencing
    houses
  • Medium to high throughput projects
  • Environmental, ecological, evolutionary,
    microbial, uncultivated, eukaryotic, cDNA
    genomics
  • Projects contributions from preliminary data
    generation to genome completion and publication
  • Bioinformatics, annotation, phylogenetic and
    diversity analysis

60
  • Acknowledgements
  • David Mead, Lucigen Corporation
  • DSV Alvin and RV Atlantis Crew
  • NSF GenEN Biocomplexity Program
  • Amersham Biosciences

61
Acknowledgements
SymBio Corporation Mojgan Amjadi Sasha
Lazetic Amir Ghadiri Roger Bhan Danielle
Hern Danny Yung Jeff Omega Meredith
Chabrier Chingying Huang Will McElroy Alec
Manoukian Gerry Deckert John Edwards Kevin
Owyang Charito Ventura Wensheng Chen Amersham
Biosciences Genomics Applications Chris
Gates Kathy Dains Shellie Bench George
Zhang Antares Pham Lisa Han Kristy
Ramsey Kathy Caldwell Jinsoo Kim Rachel
Villacorte Gavriella Krause Audrey
Shuster Monterey Bay Aquarium Research
Institute Ed DeLong Jose de la Torre Oded
Beja Diversa Corp. Desert Research Inst. Ron
Swanson Alison Murray Dan Bensen Joe Gczyminski
Univ. of Illinois Carl Woese Gary Olsen Brenda
Wilson Quorex Pharmaceuticals Jeff Stein Univ.
of Delaware Craig Cary Barb Campbell UCSD-SIO H
orst Felbeck Terry Gaasterland San Diego
Zoo/CRES Ollie Ryder Leona Chemnick Ya-Ping
Zhang Univ. of Colorado Norm Pace Dan Frank
62
Recent Publications   Feldman R.A. (2004)
Coevolution of microbial pathogens and their
hosts, in, Microbial Genomes, ed. by C. Fraser,
T. Read and K.E. Nelson, Humana Press, Totowa
NJ.   Feldman R.A., S. Bench, D. Bensen,
M.Amjadi, A. Ghadiri, H. Felbeck, J.Stein (in
prep) Genome of the uncultivated endosymbiont of
the deep-sea hydrothermal vent tubeworm Riftia
pachyptila.   N. Saunders, T. Thomas, P. Curmi,
J. Mattick, E. Kuczek, R. Slade, J. Davis, P.
Franzmann, D. Boone, K. Rusterholtz, R. Feldman,
C. Gates, S. Bench, K. Sowers, K. Kadner, A.
Aerts, P. Dehal, C. Detter, T. Glavina, S. Lucas,
P. Richardson, F. Larimer, L. Hauser, M. Land and
R. Cavicchioli (2004), Mechanisms of thermal
adaptation revealed from the genomes of the
Antarctic Archaea, Methanogenium frigidum and
Methanococcoides burtoni, Genome Research   Oded
B., E. V. Koonin, L. Aravind, L. T. Taylor, H.
Seitz, J. L. Stein, D. C. Bensen, R. A. Feldman,
R. V. Swanson, and E. F. DeLong (2002),
Comparative genomic analysis of archaeal
genotypic variants in a single population, and in
two different oceanic provinces, Appl Environ
Microbiol, 68(1)335-345   Feldman R.A., and D.
Harris, (2000) Beyond the human genome
fine-scale, high-throughput dissection of earths
microbial biodiversity. J. of Clinical Ligand
Assay, 23(4)256-261.   Beja O., Aravind L.,
Koonin E.V., Suzuki M.T., Hadd A., Nguyen L.P.,
Jovanovich S.B., Gates C., Feldman R.A., Spudich
JL, Spudich EN., and E. F. DeLong, (2000)
Bacterial rhodopsin evidence for a new type of
phototrophy in the sea, Science,
2891902-1906.   Beja O., Suzuki M.T., Koonin
E.V., Aravind L., Hadd A., Nguyen L.P.,
Villacorta R., Amjadi M., Garrigues C.,
Jovanovich S.B., Feldman R.A., E. F. DeLong,
(2000) Construction and analysis of bacterial
artificial chromosome libraries from a microbial
assemblage. Environ. Microbiol., 2516-529.
Write a Comment
User Comments (0)
About PowerShow.com