A polyphasic approach for identification of bacteria: the Burkholderia cepacia example - PowerPoint PPT Presentation

1 / 74
About This Presentation
Title:

A polyphasic approach for identification of bacteria: the Burkholderia cepacia example

Description:

Isolating DNA is normally not a problem ... when compared to eukaryote genomes (eg: Fugu rubripes : only 25% of genome is coding) ... – PowerPoint PPT presentation

Number of Views:448
Avg rating:3.0/5.0
Slides: 75
Provided by: LMG
Category:

less

Transcript and Presenter's Notes

Title: A polyphasic approach for identification of bacteria: the Burkholderia cepacia example


1
Microbial genomics
Tom Coenye
2
  • Overview
  • The sequencing of prokaryotic genomes
  • Analysis of prokaryotic genome sequences
  • Diversity of prokaryotic genome sequences
  • Use of genome sequences in biodiversity studies
  • Case studies

3
  • The sequencing of prokaryotic genomes
  • Growing the organism DNA isolation
  • Approach 1 map-based cloning sequencing
  • Approach 2 shotgun sequencing
  • Annotation

4
  • The sequencing of prokaryotic genomes
  • Growing the organism DNA isolation
  • Isolating DNA is normally not a problem
  • Biggest challenge isolating DNA from
    unculturable organisms
  • eg. Blochmannia floridanus abdomens of 100
    Camponotus floridanus ants were crushed, cell
    debris was removed by filtration, bacterial
    cells were collected and treated with Dnase I to
    remove ant DNA, then normal DNA preparation

5
Approach 1 map-based cloning sequencing
Order of fragments is known before start of
sequencing
6
Approach 2 shotgun sequencing Order of
fragments is not known before start of sequencing
7
Approach 2 shotgun sequencing
8
Approach 2 shotgun sequencing
9
Approach 2 shotgun sequencing
10
Approach 2 shotgun sequencing
11
  • Annotation the search for genes
  • Identification of genes is of utmost importance
    but not straight forward
  • rRNA genes trough similarity with known rRNA
    genes
  • tRNA genes trough similarity with known tRNA
    genes and/or use of statistical models (eg HMMs)
  • Protein-coding genes computer models

12
(No Transcript)
13
  • Annotation the search for genes
  • Protein-coding genes computer models

SD start
stop ...AGGA........ATG(XXXXXX)nTAA
P
14
  • Annotation the search for genes
  • Biological confirmation remains necessary!
  • E.g. over-annotation is a common problem

Skovgaard et al., 2001
15
  • Sequencing centers
  • Sanger Centre (UK) has sequenced more than 2
    billion bases between 4 October 1993 and now
  • Equipment includes 250 ABI 3700 and MEGABACE
    capillairy sequencers

16
Evolution of sequencing
100000000
90000000
80000000
70000000
60000000
50000000
40000000
30000000
20000000
10000000
0
77
95
03
97
17
Evolution of sequencing
45,861,651,747 bases in 31,904,910 records
18
  • Analysis of prokaryotic genomes
  • In silico analysis
  • Microarrays
  • Subtractive hybridisation
  • Proteomics
  • Making knock-out mutants

19
  • In silico analysis of prokaryotic genomes
  • Comparison of genome sequences on the computer,
    using specialised bioinformatics tools
  • Comparison of gene content
  • Alignment of genes for phylogenetic purposes
  • Looking for gene duplications
  • ...

20
(No Transcript)
21
  • Microarrays
  • DNA Microarray miniaturised version of a
    dot-blot hybridisation
  • Immobilised DNA is being hybridised with labelled
    probe DNA or RNA
  • High throughput (up to 300000 spots/array)

22
  • Microarrays
  • Different technologies
  • PCR product arrays (amplification of ORFs
    spotting or in situ PCR)
  • Oligonucleotide arrays (immobilisation of
    pre-synthetised oligos or in situ synthesis)

23
E.g. XeoChips
24
  • Microarrays
  • Study of gene expression (are these ORFs
    expressed under certain conditions?)
  • Comparative genomics
  • Comparing gene content
  • All genes or subset (eg. virulence genes)
  • Allows to include organisms in comparative
    genomics research without the availability of a
    complete genome sequence

25
  • Microarrays
  • Eg identification of new genes in Klebsiella
    pneumoniae by means of an E. coli microarray

26
Microarrays
Green E. coli specific Red K. pneumoniae
specific Yellow common genes
27
Subtractive hybridisation PCR-based way to
isolate sequences present in genome 1 but not in
genome 2
28
Subtractive hybridisation
29
  • Proteomics
  • Study of the expressed protein-complement of the
    genome, expressed by specific cells at a
    specific moment under specific conditions
  • Typical approach separation of proteins by
    2D-gel- electrophoresis (IEF SDS-PAGE) and
    identification of differentially expressed
    proteins by means of mass- spectrometry

30
Proteomics
Vb. induction of heat-shock proteins by
elevated temperature in Bradyrhizobium japonicum
(Munchbach et al., 1999)
31
  • Knock-out mutants
  • Final proof for biological function of gene
    product
  • Time consuming
  • Analysis not straight forward, for example
  • Duplicated genes
  • No phenotype
  • ...

32
  • Diversity of prokaryotic genomes
  • Which organisms have been sequenced ?
  • Genome size and organisation
  • Number of genes, functional groups
  • Visualisation of properties of prokaryotic genomes

33
Which organisms have been sequenced
? http//igweb.integratedgenomics.com/GOLD/
34
Which organisms have been sequenced
? http//www.ncbi.nlm.nih.gov/genomes/MICROBES/Com
plete.html http//www.tigr.org http//www.jgi.d
oe.gov/JGI_microbial/html/index.html http//www.s
anger.ac.uk/
35
Which organisms have been sequenced ?
Finished In progress Total Bacteria 128
391 519 Archaea 17 22
39 Total 145 413 558
36
  • Which organisms have been sequenced ?
  • Available genomes are NOT representative of total
    biodiversity
  • Emphasis on
  • Medically important organisms (humans and
    animals)
  • Organisms with biotechnological applications

37
  • Which organisms have been sequenced ?
  • Example 1 Escherichia coli
  • 4 completed genomes available
  • 7 additional genomes in progress
  • Strains K12 and O157 have been sequenced twice
  • Example 2 Agrobacterium tumefaciens C58 has
    been sequenced twice

38
  • Genome size and organisation
  • Genome size of 580074 bp (Mycoplasma genitalium)
    to 9105828 bp (Bradyrhizobium japonicum)

39
  • Genome size and organisation
  • Genome organisation
  • 1 circular chromosome (eg. Escherichia coli 4.6
    5.4 Mbp)
  • Multiple circular chromosomes (eg. Ralstonia
    solanacearum 3.7 Mbp en 2.1 Mbp Burkholderia
    cenocepacia 3.8 Mbp, 3.2 Mbp en 0.9 Mbp)
  • 1 linear chromosome (eg. Borrelia burgdorferi 0.9
    Mbp)
  • 1 linear and 1 circular chromosome (eg.
    Agrobacterium tumefaciens 2.8 en 2.1 Mbp)

40
  • Genome size and organisation
  • Plasmids can also be present
  • Borrelia burgdorferi 1 linear chromosome (0.91
    Mbp)
  • Also contains 21 circular plasmids (total 0.61
    Mbp)

41
  • Genome size and organisation
  • GC 22.5 (Wigglesworthia brevipalpis) to
    72.1 (Streptomyces coelicolor)

42
  • Genome size and organisation
  • Number of genes 480 (Mycoplasma genitalium) to
    8317 (Bradyrhizobium japonicum)

43
  • Genome size and organisation
  • Number of genes 480 (Mycoplasma genitalium) to
    8317 (Bradyrhizobium japonicum)
  • Coding density 75 (Rickettsia prowazekii) to
    94 (Campylobacter jejuni)
  • Very compact when compared to eukaryote genomes
    (eg Fugu rubripes only 25 of genome is
    coding)

44
  • Relationship genome size/number of genes
  • (Konstatinidis Tiedje, 2003)
  • Linear relationship between genome size and
    number of genes (r20.97)
  • Linear relationship between genome size and
    coding density (r20.72)

Larger genomes have more genes but not more junk
DNA !
45
  • Relationship genome size/number of genes
  • (Konstatinidis Tiedje, 2003)
  • Regulatory genes (signal transduction and
    transcriptional control) overrepresented in
    larger genomes
  • Genes involved in metabolism and transport also
    overrepresented in larger genomes
  • Gene involved in DNA metabolism are
    underrepresented in larger genomes

46
Visualisation
AGACCGAAATTTACGCACCTGTGGACAATCTGGGGAGAATTTTGAACAGT
TCCGTCTTATTCCAGTAAT TCACAGGCGTCTCGAAGACGAGAGACGCCA
CTTGCGGATTGTGGAAAAACACCACCTTATTCACCCCGCG
GCTCGGCCCGTCGGACAATTCAGAGATTTGTCCCGGTTTATCAACAGGGG
GAGAAAAACAGCGTGGAGAA CAAAAAAAGCTTCTTCCATCTGCACCTGA
TTTCGGACTCGACGGGAGAGACTCTGATGTCGGCCGGCCGC
GCCGTCTCGGCGCAGTTTCATACATCCATGCCGGTGGAACATGTCTATCC
GATGATCCGCAACCAGAAGC AGCTCGCGCAGGTCATCGATCTCATCGAC
AAGGAGCCCGGCATTGTTCTTTATACAATCGTTGATCAGCA
GCTGGCGGAATTCCTGGATCTGCGCTGCCATGCGATTGGCGTGCCCTGCG
TCAACGTTCTCGAACCGATC ATCGGCATTTTCCAGACCTATCTCGGCGC
GCCGTCCAGGCGGCGGGTGGGTGCGCAACACGCGCTGAATG
CCGATTATTTCGCGCGGATCGAAGCACTCAATTTCGCCATGGACCATGAT
GACGGGCAGATGCCGGAGAC CTATGACGATGCGGATGTCGTCATCATCG
GCATCAGCCGCACGTCGAAAACACCAACCAGCATCTATCTT
GCTAACAGGGGCATAAAGACTGCCAATATCCCGGTCGTTCCCAATGTGCC
TTTGCCCGAAAGCCTATATG CCGCGACCCGGCCGTTGATCGTCGGTCTC
GTCGCGACATCGGATCGCATATCGCAGGTTCGTGAGAACAG
GGATCTGGGTACAACCGGCGGATTTGACGGTGGCCGTTACACGGATCGCG
CCACCATCATGGAAGAGCTG AAATATGCGCGTGCGCTCTGCGCCCGCAA
CAATTGGCCGCTGATCGACGTCACACGCCGTTCCATCGAGG
AAACGGCCGCGGCGATCCTTGCCCTGCGCCCGAGGACGCGATAATCCGAA
TCGCATCATCAGGAGCAGAC AGTCGATGAAACAAGAGTTGATCCTCGCC
TCATCCAGCGCATCCCGGCAGATGCTGATGCGCAATGCGGG
GCTGACATTTTCGGCAATACCCGCGGATATTGATGAGCGTGCGCTTGATG
AGCAACTGGAACGGGACGGC GCCAGCCCCGAAGAGGTTGCGCTGGAACT
TGCGCGGGCGAAGGCTCTTGCAGTCAGTGCGCTCCATCCAG
AAGCACTGGTTCTTGGCTGCGACCAGACCATGGCGCTCGGCACACGCGTT
TATCACAAGCCAAAAAACAT GGCGGAAGCCGCGACGCATCTGCTGTCGT
TGTCCGGCAAGGTCCACCGCCTGAACAGCGCGGCTGTTCTC
GTTCACAACGGAAAGGTGGTGTGGCAGACCGTTTCCAGTGCAGAGCTTGC
CGTTCGAACCTTGAGCGCTG AGTTTGTGTCCCGCCACCTGCAGCGGGTG
GGAGAAAAGGCGCTCAGCAGCGTCGGCGCTTACCAGCTTGA
GAGGGAAGGAATCCAGCTATTCACCTCCATAGAGGGGGATTATTTCACGA
TCCTCGGTTTGCCGCTTCTG CCTCTTTTATCGAAACTACGCGACATGGA
TGTCATCGATGGCTGATTCACGTGAAACATTAACTATAAAT
GCCTTCGTTGTCGGTTACCCGATCAAACATTCCCGGTCGCCGATCATCCA
TTCCTATTGGCTGAAAAAAT TCGGTATCGCCGGTTCCTATACGGCAGTT
GAGGTCTCCCCAGACGATTTCCCGAAGTTCATTGCAACGCT
GAAGGAAGGCAAGCCGGGTGCAGCGGTGGGCGGTAACGCCACCATTCCGC
ACAAGGAAGCGGCTTACCGG TTGGCCGATCATCCCGATGCCTTGGCGGA
AGAACTCGGCGCCGCCAACACCATCTGGATGGAGGAGGGTA
AACTCCACGCGACCAACACGGATGGTTACGGTTTCGTCTCGAACCTGGAC
GAGCGGCATCCGGGCTGGGA TAAGACCCAGCGCGCGGTGGTGTTCGGCG
CCGGCGGTGCAAGCCGGGCCGTCATTCAGTCGCTGCGTGAT
CGGGATGTTGCGGAAATTCACGTCGTGAACCGTACGGTCGAGCGCGCTCG
CGAACTGGCCGACCGCTTTG GCCCACGGGTCTTTTCCCATCCCCAGGCA
GCGCTTCAGGAGGTCATGCACGGCGCGGGGTTGTTCGTGAA
47
  • Visualisation
  • Genome Atlas (http//www.cbs.dtu.dk/services/Genom
    eAtlas/)
  • TIGR CMR (http//www.tigr.org/tigr-scripts/CMR2/ch
    oose_genome.spl)
  • Artemis (http//www.sanger.ac.uk/Software/Artemis/
    )
  • ACT (http//www.sanger.ac.uk/Software/ACT/)
  • Apollo
  • SeqVISTA
  • EMBOSS, JEMBOSS, EMBASSY, ...

48
(No Transcript)
49
(No Transcript)
50
(No Transcript)
51
(No Transcript)
52
Novel approaches in taxonomy based on
whole-genome sequences
  • Supertree approach
  • Presence/abscence of characteristics
    (genes/gene families/protein folds/conserved
    indels)
  • Differences in gene content
  • Gene order / synteny
  • Differences in nucleotide composition

53
Supertrees
  • Availability of genome sequences allows to
    determine which genes are shared between a
    number of organisms
  • These genes can then be used for phylogenetic
    analysis
  • Analysis of protein-coding genes can deliver
    important additional information compared to 16S
    rRNA gene sequences
  • Combined alignments of conserved proteins

54
Presence/abscence analysis
55
Presence/abscence analysis
  • Genes
  • Gene families

56
Presence/abscence analysis
  • Genes
  • Gene families
  • Protein folds (b-propeller, TIM-barrel,
    Zn-b-lactamases, ...)
  • Conserved insertions and deletions (indels)

57
Presence/abscence analysis
glmU (UDP-N-acetylglucosamine pyrophosphorylase)
shows a 17 aa insertion in Archae and Chlamydiae
58
Presence/abscence analysis
59
Differences in gene content and gene order
  • Simple approach consider 2 genomes as bags
    filled with genes and compare content of both
  • Extension if the same genes are present, are
    they present in the same order?

60
Differences in gene content and gene order
  • Synteny less conserved than gene content
  • Syntenie lt shared genes lt identity between shared
    genes

61
Differences in nucleotide composition
  • GC (A, C, G, T)
  • Relative abundance of di/tri/tetra nucleotides
    (AA, AC, AG, AT, CA, ...)
  • Mathematical
  • rXY fXY/fXfY (X, Y A, C, G, T XY AA,
    AC, AG, AT, ..., TT)
  • d(f,g) 1/16 S r(f) r(g) (within species
    lt 20)
  • Similar statistics apply to higher-order
    nucleotides
  • Codon statistics (codon usage, GC content of
    synonymous 3rd position)

62
Examples
  • Genome-scale metabolic model of Helicobacter
    pylori 26695. Schilling et al., J. Bact.
    1844582-4593 (2002)
  • Phylogenetic position of the Aquificales
  • Comparative analysis of the genome sequences of
    Bordetella pertussis, Bordetella parapertussis
    and Bordetella bronchiseptica. Parkhill et al.
    Nature Genetics (2003) (can be downloaded at
    http//www.sanger.ac.uk/Projects/B_pertussis/Borde
    tella_genomes.pdf)

63
  • Goal
  • Thorough study of the metabolic potential of
    Helicobacter pylori based on genome sequence
  • Calculate the growth demands

64
(No Transcript)
65
  • Several key enzymes from glycolysis (PFK, PK) are
    absent deficient in hexose metabolism
  • All pentose phosphate reactions are present
    except G6P-DH oxidative branch of pentose
    phosphate pathway incomplete
  • Entner-Doudoroff pathway is present C can be
    transferred from pentose phosphate pathway to
    G3P en pyruvate
  • No PEP carboxylase PEP can not be transformed
    to oxaloacetate
  • PEP can not be transformed to TCA intermediate
  • TCA cycle is complete but in a slightly modified
    form

66
(No Transcript)
67
  • Required components Ala, Arg, His, Ile, Leu,
    Met, Phe, Val, thiamine, phosphate, oxygen and
    sulphate/cysteine
  • No requirement for purines (de novo synthesis is
    possible)
  • Ala and Arg are important C sources - use of AA
    as C sources is coupled to production of ammonia
  • 14 alternative C sources were identified
  • Oxygen necessary for production of NAD, NADP,
    CTP, UTP, dCTP, dUTP necessary because there
    is need for an e- acceptor to allow oxidation of
    FADH to FAD

68
  • 9 essential AA for humans, 8 for H. pylori 6 in
    common
  • Strategy of metabolic design seems ideal and
    could be coupled to co-evolution of this
    organism to its host (ie humans)
  • Elimination of genes required for energetically
    not favoured AA biosynthesis (proteolysis in
    stomach releases unlimited supply of AA)
  • Use of AA as C source results in production of AA
    what helps to neutralise the acid pH in the
    stomach

69
A genomic perspective on the taxonomic
position of Aquifex aeolicus
  • Genus Aquifex marine, hyperthermophilic,
    microaerophilic, hydrogen- oxidising, recovered
    from marine sediments near Iceland
  • Based on 16S rRNA sequence data one of the
    deepest branching lineages
  • But, considerable debate regarding its true
    taxonomic postion
  • Deep (type of ribosomes, EF-G and EF-Tu
    sequences, ribosomal proteins, rpoB and rpoC
    sequences)
  • Not deep but close to e-Proteobacteria
    (Helicobacter, Campylobacter, Wolinella)(cytochro
    me bc, four amino-acid insert in
    ala-tRNA- synthetase, ultrastructure)

70
16S rRNA gene
71
Differences in gene content
72
Dinucleotide relative abundance
73
The supertree
74
Conclusions (?)
  • Coming to a consistent picture is not
    straightforward
  • No particular relationship between Aquifex and
    e-Proteobacteria
  • Aquifex can be considered as a primitive species
    with primitive genes
Write a Comment
User Comments (0)
About PowerShow.com