Don%20Seto - PowerPoint PPT Presentation

About This Presentation
Title:

Don%20Seto

Description:

Don Seto – PowerPoint PPT presentation

Number of Views:266
Avg rating:3.0/5.0
Slides: 152
Provided by: chris82
Learn more at: http://www.binf.gmu.edu
Category:
Tags: 20seto | don | taef

less

Transcript and Presenter's Notes

Title: Don%20Seto


1
Don Seto Dept of Bioinformatics and Computational
Biology dseto_at_gmu.edu Sept 8, 2008
2
Binf 732 Genomics DNA sequencing and analysis
applications I Historical perspectives DNA
chemistry and biochemistry (basic
research) Molecular biology and the Central
Dogma Recombinant DNA technology (applied
research) DNA sequencing methodology Why
genomics? Genome sequencing strategies Instrum
entation Good data/bad data Applications of
sequence data Instrumentation (technology
development) Data processing signal
resolution signal to noise base-calling seque
nce assembly QC Viruses and genomics (new
insights)
3
Binf 732 Genomics DNA sequencing and analysis
applications II Genome annotation Applications
of genome methodology small scale Model
organisms, surrogate for human biology ? Big
science and Large-scale and high-throughput
sequencing Industrial-strength Human Genome
Project () Next generation DNA analysis
technology two examples Cancer genomics
4
Genomics- what is it?
  • Its a new, and changing/adapting field
  • ornl.gov Genomics- The study of genes and their
    function
  • Answers.com The study of all of the nucleotide
    sequences, including structural genes, regulatory
  • sequences and noncoding DNA segments, in the
    chromosomes of an organism
  • Wiki (genomics) The study of an organisms
    entire genome in contrast, the investigation of
  • single genes, their functions and roles does
    not fall into the definition of genomics
  • - Cites US EPAs definition
  • Study of all the genes of a cell or tissue at the
    DNA (genotype), mRNA (transcriptome)
  • or protein (proteome) levels also ???-ome
  • FSanger, et al. sequenced complete genomes of a
    bacteriophage F-X174 (5,368 bp 1977)
  • and a mitochondrion human (16,500 bp 1981)
  • Now known as the Cambridge Reference Sequence
    (CRS)
  • Reference for studies on human evolution,
    population genetics and mito disease
  • Established techniques of sequencing, genome
    mapping, data storage and bioinformatics analyses

5
This is Genomics!
  • NYT Jan 22, 2007 Close-ups of the
  • Genome, Species by Species by Species
  • Circos interactive site/resource
  • Outer band represents each speciesfirst
    chromosome
  • Numbers represent millions of bp on the
    chromosome
  • Bar charts tell how many bp, 0-1M, match part of
    human chromosome
  • Line charts show what is similar to each of
    other five genomes
  • Lines join the 200 regions on ea chromosome most
    similar to human
  • thicker more similar
  • OTHER types of comparisons
  • eg, BRCA1- green lines represent protein
    similarity

6
DNA and recombinant DNA, short history of....
  • Miescher 1868. First isolation of nucleic acids
    (RAltmann 1889) from salmon, as nuclein )
  • Levene 1919. Identified four bases, sugar and
    phosphate as components P-S-B order
  • Griffith 1928. S.pneumonae inert factor is
    infectious, gtdeath is functional
  • Chargaff 1940s. Base ratios AT, GC PurPyr
  • Avery, MacLeod, McCarty 1944. S.pneumonae
    nuclease sensitive follow-up
  • Lederberg 1945 Wollman and Jacob 1955.
    Conjugation/ bacteria sex
  • Hershey, Chase 1952. T2 DNA injected, protein
    coat outside
  • Watson, Crick, Wilkins, Franklin 1952. Double
    helix structure, implications of replication
  • Kornberg, et. al. 1958. Biochemical basis of DNA
    replication
  • Meselson and Stahl 1958. Semi-conservative
    replication
  • Matthaei and Nirenberg Khorana 1961-1965.
    Genetic code
  • 1971. Specific cleavage of SV40 by RE. Danna
    and Nathans, (71) PNAS 682913
  • Kelly and Smith, (70) JMB 51393
  • 1972. DNA cloning
  • 1975. Asilomar Conference on Recombinant DNA
    (organized by Paul Berg)
  • (Ref H. Judson. Eighth Day of Creation. 1979)

7
Nucleic acids chemistry
  • Biochemistry as mRNA, tRNA and rRNA (sn, iRNA)
    DNA
  • Roles information, structure, mediators (,
    regulatory, signal, energy)

8
(No Transcript)
9
Monomers linked
  • Properties and constraints
  • Physical
  • Chemical
  • Biological

10
Chargaffs Rules
11
Bridging chemistry to biology Structure of the
DNA double helix
James Watson, 1928-pres
Francis Crick, 1916-2004
  • http//www.achievement.org/autodoc/photocredit/ach
    ievers/wat0-001
  • DNA photo wiki

12
DNA, very stable and contains highly specific
conserved information
-How to ensure fidelity -How to access it without
losing it?
13
RNA secondary structures
14
Double helix is a problem, so are secondary
structures
15
Importance of secondary and/or local structures
as biochemistry
  • AAV genome is 4680 nucleotides
  • Integration into chr 19 unique site
  • DNA and secondary structure
  • Rep proteins bind to integration site
  • ROwens et al, 1994

16
Physical chemistry of DNA replication, RNA
transcription
17
Mediation of proteins emulates physical
chemistry (proteins interact with nucleic acids
18
Bridging chemistry to biology
  • Buchner
  • biology (extracts) to synthesize biochemicals
  • Approach to research
  • Freezer of proteins, as reagents

19
Bridging chemistry to biology
  • Cell-free extract of yeast cells as
  • press juice ferments sugar
  • -gt living yeast cells not needed for
    fermentation

1860-1917
20
Proteins structure and function, roles in the
cell
  • Paradigm change- proteins not as research topic
    but
  • As reagents and tools, and
  • Processes as well

21
Metabolic pathways within cells
22
How is the genome information accessed? -recogniti
on sites for gene expression -same for
replication?
  • Proteins bridge DNA and molecular biology of the
    cell

23
Initiation of bacterial chromosome replication
24
Genomics of initiation of bacterial chromosome
replication
  • Replication initiation of broad host range
    plasmid RK2, oriV
  • Amino acid comparison of DnaA proteins, 47.1 to
    83.7 similarity
  • Most conserved regions domain III- nucleotide
    binding region
  • and domain IV- DNA binding site
  • Domain II longer in S. coelicolor than E. coli-
    added role?
  • Arrows indicate protein-protein interacting
    domains
  • http//www.jbc.org/cgi/content/full/275/24/18454/F
    1

25
Proteins interact very specifically with DNA
sequences
26
DNA-binding proteins have several roles
  • Modifiers
  • Activators
  • Gene expression
  • Other processes
  • Repressors
  • ditto
  • Recruiters
  • Stabilizers

27
Replication proteins as a macromolecular
assembly Or, today as, Replisome
and involved in the central processes of the
cell, noted as Central Dogma of Molecular
Biology
28
Biochemistry inside the cell
29
Central Dogma of Molecular Biology seemingly
often refuted by new understandings
  • Describes information flow inside the cell
  • How can this be the Central Dogma if it does
    not hold up?
  • LMoran blog http//sandwalk.blogspot.com/2007/01
    /central-dogma-of-molecular-biology.html

30
Central Dogma of Molecular Biology seemingly
often refuted
  • Many younger scientists or the Recombinant DNA,
    Molecular Biology and Genomics era
  • read/studied Watsons Molecular Biology of the
    Gene
  • not many have read the original papers
  • LMoran blog http//sandwalk.blogspot.com/2007/01
    /central-dogma-of-molecular-biology.html

31
Central Dogma of Molecular Biology perception
corrected
  • http//sandwalk.blogspot.com/2007/01/central-dogma
    -of-molecular-biology.html

32
Central Dogma of Molecular Biology perception
corrected
  • Still, prions?
  • http//sandwalk.blogspot.com/2007/01/central-dogma
    -of-molecular-biology.html

33
Elongation
  • Components, eg proteins and nucleic acids, and
    small molecules, may be used as reagents
  • once their biochemical roles are understood,
    and once they can be isolated and stabilized
  • Molecular biological processes, eg, replication
    and transcription may be used as reagents
  • Both reagents may be modified, once understood
    ex, Mn2 for Mg2 dNTP vs ddNTP replication
  • Along with the same from Genetics, Cell biology,
    etc

34
DNA and recombinant DNA, short history of....
  • Miescher 1868. First isolation of nucleic acids
    (RAltmann 1889) from salmon, as nuclein )
  • Levene 1919. Identified four bases, sugar and
    phosphate as components P-S-B order
  • Griffith 1928. S.pneumonae inert factor is
    infectious, gtdeath is functional
  • Chargaff 1940s. Base ratios AT, GC PurPyr
  • Avery, MacLeod, McCarty 1944. S.pneumonae
    nuclease sensitive follow-up
  • Lederberg 1945 Wollman and Jacob 1955.
    Conjugation/ bacteria sex
  • Hershey, Chase 1952. T2 DNA injected, protein
    coat outside
  • Watson, Crick, Wilkins, Franklin 1952. Double
    helix structure, implications of replication
  • Kornberg, et. al. 1958. Biochemical basis of DNA
    replication
  • Meselson and Stahl 1958. Semi-conservative
    replication
  • Matthaei and Nirenberg Khorana 1961-1965.
    Genetic code
  • 1971. Specific cleavage of SV40 by RE. Danna
    and Nathans, (71) PNAS 682913
  • Kelly and Smith, (70) JMB 51393
  • 1972. DNA cloning
  • 1975. Asilomar Conference on Recombinant DNA
    (organized by Paul Berg)
  • (Ref H. Judson. Eighth Day of Creation. 1979)

35
Basic science to applied science laboratory to
industry Bacteriophage host range restriction
system
  • Or restriction-modification system
  • 1950s, bacteriophage can infect and replicate in
    one strain of bacterium and not another
  • ie, the infected strain inhibited or restricted
    the growth of the viruses grown in another strain
    first
  • Due to sequence-specific restriction enzyme
  • Recognition site 4-6 bp long and often
    palindromic sequence
  • Paired with own modification enzyme to protect
    its own DNA

36
Where did molecular biotechnology begin? Basic
science to applied science laboratory to industry
  • With HSmith and WArber, Nobel Prize 78
  • Molecular scissors as phage host range
    restriction
  • Fundamental tool for recombinant DNA technology

37
Where did molecular biotechnology begin? Basic
science to applied science laboratory to industry
  • Invented a method of cloning genetically
    engineered molecules in foreign cells
  • initiated what is now the multi-billion-dollar
    biotechnology industry
  • Collaboration began at a conference (on bacterial
    plasmids) in Hawaii in 1972
  • rock, paper scissors
  • http//web.mit.edu/invent/iow/boyercohen.html

38
Application of basic science discovery -gt
recombinant DNA technology
  • Nov 1972 Honolulu Meeting on plasmids
  • Collaboration
  • HWBoyer- isolated an enzyme which cut DNA
    at specific sites
  • SCohen- method to introduce
    antibiotic-carrying plasmid into bacteria
  • method of
    isolating and cloning genes carried by plasmids
  • 1973- series of expts resulting in method to
    select and replicate specific foreign genes in
    bacteria
  • Feb 1975 Asilomar in Pacific Grove, CA goal to
    estimate risk of

  • biohazard and formulate
    guidelines
  • Dec 1980 First of three patents on gene cloning
    to Stanford and UCalif
  • April 1976 Genentech incorporated (Boyer,
    RASwanson)
  • 1977 WRutter et al cloned rat insulin gene
  • 1981 Founded Chiron
  • 1986 First recomb vaccine to receive FDA
    approval
  • Chiron-Merck hepB vaccine
  • retrospect, first cancer vaccine
  • 29 yo venture capitalist
  • http//bancroft.berkeley.edu/Exhibits/Biotech/25

39
What does this represent? (Impact beyond
laboratory bench, applied clinical applications)
  • 1980 founded as Amgen (Applied Molecular
    GENetics)
  • based on recombinant DNA and molecular biology
  • 1983 Amgen
  • 1983 F-K Lin clones human erythropoietin
  • recombinant as Epogen (epoetin alfa)
  • 1985 LM Souza clones human granulocyte colony-
  • stimulating factor G-CSF
  • recombinant as Neupogen (filgrastim)
  • 1987 First epo patent 1989 First neupo patent
  • 1992 Sales gt 1B 1996 Sales gt 2B 1999 Sales gt
    3B
  • 2006 Stock falls patent issues, pipeline

40
What does this represent? (An example of the
integration of molecular biotechnology and
society) (Impact beyond clinical applications
economy and financials)
  • 072608. NYT Amgens experimental bone drug,
  • widely considered to be crucial to the companys
    future
  • has succeeded in its most important clinical
    trial,
  • sending the companys shares up sharply.
    53.92 to above 61
  • Sales up to several billion dollars per year
  • 44M Americans over 50 have osteoporosis
  • Previously, stock lost 50 due to falling sales
    of its anemia drugs,
  • After some studies linked the drugs to worsening
    of cancer, death and cardiovascular problems
  • http//seekingalpha.com/article/87572-amgen-gets-m
    uch-needed-denosumab-boost
  • http//blog.seattlepi.nwsource.com/thelifesciences
    blog/archives/141562.asp

41
What does this represent? (Impact from
recombinant DNA and genomics) (Applied science)
  • Targeting the RANK/RANKL/OPG signaling pathway
    A novel approach in the
  • management of osteoporosis (NATHamdy, Curr
    Opin Invest Drugs 8299 (07))
  • RANK, RANKL and OPG are members of TNF receptor
    superfamily
  • Amgens experimental bone drug,denosumab
  • Three-year study with 7,800 postmenopausal women
    with osteoporosis
  • Reduced risks of spine and hip fractures,
    compared with placebo
  • Smaller studies shown to build bone mineral
    density but with
  • questions of whether this translates to
    reduction in risks of fractures
  • See surprising reduction of hip fractures, rarer
    than spine fractures
  • (thus, harder to show statistically significant
    effect)
  • Hip fractures costly and potentially lethal
    medical problems
  • Earlier studies, higher rate of serious
    infection and cancer- no mention in news release
  • Denosumab is a mAb blocking action of RANK
    ligand, protein involved in bone equilibrium
  • Nuclear factor kappa B ligand
  • Not initiated from academia, but from internal
    studies of genes in mice with
  • particularly dense bones
  • Made mice based on superfamily data
  • http//seekingalpha.com/article/87572-amgen-gets-m
    uch-needed-denosumab-boost
  • http//blog.seattlepi.nwsource.com/thelifesciences
    blog/archives/141562.asp
  • http//www.the-scientist.com/article/display/54849
    /

42
What does this represent? (Impact from
recombinant DNA and genomics) (Basic science)
  • Not initiated from academia, but from internal
    studies of genes in mice with particularly dense
    bones
  • 1994 SSimonet (Thousand Oaks, CA) engineered
    five transgenic mice overexpressing
  • a previously unknown protein, osteoprotegerin
  • Looked and behaved normally but x-rays show
    thicker pelvic and vertebral bones
  • Used this protein because its DNA sequenced
    matched family of cytokine receptors
  • involved in cell death (TNFR)
  • But differs in that it is secreted, eg missing
    transmembrane-spanning sequence
  • 1998 Snow Brand Milk Co (Japan) independently
    identified OPG
  • 1998 Both groups also discovered the binding
    partner,
  • Similar to TNFR, called RANK, discovered at
    Immunex (Seattle) in 1997
  • Phase 1 trials with OPG, then switched to RANKL
  • OPG prevents RANKL from binding RANK, denosumab
    destroys RANKL directly
  • Silver bullet
  • Immunex (WCDougall) 13 years on RANKL project.
  • Now, studying its applications in bone cancer,
  • giant cell tumor of the bone dramatically shrunk
  • http//seekingalpha.com/article/87572-amgen-gets-m
    uch-needed-denosumab-boost
  • http//blog.seattlepi.nwsource.com/thelifesciences
    blog/archives/141562.asp
  • http//www.the-scientist.com/article/display/54849
    /

43
What does this represent? (Impact from
recombinant DNA and genomics) (Applied science)
  • Targeting the RANK/RANKL/OPG signaling pathway
    A novel approach in the
  • management of osteoporosis (NATHamdy, Curr
    Opin Invest Drugs 8299 (07))
  • RANK, RANKL and OPG are members of TNF receptor
    superfamily
  • Made transgenic mice based on superfamily data
  • Whats a superfamily? TNF to bone growth??? An
    orphan receptor? Mice to man???
  • -gtGenomics and Bioinformatics

44
Addendum DNA sequencing (applied biology)
Basic research versus applied research Technology
(Instrumentation) example translation of DNA
replication and DNA synthesis to DNA
sequencing..... molecular biology, genomics,
bioinformatics
-signal transduction field
45
  • Originally and Simply, The Cell
  • Cells come in all shapes, sizes, functions
  • So, if we understand the cell, we can use it

46
Huge range of cells given one genome blueprint
47
Across all life forms, Diversity and commonality
48
Whats in that blueprint? How do we get to it,
read it? Use it???
  • Whats a superfamily? TNF to bone growth??? An
    orphan receptor? Mice to man???

49
Whats in that blueprint? Once we read it, we
find You and the Fly are very similar in genes
  • Whats a superfamily? TNF to bone growth??? An
    orphan receptor? Mice to man???
  • www.imdb.com and wiki

50
Model organisms value
51
Where did genomics begin? Basic science to
applied science laboratory to industry
  • Protein sequencing
  • DNA sequencing, also Maxam and Gilbert
  • Automation of DNA sequencing also et al

52
DNA sequencing methodologies ca. 1977!-The
Chemistries
  • Maxam-Gilbert
  • General
  • base modification by general and specific
    chemicals
  • depurination or depyrimidination
  • single-strand excision
  • not amenable to automation
  • Sanger
  • Specific
  • DNA replication-based
  • substitution of substrate with chain-terminator
    version
  • more efficient
  • automation

53
DNA sequencing Maxam-Gilbert
54
DNA sequencing Maxam-Gilbert Close-up
Chemistry of reactions
55
DNA sequencing Maxam-Gilbert Close-up
Chemistry of reactions
Note, 4 tubes-gt 4 lanes
56
versus bio based methods
  • Sanger method or
  • dideoxynucleotide chain chemistry

57
DNA biochemistry Biochemistry of replication
fork
58
DNA replication Chemistry of replication fork
59
DNA replication Chemistry of replication fork
Problems with chemistry
60
Modify DNA replication biochemistry with
nucleotide analogs
dideoxycytidine triphosphate (ddNTP)
61
if last base added is dideoxy, no extension
purine or pyrimidine
N
C
HO
O
purine or pyrimidine
O
N
C
O
O
O
P
  • Dideoxy chain termination method
  • Sanger method
  • The bio approach

OH
H
62
DNA sequencing replication reaction
terminations gives ladders, with labels at
fixed primer end














63
DNA sequence analysis protocolThe bench
  • de novo vs
  • re-sequencing

64
Shotgun cloning and DNA sequencing method vs
Primer walking strategies (tiled, etc)
  • Reduction
  • Chromosome (Mb) to
  • YAC (gt100kb), BAC (100kb) to
  • cosmid (40kb)
  • To M13 (1kb)
  • Wiki DNA sequencing

65
DNA re-sequencing method
  • Wiki DNA sequencing

66
Before-whole genome sequencing analyses
Candidate gene huntingalso, one strategy for
genome sequencing
  • Mapmaking Chromosome mapping
  • Chromosome walks
  • Isolation of candidate disease gene (early
    strategy)
  • Clone and sequence bioinformatics

67
Resequencing DNA methodology the mitochondria
  • Applied Biosys, Innovations, July 08
  • mitoSEQr System
  • PCR-based resequencing system
  • Identification of sequence variations
  • entire mito genome
  • And control region
  • Methodology
  • Overlapping regions amplified with specific
    primer pairs
  • Tailed with universal M13 sequences
  • Generates resequencing amplicons
  • Identifying mitochondrial mutations
  • Heteroplasmic mutations in affected tissues
  • CFranceschi, GRomeo, EBonora, GGasparre
  • Role of mitochondria in diseases, including
    cancer and Alzheimers
  • Oncocytoma characterized by proliferation of
    mitochondria

68
Resequencing DNA methodology the mitochondria
  • Applied Biosys, Innovations, July 08
  • mitoSEQr System
  • PCR-based resequencing system
  • Identification of sequence variations
  • entire mito genome
  • And control region
  • Methodology
  • Overlapping regions amplified
  • Tailed with universal M13 sequences
  • Generates resequencing amplicons
  • Identifying mitochondrial mutations
  • Heteroplasmic mutations in affected tissues
  • CFranceschi, GRomeo, EBonora, GGasparre
  • Role of mitochondria in diseases, including
    cancer and Alzheimers
  • Oncocytoma characterized by proliferation of
    mitochondria

69
Tracking the Woolly Mammoth, Out of America
Ancient DNA Evidence for a New World Origin of
Late Quaternary Woolly Mammoths
  • RDebruyne HNPoinar, et al. Current Biology Sept
    08
  • (NYT 09/04/08)
  • Siberian woolly mammoth wasnt really Siberian
  • Origins to 6M yrs ago with common ancestor to
    African elephant
  • Sequenced mitoDNA from 160 mammoth samples from
    across Eurasia and North America
  • Identified several clades, some endemic to
    Siberia and other parts of Asia, others to NAm
  • Separated by 1.5Myrs, of eastward migration to NAm

70
Tracking the Woolly Mammoth, Out of America
Ancient DNA Evidence for a New World Origin of
Late Quaternary Woolly Mammoths
  • RDebruyne HNPoinar, et al. Current Biology Sept
    08
  • At some point in past 150,000 yrs,
  • NAm mammoths migrated back to Siberia over
    Bering Strait
  • At reverse migration time, endemic Siberian
    population was crashing
  • 40,000 yrs ago, NAm mammoths dominated Siberia
  • Siberian died out on its own (genetic drift) or
    out-competed?
  • The mammoth that went extinct in Siberia about
    10,000 yrs ago was not of Siberian lineage
  • Common to think of Bering Strait as a one-way
    route
  • Camels went NAm to Asia
  • http//mutex.gmu.edu2119/cgi/content/full/2008/90
    4/2

71
Tracking the Woolly Mammoth, Mammoth Sequences
A Hunt for DNA from the Extinct Titans of the
Klondike
  • Sci Am Sept 08 A different type of scientific
    research tool
  • Core drill designed for punching holes in
    concrete
  • Used to dig into ice, dated 100,000 yo
  • Retrieve frozen soil from Pleistocene
  • Paleomammalogist RMacPhee, AMNH, NCY
  • Water leaking into crater and freezing, remaining
    frozen,
  • might hold DNA from mammoths, flora and fauna
  • May answer long-standing question of whether two
    species of mammoth, rather than just one,
  • roamed the Americas at the end of last ice
    age

72
Evolution of simple technologyThrough-put
considerations
  • Microbiology and molecular biology
  • Insert sizes and vectors
  • Plasmid
  • Plasmid-derived RE-defined fragments
  • 100- 500 base inserts
  • M13-based 1 ug template, linear amp
  • 1 kb inserts
  • M-13 or any 0.1ug or 100ng template, cycle amp
  • now, ng quantities, cycle amp plus detection
    technology
  • Labware
  • eppy tubes/test tubes/ 50cc conical tubes
  • microtiter plates/deep well plates
  • 48 to 96 to 384 to
  • stacked plates
  • Automation, robotics

73
The Biology- Preparation of sequence template
M13 replication
http//www.biochem.arizona.edu/classes/bioc471/pag
es/Lecture5/Lecture5.html
74
Preparation of sequence template M13
modification
  • Cloning sites
  • Universal primers

http//wine1.sb.fsu.edu/bch5425/lect33/lect33.htm
75
M13 DNA prep
76
DNA sequencing Ladders terminate
randomly,allowing reads
77
DNA sequencing In practice- resolution of
ladders
template polymerase
1 dCTP dTTP dGTP dATP ddATP primer
2 dCTP dTTP dGTP dATP ddGTP primer
3 dCTP dTTP dGTP dATP ddTTP primer
4 dCTP dTTP dGTP dATP ddCTP primer
electrophoresis
AT GC AT TA CG TA GC GC AT GC TA TA C
G TA GC AT
extension
78
Manual radioactive sequencing(high resolution
denaturing PAGE)
  • Steps
  • Remove sandwich from hot chambers
  • Separate top plate of 1-3mm gel
  • Denature with acetic acid mix
  • Bind to old film
  • Wrap in saran
  • Assemble sandwich with intensifier screen
  • and x-ray cassette box
  • Expose overnite at -20oC
  • Develop film
  • Assess
  • (Repeat?)
  • Read film
  • Enter data
  • Discard wastes
  • Other problems....

79
  • Wiki
  • http//dnasequencing.wordpress.com/2007/10/26/chai
    n-termination-methods/

80
(No Transcript)
81
Semi-automated fluorescent DNA sequencing
primer label
  • Fred Sanger et. al., 1977.
  • Maxam and Gilbert, 1977
  • Leroy Hood et. al., 1986
  • Applied Biosystems Inc., 1987
  • JM Prober et. al., _at_DuPont, 2000
  • H Swerdlow et. al., 1990 1991
  • BL Karger et. al., 1993

82
DNA sequencing Upgrade, second iteration, dye
terminator label
  • Disadvantages of primer-labels
  • four separate sequencing reactions
  • tedious manually
  • limited to certain regions, custom oligos or
  • limited to cloned inserts behind universal
    priming sites
  • Advantages TBD
  • Solution Dye terminators
  • DuPont Company, sold technology to ABI as being
    of limited use

83
Semi-automated fluorescent DNA sequencing
Terminator label
Note excitation vs emission
84
Semi-automated fluorescent DNA sequencing
Terminator labelSequencing chemistry
  • modification of the biochemistry to accommodate
  • Pre-PCR-based

85
Semi-automated fluorescent DNA sequencing
Terminator label
Note 1 tube-gt 1 lane Also, 4x increase in
thru-put
template polymerase
dCTP dTTP dGTP dATP ddATP ddGTP ddTTP ddCTP
electrophoresis
AT GC AT TA CG TA GC GC AT GC TA TA C
G TA GC AT
extension
86
DNA sequencing instrumentation
Equipment/automation
87
Biotech Generation 1 Auto Sequencer Value
of Instrumentation
  • ca. 1986

88
ABI series 370, 373 and 377
  • semi-automated
  • ca. 1989
  • higher throughput operations
  • bioinformatics limitations-gt opportunities

89
ABI 377 April 06 retirement planned
  • technology moves on
  • new Big Science (paradigm shift)
  • capabilities vs costs

90
Second generation Capillary electrophoresisSand
er-based chemistry
  • ABI/Applied Biosystems
  • 1-cap 310
  • 4-cap 3100, 3130
  • 16-cap 3100, 3130xl
  • 48-cap 3730
  • 96-cap 3730xl
  • Amersham
  • MegaBACE 96-cap
  • Beckman
  • CEQ 8000 16-cap

91
Cap array screen dump
92
Multi-capillary array The Skin
93
Third generation ex., Shimadzu, Ltd.(DNA
sequencing technology)
  • NEW ORLEANS, March 19, 2002. PittCon
  • Shimadzu Ltd. Faster and more economical DNA
    Sequencer
  • 10 times faster and 90 percent cheaper to run
    than
  • current state-of-the-art
  • GenoMEMS, MA spinoff that has developed a
    microfabrication technology, based on Whitehead
    Inst. technology
  • Microelectromechanical system, or MEMS,
    technology microfabricated electrical and
    mechanical components
  • Five million bases per day
  • Read lengths of 800 bases
  • Target release date 2003
  • Still, Sanger chemistry-based fluorescent
  • TODAY (2005) Solexa, 454, etc. Looking for
    1,000 genome- 100,000 genome
  • Archon X Prize for Genomics (X Prize Foundation
    10/4/06)
  • 10 M prize for the first team to successfully
    sequence 100 human genomes in 10 days with
    accuracy lt 1 per 100,000 bases at a recurring
    cost of no more than 10,000 per genome

94
Signal capture, signal-noise, resolution,
de-convolution, sequence data assembly
95
Sequencing artifacts (Difficult templates vs
signal/noise)
  • Hardware remedy
  • Gel length, thickness
  • Gel composition
  • Bioware remedy
  • Primer
  • Vector
  • Polymerase
  • Reagents and additives
  • Radioisotope
  • Reaction time
  • Reaction temperature
  • Modify physical conditions
  • Run time
  • Temperature
  • Film exposure conditions
  • Some are unavoidable
  • GC-rich
  • Repetitive sequences

96
Sequencing artifacts
  • http//www.nshtvn.org/ebook/molbio/Current20Proto
    cols/CPMB/mb0704a.pdf.

97
Sequencing artifacts
  • http//www.nshtvn.org/ebook/molbio/Current20Proto
    cols/CPMB/mb0704a.pdf.

98

Good data/bad data
  • High quality
  • Good spacing
  • Good heights
  • Symmetrical peaks
  • No or low background

99

Good data/bad data
  • Good quality
  • Good spacing
  • Good heights
  • Symmetrical peaks
  • Low but more background

100

Good data/bad data
  • Poor quality (physical)
  • Poor spacing
  • Poor heights
  • Asymmetrical peaks
  • More background?

101

Good data/bad data
  • Sudden drop
  • Template folding (chemistry)
  • (local sequence)

102

Good data/bad data
  • Sudden drop
  • Template folding
  • (local sequence)
  • Resolution through better chemistry,
    biochemistry,
  • instrumentation, conditions

103

Good data/bad data
  • Stutters (biochemistry)
  • Template folding or
  • GC-rich or polyN runs
  • (local sequence)

104
Local sequence and effects(not all sequences
look/act alike)
  • Resolution through better chemistry,
    biochemistry,
  • instrumentation, conditions

105
Difficult templates
  • Resolution through better chemistry,
    biochemistry,
  • instrumentation, conditions

106
Difficult templates
  • Resolution through better chemistry,
    biochemistry,
  • instrumentation, conditions

107
Difficult templates versus Real data
microsatellites
108
Difficult templates versus Real data SNP
109
Difficult templates versus Real data
Coinfections of viruses
  • Interpretation
  • Experience
  • Keep sending out for re-sequencing due to
  • contamination of reaction

110
Difficult templates versus Real data SNP (GTHR)
111
Difficult templates versus Real data SNP
F Umehara, et al. AmJHumGenet. Nov 00 Desert
hedgehog mutation, patient with 46,XY Yp to
Xp Male phenotype, female karyotype Partial
gonadal dysgenesis (PDG) with polyneuropathy CGD
Swyers syndrome Sex reversal in XY
female Premature female genitalia, blinded vagina
and immature uterus plus Testis on one side and a
streak gonad on other Homozygous missense
ATG-gtACG at initiating Met of exon 1 DHH gene
112
DNA Sequencing Applications heterogeneity
  • Unbiased molecular genomic/genetic diagnostics
  • Cystic Fibrosis
  • 24 most common mutations, screening 43,849
    chromosomes
  • 66 at one site
  • Of remaining 23 mutations, next highest number is
    2.4 at one site
  • Ranging to 0.1, accounting for 10 of 24 sites
  • Generalized Thyroid Hormone Resistance (GTHR)
  • Dominant negative mutation
  • ADHD?

113
Applications molecular diagnostics(localized
mutations)
114
  • Wiki
  • http//dnasequencing.wordpress.com/2007/10/26/chai
    n-termination-methods/

115
DNA sequencing Photochemistry
Fluorescence-based labels as alternatives(UV,
IR, etc)
116
Optimization of dyes
117
ABI 370s-series screen dump
118
Bioinformatics part one pixel refinement, lane
bleeding
119
ABI 377 envelope 96 lanes
120
DNA sequencing Computation
  • Input from sequencer
  • peak intensities
  1. normalize intensities
  2. apply mobility corrections
  3. predict bands
  4. call bases
  • Output to user
  • DNA sequence

121
Base-calling issues
122
ABI 377 data
123
To get to nice data-gtSignal de-convolution and
processing
124
DNA sequencing Computation
125
Signal de-convolution and processing
126
Base-calling and matrix issues
  • POP 6 vs POP 4 misapplication of resin, 50 cm vs
    80 cm capillaries.
  • Base-calling issues.
  • SNP issues.

127
Post-processing raw data
  • One fragment, two fragments, three
  • Now, have handful plus fragments
  • Now what?

128
Assembling sequence data
129
DNA sequence assembly Software
  • GCG (Wisconsin Pkg/ Genetics Computer Group)
  • DNAstar
  • GAP4 (Genome Assembly Program)/ Staden Pkg
  • ABI versions
  • Phred/Phrap
  • DNA Sequencher
  • 2008 more?

130
Assemblers, a snapshotGAP4
131
Quality issues in base-calling
  • Base-calling software, with quality scores
  • Phred
  • TraceTuner
  • QV -10logP
  • ex., if want 1/1000 error (0.1)
  • QV 30 (-10)x(-3)
  • lt10 score means base-calling error rate of 10.
  • 20 score is considered good, at 1.
  • gt30 score is considered excellent, at lt0.1.
  • Bermuda Stds, 1/10,000 GenBank now 2006

132
Assemblers, a snapshot Phred/Phrap QA/QC
133
Assemblers, a snapshot Sequencher
  • Mac and icon-based
  • final screen

134
DNA sequence assembly Assembly of fragments
  • Toggle up

135
Ad 1 assembly Collection of fragments
  • Getting there

136
Ad 1 assembly at 98
137
Done! Consensus
  • Joined contigs, no orphans/islands

138
DNA sequence assembly Editing
  • QA

139
Sequence assembly Overlapping fragments for
contigs
  • x-Fold redundancy for accuracy

140
Sequence assembly 21 rule
  • For accuracy, local seq considerations

141
The destination or the journey?
to be continued (preview)
142
Viruses Rule the Deep Sea (The Butterfly
Effect)
  • Phrase for the more technical sensitive
    dependence on initial conditions in chaos theory
  • Edward Lorenz, 1917-2008
  • 1963 NYAcadSci paper as a shortcut for a
    computer modeling weather prediction,
  • used 0.506 instead of 0.506127 - completely
    different weather scenario
  • Viruses good, viruses bad?
  • Proposed earlier a Jekyll and Hyde role,
    killing biomass and sustaining it
  • Viruses and significance in marine systems 15-20
    years old
  • Proof Nature Aug 28, 2008. RDanovaro, et al.
  • Viruses in the deepest ocean environments are
    strong regulators
  • of the deep sea biosphere
  • Infecting and killing bacteria and other
    prokaryotes
  • Main producers of organic material that sustains
    life at 1,000 meters
  • Viruses are by far the most abundant life form
    in the ocean, this study
  • Generating biomass, as major contribution to
    carbon cycle and other geochemical processes
  • Virus-induced deaths as 80 of bacterial deaths
  • Very large amount of carbon reaching sea floor
    through pathways that were thought to be minor
  • 232 samples of sediment from deep sea
  • Viruses surprisingly abundant and reproducing
    locally rather than migrating from surface
  • 65 of earth is dominated by deep sea or
    benthic ecosystems
  • Estimate 0.37- 0.63 gigatons of C per year
    oceans absorb billions of tons of atmospheric CO2
    per year
  • Viral shunt killed organism eaten by another
  • The Scientist, Aug 08
  • http//www.terradaily.com/reports/Viruses_are_hidd
    en_drivers_of_oceans_nutrient_cycle_999.html

143
Human genome, follow-up (The past is never dead.
Its not even past)
  • Aug08 15 years of The Human Genome Project
  • 8 of human genome comprises cryptic viral
    genomes (06 Aug08, molec fingerprint of inactiv)
  • molecular equivalents of mounted trophies
    insects preserved in genomics amber DNA
    fossils
  • Human endogenous retroviruses (HERVs) during 550M
    years of vertebrate evolution
  • HERVs attack germline cells, become integrated
    into genome
  • http//www.washingtonpost.com/wp-dyn/content/artic
    le/2008/08/31/AR2008083101759.html

144
Human genome, follow-up (The past is never dead.
Its not even past)
  • Unlike HIV, HERV outlive infected organism
    endogenous
  • Best-preserved HERV-K113 ca 200,000 years ago,
    long after human and chimp divergence
  • Parts of a few have become incorporated into
    human genes, taking on new roles
  • Proteins helped mold the immune system
  • Syncytin, protein that helps cells fuse together
    in placenta from envelope gene from a HERV
  • In past two years, labs in France and US
    independently reconstructed a functioning HERV-K
    from pieces
  • in the human genome PDBieiasz et al
  • This summer 08, both showed the gene sequences
    similar fingerprints of APOBEC3, human enzyme
    that
  • mutated them into submission
  • http//www.washingtonpost.com/wp-dyn/content/artic
    le/2008/08/31/AR2008083101759.html

145
Human genome, follow-up (The past is never dead.
Its not even past)
  • HERV as junk DNA? eg served no function but
    remnants of past infections
  • MDewanneiux, et al. Genome Research Oct06
    THeidmann
  • Reconstructed an infectious version that
    incorporated into genome 5M years ago
  • Named Phoenix as ancestral ? version of
    HERV-K
  • HERV-K is a young virus lt5M years and contains
    complete set of of genes
  • Proposed roles control gene expression, found
    near genes immune system and disease
  • and linked to cancers and male
    infertility (NBannert and Rkurth PNAS 04)
  • (also, sheep pregnancy and placenta development
    (MPalmarini, TSpencer, et al PNAS 06)
  • Originally from Genome Research and via
    http//www.washingtonpost.com/wp-dyn/content/artic
    le/2008/08/31/AR2008083101759.html

146
Old Viruses Resurrected Through DNA (ELSI)
  • NYT Nov 06
  • Reconstruction of extinct lost viruses
  • 02 Chem-synthesized polio genome (Cello Sci
    02)
  • 05 US govt scientists reconstructed 1918
    influenza
  • virus genome
  • 06 French scientists reconstructed virus that
  • infected primate ancestors Phoenix virus
  • THeidmann, et al (Gen Research)
  • Built and Re-inserted into human cells,
  • (some) Infectious particles out
  • Plan to study HERV role in cancer
  • Alternate view brought back to life
  • dont know what this class of viruses do..
    Its a
  • dangerous thing, and a potent biological
    weapon.
  • Systematic crippling of all future
    generations
  • http//www.nytimes.com/2006/11/07/science/07virus.
  • Photo http//elliottback.com/wp/archives/2006/11
    /08/the-phoenix-virus-resurrected-rna-retroviruses
    /

147
Human genome, follow-up (The past is never dead.
Its not even past)
  • HERVs attack germline cells, become integrated
    into genome
  • Parts of a few have become incorporated into
    human genes, taking on new roles
  • Syncytin, protein that helps cells fuse together
    in placenta
  • From envelope gene from a HERV
  • Jan08, tissue from women with preeclampsia or
    intrauterine growth restriction- threaten fetal
    health
  • - had abnormally amts of syncytin
  • Proteins derived from HERV genes, or antibodies
    against these proteins, are common in
  • testicular tumors, breast cancer tissue and
    melanomas
  • Does HERV cause cancer or is an effect of, or
    both or neither ???
  • Mice, chicken Remnant retrovirus env proteins
    equivalent proteins are made and attach to/block
  • receptors that are used by retroviruses for
    binding
  • Sheep lung or nasal tumors caused by
    retroviruses ancestors into genome before
    sheep/goat divergence
  • 5M years ago
  • http//www.washingtonpost.com/wp-dyn/content/artic
    le/2008/08/31/AR2008083101759.html

148
Human genome, follow-up (The past is never dead.
Its not even past)
  • Jaagsiekte sheep retrovirus causes contagious
    lung cancer in sheep, ovine pulmonary
    adenocarcinoma
  • Retroviruses have played a critical role in
    understanding oncogenes
  • Distinct from classical mechanisms of retroviral
    oncogenesis by insertional activation of
  • or virus capture of host oncogene, native
    envelope (Env) structural protein is itself the
    oncogene
  • MPalmarini, et al wild species had versions of
    two retroviruses differing from domesticated
    versions
  • The domesticated versions have mutation that
    impedes infection by cancer-causing viruses
  • Argue domestication of wild sheep 9,000 years
    ago, with cancer-causing virus,
  • selected for mutant non-cancerous
  • MDeLasHerasJMSharp. Eur Resp J Dec01
  • Evidence for a protein related immunologically to
    the JSRV in some human lung tumors
  • Review MPalmarini and HFan. JNCI 01
  • Review S-LLie and ADMiller, http//www.nature.co
    m/onc/journal/v26/n6/abs/1209850a.html

149
miniEvolution of a protein syncytin (Fear
of the unknown)
  • ASmallwood, et al. BioOne Maternally imprinted
    PEG10 and SGCE, separated from Syncytin (HERV-W)
    gene
  • at 7q21.3, are implicated in
    choriocarcinoma and Silver-Russell syndrome
  • AMalassine.Theidmann. Placenta 07 Expression
    of human endogenous retrovirus HERV-FRD encodes
    fusogenic
  • envelope proteins (syncytin2)
    observed in human placenta
  • AMuirAMoffett. JGV 06 Human endogenous
    retrovirus-W envelope (syncytin) expressed in
    trophoblast
  • Placenta is unique amongst normal tissues in
    transcribing numerous different
  • human endogenous retroviruses at high levels
  • Syncytin expressed widely in normal cells as well
    as choriocarinoma cell lines
  • HERVs arose from ancient germ-cell infections by
    exogenous retroviruses
  • Most HERVs inactivated due to accumulated
    mutations small number of HERV genes retained
    ORFS
  • IKnerrWrascher Mol Hum Reprod Jun04
    placental syncytin first described in 2000 as a
    fusogenic glycoprotein
  • derived from a human endogenous
    retroviral envelope gene
  • Stable integrated retroviral elements within
    human genome known for many years, biological
    significance obscure,
  • usually designated as irrelevant or even
    harmful
  • Syncytin, however, demonstrates tissue-specific
    expression and distinctive receptor interaction
    during trophoblast cell
  • differentiation and syncytium formation
  • http//www.washingtonpost.com/wp-dyn/content/artic
    le/2008/08/31/AR2008083101759.html

150
(No Transcript)
151
PCR linear amplification DNA sequencing
  • http//www3.appliedbiosystems.com/cms/groups/porta
    l/documents/web_content/cms_051956.gif
Write a Comment
User Comments (0)
About PowerShow.com