Optional Reading: - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Optional Reading:

Description:

Handed in with exam. Exceptions from these rules will not be ... GBREL.TXT Genetic Sequence Data Bank. February 15 2001. NCBI-GenBank Flat File Release 122.0 ... – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 34
Provided by: jeffy8
Category:
Tags: optional | reading

less

Transcript and Presenter's Notes

Title: Optional Reading:


1
Optional Reading Yeast as a Genomic Model
2
Midterm III5/24
  • Closed Books, Closed Papers,
  • Open Note Page
  • one 8.5 x 11 inch page,
  • two-sided OK,
  • handwritten or typed (computer OK) text,
  • NO CUT AND PASTE FIGURES,
  • hand drawn figures are OK,
  • Handed in with exam.
  • Exceptions from these rules will not be
    tolerated.

3
Complete Genomic SequencesDNAReagent for the
21st Century
  • 2001
  • 9 ARCHAEAL
  • 36 BACTERIAL
  • 6 EUKARYAL

2004 (May) 17 ARCHAEAL 144 BACTERIAL 39
EUKARYAL (rough and finished)
2004 1,286 Viral Genomes, 547 Organelles,
others
4
Public Data Set
GBREL.TXT Genetic Sequence Data Bank
February 15 2001
NCBI-GenBank Flat File Release 122.0
Distribution Release Notes 10896781
loci, 11720120326 bases, from 10896781 reported
sequences
GBREL.TXT Genetic Sequence Data Bank
April 15 2004
NCBI-GenBank Flat File Release 141.0
Distribution Release Notes 33676218
loci, 38989342565 bases, from 33676218 reported
sequences
5
Genome Project Goals(General From the Outset)
  • Establish an integrated WEB-based database and
    research interface,
  • Assemble physical and genetic maps,
  • Generate genomic and expressed (mRNA) gene
    sequences,
  • Identify and annotate the complete set of genes
    encoded within a genome,
  • Compile atlases of gene expression,
  • Accumulate functional data (functional genomics
    reverse genetics, proteomics, etc),
  • Characterize sequence diversity between and among
    organisms.

6
Post-Genomic Goals10 Year ScheduleArabidopsis
(2000 - 2010)
1- to 3-Year Goals
Develop essential genetic tools, including the
following
  • - comprehensive sets of sequence-indexed mutants,
    accessible via database search,
  • - whole-genome mapping and gene expression DNA
    chips,
  • facile gene expression systems
  • - produce antibodies against, or epitope tags on,
    all deduced proteins.

all done.
7
Post-Genomic Goals10 Year ScheduleArabidopsis
(2000 - 2010)
3- to 6-Year Goals
  • Create a complete library of full-length cDNAs
    (cloned mRNAs).
  • Construct defined deletions of linked,
    duplicated genes.
  • Develop methods for directed mutations and
    site-specific recombination.
  • Describe global mRNA expression profiles at
    organ, cellular, and
  • subcellular levels under various environmental
    conditions.
  • Develop global understanding of
    post-translational modification.
  • - Undertake global metabolic profiling at organ,
    cellular, and sub-cellular levels under various
    environmental conditions.

progressing.
8
Post-Genomic Goals10 Year ScheduleArabidopsis
(2000 - 2010)
10-Year Goals
  • Artificial chromosomes.
  • Identify cis regulatory sequences of all genes.
  • Identify regulatory circuits controlled by each
    transcription factor.
  • Determine biochemical function for every protein.
  • Describe three-dimensional structures of members
    of every plant-specific protein family.
  • Undertake systems analysis of the uptake,
    transport, and storage of ions and metabolites.
  • Describe globally protein-protein,
    protein-nucleic acid, and protein-other
    interactions at organ, cellular, and subcellular
    levels under various environmental conditions.
  • Survey genomic sequencing, and deep EST sampling
    from phylogenetic node species.
  • - Define a predictive basis for conservation
    versus diversification of gene function.
  • Compare genomic sequences within species.

progressing.
9
Disclaimer this review is heavily biased toward
the public sequencing consortium.
10
Map First then sequence
Sequence First then map
11
Genome Sequencing Strategy 1
  • Clone-by Clone Approach
  • Order clones along the genome, then sequence,
  • not dependent on acceleration of sequencing
    capacity,
  • not dependent on advanced computer analysis,
  • not dependent on as-of-yet sequencing
    technologies.
  • heavy up-front demand for human labor.

12
Clone-by-Clone Ordered Approach
Online Primer mapping.html
13
Genomic Libraries
how many clones to cover a genome?
14
Vectors(carry insert DNA)
Vector
Host
Inserts
  • Plasmid E. coli up to 15 kb,
  • Phage E. coli up to 25 kb,
  • Cosmid E. coli up to 45 kb,
  • BAC E. coli 100-500 kb,
  • YAC Yeast 250-1000 kb.

plasmid/phage hybrid
15
Genomic Sequences and Coverage
  • N ln(1 -
    .9999)
  • ln(1 - v/2,900,000,000)
  • v average vector insert size

plasmid (5 kb) 5.3 x 106 phage (20 kb)
1.3 x 106 BAC (125 kb) 2.2 x 105
YAC (500 kb) 27,000 clones
16
Bacterial Artificial ChromosomesBACs
  • Universal Priming Sites,
  • On the vector, flanking the genomic insert.

17
Clone-by-Clone Ordered Approach
18
Contigs(Contiguos Sequences)
Find overlapping ends
Clone 1
Sequence,
Restriction Fragment Length Polymorphisms
(RFLPs).
19
Sequence Contig
20
RFLP
Restriction enzymes cut specific
DNA specifically,
Fragment lengths provide clone identification
data.
21
(No Transcript)
22
Contigs(Contiguos Sequences)
Find overlapping ends
Merge good pairs of reads into longer contigs
  • Find the minimal Tilling Path,
  • - minimum set of overlapping clones that cover
    the genome.

23
Minimal Tilling Path
Shotgun Sequence Each Clone
24
Shotgun(self-quiz)
8x - 10x coverage To shotgun sequence 10,000
bp, youd need 80k - 100k bp of sequence, or 160
- 180 sequencing reactions.
But, 10,000 bp, at 500 bp per sequencing
reaction could be done in as few as 20 sequencing
reactions.
Why Shotgun?
25
Contigs
QC
26
(No Transcript)
27
Structural Genomic Strategies 2
  • Whole Genome Assembly Approach
  • Sequence first, then order,
  • dependent on advances in computer analysis and
    sequencing technologies,
  • dependent on automated labor.

28
WGA
29
Read Pairs Mate End Pairs
  • Paired End Sequencing,
  • sequence both ends of the vector insert, using
    vector derived primers,
  • Maintain mate pair data.

5
3
5
3
30
Example Sequence Output(example 5 kb insert)
5 read(543 bp)-atatgtatattgaattacatacatattattaatg
cacatttttatccggagttgtggaccatagaaagacatattgactcctca
aagtaaattctgcatgttacattgaaatcataggctaaatttgagatgca
ctatttttagaaagtgtagagaaaaggacaggaagaaataagcgaaagct
ttggtaagccaccaaacctgattactggaagaaaagaaaaaagttccgag
aatagagttagatcgctggtgagggttttaaatggaacacaacaatggtt
gttttagagtgtgttattcttttgtatttataccttctcataggtttctt
gtaatacacgcttcttcctctctctccctctctcttatggcctcgtcttg
aaagcgtcttgcatgctaagagaaggctttagagcaaggagagaagggag
aagttgatttatacgtccatcggatatatcttctttttatatctgtctct
cttttaaggaagaaaaatggcgactgaattctcgtgggatgaaatcaaga
aagaaaatg...
- rest of insert (unsequenced, 3.9 kb) -
...ggcttgaaatatttggggcaaacaagcttgaagagaaatcagagaac
aagtttttgaaattcttggggttcatgtggaatcctctctcatgggttat
ggagtctgctgcaatcatggctattgttttagctaatggaggaggaaagg
cgccggattggcaagattttatcggtattatggtgttgcttatcatcaac
tccaccataagtttcatcgaggagaacaatgctggcaatgccgctgctgc
tctcatggcaaatcttgcaccaaagactaaggtatgcaaatttctcaata
catatatataggtatgtattttctaaaaaggagagttatataacctatgt
gtgaatgtaggtgttgagagatggtaaatggggggagcaagaggcttcaa
tcttggttccgggtgatttgataagcatcaaattgggtgacattgttcct
gctgatgctcgtctcctcgaaggagatcctttaaaaattgaccaatctgc
tcttactggtgaatcccttccaaccaccaaacacccaggagat - 3
read(540 bp)
plus trace data files associated with these
sequence runs.
31
WGA
32
Structural Genomic Strategies 3 (Hybrid)
33
Monday
  • WGA,
  • Shotgun Sequencing,
  • Hybrid Approach.
  • Compartmentalized
  • Shotgun
  • Approach
  • Please read
  • Science 291 1304-1315
Write a Comment
User Comments (0)
About PowerShow.com