Tomato Project Group - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Tomato Project Group

Description:

Phase 2. Clones Selection and Verification. BACs selected primarily from the ... Sarah Sims. Karen Oliver. Jane Rogers. Imperial College London: Gerard Bishop ... – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 28
Provided by: helenb4
Category:
Tags: group | project | tomato

less

Transcript and Presenter's Notes

Title: Tomato Project Group


1
Second Tomato Finishing Workshop Chromosome 4
  • Tomato Project Group
  • Wellcome Trust Sanger Institute
  • 25th April 2008

2
Chromosome 4 Introduction
  • Data Flow at WTSI
  • Sequencing Method Used
  • Finishing Strategies
  • Use of Overlapping Data
  • Chr4 Sequence Update
  • Discussion points for Workshop
  • Unmapped BACs
  • Examples of Problem Clones
  • Dealing with Large Repeats

3
UK - Chromosome 4
  • Gene space estimate for Chromosome 4 is 19Mb
  • Mapping, sequencing and finishing at Wellcome
    Trust Sanger Institute (WTSI)
  • BAC by BAC sequencing approach
  • Approximately 200 BACs
  • Funding at WTSI ends October 31st 2008

4
Overview of WTSI Clone Pipeline
  • Clone Selection and Verification
  • Clones entered into pipeline

Mapping
BACs assigned to chr4 sequencing project on SGN
BAC registry
  • Clone DNA Prep
  • Digest Confirmation
  • Library Construction (plasmid)

Subcloning
  • Plasmid Prep
  • Sequencing Processing

Sequence Contigs gt2Kb available on Sanger FTP
site and Public Databases Sequencing in
Progress
Shotgun Sequencing
HTGS Phase 1
  • Sequence Improvement
  • Contig Orientation and Gap Closure
  • Confirmation of Assemby (QC)

Finishing
HTGS Phase 2
  • Sequences Uploaded
  • to SGN
  • BAC Registry Updated

Finished Sequence Final EMBL submission Complete
Sequence HTGS Phase 3
5
Clones Selection and Verification
  • BACs selected primarily from the
  • HindIII (LE-HBa-) and MboI (SL_MboI) libraries
  • Using Seed BACs from SGN,
  • end sequence alignment and FPC analysis
  • New BACs selected from in house overgoes for
    markers
  • Selected 5 clones from the fosmid library
  • based on end sequence alignments and
    fingerprints

6
Plasmid Prep and Shotgun Sequencing
  • Optimised for 384 well prep and sequencing
  • Capillary Sequencing
  • AB3730s with AB Big Dye Terminator
  • pUC118 Double Stranded Sequencing Vector
  • 4-6Kb inserts, double end sequenced

BACs Aim for 6x-8x Coverage Average Insert
100-150Kb (LE_HBa- and SL_MBol- Libraries) 2x
or 3x 384 plates per BAC 750 paired end
reads 1500 reads in total Average 10-15 contigs

Fosmids Average Insert 35Kb 1x 384 plates
7
Clone Finishing
Gap4 (Staden) used to view and manipulate
sequence data
  • Sequence Improvement

Manual Finishing
QC Checking
8
Manual Finishing of BACsBACs viewed in relation
to map
  • BACs are viewed in relation to the mapped minimal
    tile path
  • Use in house tpf visualisation tool e.g. ctg503

9
Use of Overlapping Sequences
  • From Minimal Tile Path the region finished in
    each clone depends on the order the clones enter
    finishing
  • Finish unique sequence with a 2000bp overlap
    between clones

BAC1
BAC4 gap closure
BAC2
BAC3
total BAC insert
finished region
Final order and orientation of finished BACs are
given in the AGP file e.g. BAC1-BAC2-BAC4-BAC3
10
Summary of Clone Gap Closure Strategies
  • Make use of paired ends to order and orientate
    contigs
  • Identify whether gaps are spanned or unspanned
    orchid example
  • Identify any repeats associated with gaps
    dotter example
  • Estimate gap sizes using restriction digest data
  • This will determine appropriate strategy for gap
    closure e.g.
  • primer/oligo walking into regions of low quality
    or gaps spanned by paired end reads
  • PCR and direct walking on BAC DNA into regions of
    low quality and unspanned gaps (also attempted on
    unresolved spanned gaps)
  • Use of alternative chemistries where appropriate
  • structural problems, mono- di-nuclotide runs

11
OrchidRead pair Visualisation Tool
Contiguous sequence with good read pair coverage
12
Visualising Repeats associated with gaps
Inverted Repeat
Direct Repeat

13
Restriction Digests
  • Minimum of three restriction enzymes used to
    confirm the assembly
  • Selection depends on organism and the nature of
    the sequence
  • S. lycopersicum BACs are digested with
  • BamHI
  • EcoRI
  • HindIII
  • Comparison of real and virtual digest of entire
    BAC sequence

14
ConfirmWTSI In-house digest visualisation tool
15
In-house digest visualisation tool
16
Clone Gap Closure Strategies
  • Make use of paired ends to order and orientate
    contigs
  • Identify whether gaps are spanned or unspanned
    orchid
  • Identify any repeats associated with gaps
    dotter
  • Estimate gap sizes using restriction digest
  • This will determine appropriate strategy for gap
    closure e.g.
  • primer/oligo walking into regions of low quality
    or gaps spanned by paired end reads
  • PCR and direct walking on BAC DNA into regions of
    low quality and unspanned gaps (also attempted on
    unresolved spanned gaps)
  • Use of alternative chemistries where appropriate
  • structural problems, mono- di-nuclotide runs

17
Sequencing Chemistries and Additives used in
Finishing
  • 41 mix ratio of AB Big Dye Terminator AB dGTP
    Terminator
  • used for general finishing reactions, not
    problem specific
  • AB dGTP Terminator
  • used for di-nucleotide runs and inverted repeats
  • Additive A (SequenceRx Enhancer Solution A -
    Invitrogen)
  • Dimethyl sulfoxide (DMSO)
  • Additive ADMS0dGTP
  • used for mono-nucloetide runs, inverted repeats
  • Sequence Finishing Kit (SFK) (TempliPhi -
    Amersham)
  • used to increase DNA yield
  • useful for structural problems caused by
    inverted repeats

18
Alternative Gap Closure Strategies
  • Specialist Subcloning
  • Small Insert Libraries (SIL)
  • Double Stranded pUC or Single Stranded M13
  • Large Insert Libraries (LIL)
  • Transposon Libraries (TIL)
  • Restriction Fragment SIL (RFSIL)
  • Alternative Strategies for dealing with large
    repeats
  • - points for further discussion on Tuesday
  • - what repeats have other chromosomes found?

19
Clone Gap Closure Strategies
  • Make use of paired ends to order and orientate
    contigs
  • Identify whether gaps are spanned or unspanned
    orchid
  • Identify any repeats associated with gaps
    dotter
  • Estimate gap sizes using restriction digest
  • This will determine appropriate strategy for gap
    closure e.g.
  • primer/oligo walking into regions of low quality
    or gaps spanned by paired end reads
  • PCR and direct walking on BAC DNA into regions of
    low quality and unspanned gaps (also attempted on
    unresolved spanned gaps)
  • Use of alternative chemistries where appropriate
  • structural problems, mono- di-nuclotide runs

20
Use of Misc_Feature Tags in EMBL/GenBank/DDBJ
  • Used regularly on finished sequence to identify
    regions of
  • uni-directional chemistry when dGTP only
  • single subclone regions
  • including SIL and TIL only regions
  • pcr only
  • Single reads from direct walks on BAC DNA
  • data only from overlapping BACs
  • E.coli Transposon insertion sites
  • tag sp6 and t7 ends of overlaps (tomato)
  • gap sizes of force joins in tandem repeats

21
Misc_Feature Tag Example Clone End Tags
Accession
Length of sequence
Whole Clone Finished
Both ends of clone cited
22
Misc_Feature Tag Example
23
QC Check of Clone Assembly
  • Before submission to public databases as HTGS
    phase 3 complete, all assembled BACs undergo
    several QC checks
  • all reasonable chemistry attempts have been made
    for any specific problem types
  • all bases are above phred30
  • orientation of paired end reads checked across
    assembly
  • assembly is confirmed by restriction digest data
  • correct misc_feature tags have been used to
    identify any regions where appropriate

Ensures on high quality contiguous sequence with
low error rate
24
Chromosome 4 Clone Pipeline
Additional 15 BACs finished - not on chromosome 4
from FISH
25
Unmapped BACs moved from chr4
bTH82D4 (LE_HBa082D04) moved to chr7 (on FISH
map) bTH91D14 (LE_HBa091D14) moved to chr5 (on
FISH map)
26
Points for Discussion at Workshop
  • What problematic sequence have other groups
    encountered?
  • Strategies for finishing repeats used by other
    chromosome groups?
  • Unmapped BACs any from other chromosomes?

27
Acknowledgements
Cornell University Lukas Mueller Robert
Buels Jim Giovannoni Steve Tanksley Colorado
State University Stephen Stack Suzanne
Royer Song-Bin Chang Arizona Genomics
Institute Rod Wing Seunghee Lee MIPS/IBI
Institute for Bioinformatics Klaus Mayer Remy
Bruggmann Wageningen University Rene Klein
Lankhorst Hans de Jong Dora Szinay
  • Wellcome Trust Sanger Institute
  • Karen McLaren
  • Clare Riddle
  • Sean Humphray
  • Christine Nicholson
  • Carol Scott
  • Stuart McLaren
  • Matt Jones
  • Christine Lloyd
  • Sarah Sims
  • Karen Oliver
  • Jane Rogers
  • Imperial College London
  • Gerard Bishop
  • Daniel Buchan
  • James Abbott

FUNDING
Write a Comment
User Comments (0)
About PowerShow.com