Title: Locating markers from the genetic to the physical rice map
1Locating markers from the genetic to the physical
rice map
2Review Linkage map of molecular markers (Genetic
maps)
- Shows order map distance of various molecular
markers along a linkage group (chromosome) - Generated by linkage analysis
SSR marker
RFLP marker
3A physical map of the rice genome that is aligned
with genetic maps is available
Genetic map Cornell
Physical map
4Terms and facts about the physical map
- Chromosome assembly of contigs
- Contig assembly of clones
- Clone the only physical entity (BAC or PAC
clone) - A veeerrry small chromosome piece sequenced
100 kb DNA - Assigned a GenBank accession ID
- Gap introduced by
- Recombination
- No data (assembly unfinished or data non-existent
yet) - Anchor marker molecular marker that is
- physically located on a clone (sequence is there)
- has an assigned cM location
- 1 cM 247kb (from 420Mb japonica genome / 1700
cM total)
5Molecular markers from genetic map can be located
into physical map
Genetic map Cornell
Clone AL772426
Anchor marker C460
6Locating a marker in the physical map enables you
to..
- Determine gene(s) in silico that may be
responsible for the phenotypic effect of a marker - Gene structures in the region of clone where
marker sequence can be found - Identify more markers that can be used for fine
mapping - Use existing markers
- Design STS primers new markers
7Where are these map databases???
- Gramene
- http//www.gramene.org/
- TIGR rice genome map
- http//www.tigr.org/tdb/e2k1/osa1/BACmapping/descr
iption.shtml - Arizona Genomics Institute rice physical map
- http//www.genome.arizona.edu/fpc/rice/
- RGP (Japan)
- http//rgp.dna.affrc.go.jp/
8Sample problem 1
- You mapped a gene for salt tolerance (SALTY) 5 cM
away from marker RZ649 in chromosome 5. What SSR
marker(s) can you use near this region? - Use Gramene comparative maps resourceCornell
SSR map
9Gramene output
- Further exercise
- Which clones contain these SSR markers?
- How far apart (in cM) are these clones?
10Sample problem 2
- You have a clone of a putative salt-tolerance
gene, sequence known (SALTY) - Candidate gene approach
- You want to map it by RFLP/SSR combination
- Strategy
- Locate the sequence in the rice genome in silico
for better targetting (BLAST is a good tool) - For sequences gt 50 nt, 50 identity
hybridization signal - Locate SSR/RFLP markers flanking the location
- Gramene or other databases mentioned
- Additional Tips
- Also use markers anchored to clones
- Use standard-spaced markers for good genome
coverage
11Step 1. Locating the clone
- Find similar sequences in the rice genome (IRGSP)
using BLAST - Predict the genes in the clone in the region
where there is significant BLAST hit using
FGENESH - Ensures that you are hitting a functional gene
- Identify the gene
- Search the Protein Family database for homologous
proteins with the predicted gene using HMMPFAM - Ensures you are working with the target gene of
interest
12Gene as a data text file
- FASTA Format the most common format
- Loosely formatted text file containing a
descriptor line the sequence data - Saved as a text-only file
- Best to use Notepad or a text-only editor
- Most sequence database centers offer this option
13Sample FASTA file
- gtSALTY gene Oryza sativa putative salt tolerance
gene mRNA,complete cds - TTCTCTCTCTCTCTCTTCTTCTTCTTCTTCTTCTCCATATCTCCTACTCC
TCGTGAAGATCGATCGACCATCGGCAATTT - CATTCGGTAATAGTTAAGCTAAGATCAAATCAAGATTGGCGAAACGATGG
AGATGGTGCTGCAGAGGACGAGCCACCACC - CGGTGCCCGGGGAGCAGCAGGAGGCGGCGGCGGAGCTGTCGTCGGCTGAG
CTCCGGCGAGGGCCGTGGACCGTCGACGAG - GACCTCACCCTCATCAATTACATCTCTGATCACGGCGAGGGCCGCTGGAA
CGCACTCGCACGCGCCGCCGGTCTGAAGAG - GACTGGGAAGAGCTGCCGGCTCCGGTGGCTGAACTATCTCCGGCCGGATG
TGAAGCGCGGCAACTTCACCGCAGAGGAGC - AGCTGCTCATCCTCGACCTCCACTCCCGATGGGGCAACCGATGGTCCAAG
ATAGCACAACATTTGCCTGGGAGGACCGAC - AACGAGATCAAGAACTACTGGAGGACCAGAGTGCAAAAGCATGCCAAGCA
ACTCAATTGTGATGTCAACAGCAAGAGGTT
14Database search using Basic Local Alignment
Search Tool (BLAST)
- Most popular sequence alignment tool available
- Similarity/Homology Alignment
- BLAST hist significance is quantified by various
parameters - Alignment for the Maximal-scoring Segment Pairs
are reported - 6 Different BLAST programs from NCBI
- BLASTN, BLASTP, BLASTX, TBLASTN, TBLASTX
- Usually we use BLASTN
15BLAST Score its Statistical Significance
- One alignment has a score , S, associated
- local random alignments are given a probability
density function named extreme value distribution
- When you relate an observed alignment score (S)
to the EVD, you can calculate statistical
significance known as E value - E value is the number of alignments with scores
S that would be expected by chance alone - Lower E value, higher MSP match
16Where to BLAST
- Web-based one to tens of sequences
- NCBI
- TIGR http//tigrblast.tigr.org/euk-blast/index.cg
i?projectosa1 - my favorite - IRRI Local computer, via command line
- Palay Alphaserver
- Good for heavy-duty searches
17Gene Prediction
- FGENESH predicts multiple genes in genomic DNA
sequences (Solovyev 2001) - Used by the rice genome sequence authors (BGI,
TMRI) - Available in web server http//www.softberry.com/b
erry.phtml?topicgfindprgFGENESH and command
line version (Palay)
18FGENESH command line
- fgenesh /usr/local/fgenesh/Monocot seqfilename
gt outfilename
19Identify the gene Search the established protein
database for homology
- PFAM database HMM search
- Web based thru the TIGR site
- http//tigrblast.tigr.org/web-hmm/
20Locate the surrounding SSR markers in the region
- BLAST reports the accession ID where you have
significant hit(s) - Get more information on this clone from Gramene
or AGI