Practice retrieving data and running stand alone BLAST. - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Practice retrieving data and running stand alone BLAST.

Description:

Practice retrieving data and running stand alone BLAST. ... Solanum pennellii (EST) Query: Select Pathway by name. Enter: Abscisic Acid. Submit ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 35
Provided by: davidf152
Category:

less

Transcript and Presenter's Notes

Title: Practice retrieving data and running stand alone BLAST.


1
Practice retrieving data and running stand alone
BLAST.   Step 1. Identify genes in the ABA
biosynthesis pathway from the Arabidopsis Cyc
database http//www.arabidopsis.org/biocyc/index.j
sp Step 2. Identify subject database Vitis
vinifera (nucleotide) Solanum pennellii (EST)
2
(No Transcript)
3
Query Select Pathway by name  Enter Abscisic
Acid Submit
4
(No Transcript)
5
(No Transcript)
6
(No Transcript)
7
Now what?
8
Filter for unique sequences (EXCEL Data, Filter,
Advanced Filter)
9
Notepad EDIT, LINE OPPERATIONS, JOIN
LINES   SEARCH, REPLACE, space with
spaceORsapce   Paste into ENTREZ Nucleotide
search
10
(No Transcript)
11
PERL chomp next if /\s/ (skip if there is a
space in start of the line) next if /Gene/ (if
line starts with gene, skip) my _at_temp split
/\t/ (data set is tab delimited) hashtemp0
1 (unique sequence i.d. 0 is first element
of the array) Then invoke BioPerl to query NCBI
with the search string TAIRAT AND complete
cds Where AT are the unique accession numbers
from AraCyc and complete cds eliminates genomic
sequence (e.g. complete Ath chrom 4) See
complete script on class site.
12
(No Transcript)
13
Do we want this much sequence?
14
Use the push pin to highlight all boxes for mRNA
(22 sequences) so we dont get chromosome 4
genomic sequences
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
Try Use Unix to verify that the file contains
all the sequences Q What command would you
use? A grep c gt filename
19
(No Transcript)
20
(No Transcript)
21
(lycopersicum ORGN AND EST) AND "Solanum
pennellii"porgn__txid28526
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
Try Use Unix to verify that the file contains
all the sequences
27
Nucleotide
Vitis ORGN AND EST
28
(No Transcript)
29
Note syntax of ENTREZ search invoked by organism
tree link
30
For class, I recommend downloading the smaller
Nucleotide data set
31
Try Use Unix to verify that the file contains
all the sequences
32
Now what? Which file needs to be formatted for
BLAST (formatdb)? Which file will be the query
file? What is the syntax for the BLAST (including
PATH)?
33
(No Transcript)
34
Formatdb /path/formatdb -i /path/filename p
F Run nucleotide BLAST (blastn)
/path/blastall -p blastn -d /path/filename -i
/path/filename o filename e 0.01
Write a Comment
User Comments (0)
About PowerShow.com