BLAST Sequence Searching in Registry - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

BLAST Sequence Searching in Registry

Description:

How sequences are represented in the Registry file today ... Copied and pasted. Read from File. Typed directly. a Recalled sequence ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 57
Provided by: library76
Category:

less

Transcript and Presenter's Notes

Title: BLAST Sequence Searching in Registry


1
BLAST Sequence Searching in Registry
  • Soichi Tokizane
  • November 2002

2
You will learn
  • How sequences are represented in the Registry
    file today
  • How to use BLAST for similarity searching
  • Techniques for finding references to BLAST results

3
Sequence Information from CAS
4
CAS creates the Registry database
CAS Registry growth since 1965
Substances Registered (millions)
01
5
CA has covered biochemistry journals and patents
since 1907
Oxidizing enzymes - (III) Specific nature of
tyrosinase and its action on products of
disintegration of protein compds. Arch.
Sci. phys. nat. gen. 24 1907
6
Today, the CA database contains a very complete
bioscience collection
It covers
  • Journals and patents from
  • More than 3,000 bioscience titles
  • Patents from 33 countries plus EP and WO
  • Over 500 books and over 300 book series
  • Conference proceedings
  • Dissertations

7
Over 40 of the 21.5 million bibliographic
records in CA cover bioscience information
8
Biomolecules (sequences) are a major substance
class in REGISTRY
36 million substances
9
Virtually all types of sequences are covered in
Registry
  • Sequences from earlier literature
  • Novel nucleic acid primers and probes
  • Protein sequences deduced from gene translation
    and ESTs
  • Sequences with uncommon or non-natural residues
  • Chemically modified sequences
  • Fusion proteins
  • Genetically engineered sequences
  • Protein nucleic acids (PNAs)

10
BLAST Sequence Similarity Searching
11
Registry offers several sequence search techniques
  • BLAST similarity (homology) searching
  • similarity searching is the retrieval of sequence
    matches based on identity, conservation, and gaps
  • Sequence code match exact, family, motif,
    pattern
  • Sequence name search

12
BLAST is a similarity matching algorithm
  • BLAST stands for Basic Local Alignment Search
    Tool
  • Produced and offered by the U.S. National Center
    of Biotechnology Information (NCBI)
  • Designed to quickly compare nucleic and amino
    acid sequences against desired databases

13
Search Application
Find patent references for sequences similar to
the following recombinant human collagen. Conduct
a comprehensive search in Registry on STN.
MRAWIFFLLCLAGRALAAPLADYKDDDDKP GYLGGFLLVLHSQTDQEP
TCPLGMPRLWTG YSLLYLEGQEKAHNQDLGLAGSCLPVFSTL
HQVCHYAQRNDRSYWLASAAPLPRAWIFF MMPLSEEAIRPYVSRCAVC
EAPAQAVAVHS QDQSIPPCPQTWRSLWIGYSFLMHTGAGDQ
GGGQALMSPRAAPFLECQGRQGTLADY CHFFANKYSFWLTTVKADLQ
FSSAPAPDTL KESQAISRCQVCVKYS
14
CAS Registry BLAST via STN on the Web is easy to
use
  • 1. Install sequence plug-in
  • 2. Conduct Registry BLAST similarity search
  • 3. Search selected BLAST answers in STN to
    get the literature references

15
BLAST is available via STN on the Web
  • A plug-in must be downloaded and installed before
    using the BLAST module
  • It is a one-time only requirement
  • The plug-in is free
  • Clicking on Get Sequence Plug-in takes you to
    easy-to-use Instructions

16
Plug-in instruction page
17
Conduct Registry BLAST Similarity Search
18
Follow these steps for Registry BLAST searching
  • Launch CAS Registry BLAST
  • Submit sequence query
  • Examine results and return to STN
  • Continue searching in STN on the Web

19
Logon to STN on the Web and select the Sequence
Assistant
1.
2.
20
Select from one of three STN online options
before launch
Click on Launch button
21
The main and new search windows appear

22
Submit sequence query
  • In a new session, the only available option is
    Similar Sequences
  • Fast BLAST is available after the first search
  • Click on the Similar Sequences button to open the
    Search by Sequence query page

Search by Sequence
23
The Search by Sequence screen is easy to use
24
Type in a result name
  • Type desired name for sequence search
  • Alpha or numeric
  • Spaces and punctuation allowed
  • STN will assign sequential number if you do not
    name the search
  • The name can also be changed later in the Main
    Menu

25
Recall Sequence is useful for re-submitting the
same query with different settings
  • The most recently searched sequence is stored in
    a buffer that can be retrieved using this
    function
  • This function is grayed out when you first begin

26
Read from File allows you to upload directly from
a file
  • The file can be
  • A text file (e.g. .txt)
  • In GCG or FASTA format
  • An STN record (SQIDE display)

27
The sequence query must be 1-letter code
  • The sequence query can be
  • Copied and pasted
  • Read from File
  • Typed directly
  • a Recalled sequence
  • The sequence length limit is 50,000 characters

28
This screen is for inserting a sequence query
from file
29
The BLAST program to be used is selected next
30
Searches can be run on a subset of the Registry
File
  • For proteins, the three options are
  • The default is all CA sequences

Other options are available for nucleic acids,
such as include or exclude GenBank records.
31
BLAST default settings are optimized
  • Parameters can be modified
  • Search sensitivity
  • Low complexity filtering
  • Maximum number of answers
  • Show advanced options

32
Advanced functions should only be modified with a
thorough understanding of BLAST principles
  • Users are encouraged to contact bioinformatics
    departments for details, advice, and
    recommendations
  • Additional information is also available at the
    NCBI Web page http//www.ncbi.nlm.nih.gov/

33
The Main Window is for managing results
  • The Main Window has columns for
  • Assigned name
  • Type of search
  • Time created
  • Status
  • Results
  • Reviewed status

34
Results can be viewed once the search is complete
  • The results are permanently stored on STN, until
    deleted by the user
  • Old results can be reviewed when desired
  • Up to 50 results sets can be stored

Highlight
Then view
35
(No Transcript)
36
Alignments can be viewed individually
37
Alignments can be saved or printed
38
The saved file has a summary of all the hits
and scores
39
Select desired alignments for transfer to STN
  • Check boxes
  • Select by score category
  • Select all

40
Transfer RNs to STN
  • Select Transfer RNs to STN
  • Message indicates when the transfer is complete
  • Log off the BLAST system -- Select Exit from File
    menu or close browser

41
Retrieve RNs from BLAST
  • The Sequence Assistant page appears after you
    exit BLAST
  • Select the Retrieve RNs from BLAST option

42
Return to STN on the Web
  • STN will indicate if session is logged off
  • If so, log on to STN on the Web
  • Select Sequence Assistant
  • Retrieve RNs from BLAST

To obtain a transcript of your session, you must
log in again.
Back to the STN on the Web login page
43
Continue STN Searching
44
L-Numbers are created from the automatic transfer
45
L-Numbers are used for reference searches
These search results can be optionally combined
with DGENE, with routine use of STNs multifile
search interaction.
46
STN Express with Discover! 6.01is now available
for Sequence Searching
http//www.cas.org/ONLINE/STN/interact/express.htm
l
47
CAS REGISTRY BLAST is now searchable from Express
48
Transferring BLAST data into an STN session is
seamlessly integrated into the software
49
A report merges an STN transcript and BLAST
alignment data
50
A report merges an STN transcript and BLAST
alignment data
51
CAS REGISTRY BLAST will offer enhancements that
are in great demand by customers
  • BLAST Alerts
  • 1000 answers (increased from 200)
  • Searching on lt50 residues
  • BLAST version 2.2.3 from NCBI

These BLAST enhancements are also available
through STN on the Web new plug-in required.
52
Setting up and managing CAS REGISTRY BLAST alerts
is easy
53
Searches can now be set to retrieve only
sequences that have 50 residues or less--a big
help for primers and drug targets
54
Summary
55
In conclusion CAS Registry BLAST is necessary
for comprehensive sequence searching
  • The Registry file is a key resource for
    biotechnology information
  • CAS Registry BLAST provides a powerful and easy
    to use search engine
  • BLAST RNs can be searched using STN on the Web or
    STN Express to get related patent and journal
    references
  • Similarity searches in Registry can be combined
    with results from DGENE

56
The End
Write a Comment
User Comments (0)
About PowerShow.com