GenBank - PowerPoint PPT Presentation

About This Presentation
Title:

GenBank

Description:

Data shared nightly among three collaborating databases ... Taxonomy. PubMed abstracts. Nucleotide sequences. Protein sequences. 3-D Structure. 3 -D Structure ... – PowerPoint PPT presentation

Number of Views:1286
Avg rating:3.0/5.0
Slides: 32
Provided by: tech165
Category:
Tags: genbank | taxonomy

less

Transcript and Presenter's Notes

Title: GenBank


1
GenBank
  • Nucleotide only sequence database
  • Archival in nature
  • Data shared nightly among three collaborating
    databases
  • GenBank at NCBI
  • DNA Database of Japan (DDBJ)
  • EMBL at EBI

2
The International Sequence Database Collaboration
Source NCBI
3
NCBI site map A good place to find resources
http//www.ncbi.nlm.nih.gov/Sitemap/index.html
4
GeneBank Release 131.0December 15 2003
  • 30968418 Sequences
  • 36553368485 Bases
  • full release every two months
  • incremental and cumulative updates daily
  • available only through internet

ftp//ftp.ncbi.nih.gov/genbank/
5
GenBank Record
  • Header
  • information that apply to
  • the whole record
  • Features
  • annotations on the record
  • Sequence

6
GenBank Record
GeneBank Record
Header
modification date
Molecule Type
Locus Name
Sequence Length
Modification Date
Accession Number
Version Number
GenBank Division
7
GeneBank Record
FEATURE
Link to Seq
8
GenBank Record
Sequence
9
Entrez
10
Entrez
http//www.ncbi.nlm.nih.gov/gquery/gquery.fcgi
Select GenBank
11
Find mRNA sequence for human epidermal growth
factor receptor
12
Specify human as an organism
Click Preview/Index
Specify human by selecting Organisms from
All Fields drop-down menu
13
2
1
14
Limit your search
Exclude all technology generated records
Select mRNA in the Molecule list
Select Refseq in the database list
15
RefSeq
  • Database of reference sequences
  • Curated
  • Non-redundant one record for each gene, or each
    splice variant, from each organism represented
  • Each record is intended to present an
    encapsulation of the current understanding of a
    gene or protein, similar to a review article

RefSeq FAQ
16
Molecular databases
17
Find Gene Name by searching LocusLink
http//www.ncbi.nlm.nih.gov/LocusLink/
Select organism
18
LocusLink
19
Find mRNA sequence for epidermal growth factor
receptor (EGFR)
Starts with gene name EGFR
  • Limit search to
  • Gene Name
  • exclude all technology generated records
  • Select mRNA as Molecule
  • Select Refseq as source database

20
Entrez Neighbors and Hard Links
Word weight
3-D Structure
3 -D Structure
VAST
Phylogeny
Protein sequences
BLAST
BLAST
Source NCBI
21
SRS List of Public SRS Servers
22
SRS List of Public SRS Servers
23
SRS Tutorial
24
http//srs.ebi.ac.uk
Database Information -which are present -when
indexed
25
What is SRS?
  • Central resource for molecular biology data
  • Data retrieval system
  • - more than 250 databanks have been indexed. More
    than 35 SRS servers over the WWW
  • Data analysis applications server
  • - 11 protein applications
  • - 6 nucleic acid applications
  • Uniform query interface on the web

26
History of SRS
  • 1990 - Main author Dr. Thure Etzold
  • Development started in EMBL, Heidelberg
  • 1997
  • Moved to EBI in Cambridge. Development work was
    supported by various grants amongst others from
    the EMBnet.
  • 1998
  • Etzold and his group join LionBiosciences

27
Why SRS?
  • Information retrieval
  • Easy way to retrieve information from sequence
    and sequence-related databases
  • Possibility to search for multiple words/other
    criteria
  • Linkage between different databases
  • E.g. Find all primary structures with known
    three-dimensional structure
  • ... and much more

28
Philosophy of SRS
Original database file -plain text, html,
xml
29
The Library Select Page
30
SRS main toolbar tabs
  • Top Page displays databases in different
    database groups
  • Query displays either the standard or extended
    query form
  • Results or the query manager maintains a
    history of all the results obtained during a
    session
  • Projects or the project manager maintains a
    history of all queries and views used during a
    session
  • Views allows a user to define a user specific
    view for one or more databases
  • Databanks contains a list and some facts about
    the databases available in the system

31
Search terms in SRS
  • SRS indexed fields can be searched using any of
    the following
  • Single word search
  • Multiple word phrases
  • Numbers and dates
  • Regular expressions
  • Wildcards
Write a Comment
User Comments (0)
About PowerShow.com