Title: Selection of Resources for the Development of an Information Service Program in Molecular Biology an
1Selection of Resources for the Development of an
Information Service Program in Molecular Biology
and Genetics
- Ansuman Chattopadhyay, PhD
- Information Specialist in Molecular Biology and
Genetics - Health Sciences Library System
- University of Pittsburgh
2Topics
- Multi Step Life Sciences Research
- Literature Retrieval
- Sequence Analysis
- Laboratory Resources
- University of Pittsburgh HSLS Molecular Biology
Information Service Program
3Life Sciences Research- A Multi Step Process
Hypothesis Generation
Knowledge Mining
Sequence Analysis
Laboratory Bench Work
Mol Biol Information Service
4Literature Retrieval Resources
Hypothesis Generation
Knowledge Mining
Laboratory Bench Work
Sequence Analysis
PubMed --CellSpace Knowledge
Miner --PubGene --Genomatix BiblioSphere
5Too much information
83,130
31,596
6Literature Retrieval Resources
http//www.pubgene.org
http//www.genomatix.de/
7What is CellSpace ?
- CellSpace is a bioinformatics tool-- a
knowledge mining system that automatically
detects, analyzes, and reports the logical
relationships between four types of terms found
in the research literature - molecule proteins, genes, drugs
- function biological processes and disease states
- cell type
- organism
8What is CellSpace ?
CellSpace Knowledge Miner
Literature Association
Cells Systems
Molecules
Functions
Organisms
- Molecules
- Molecules
- Drugs
- Genes
- Proteins
Molecules
Functions
Cells Systems
- Functions
- Biological
- Functions
- Disease
- States
Organisms
Cells Systems Cells, Sub-cellular
Components,Tissues and Organs
9What you can do with CellSpace?
- Start with a single protein (or other molecule)
and - find its functions, the diseases in which it is
implicated, - and related molecules.
- Start with a disease or biological function and
find - related molecules, or related functions.
- Start with two or more functions, and find the
related - molecules that they have in common
10What you can do with CellSpace?
- Start with results from a high-throughput
experiment - (such as a cluster of co-regulated genes from
- microarray analysis), and easily find the
functions - that they share.
- Start with the results of proteomics
experiments, and - quickly screen the data to distinguish published
- interactions from novel ones.
- .View the literature that supports the
connections - found in CellSpace.
11Start with a disease or biological function and
find related molecules, or related functions
CellSpace Knowledge Miner
- Find molecules related to apoptosis
125
1
2
Drag and drop
3
4
Click to select
13Find molecules associated with apoptosis
Get references
Results are presented with statistical likelihood
value
14CellSpace Knowledge Miner
15How CellSpace Works?
CellSpace computers analyze the National Library
of Medicine's MEDLINE database, performing
proprietary statistical correlation analyses
regarding the organisms, cell types, biological
processes, and molecules reported in 655
selected life science research journals. The
molecular relationships extracted from the
literature are then stored in the CellSpace
database, which can be queried via the CellSpace
user interface. The information is updated every
two weeks
16PubGene
The Network Browser tool displays literature
association networks for a gene. The Set Cover
Article Search tool will let you search the
literature using a set covering algorithm. The
set covering algorithm is particularly useful to
search for literature references for large sets
of terms.
17PubGene
18PubGene
The query gene is shown with bright red font in
the graph, its direct neighbors are shown with
darker red font, and neighbours of neighbours
are shown with black font
19BiblioSphere
20BiblioSphere
21BiblioSphere
22BiblioSphere
23BiblioSphere
24BiblioSphere
25Resources comparison
26Resources comparison
27Information Hubs
Hypothesis Generation
Knowledge Mining
Sequence Analysis
Laboratory Bench Work
The molecular biology and genetics resources
that can serve as information hubs, an access
point to retrieve a broad range of information
through a small number of selected web-based
public databases
28Information Hubs
- UCSC Genome Bioinformatics Resources
- Genes detail page Genome Browser
- Family Browser Proteome Browser
- SwissProt
- LocusLink / Entrez Gene
- Gene Cards
- Gene Lynx
- Incyte Proteome Bioknowledge Library
- Human Protein Reference Database
- Organism Genome Consortium sites
29Information Hubs
Gene Expression Data
Sequence Genomic,mRNA Protein
LocusLink
UCSC Family browser
RNA Structure
SwissProt
Protein Structure
OMIM
GeneCards
CGAP
UCSC Genes Detail Page
GO Annotations Molecular function Bio
pathways Cellular component
Other Species
GeneLynx
PubMed
Mouse Genome Informatics
AceView
UCSC genome browser
UCSC Proteome browser
30Information Hubs
http//genome.ucsc.edu/cgi-bin/hgGene?hgsid314086
63dbhg16hgg_geneU14680hgg_chromchr17hgg_sta
rt41570859hgg_end41650551
31Information Hubs
http//www.ncbi.nlm.nih.gov/LocusLink/LocRpt.cgi?l
672
32Information Hubs
33Information Hubs
http//bioinfo.weizmann.ac.il/cards-bin/carddisp?B
RCA1searchBRCA1sufftxt
34Information Hubs
http//www.hprd.org/protein/00218
35Information Hubs
36Information Hubs
Sequence
Disease
Expression in Organ/Tissue Cell Type Tumor Type
Gene Ontology terms
Proteome BioKnowledge Library
Protein Interactions
Gene Regulation
Protein Modifications
Literature Excerpts
37Resources Comparison
38Genome Browsers
39Molecular Database Catalog
http//nar.oupjournals.org/
- Nucleic Acids Research Database Issue
40Growth of Molecular databases
41Database Catalog
http//www.infobiogen.fr/services/dbcat/
42Sequence Analysis
Hypothesis Generation
Knowledge Mining
Sequence Analysis
Laboratory Bench Work
MolBiol Tools Restriction mapping, PCR primer
design
Sequence Search
Sequence Alignment
Sequence Manipulation
43Web Server Catalog
http//nar.oupjournals.org/
- Nucleic Acids Research Database Issue
44Sequence Analysis
http//www.bioinformatics.vg/
http//healthlinks.washington.edu/index.cfm?id210
BCCB7-511A-4C6B-8B40-DFC47AABEA7F
http//www.hsls.pitt.edu/guides/genetics
45Sequence Analysis
http//www.bioinformatics.vg/
46Sequence Analysis
47Sequence Analysis
48Sequence Analysis
49Sequence Analysis
DNAStar
LaserGene
PC/Mac
PC/Mac
50Sequence Analysis
Vector NTI
Software
Database
DNA/RNA Protein Oligo Enzyme Gel Marker Blast
Result Analysis Result
Vector NTI core AlignX ContigExpress GenomBench Bi
oAnnotator
51Sequence Analysis
- VectorNTI Advanced software suit consists of
five independent - yet interconnected components
-
- Vector NTI core the cornerstone application for
Vector NTI suite, - provides tools for sequence analysis and
molecule manipulation. - AlignX a multiple sequence alignment tool
- ContigExpress a DNA sequence assembly and
sequencing - project management tool
- GenomBench a tool for genomic DNA sequence
analysis - and annotation
- BioAnnotator a tool for functional annotation of
DNAs - and proteins
52Sequence Analysis
- Using vector NTI molecular biologists can
-
- Perform routine sequence analysis tasks such as
restriction - mapping, identifying protein coding regions or
finding - sequence motifs and carrying out sequence
similarity searches - Generate recombinant cloning strategies and
protocols - Design and analyze PCR primers
- Catalog a growing number of plasmids and PCR
primers, - in order to track the origin and lineage of
recombinant molecules - Run in silico gel electrophoresis
- Perform and edit multiple sequence alignments on
proteins - and nucleic acids
- Create publication quality graphics and more
53Laboratory Resources
Hypothesis generation
Knowledge Mining
Sequence Analysis
Laboratory Bench work
Protocols
Useful Laboratory Resources
54Laboratory Resources
http//www.interscience.wiley.com/c_p/index.htm
Basic Protocol Alternate Protocol Commentary Crit
ical Parameters Troubleshooting Time
Considerations Key References Internet Resources
55Laboratory Resources
http//researchlink.labvelocity.com/
56HSLS Mol Biol Information Service
57HSLS Mol Biol Information Service
http//www.hsls.pitt.edu/guides/genetics
58Website Usage Report
http//www.hsls.pitt.edu/guides/genetics
59Workshops
May 2003-April 2004
Workshop 1 Information Hubs 2 Sequence
Similarity Searching 3 DNA Protein
Analysis Tools 4 CellSpace
Knowledge Miner 5 VectorNTI
60One-on-one Consultation
Total 70
2003
2004
61..only half of biomedical researchers using
genome databases are familiar with the tools that
can be used to actually access the data. ..
all scientists on the planet must be empowered to
use these powerful databases to unravel
longstanding scientific mysteries. atabases to
unravel longstanding scientifi c
Andreas D. Baxevanis Francis S. Collins
Nature Genetics, September 2002,
Vol 32