PGA Education Module: Bioinformatics Tools, ELXR and eTBLAST - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

PGA Education Module: Bioinformatics Tools, ELXR and eTBLAST

Description:

PGA Education Module: Bioinformatics Tools, ELXR and eTBLAST ... hydra. home. 130G. node1. node2. node16. insolent. lethargy (14G) spore web server. main web server ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 19
Provided by: simmo5
Category:

less

Transcript and Presenter's Notes

Title: PGA Education Module: Bioinformatics Tools, ELXR and eTBLAST


1
PGA Education Module Bioinformatics Tools, ELXR
and eTBLAST
UTSW PGA Education Module Bioinformatics Tools,
ELXR and eTBLAST Webex, 12/17/03
Presented by Skip Garner This reflects the work
of many researchers and staff.
2
Agenda for module - Today
  • Welcome PGAers and others
  • Interested persons should have visited our
    web-site education module
  • http//pga.swmed.edu/new_pga/Dreamweaver/
  • Brief introduction
  • Other tools to be addressed in future Web-X demos
  • Intro to hardware and use tracking
  • Operations and use of the ELXR code
  • Utility, use
  • Examples
  • Operations and use of the eTBLAST code
  • Utility, use
  • Examples
  • Questions and Answers

3
A family of bioinformatics tools and their
computed databases have been developed.
Genomic Annotation
Text data mining
Gene collection identification and analysis
Polymorphism Prediction
4
Our Computational Biology / Bioinformatics
toolset is very applied.
  • ELXR Exon locator and extractor for
    resequencing.
  • POMOUS/Rep-X and SNIDE polymorphism prediction
    software.
  • eTBLAST, FRISC, TRITE, IRIDESCENT Text data
    mining and knowledge discovery tools.
  • PANORAMA A DNA/Protein sequence analysis and
    visualization tool.
  • ARROGANT A gene/clone collection analysis tool.
  • Local BLAST Server UTSW BLAST utility for
    comparison against EST/cDNA/RefSeq sequences from
    UTSW microarrays, specialized collections and
    BioThreat work.
  • MarC-V, Signal, SNPCEQer .

5
Hardware and Databases
We have established hardware and databases for
this effort.
Linux, UNIX, Solaris and Windows Servers gt10TB
primary storage All major languages and scripts
and databases
6
For our applications that have web interfaces, we
monitor their usage and can estimate their
utility to the wider research community.
These numbers are only for users external to
UTSW, and they may contain web-bot hits that we
estimate from their origin to be about 50 of the
total.
7
Primers for all human, mouse and rat exons, etc.
have been computed and experimentally verified.
ELXR components fastacmd (NCBI) genomic and
RefSeq mRNA sequence retrieval blastn (NCBI)
local genomic sequence alignment to EST/cDNA
sequence input sim4 (PSU) alignment of
transcribed and spliced DNA sequence to genomic
sequence containing that gene while predicting
donor/acceptor sites between introns and
exons primer3 (WI) designs PCR/sequencing
primer pairs
Some area
H2AFY2
ELXR
Exon 1 primers
8
ELXR data sets verified and it is now used in the
NHLBI Program in Genomics Applications (PGA) SNP
discovery pipeline.
Latest numbers gt6,000 primer pairs in use for
PGA project.
9
Pathogene
  • Primers for every ORF for every microorganism
    computed
  • From genome annotation
  • From GLIMMER output (overestimates ORFs)
  • Primer pairs in experimental validation
  • Soon to be available via our www page at
    rce.swmed.edu (along with our dedicated
    microorganism BLAST server

Example ORF search
Primers
Pathogene Interface
10
eTBLAST electronic Text Basic Local Alignment
and Similarity Tool For document clustering, a
new/better way for us to access the literature
eTBLAST electronic Text Basic Local Alignment
and Similarity Tool For document clustering, a
new/better way for us to access the literature
11
eTBLAST is a automated document similarity search
and retrieval tool.
  • Input is text (paragraphs, proposals, abstracts,
    documents, sentences) via a www browser.
  • Stop words eliminated, keywords extracted (with
    lexical variation and synonyms expanded in query)
    and weighted
  • An indexed database of documents (Medline with
    13,000,000 abstracts and some book chapters
    currently implemented) is searched and ranked for
    similarity. Alternate similarity algorithms
    planned (grammar induction).
  • Top 200 hits re-ranked using a dynamic
    programming algorithm.
  • Variety of data outputs in a browser.
  • This is a work in progress, but has already found
    a number of users (and has made me look smarter
    than I really am).

12
eTBLAST algorithm, natural query input,
filtering, and output is philosophically similar
to BLAST.
13
Where eTBLAST has advantages
  • First, eTBLAST is no substitute for traditional
    search engines
  • eTBLAST is particularly valuable when entering a
    new area of research for which selection of
    keywords may be difficult
  • eTBLAST can be applied to bulk text that one
    often has
  • Abstracts while reviewing papers
  • Grant proposal abstracts can be directly
    submitted
  • Student proposals
  • Reference finding while writing papers or
    proposals
  • Uniqueness searching for proposed manuscripts or
    patent-able ideas
  • General text that defines a new area you are
    beginning to study

14
Using eTBLAST
  • Go to http//innovation.swmed.edu/Biocomputing/Co
    mputing.htm Select eTBLAST and follow the
    directions (page 1) (page 2)
  • Those directions include
  • Pasting in or entering text
  • Enter the email address where you want the
    results sent
  • Go to the next page and refine your search
    parameters or go with the default
  • Submit the search
  • Wait, and the results will be sent to you
  • Inspect the results, interact with the returned
    links
  • Refine your search and/or iterate
  • Save the link or save the page locally, for
    increased usage may require results to be purged
    roughly monthly

15
This is how you receive your results
Click here and Your results will Come up in
your browser
We monitor our Email
Raw results (not user friendly)
16
eTBLAST by extension has lead to other
opportunistic applications - FRISC, TRITE
  • eTBLAST similarity comparison engine for
    electronic text using weighted keywords, concepts
    and grammar induction. Other types of literature
    have begun to be available (Book - Cancer
    Medicine).
  • FRISC using eTBLAST, a UTSW faculty research
    interests page is checked regularly against new
    Biomedical abstracts from Medline and ranks to
    cluster information that best fits interests of
    researcher. (UTSW, Brown) A PGA specific User
    Profile Builder will be available soon!
  • TRITE using eTBLAST, topical interests will be
    searched regularly against new Biomedical
    abstracts in Medline.

17
eTBLAST is still experimental
  • eTBLAST is offered as a free service of the
    Garner Lab
  • eTBLAST is not funded, but we continue to develop
    and extend it, and new features and increased
    speed are coming
  • eTBLAST is experimental, and as such there will
    be bugs and occasional temporary problems with
    availability
  • Email us with problems, suggestions, comments

18
http//pga.swmed.edu/
http//pga.swmed.edu/
Write a Comment
User Comments (0)
About PowerShow.com