Legume Information Network: A Component of the Virtual Plant Information Network - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Legume Information Network: A Component of the Virtual Plant Information Network

Description:

University of Minnesota Center for Computational Genomics ... Greek mythology: cleansed the Augean stables and restored life to the soil. Pileup Visualization ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 35
Provided by: kg2
Category:

less

Transcript and Presenter's Notes

Title: Legume Information Network: A Component of the Virtual Plant Information Network


1
Legume Information Network A Component of the
Virtual Plant Information Network
  • National Center for Genome Resources
  • University of Minnesota Center for
    Computational Genomics and Bioinformatics
  • United States Department of Agriculture
    Agricultural Research Service

Gregory D. May Atlanta October 2007
2
Current State of Bioinformatics Resources
  • Hundreds of Project web-sites and DBs
  • Project DBs are distributed, autonomous and
    ephemeral
  • Inconsistent user interfaces

TIGR Gene Indices
  • Stein et al, (2006) Plant Biology Databases A
    Needs Assessment by the NSF-USDA Working Group on
    Long-Lived Databases.

3
The promises of 30 high throughput omics
technologies
  • Improved crops
  • nutrition, novel traits, resistance,
  • yield, sustainable
  • Improved animal production
  • Improved human health
  • biomarker diagnostics
  • personalized medicines and therapies
  • Improved environment
  • bioremediation
  • carbon sequestration
  • energy independence

4
The need
  • The legume biologist still must navigate multiple
    information resources for many research questions
  • Develop a virtual, easy-to-navigate one-stop
    legume information network. By one-stop we
    refer by analogy to Google and how it can be seen
    as a single, yet non-exclusive, information
    resource.
  • Gepts et al, Report from the CATG meeting.
  • Plant Physiology (2005) 1371228.

5
(No Transcript)
6
Virtual Plant Information Network
  • Establish an architecture based on semantic web
    technologies to support interoperable (database)
    network
  • Standardize data formats and user-interfaces to
    support machine readable representation of
    genomes, genetic maps, polymorphisms, QTL,
    expression, proteins, metabolites and phenotypes.
  • Develop breeders toolboxes with visual
    interfaces similar to that depicted in GEYSIR

7
Goals
  • Design a solution for integrating disparate data
    sources
  • Develop a prototype, Legume Information Network,
    demonstrating the capabilities of semantic web
    technologies
  • Legume community take a leadership role in data
    and tool integration using semantic-MOBY

8
The Requirements
  • Devise a way in which resources can be described,
    discovered, and invoked on the web using
  • a common syntax so machines can parse the data
    and services of each other
  • a public semantic so machines can make
    determinations on suitability-for-purpose
  • a discovery service so machines can find data
    and services across the web based on the
    semantics of the resources being offered and the
    needs of the task at hand

9
The Approach Keep it simple
Clients, Providers, and even Discovery Servers
all read and contribute to the same set of
statements.
All actors understand a single, mutable graph
which embeds an explicit logic necessary and
sufficient to describe, query, discover, invoke,
and satisfy resources and requests.
10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
Services
Data Provider Services Service Description Provide
r GO Annotated Transcript Sequences LIS Medicago
IMGAG Annotations CCGB Precomputed BlastX against
NCBI's NR LIS Blocks precomputed analysis
retrieval LIS GenScan precomputed gene
predictions LIS Sequence Text Retrieval LIS GO
Annotations Retrieval LIS InterPro precomputed
analysis retrieval LIS
Visualization Services Service
Description Provider Comparative Map and Trait
Viewer LIS ISYS TableViewer LIS Alignment
visualization using PFAAT CCGB
Analysis Services Service Description Provider Clu
stalw Multiple Sequence Alignment CCGB BlastN LIS
Transcript Contigs LIS Blast sequences against
Kegg Genes CCGB Blast sequences against TIGR TOG
Sequence CCGB BlastN Legume BACs LIS BlastN
Lotus finished BACs LIS
18
LIN partners
19
Resources
A running Discovery Server www.semanticMoby.org
The project web site vpin.ncgr.org Discussion
forum vpin.ncgr.org/mvnforum/forum Collection
of ontologies ontologies.ncgr.org Protocol
documentation ontologies.ncgr.org/OWLDocs/moby P
ublications and other docs vpin.ncgr.org/links.sh
tml Developers resources www.semanticmoby.org/d
eveloper/index.jsp Provider Developer
Kit vpin.ncgr.org/provider.shtml Client
Developer Kit vpin.ncgr.org/client.shtml
20
Generation of DNA Sequence Data
Cost/1000 bp 1990 10.00 2000 3.00 2005
1.00 2006 0.10 2007 0.03
21
Sequencing Platform Comparison
22
Alpheus Cyberinfrastructure for medical and
agricultural resequencing
  • Nucleotide variant and splice isoform detection
  • 100s Gb-scale resequencing projects
  • Short reads (454, Solexa, SOLiD plus Sanger)
  • Paired and unpaired
  • Alignments to genomic and transcriptomic
    references
  • Greek mythology cleansed the Augean stables and
    restored life to the soil

23
Pileup Visualization
Slidable window
Overview of transcript
Coding domain
nsSNP SNP in/del
454 reads
24
Dynamic Filtering
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
Summary of Medicago ecotype F83005.5 Solexa
resequencing
  • With 1x coverage of a 540Mb genome
  • One SNP 600bp no filtering
  • 45,000 High-stringent SNPs

31
(No Transcript)
32
Application of Next-Generation Sequencing
Technologies for Variant Detection in Crop Plants
and Pathogens
  • Whole transcriptome shotgun re-sequencing
  • Expressed portions (or gene space) of the genome
    across populations in the absence of a reference
    genome
  • Whole genome shotgun re-sequencing
  • Sequence across populations with available
    reference genomes
  • WGS skimming of transformation events
  • Target genome re-sequencing across populations
  • Area under the QTL
  • Pooled long-PCR products to walk between markers
  • Restriction enzyme-anchored

33
GEYSIR(Genomic Explorer y Survey of Immune
Response)
geysir.ncgr.org
Clickable LOD scores moves selection windows
Map region selection windows (grab slide)
Marker on linkage map (cM)
Zoom pan buttons
View Selected Studies (across all chromosomes)
Sample study 1
Marker on physical map (Mb)
Chromosome Map
Marker titles visible in this 1.5 Mb region
Candidate genes in blue
CTRL-left mouse click takes you to Gene detail
page
Slide-able feature neighborhood window
Nucleotide slider window
Exons in green
Click on chromosome 22
SNP markers
Clickable SNP bubbles take you to dbSNP
Nucleotide slider window View
34
Acknowledgements
  • NCGR LIS
  • Greg May
  • Kamal Gajendran
  • Andrew Farmer
  • Michael Gonzales
  • Selene Virk
  • Bill Beavis
  • USDA-ARS LIS
  • Randy Shoemaker
  • David Grant
  • Rich Wilson
  • NCGR GEYSIR
  • Susan Baxter
  • Faye Schilkey
  • Neil Miller
  • Dan Weems
  • Lar Mader
  • USDA-ARS LIN
  • Randy Shoemaker
  • Michelle Graham
  • CCGB/U. Minn LIN
  • Ernest Retzel
  • Jim Johnson
  • Michael Heuer
  • John Crow
  • NCGR VPIN/LIN
  • Damian Gessler
  • Gary Schiltz
  • Bill Beavis
  • Andrew Farmer
  • S. Knapp
  • N. Young
  • Funding
  • LIS/LIN USDA-ARS
  • SCA 3625-21000-038-01
  • GEYSIR NIH-NIAID HHSN266200400064C
  • VPIN NSF-BDI 0516487
  • LIS Steering Committee
  • Mark Burow
  • Doug Cook
  • Perry Cregan
  • Rebecca Dickstein
  • David Grant
  • Randy Shoemaker
  • Michael Udvardi
  • Nevin Young
Write a Comment
User Comments (0)
About PowerShow.com