GreenPhylDB Largescale phylogenomic analyses for gene function prediction in GCP crops - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

GreenPhylDB Largescale phylogenomic analyses for gene function prediction in GCP crops

Description:

Marie-ang lique LAPORTE. Objectives: plants comparative genomics. Why? ... Need bioinformatics database and tools to drive experiments on gene functional genomics. ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 24
Provided by: mathieu
Category:

less

Transcript and Presenter's Notes

Title: GreenPhylDB Largescale phylogenomic analyses for gene function prediction in GCP crops


1
GreenPhylDBLarge-scale phylogenomic analyses for
gene function prediction in GCP crops
Matthieu CONTE
Christophe PERIN Marie-angélique LAPORTE
Mathieu ROUARD (PI)
2
Objectives plants comparative genomics
  • Why?
  • A lot of sequencing project. Need bioinformatics
    database and tools to drive experiments on gene
    functional genomics.
  • How?
  • Using phylogenomic methods. The only way to
    identify orthologous genes (probable same
    function).
  • What data?
  • Use plant model with complete genomes.
  • The only way to have complete and correct
    orthologs predictions (or non prediction)

3
GreenPhylDB V1.0http//greenphyl.cines.fr/
  • Oryza sativa and Arabidopsis thaliana model
    plants
  • Full genome available.
  • Gene annotation quality (TAIR release 8, TIGR
    release 5).
  • Most of functional evidence.

4
GreenPhylDB V1.0Statistics
  • 81,000 genes
  • 6,420 manually annotated gene families
  • 4,400 phylogenetically analysed gene families
  • 24,000 orthologs relationships between rice and
    Arabidopsis
  • (confidence score gt90)

5
GreenPhylDB V1.0Easy to use and to find
information
Gene ID (TAIR, TIGR)
Family name
Gene name (alias)
Gene annotation
KEGG ID
InterPro ID
6
GreenPhylDB V1.0If you have only the sequence.
7
GreenPhylDB V1.0If you have only a sequence.
8
GreenPhylDB V1.0Orthologs prediction
9
GOST (GreenPhyl Ortholog Search Tool)How to
study a gene from another species ?
10
GOST (GreenPhyl Orthog Search Tool)2 objectives
11
GOST (GreenPhyl Orthog Search Tool)Paste ONE
complete protein sequence
12
GOST (GreenPhyl Orthog Search Tool)Output
13
GOST (GreenPhyl Orthog Search Tool)Output
Your gene
14
i-GOSTbeta (Iterative GreenPhyl Ortholog Search
Tool) How to integrate more genes?
15
i-GOSTbeta (Iterative GreenPhyl Orthog Search
Tool)Paste Max 20 complete protein sequence
16
i-GOSTbeta (Iterative GreenPhyl Orthog Search
Tool) Step 1 gene classification and select
species
17
i-GOSTbeta (Iterative GreenPhyl Orthog Search
Tool) Step 2 phylogenomic predictions
18
i-GOSTbeta (Iterative GreenPhyl Orthog Search
Tool) Usual errors
  • Integration of genes from different families.
  • Integration of incomplete sequences.
  • We noticed some problems during analysis of some
    sequences
  • Please note that i-GOST is a BETA version

19
GreenPhylDB V2.0 in progressObjectives
  • 10 news fully sequenced genomes are now available
  • (Populus alba, Glycine max, sorghum bicolor,
    Medicago truncatula, Vitis vinifera , Selaginella
    moellendorffii , Physcomitrella patens ,
    Ostreococcus Tauri, Chlamydomonas reinhardtii ,
    Cyanidioschyzon merolae )
  • Why do you integrate these speciesand not GCP
    crops?
  • Complete sequencing and gene prediction
  • Will provide the complete list a plant gene
    families!
  • Use functional information available on these
    species
  • Reinforce phylogenomic signal and then orthologs
    predictions
  • Have a taxonomy sampling close to GCP crop target

20
Taxonomy Sampling GCP crops
21
GreenPhylDB V2.0Step 1 gene clustering
390,000 sequences 25,000 clusters
Family assignment
GreenPhyl Database v2.0
10 news species 300,000 sequences
2 species 81,000 genes 21,400 clusters 6,400
genes families
GreenPhyl Database V1.0
22
GreenPhylDB V2.0Family annotation platform
1./An essential step for optimal phylogenomic
analysis
  • Development of a family annotation platform with
  • Annotator registration system
  • Annotation and standardisation rules
  • Statistic on data available for each gene members
    of the group
  • User friendly annotation procedure

2./A useful information for future gene
annotation
23
GreenPhylDB- iGOST
  • A database/tool develop for your comparative
    genomic analysis
  • we need your feedbacks and comments.

Matthieu CONTE M.CONTE_at_CGIAR.ORG
Mathieu ROUARD M.ROUARD_at_CGIAR.ORG
Write a Comment
User Comments (0)
About PowerShow.com