FunCatTM, a controlled vocabulary encompassing the biology of prokaryotes, plants and animals from c - PowerPoint PPT Presentation

About This Presentation
Title:

FunCatTM, a controlled vocabulary encompassing the biology of prokaryotes, plants and animals from c

Description:

FunCatTM, a controlled vocabulary encompassing the biology of prokaryotes, ... Test for consistence. Biomax Informatics AG Bioinformatics designed with you in mind. ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 35
Provided by: drdm
Category:

less

Transcript and Presenter's Notes

Title: FunCatTM, a controlled vocabulary encompassing the biology of prokaryotes, plants and animals from c


1
FunCatTM, a controlled vocabulary encompassing
the biology of prokaryotes, plants and animals
from cellular to systemic level
  • Dr. Dieter Maier
  • Manchester Ontologies Workshop 23/24.3.02
  • Biomax Informatics AG, Lochhamer Str. 11, 82152
    Martinsried, Germany

2
Outline
  • Objectives
  • Structure
  • Content
  • Development
  • Use

3
Objectives
  • Automatic data management
  • No prior knowledge of vocabulary required
  • Group genes by functional categories
  • Extensible
  • Organism independent
  • Compatible to other ontologies

4
Disclaimer
  • what the FunCat is not
  • Tool for the complete description of functions on
    a single gene level

5
Structure
  • Organized hierarchicall
  • Related functions grouped on different levels
  • Internally consistent

gt Provides a data warehouse
- overview about available selection
- progress from general to specific
- infere from specific to
general
6
Hierarchical structure
rRNA-processing
tRNA-transcription
  • 5-end processing

7
Content
  • Covers cellular processes, systemic physiology,
    development and anatomy
  • from procaryotes to the human
  • 25 main Categories with 1500
  • sub-categories
  • Categories are independent of organism
  • Genes can belong to multiple categories

8
  • Metabolism 247
  • Energy 60
  • Cell cycle and DNA processing 54
  • Transcription 31
  • Protein synthesis (Translation) 11
  • Protein fate (folding, modification,
    destination) 25
  • Cellular transport 32
  • Cellular communication 47
  • Cell rescue, defense and virulence 50
  • Regulation / interaction with cellular
    environment 45
  • Cell fate 54
  • Systemic regulation / interaction with
    environment 89
  • Development (systemic) 51
  • Transposable Elements, viral and plasmid
    proteins 8
  • Control of cellular organisation 57
  • Cell type differentiation 69
  • Tissue differentiation 40
  • Organ differentiation 91

Biological process 1061
9
Development
  • Historical
  • Pathways
  • Thesaurus
  • Complex relations

10
Structural development
  • Proven flexibility easy to extend
  • Stable overall structure
  • Compatibel to other ontologies like
  • Enzyme Cataloge
  • Gene Ontology
  • EcoCyce

11
Development in numbers
S. cerevisiae 1996 Main categories
16 Depth 4 Total 182
Plant (A. thaliana) and Procaryotes 1998
20 6 528
Animals (Human) 2001 25 6 1448
12
Integrating Pathways into processes
  • hierachical structure allows
  • Univocal attribution
  • Test for completeness
  • Test for consistence

13
Integrating additional information
  • Create a dynamic ontology from existing
    ontologies,
  • keywords and linguistic extraction of
    descriptors from
  • the literature
  • Semiautomatic mapping of dynamic ontologie to
    FunCat

14
Enabling complex relations
  • Intensify multidimensionality
  • Enable if ... then ... relations

15
Use
  • Manual annotation
  • Automatic annotation
  • Data mining

16
Manual annotation
  • multidimensional
  • stepwise

17
Manual annotation
  • 17 manually annotated genomes (5 eucaryotes, 12
  • procaryotes)
  • H.sapiens, A.thaliana, S.cerevisiae, N.crassa,
  • propriatary A.niger
  • B.subtilis, T.acidophilum, Listeria, 6 public
    procaryotes
  • in progress,
  • propriatary C.glutamicum, C.pneumoniae, 1
    undisclosed
  • Used for annotation of Transcriptomes

18
Automatic Annotation
  • Sequence similarity to manually annotated
    proteins
  • (distinguish experimentally verified and
    similarity
  • associated function)
  • H. sapiens
  • A. thaliana
  • S. cerevisiae
  • B. subtilis
  • T. acidophilum

19
PEDANT Genome Database
Currently more than 170 genomes (600 000 ORFs)
20
Data mining
  • Retrieval
  • Visualisation
  • Mining
  • Integration

21
Queries using the FunCat Grouplevel
  • Looking for groups of genes

22
Single molecule level
  • Retrieving protein entries

23
The human FunCat
Translation
cell cycle
Transcription
Protein fate
Energy
Intracellular Transport
Metabolism
Signalling
Unclassified
Defense
Cell physiology
24
Comparing genomes
Sequence similairty ? functional homology
Identification of organism specific functions
25
Comparing H.sapiens B.subtilis
Protein fate
Cellular communication
Interaction with cellular environment
Metabolism
26
Integrative analysis
Protein expression data
Protein-proteininteraction data
Gene expression data
Functional catalogue
Functional catalogue
Functional catalogue
Functional catalogue
27
Topological clustering (SOM)
28
Distribution of the genes
29
Limitations
  • Co-expression is no proof of functional
    association.
  • ? Integrate evidence from multiple sources.

30
Integration with annotation
  • Analyse gene expression data using integration
    with annotation catalogues.
  • Functional catalogue
  • Phenotypes
  • Interaction

31
Functional projection
32
Looking at the gene lists
33
FunCat
  • Tool to structure information
  • Tool to connect information

34
Thank you!
Write a Comment
User Comments (0)
About PowerShow.com