Title: FunCatTM, a controlled vocabulary encompassing the biology of prokaryotes, plants and animals from c
1FunCatTM, a controlled vocabulary encompassing
the biology of prokaryotes, plants and animals
from cellular to systemic level
- Dr. Dieter Maier
- Manchester Ontologies Workshop 23/24.3.02
- Biomax Informatics AG, Lochhamer Str. 11, 82152
Martinsried, Germany
2Outline
- Objectives
- Structure
- Content
- Development
- Use
3Objectives
- Automatic data management
- No prior knowledge of vocabulary required
- Group genes by functional categories
- Extensible
- Organism independent
- Compatible to other ontologies
4Disclaimer
- what the FunCat is not
- Tool for the complete description of functions on
a single gene level
5Structure
- Organized hierarchicall
- Related functions grouped on different levels
- Internally consistent
gt Provides a data warehouse
- overview about available selection
- progress from general to specific
- infere from specific to
general
6Hierarchical structure
rRNA-processing
tRNA-transcription
7Content
- Covers cellular processes, systemic physiology,
development and anatomy - from procaryotes to the human
- 25 main Categories with 1500
- sub-categories
- Categories are independent of organism
- Genes can belong to multiple categories
8- Metabolism 247
- Energy 60
- Cell cycle and DNA processing 54
- Transcription 31
- Protein synthesis (Translation) 11
- Protein fate (folding, modification,
destination) 25 - Cellular transport 32
- Cellular communication 47
- Cell rescue, defense and virulence 50
- Regulation / interaction with cellular
environment 45 - Cell fate 54
- Systemic regulation / interaction with
environment 89 - Development (systemic) 51
- Transposable Elements, viral and plasmid
proteins 8 - Control of cellular organisation 57
- Cell type differentiation 69
- Tissue differentiation 40
- Organ differentiation 91
Biological process 1061
9Development
- Historical
- Pathways
- Thesaurus
- Complex relations
10Structural development
- Proven flexibility easy to extend
- Stable overall structure
- Compatibel to other ontologies like
- Enzyme Cataloge
- Gene Ontology
- EcoCyce
11Development in numbers
S. cerevisiae 1996 Main categories
16 Depth 4 Total 182
Plant (A. thaliana) and Procaryotes 1998
20 6 528
Animals (Human) 2001 25 6 1448
12Integrating Pathways into processes
- hierachical structure allows
- Univocal attribution
- Test for completeness
- Test for consistence
13Integrating additional information
- Create a dynamic ontology from existing
ontologies, - keywords and linguistic extraction of
descriptors from - the literature
- Semiautomatic mapping of dynamic ontologie to
FunCat
14Enabling complex relations
- Intensify multidimensionality
- Enable if ... then ... relations
-
15Use
- Manual annotation
- Automatic annotation
- Data mining
16Manual annotation
- multidimensional
- stepwise
17Manual annotation
- 17 manually annotated genomes (5 eucaryotes, 12
- procaryotes)
- H.sapiens, A.thaliana, S.cerevisiae, N.crassa,
- propriatary A.niger
- B.subtilis, T.acidophilum, Listeria, 6 public
procaryotes - in progress,
- propriatary C.glutamicum, C.pneumoniae, 1
undisclosed - Used for annotation of Transcriptomes
18Automatic Annotation
- Sequence similarity to manually annotated
proteins - (distinguish experimentally verified and
similarity - associated function)
- H. sapiens
- A. thaliana
- S. cerevisiae
- B. subtilis
- T. acidophilum
19PEDANT Genome Database
Currently more than 170 genomes (600 000 ORFs)
20Data mining
- Retrieval
- Visualisation
- Mining
- Integration
21Queries using the FunCat Grouplevel
- Looking for groups of genes
22Single molecule level
- Retrieving protein entries
23The human FunCat
Translation
cell cycle
Transcription
Protein fate
Energy
Intracellular Transport
Metabolism
Signalling
Unclassified
Defense
Cell physiology
24Comparing genomes
Sequence similairty ? functional homology
Identification of organism specific functions
25Comparing H.sapiens B.subtilis
Protein fate
Cellular communication
Interaction with cellular environment
Metabolism
26Integrative analysis
Protein expression data
Protein-proteininteraction data
Gene expression data
Functional catalogue
Functional catalogue
Functional catalogue
Functional catalogue
27Topological clustering (SOM)
28Distribution of the genes
29Limitations
- Co-expression is no proof of functional
association. - ? Integrate evidence from multiple sources.
30Integration with annotation
- Analyse gene expression data using integration
with annotation catalogues. - Functional catalogue
- Phenotypes
- Interaction
31Functional projection
32Looking at the gene lists
33FunCat
- Tool to structure information
- Tool to connect information
34Thank you!