'A bioinformatic Problem Solving Environment in the e-BioLab' VL-e Sub Program 1.5: Bioinformatics - PowerPoint PPT Presentation

About This Presentation
Title:

'A bioinformatic Problem Solving Environment in the e-BioLab' VL-e Sub Program 1.5: Bioinformatics

Description:

Where in the Virtual Laboratory for e-Science? Generic Virtual Laboratory ... UCSC genome browser snapshot. Result: Correlation between histone modification and ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 31
Provided by: ROOS7
Category:

less

Transcript and Presenter's Notes

Title: 'A bioinformatic Problem Solving Environment in the e-BioLab' VL-e Sub Program 1.5: Bioinformatics


1
'A bioinformatic Problem Solving Environment in
the e-BioLab' VL-e Sub Program 1.5
Bioinformatics
  • Timo Breit
  • Micro-Array Department
  • Integrative Bioinformatics Unit
  • Faculty of Science,
  • University of Amsterdam

2
Where in the Virtual Laboratory for e-Science?
BioInformatics Problem Solving Environment
Data intensivescience
Bioinformatics
BI- PSE
Application Layer
Generic Virtual Laboratory e-science layer
Grid Layer
3
Why in the VL-e?Data explosion in life sciences
research.
RNA analysis by Northern blot 1-15 genes
A B C D E F G H I J K L
M N O P Q R S T
Analyzed genes
Samples of cellular experiments
4
Life sciences research todaywhole system omics
data.
Biology
Biotechnology
Bioinformatics
Biologist
DNA
Genomics
Data storage Data handling Data
preprocessing Data analysis Data
integration Data interpretation
Experiment
Transcriptomics
RNA
protein
Proteomics
metabolite
Metabolomics
Results
Integrative biology or Systems biology
5
How in VL-e?A bioinformatics problem solving
environment (BI-PSE)
a.o. domain knowledge domain information domain
data
Life sciences domain
Hypothesis generation
Decision process
Experiment design
Wet-lab experiment
Enhancing knowledge model
In-silico experiment
e-bio science
a.o. semantic modeling
Problem solving environment
Generic virtual laboratory
a.o. analysis methods information
management semantic modeling adaptive inf.
disclosure
a.o. security (AAA) ICT infrastructure
Grid- layer
RESULT Rauwerda et al The Promise of a virtual
lab. Drug Discov Today. 2006 Mar11(5-6)228-36.
6
Parts of the BI-PSE we work on
Biological use case Huntington Disease
Biological use case Toxicogenomics
VL-e
Grid computing
7
Basic configuration of e-BioLab
VL-e use case SigWin finder
Goal A workflow to find significant windows in
data related to a given sequence (of any
type). Motivation Find sets of genes (windows)
with increased overall gene expression
(significance) in expression data ordered by gene
location on the chromosomes (sequence).
8
Basic configuration of e-BioLab
SigWin Significant Windows Márcia Alves de
Inda, Dimitri, Frans Verster, Marco Roos
Given a data set we compute Sliding Window (SW)
Medians for a given window size.
Using the SW Medians data we compute a False
Discovery Rate (FDR) threshold.
Windows with values above the FDR threshold are
called significant windows (or Windows Beyond the
Threshold)
R. Versteeg et al. Genome Res 2003 13 1998-2004.
9
Basic configuration of e-BioLab
VLAM SigWin-finder workflow
Modules
1) Read sequence
2) Rank sequence
3) SW Medians
4) Sample to Frequency
5) SW Medians Prob
6) FDR Threshold
7) WinBeTs
8) GnuPlot
10
Basic configuration of e-BioLab
SigWins and periodic data
11
Basic configuration of e-BioLab
Example periodic data Temperature in Amsterdam
12
Basic configuration of e-BioLab
Integration genomic transcriptomics data
13
Basic configuration of e-BioLab
Integration genomic transcriptomics data (zoom)
14
Basic configuration of e-BioLab
VL-e use case Histone code and semantic
modeling Lennart Post, Scott Marshall, Marco Roos
Hypothesis A relationship exists between histone
modification and transcription factor binding
sites
15
Basic configuration of e-BioLab
Design myModel Protégé - OWL
plug-inhttp//protege.stanford.edu
16
Basic configuration of e-BioLab
Data integration through semantic modeling
17
Basic configuration of e-BioLab
Result data integration via semantic modeling
UCSC genome browser snapshot
Overlap
etc
Result Correlation between histone modification
and transcription factor binding sites
18
Domain interaction Basic concept of an
e-BioScience Laboratory (e-BioLab)
Bioinformatics Problem Solving Environment
Tools
Grid
Methods
Workflows
19
Basic configuration of e-BioLab
Basic set-up of the e-BioLab
20
Basic configuration of e-BioLab
Anticipated tiled display in e-BioLab
2
Gene lists
SOM
Hier.clust.
1
3
Video remote collaboration
P1 cluster 1
P1 cluster 2
P1 cluster 3
P2 cluster 1
P2 cluster 2
P2 cluster 3
Remote whiteboard
Chrom.map 1
Pathways displayed
P3 cluster 1
P3 cluster 2
P3 cluster 3
Chrom.map 2
Chrom.map 3
21
Basic configuration of e-BioLab
Acknowledgements
Within SP1.5 Marco Roos Molecular biologist Han
Rauwerda Bioinformaticia Roel van
Driel Biochemist Christiaan Henkel Molecular
Biologist Lennart Post AIO (vDriel) Martijs
Jonker Bioinformatician Marcia Alves de Inda
Computational scientists Oskar
Brunning Bioinformatician Scott
Marshall Informatician Tessa Pronk Molecular
biologist Frans Verster Scientific
programmer Ramin Monajemi Informatician Timo
Breit Molecular biologist
Within VL-e SP1.2 use ontologies in semantic
modeling SP1.4 use case R on Grid,
e-bioscience SP2.2 AID ontologies and semantic
modeling SP2.4 information management SP2.5
workflow methods and tools Sp3.3 e-BioLab SP4.1
VLEIT team
Vacancies _at_ IBU Bioinformatician micro-array
data analysis (HBO/WO, 2 years) Scientific
Programmers building the e-BioLab
  • Outside VL-e
  • BioRange, NBIC Dutch bioinformatics
  • Content driven data modeling (Kok-LUMC,
    Adriaans,-UvA etc)
  • Test case systems biology (RUG, CMBI, TNO, UvA,
    etc)
  • SigWin (vKampen-AMC etc)
  • E-BioLab (vdVeer-VU, vd Vet-UT, Nikhef,
    SARA,etc)
  • BioAssist
  • - Microarray workflow (many.)
  • - Reannotatie (Leunissen-WU, Neerincx-WU etc)

More information www.micro-array.nl
22
Where in the Virtual Laboratory for e-Science?
Integrative Bioinformatics Problem Solving Environ
ment
Data intensivescience
Bioinformatics ASP
IB- PSE
Application Layer
Generic Virtual Laboratory e-science layer
Grid Layer
23
Subprograms research themes in national
bioinformatics initiative BioRange.


SP1. Bioinformatics for Microarray Technology
Experimental design
2. Understanding biological processes
3. Genotype-phenotype analysis
4. Dissemination of bioinformatics tools and expertise, and education
SP2. Bioinformatics for Proteomics and Metabolomics
5. Preprocessing and identification tools
6. Analysis and modeling tools
7. Molecular interactions tools
SP3. Integrative Bioinformatics.
8. Structural genomics
9. Comparative genomics
10. Phenotype-genotype modelling
11. Pathway modelling and visualisation
12. Content driven data modelling
13. Content driven text mining
SP4. VL-E Informatics for Bioinformatics Applications.
14. Adaptive information disclosure
15. User interface and visualization
16. Collaborative information management
SP5. Test bed with Real-Life Applications.
17. Selection of bioinformatics applications, scaling approach, real-life test applications
18. Dedicated scaling and validation approach
19. Integrated scaling and validation approach
Dissemination

Bioinformatics
Informatics
ICT infrastructure
24
Use cases (user scenarios)
  • R on grid (IUC1.5.1) (finished)
  • Creation of a web service that executes an
    R-script that invokes a LAM-MPI distributed
    calculation on the grid on a number of nodes that
    can be chosen by the user.
  • R in workflows (IUC1.5.4) (started)
  • Proof of principle of a micro-array analysis
    workflow by invocation of web services.
    Requirements are visualization of intermediate
    results and enabling human interaction.
  • Re-annotation of micro-array libraries (IUC1.5.5)
    (started, with J. Leunissen WU)
  • Re-annotation from sequence by invocation of
    remotely hosted web services in a workflow
    environment.
  • SigWin (IUC1.5.3) gt Significant Window Finder
    (proof of principle given)
  • Generalization of method that finds Regions of
    IncreaseD Gene Expression (RIDGEs) into workflow
    in VLAM environment that finds significant
    windows in sequences of values.
  • Histone Code case 1 (IUC1.5.2) (proof of
    principle given)
  • Proof-of-concept data integration via semantic
    models
  • Scaling problems semantic data integration
    (RUC1.5.1) (Finished, lead to 2 new IUCs)
  • Provide guidelines for the infrastructure to use
    for semantic data integration

25
A view on bioinformatics research and IBU
IBU
Bio -
- informatics
Informatics research
Bioinformaticsresearch
Appliedbioinformatics
Biologyresearch
26
Outline of presentation.
  • Where are we in positioned in the VL-e project?
  • Why do we need a Integrative Bioinformatics
    Problem Solving Environment?
  • What do we want to do with a IB-PSE?
  • How do we think to create a functional IB-PSE?
  • Who are we?
  • Where do we start?
  • When do we think we will have a functional
    IB-PSE?
  • Who are our collaborators?

27
What do we want to do with a IB-PSE?Concept of
integrative bioinformatics
Biological research domain
e-bioscience core domain
Enabling science domain
Model
Analysis methods
Omics data
Data- driven hypothesis
Integrative computational bioinformatics experi
ment
Experiment design
VL-e
ICT infra- structure
Biological knowledge
Problem- driven hypothesis
biological problem
Visualization
Biological solutions
Biological phenomena
28
Computational experimentation through advanced
data integration.
Data source A
29
Bioinformatics in the Netherlands
BiOrange Proof-of- concept Environment
VL-E Experimental (rapid prototyping)
Environment
VL-E Proof-of- concept Environment
VL-E Exploitation Environment (SARA)
University of Amsterdam
30
Data integration basic concept of any cell
DA
LC
ED
DA
DI
Assumption the complexity of life is organized
via a limited number of general cellular
mechanisms.
Write a Comment
User Comments (0)
About PowerShow.com