Master - PowerPoint PPT Presentation

About This Presentation
Title:

Master

Description:

There are two extremes in bioinformatics work ... Sociobiology Psychology. Systems Biology. Biology Medicine. Molecular Biology. Chemistry ... – PowerPoint PPT presentation

Number of Views:456
Avg rating:3.0/5.0
Slides: 45
Provided by: math79
Category:

less

Transcript and Presenter's Notes

Title: Master


1
Masters courseBioinformatics Data Analysis
and Tools
  • Lecture 1 Introduction
  • Centre for Integrative Bioinformatics
  • FEW/FALW
  • heringa_at_few.vu.nl

2
Course objectives
  • There are two extremes in bioinformatics work
  • Tool users (biologists) know how to press the
    buttons and know the biology but have no clue
    what happens inside the program
  • Tool shapers (informaticians) know the
    algorithms and how the tool works but have no
    clue about the biology
  • Both extremes are dangerous, need a breed that
    can do both

3
Course objectives
  • How do you become a good bioinformatics problem
    solver?
  • You need to know basic analysis and data mining
    modes
  • You need to know some important backgrounds of
    analysis and prediction techniques (e.g.
    statistical thermodynamics)
  • You need to have knowledge of what has been done
    and what can be done (and what not)
  • Is this enough to become a creative tool
    developer?
  • Need to like doing it
  • Experience helps

4
Contents (tentative dates)
Date Lecture Title Lecturer 1 wk 19
07/05/07 Introduction Jaap Heringa   2 wk
1910/05/07 Microarray data analysis Jaap
Heringa   3 wk 2014/05/07 Molecular simulations
sampling techniques Anton Feenstra   4
wk 21 22/05/07 Introduction to Statistical
Thermodynamics I Anton Feenstra 5 wk 21
24/05/07 Introduction to Statistical
Thermodynamics II Anton Feenstra   6wk 23
05/06/07 Machine learning Elena
Marchiori  7wk 23 07/06/07 Clustering
algorithms Bart van Houte 8wk 24
11/06/07 Support vector machines and feature
selection in bioinformatics Elena
Marchiori   9wk 24 12/06/07 Databases and
parsing Sandra Smit 10wk 24
14/06/07 Ontologies Frank van Harmelen 11wk
25 18/06/07 Benchmarking, parallelisation
grid computing Thilo Kielmann   12wk 25
19/06/07 Method development I Protein domain
prediction Jaap Heringa13wk 25 21/06/07 Method
development II Jaap Heringa 
5
At the end of this course
  • You will have seen a couple of algorithmic
    examples
  • You will have got an idea about methods used in
    the field
  • You will have a firm basis of the physics and
    thermodynamics behind a lot of processes and
    methods
  • You will have an idea of and some experience as
    to what it takes to shape a bioinformatics tool

6
Bioinformatics
Studying informatic processes in biological
systems (Hogeweg)
Information technology applied to the management
and analysis of biological data (Attwood and
Parry-Smith)
Applying algorithms and mathematical formalisms
to biology (genomics)
7
This course
  • General theory of crucial algorithms (GA, NN,
    HMM, SVM, etc..)
  • Method examples
  • Research projects within own group
  • Repeats
  • Domain boundary prediction
  • Physical basis of biological processes and of
    (stochastic) tools

8
Bioinformatics
Bioinformatics
Large - external (integrative) Science Human
Planetary Science Cultural Anthropology
Population Biology Sociology
Sociobiology Psychology Systems
Biology Biology Medicine
Molecular Biology
Chemistry Physics Small
internal (individual)
9
Genomic Data Sources
  • DNA/protein sequence
  • Expression (microarray)
  • Proteome (xray, NMR,
  • mass spectrometry,
  • PPI)
  • Metabolome
  • Physiome (spatial,
  • temporal)

Integrative bioinformatics
10
Protein structural data explosion
Protein Data Bank (PDB) 14500 Structures (6
March 2001) 10900 x-ray crystallography, 1810
NMR, 278 theoretical models, others...
11
Bioinformatics inspiration and cross-fertilisation
Chemistry
Biology Molecular biology
Mathematics Statistics
Bioinformatics
Computer Science Informatics
Medicine
Physics
12
Algorithms in bioinformatics
  • string algorithms
  • dynamic programming
  • machine learning (NN, k-NN, SVM, GA, ..)
  • Markov chain models
  • hidden Markov models
  • Markov Chain Monte Carlo (MCMC) algorithms
  • stochastic context free grammars
  • EM algorithms
  • Gibbs sampling
  • clustering
  • tree algorithms (suffix trees)
  • graph algorithms
  • text analysis
  • hybrid/combinatorial techniques and more

13
Joint international programming initiatives
  • Bioperl
  • http//www.bioperl.org/wiki/Main_Page
  • http//bioperl.org/wiki/How_Perl_saved_human_geno
    me
  • Biopython
  • http//www.biopython.org/
  • BioTcl
  • http//wiki.tcl.tk/12367
  • BioJava
  • www.biojava.org/wiki/Main_Page

14
Integrative bioinformatics _at_ VU
  • Studying informational processes at biological
    system level
  • From gene sequence to intercellular processes
  • Computers necessary
  • We have biology, statistics, computational
    intelligence (AI), HTC, ..
  • VUMC microarray facility, cancer centre,
    translational medicine
  • Enabling technology new glue to integrate
  • New integrative algorithms
  • Goals understanding cellular networks in terms
    of genomes fighting disease (VUMC)

15
Bioinformatics _at_ VU
  • Progression
  • DNA gene prediction, predicting regulatory
    elements, alternative splicing
  • mRNA expression
  • Proteins (multiple) sequence alignment, docking,
    domain prediction, PPI
  • Metabolic pathways metabolic control
  • Cell-cell communication

16
Fold recognition by threading THREADER and
GenTHREADER
Fold 1 Fold 2 Fold 3 Fold N
Query sequence
Compatibility scores
17
Polutant recognition by microarray mapping
Cond. 1 Cond. 2 Cond. 3 Cond. N
Contaminant 1
Contaminant 2
Query array
Contaminant 3
Compatibility scores
Contaminant N
18
ENFIN WP4
  • Functional threading
  • From sequence to function
  • Multiple alignment
  • Secondary structure prediction, Solvation
    prediction, Conservation patterns, Loop
    enumeration

19
ENFIN WP4
  • Functional threading
  • From sequence to function
  • Multiple alignment
  • Secondary structure prediction, Solvation
    prediction, Conservation patterns, Loop
    enumeration

Struct
Func
DB of active site descriptors
H
DHS
S
D
20
ENFIN WP5 - BioRange (Anton Feenstra)
  • Protein-protein interaction prediction
  • Mesoscopic modelling
  • Soft-core Molecular Dynamics (MD)
  • Fuzzy residues
  • Fuzzy (surface) locations

21
ENFIN WP6
  • Silicon Cell
  • Database of fully parametrized pathway model
    (differential equations) solver
  • Jacky Snoep (Stellenbosch, VU/IBIVU)
  • Hans Westerhoff (VU, Manchester)

22
Where are important new questions?
23
New neighbouring disciplines
  • Translational Medicine
  • A branch of medical research that attempts to
    more directly connect basic research to patient
    care. Translational medicine is growing in
    importance in the healthcare industry, and is a
    term whose precise definition is in flux. In
    particular, in drug discovery and development,
    translational medicine typically refers to the
    "translation" of basic research into real
    therapies for real patients. The emphasis is on
    the linkage between the laboratory and the
    patient's bedside, without a real disconnect.
    This is often called the "bench to bedside"
    definition.
  • Computational Systems Biology
  • Computational systems biology aims to develop
    and use efficient algorithms, data structures and
    communication tools to orchestrate the
    integration of large quantities of biological
    data with the goal of modeling dynamic
    characteristics of a biological system. Modeled
    quantities may include steady-state metabolic
    flux or the time-dependent response of signaling
    networks. Algorithmic methods used include
    related topics such as optimization, network
    analysis, graph theory, linear programming, grid
    computing, flux balance analysis, sensitivity
    analysis, dynamic modeling, and others.
  • Neuro-informatics
  • Neuroinformatics combines neuroscience and
    informatics research to develop and apply the
    advanced tools and approaches that are essential
    for major advances in understanding the structure
    and function of the brain

24
Translational Medicine
  • From bench to bed side
  • Genomics data to patient data
  • Integration

25
Natural progression of a gene
26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
Systems Biology
  • is the study of the interactions between the
    components of a biological system, and how these
    interactions give rise to the function and
    behaviour of that system (for example, the
    enzymes and metabolites in a metabolic pathway).
    The aim is to quantitatively understand the
    system and to be able to predict the systems
    time processes
  • the interactions are nonlinear
  • the interactions give rise to emergent
    properties, i.e. properties that cannot be
    explained by the components in the system

30
Systems Biology
  • understanding is often achieved through modeling
    and simulation of the systems components and
    interactions.
  • Many times, the four Ms cycle is adopted
  • Measuring
  • Mining
  • Modeling
  • Manipulating

31
(No Transcript)
32
(No Transcript)
33
A system response
Apoptosis programmed cell death Necrosis
accidental cell death
34
Neuroinformatics
  • Understanding the human nervous system is one of
    the greatest challenges of 21st century science.
  • Its abilities dwarf any man-made system -
    perception, decision-making, cognition and
    reasoning.
  • Neuroinformatics spans many scientific
    disciplines - from molecular biology to
    anthropology.

35
Neuroinformatics
  • Main research question How does the brain and
    nervous system work?
  • Main research activity gathering neuroscience
    data, knowledge and developing computational
    models and analytical tools for the integration
    and analysis of experimental data, leading to
    improvements in existing theories about the
    nervous system and brain.
  • Results for the clinic Neuroinformatics provides
    tools, databases, models, networks technologies
    and models for clinical and research purposes in
    the neuroscience community and related fields.

36
(No Transcript)
37
Bioinformatics _at_ VU
  • Qualitative challenges
  • High quality alignments (alternative splicing)
  • In-silico structural genomics
  • In-silico functional genomics reliable
    annotation
  • Protein-protein interactions.
  • Metabolic pathways assign the edges in the
    networks
  • Fluxomics, quantitative description (through
    time) of fluxes through metabolic networks
  • New algorithms

38
Bioinformatics _at_ VU
  • Quantitative challenges
  • Understanding mRNA expression levels
  • Understanding resulting protein activity
  • Time dependencies
  • Spatial constraints, compartmentalisation
  • Are classical differential equation models
    adequate or do we need more individual modeling
    (e.g macromolecular crowding and activity at
    oligomolecular level)?
  • Metabolic pathways calculate fluxes through time
  • Cell-cell communication tissues, hormones,
    innervations

Need complete experimental data for good
biological model system to learn to integrate
39
Bioinformatics _at_ VU
  • VUMC
  • Neuropeptide addiction
  • Oncogenes disease patterns
  • Reumatic diseases

40
Bioinformatics _at_ VU
  • Quantitative challenges
  • How much protein produced from single gene?
  • What time dependencies?
  • What spatial constraints (compartmentalisation)?
  • Metabolic pathways assign the edges in the
    networks
  • Cell-cell communication find membrane associated
    components

41
Integrative bioinformatics
  • Integrate data sources
  • Integrate methods
  • Integrate data through method integration
    (biological model)

42
Integrative bioinformaticsData integration
Algorithm
Data
tool
Biological Interpretation (model)
43
Integrative bioinformaticsData integration
Data 1
Data 2
Data 3
44
Integrative bioinformaticsData integration
Data 1
Data 2
Data 3
Algorithm 1
Algorithm 2
Algorithm 3
tool
Biological Interpretation (model) 1
Biological Interpretation (model) 2
Biological Interpretation (model) 3
45
Bioinformatics
  • Nothing in Biology makes sense except in the
    light of evolution (Theodosius Dobzhansky
    (1900-1975))
  • Nothing in Bioinformatics makes sense except in
    the light of Biology
Write a Comment
User Comments (0)
About PowerShow.com