Biology as an Information Science Rob Rutherford St' Olaf College PowerPoint PPT Presentation

presentation player overlay
1 / 46
About This Presentation
Transcript and Presenter's Notes

Title: Biology as an Information Science Rob Rutherford St' Olaf College


1
Biology as an Information ScienceRob
RutherfordSt. Olaf College
2
1. The Biologists perspective2.
Genome Biology3. A survey of Challenges
3
  • If the biota, in the course of eons, has built
    something ..who but a fool would discard
    seemingly useless parts? To keep every cog and
    wheel is the first precaution of intelligent
    tinkering.
  • -Aldo Leopold (1887 - 1948)

4
Figure 1.18 Careful observation and measurement
provide the raw data for science
5
PubMed had 400,000 new research articles entered
in 2002.NCBI-NLM, 2003
Productive Tinkerers
6
NIH-NLM 2003
7
NIH-NLM 2003
8
Historically Diverse opinions on math and biology
9
If your experiment needs statistics, you ought
to have done a better experiment.-Rutherford
(the other one)
10
To consult the statistician after an experiment
is finished is to ask him to conduct a post
mortem examination. He can perhaps say what the
experiment died of.
RA Fisher 1956, University of Adeliade Archives
11
Part 2Genome Biology
12
The human genome
  • 3.2 x 109 bp
  • If each base were one mm long
  • 2000 miles, across the center of Africa

13
Human Genes
Rutherford
14
Imagine we found an ancient document 1,000x
longer than the King James bible, and you can
read 10-20 of the words.
15
Things we dont know-The sequence(s) of some of
the genes -What many/most of them do at a
basic level - How they work together
dynamically -Very little about areas outside
genes which control them and show what is most
importantEach organism has its own genome, and
they work together at many levels to build an
ecosystem.
16
(No Transcript)
17
Part 3A Survey of Problems/Opportunities
18
The Central Dogma
  • DNA Information Warehouse
  • (4 nucleic acid letters ATGC)
  • RNA Temporary copy of a gene
  • Protein Working Cellular Machine
  • (20 amino acid letters)

RNA polymerase PDB
19
A Survey of Problems
Finding and Understanding Genes Molecular
Structure and Function Evolution Gene
Expression Networks Other areas
20
Finding and Understanding Genes
21
(No Transcript)
22
Finding Conserved Regions/Domains
HIV protein Comparing your sequence versus
models derived from curated known protein
families
23
Determining Molecular Structure
Imaging Experimental data Predicting structure
in silico from sequence
24
HIV reverse transcriptase
Structure is Function
  • DNA (human genome)
  • RNA (HIV virus)
  • Protein

Goodsell, PDB
25
Goodsell, PDB
26
Experimental structures in the Protein Data Bank
27
Evolution and Phylogenetics
Thanks to Porterfield
28
Gene ExpressionWhich gene products
(RNA/protein) are being used when?
29
MicroArray
One spot for each gene
30
Comprehensive Experiments Data Explosion
  • 30,000 Genes on one microarray
  • x 100 Microarry Experiments
  • 3,000,000 Data-points

How to mine this Data for biologically
important trends?
31
Experimental Conditions
4000 Genes
Gene turned on Gene turned off
32
Figure 1.23x1 Biotechnology laboratory
33
NetworksBiochemical PathwaysSignaling
NetworksTranscriptional Networks Computational
Neuroscience
34
Metabolic Pathway Map
35
  • If the biota, in the course of eons, has built
    something ..who but a fool would discard
    seemingly useless parts? To keep every cog and
    wheel is the first precaution of intelligent
    tinkering.
  • -Aldo Leopold (1887 - 1948)

36
Other Opportunities Organismal
Physiology Populations Communities
Ecosystems
37
Figure 1.3 Some properties of life
38
  • Same issues in Macro Biology
  • Long history of mathematical
  • modeling
  • Huge datasets from
  • GPS/GIS
  • Remote sensing

39
Where is all this leading?
40
Part 3How do we prepare our students for this
future?
41
The systems biologist.
  • Biologist who is an intelligent and skeptical
  • consumer of large data sets
  • Probability and Statistics
  • SQL and database basics
  • Equilibrium and rates of change (Calculus)
  • Exposure to system level data
  • And who knows how and when to collaborate(!)

42
The Tool Builders
  • Excellent mathematical skills
  • (algorithms, linear algebra, data structures)
  • Be comfortable in a Linux/Unix environment, and
    know Perl and C/C.
  • A deep background in 2 advanced areas of biology
    with chemistry prerequisites.
  • Graduate training

43
end
44
(No Transcript)
45
2 Wings of Bioinformatics
  • Housekeeping Bioinformatics
  • Representation, storage, and distribution of data
  • Analytical Bioinformatics
  • New tools for the discovery of knowledge in data

46
CASPCommunity Wide Assessment of techniques for
Protein Structure Prediction
An example
  • Every two years, contest to test protein
    structure prediction from primary sequence
Write a Comment
User Comments (0)
About PowerShow.com