Title: Bioinformatics: One Minute and One Hour at a Time
1Bioinformatics One Minute and One Hour at a
Time
- Laurie J. Heyer
- L.R. King Asst. Professor of Mathematics
- Davidson College
- laheyer_at_davidson.edu
2What is Bioinformatics?
Computer Science
Mathematics
Bioinformatics
Biology
3Genomics, Proteomics and Systems Biology
- Primary audience
- Junior bio majors
- Prerequisites
- Bioinformatics and intro molecular biology or
- One of several 300-level biology courses
- Course home page
- http//www.bio.davidson.edu/genomics
- Math Minutes
- Taught by A. Malcolm Campbell (Biology)
4Sample TopicDNA Microarrays
5Plotting Expression Data
- One highlighted gene is induced 16 fold
- One highlighted gene is repressed 16 fold
- But induction looks much more dramatic
6Log Transformation
- Calculate log2 of each ratio
- Ratio of 16 becomes value of 4
- Ratio of .0833 (1/16) becomes value of 4
- Induction and repression look equal, but opposite
sign
7Hierarchical Clustering
- Join two most similar genes
- Join next two most similar objects (genes or
clusters of genes) - Distance from one gene to a set of genes is
minimum of all distances from the gene to the
individual members (Single Linkage) - Repeat until all genes have been joined
8Genome Consortium for Active Teaching (GCAT)
http//www.bio.davidson.edu/GCAT
9High School Chips
See Kathy Gabrics page http//cstaff.hinsdale86.
org/kgabric/honorscalendar.html
10Bioinformatics Course
- Prerequisites
- Genomics or experience with modeling and
algorithmic thinking - Goals
- To understand and apply various algorithms and
statistical tests for analyzing DNA, RNA and
protein sequences, and DNA microarray data. - To gain practical experience with Perl, a
programming language widely used in molecular
biology, web design, and text processing. - Course home page
- http//gcat.davidson.edu/bioinformatics/bioinf.htm
l
11Bioinformatics Topics
- Determining sequences
- Comparing sequences
- Finding genes
- Predicting structure
- Comparing genomes
- Inferring phylogenies
- Analyzing images
- Clustering gene expression patterns
- Designing experiments
12Bioinformatics Projects
13Image Segmentation
- Locate spot (signal) pixels
- Measure intensity of signal and background in
each channel - Compute ratio
14Adaptive Circle Algorithm
- Specify threshold between darkest and lightest
pixel - Pixels above threshold are on, others are off
- Combine two binary images if pixel is on in
either image, it is on in combined image - Search for radius and center that maximize
percent of on pixels
15Adaptive Circle V2 (Dapple)
- Compute 4-neighbor second-difference
approximation to the Laplacian - Find sharply defined upper edge by convolving
Laplacian with annular filters
From Dapple Improved Techniques for Finding
Spots on DNA Microarrays UW CSE Technical Report
UWTR-2000-08-05
16Quality Clustering QT Clust
1. Each gene builds a supervised cluster 2.
Gene with best list, and genes in its list,
becomes next cluster 3. Remove these genes from
consideration, and repeat 4. Stop when all
genes are clustered, or largest cluster is
smaller than user specified threshold
17Why teach Bioinformatics?
- Critical thinking
- Interdisciplinary
- Integrative
- Modeling
- Data analysis
- Computational science
- Discrete math
- Probability and statistics
- Student research opportunities