Title: Biology%20and%20Bioinformatics
1Biology and Bioinformatics
BI820 Seminar in Quantitative and Computational
Problems in Genomics
Gabor T. Marth
Department of Biology, Boston College marth_at_bc.edu
2The animal cell
3DNA the carrier of the genetic code
4DNA organization chromosomes
5Translation of genetic information
6DNA sequencing informatics
DNA sequencing informatics
7DNA organization
8Genome annotation
9De novo gene prediction
10Similarity-based gene prediction
11Gene localization
12Genetic mapping
13Gene function
14Expression analysis
15Protein structure
16RNA structure
17Protein structure prediction
18RNA structure prediction
19DNA evolution
20Evolution of chromosome organization
21Evolution of gene structure
22Evolution of DNA sequence
23Comparative genomics
24Phylogenetics
25Mechanisms of molecular evolution
26Sequence variations
- Human Genome Project produced a reference genome
sequence that is 99.9 common to each human being
27Why do we care about variations?
phenotypic differences
28How do we find polymorphisms?
- look at multiple sequences from the same genome
region
29SNP discovery -- Methods
30SNP discovery Computer tools
31SNP discovery Mining Projects
30,000 clones
gtCloneX ACGTTGCAACGT GTCAATGCTGCA
gtCloneY ACGTTGCAACGT GTCAATGCTGCA
25,901 clones (7,122 finished, 18,779 draft with
basequality values)
21,020 clone overlaps (124,356 fragment overlaps)
ACCTAGGAGACTGAACTTACTG
ACCTAGGAGACCGAACTTACTG
32SNP databases and characteristics
- access to variation data
- SNP properties
- reliability of information
33Where do variations come from?
- sequence variations are the result of mutation
events
TAAAAAT
34Mutation rate
- higher mutation rate (µ) gives rise to more SNPS
35Recombination
accgttatgtaga
accgttatgtaga
accgttatgtaga
36Demographic history
small (effective) population size N
- different world populations have varying
long-term effective population sizes (e.g.
African N is larger than European)
37Modeling
stationary
expansion
collapse
bottleneck
past
history
present
MD (simulation)
AFS (direct form)
38Ancestral inference
modest but uninterrupted expansion
bottleneck
39The signatures of selection
- selective mutations influence the genealogy
itself in the case of neutral mutations the
processes of mutation and genealogy are decoupled
40Association and haplotype structure
linkage disequilibrium
41Computer simulations the Coalescent
42Medical utility?
?
clinical phenotype
molecular markers
43Mapping disease-causing loci
genetic linkage
44Forensic applications