Algorithms for Biological Sequence Analysis - PowerPoint PPT Presentation

About This Presentation
Title:

Algorithms for Biological Sequence Analysis

Description:

Department of Computer Science and Information Engineering. National ... Completed Genomes. 11. Chimpanzee Genome. 12. The Primate Family Tree. Source: Nature ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 13
Provided by: VERI152
Category:

less

Transcript and Presenter's Notes

Title: Algorithms for Biological Sequence Analysis


1
Algorithms for Biological Sequence Analysis
  • Kun-Mao Chao (???)
  • Department of Computer Science and Information
    Engineering
  • National Taiwan University, Taiwan
  • Date October 2, 2007
  • WWW http//www.csie.ntu.edu.tw/kmchao

2
About this course
  • Course Algorithms for biological sequence
    analysis
  • We will be focused on the sequence-related
    algorithmic problems. Genomic sequences are our
    main target.
  • The oldest language
  • The largest program
  • Fall semester, 2007
  • Tuesday 910 1210, 107 CSIE Building.
  • 3 credits
  • Web site http//www.csie.ntu.edu.tw/kmchao/seq07
    fall

3
Coursework
  • Homework assignments and Class participation
    (15)
  • Two midterm exams (60 30 each)
  • November 6, 2007 (tentatively)
  • December 18, 2007 (tentatively)
  • Oral presentation of selected papers (25)

4
Outlines
  • Part I Sequence Homology
  • Introduction to genomes
  • Dynamic programming strategy revisited
  • Pairwise sequence alignment
  • Multiple sequence alignment
  • Chaining algorithms for genomic sequence analysis
  • Suboptimal alignment
  • Comparative genomics
  • Hidden Markov models (the Viterbi algorithm et
    al.)
  • Part II Sequence Composition
  • Maximum-sum and maximum-density segments
  • SNP and haplotype data analysis
  • Genome annotation
  • Other advanced topics

5
A Brief History of Genetics
  • 1859 Darwin publishes The Origin of Species
  • 1865 Genes are particular factors
  • 1871 Discovery of nucleic acid
  • 1903 Chromosomes are hereditary units
  • 1910 Genes lie on chromosomes
  • 1913 Chromosomes are linear arrays of genes
  • 1931 Recombination occurs by crossing over

6
A Brief History of Genetics (contd)
  • 1944 DNA is the genetic material
  • 1945 A gene codes for protein
  • 1951 First protein sequence
  • 1953 DNA is a double helix
  • 1961 Genetic code is triplet
  • 1977 Eukaryotic genes are interrupted
  • 1977 DNA can be sequenced
  • 21th Century Many genomes completely sequenced

7
Milestones of Bioinformatics
  • 1962 Pauling's theory of molecular evolution
  • 1965 Margaret Dayhoff's Atlas of Protein
    Sequences
  • 1970 Needleman-Wunsch algorithm
  • 1977 DNA sequencing and software to analyze it
    (Staden)
  • 1981 Smith-Waterman algorithm developed
  • 1981 The concept of a sequence motif (Doolittle)
  • 1982 GenBank Release 3 made public
  • 1982 Phage lambda genome sequenced

8
Milestones of Bioinformatics (contd)
  • 1983 Sequence database searching algorithm
    (Wilbur-Lipman)
  • 1985 FASTP/FASTN fast sequence similarity
    searching
  • 1988 National Center for Biotechnology
    Information (NCBI) created at NIH/NLM
  • 1988 EMBnet network for database distribution
  • 1990 BLAST fast sequence similarity searching
  • 1991 EST expressed sequence tag sequencing
  • 1993 Sanger Centre, Hinxton, UK
  • 1994 EMBL European Bioinformatics Institute,
    Hinxton, UK

9
Milestones of Bioinformatics (contd)
  • 1995 First bacterial genomes completely sequenced
  • 1996 Yeast genome completely sequenced
  • 1997 PSI-BLAST
  • 1998 Worm (multicellular) genome completely
    sequenced
  • 1999 Fly genome completely sequenced

10
Milestones of Bioinformatics (contd)
  • Human Genome Project (1990-2003)
  • Mouse 2002
  • Rat 2004
  • Chimpanzee 2005
  • Completed Genomes

11
Chimpanzee Genome
12
The Primate Family Tree
Source Nature
Write a Comment
User Comments (0)
About PowerShow.com