Biology 301 Computational Biology Phylogenetic Trees Molecular Data and Applications Relevant Dutton - PowerPoint PPT Presentation


PPT – Biology 301 Computational Biology Phylogenetic Trees Molecular Data and Applications Relevant Dutton PowerPoint presentation | free to view - id: e5439-ZDc1Z


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

Biology 301 Computational Biology Phylogenetic Trees Molecular Data and Applications Relevant Dutton


Biology 301 - Computational Biology. Phylogenetic Trees ... Hasegawa: BF; SR (Ts Tv) Page 5.20: Think About. BF = base frequency, SR = substitution rate; ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 19
Provided by: sarahb73


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Biology 301 Computational Biology Phylogenetic Trees Molecular Data and Applications Relevant Dutton

Biology 301 - Computational BiologyPhylogenetic
Trees - Molecular Data and ApplicationsRelevant
Dutton Lectures, Page Sections, BFD Ch. 9
(PJ)In advance for next time and final Sauter
paper (for Wednesday) and Golding paper (for
  • Molecular Data Assumptions
  • Sequence information is correct/accurate
  • Homologous - sequences, aligned regions
  • Dataset - adequate representation
  • Dataset characters adequately variable
  • All sequences evolved via a single model
  • - all positions evolved independently
  • - substitution rates equal
  • - base composition is the same

  • The danger of generating incorrect results is
    inherently greater in computational phylogenetics
    than in many other fields of science.
  • Authors, Bioinformatics - A Practical Guide to
    the Analysis of Genes and Proteins

  • Review Duttons Stuff
  • Structure and Function of Phylogenetic Trees
  • Using legends to calculate genetic change
  • Will briefly discuss Distance, MP, and ML
  • BOOTSTRAP same
  • Testing of phylogenetics programs

(No Transcript)
  • A Few Emphatics
  • Distance alignments converted to distance
  • Discrete aligned position data retained
  • - Parsimony low tree score wins
  • - Likelihood most probable based on model
  • BOTH can invoke substitution rates, weights
  • Read ML - we will not be doing in lab.

  • Distance Critiques
  • GOOD fast and cheap
  • BAD LOSS of aligned character data
  • BAD no way to apply weights to regions
  • BAD strong stochastic assumptions
  • In tests poor with similar sequences, slightly
    better with long branching less similar.

  • MP Critiques
  • GOOD moderately fast and cheap
  • BETTER less stochastic assumptions
  • YOUR CALL is life/evolution parsimonious?
  • In tests excellent to good with similar
    sequences, moderately poor with long branching
    less similar.

  • ML Critiques
  • BAD EXPENSIVE and slow
  • DIFFICULT all data must share same model
  • DIFFICULT how many models are there?
  • In tests top performer with similar and
    dissimilar data using one model.

  • Evolution Models
  • Carefully review Page 5.14
  • JC BF and SR (Ts Tv)
  • Kimura BF but ? SR (Ts ? Tv)
  • Felsenstein ? BF SR (Ts Tv)
  • Hasegawa ? BF ? SR (Ts ? Tv)
  • Page 5.20 Think About
  • BF base frequency, SR substitution rate
  • Ts transitions Tv transversions

  • Schools of Evolution
  • Trait-Based (pre-1960s)
  • Classical natural selection purified/stabilized
  • Balance natural selection drove diversity
  • Molecular-Based (post-1960s)
  • Neutralists more mutations neutral
  • Selectionists more mutations advantageous
  • Are non-adaptive traits parsimonious?

  • Agreeable Molecular Concepts
  • More functionally constrained, lower SR
  • Less constrained/non-coding, high SR
  • Third codon position, high SR
  • Different GC ratios - Why?
  • More G/C in early monomer pools
  • Hydrogen bond stability and hot earth
  • Associated with higher recombination
  • Shorter generation time in prokaryotes

Applications Beyond Organismal Phylogeny
  • Basic Identification
  • Typified by your 16S lab project
  • Yellowstone unknowns - you are solving
  • Compare with Noahs Ark dataset
  • Assess location to infer phylogeny
  • Nowadays, BLAST is acceptable for ID
  • Identification applications include research,
    epidemiology, and even forensics.

  • Assessing Horizontal/Lateral Gene Transfer
  • Acquiring new genes
  • MUST compare with known molecular clock
  • If topologies different assume acquired
  • BACTERIAL PROJECTS 16S gene clock
  • Compare your genes to 16S trees

  • Gene Duplication and Gene Family History
  • Typical for eukaryotic studies
  • Compare molecular clock species tree
  • With tree of all gene family members
  • Look For widely distributed species genes
  • DIFFICULT why I dont study eukaryotes!
  • None of you should encounter these in your
    projects so it is best to avoid making these

  • Host-Parasite Co-Evolution
  • Page, Figure 8.10
  • Compares genes from host to clock
  • And genes from parasite to clock
  • AND host gene tree to parasite gene tree
  • Look For host and parasite ? either clock
  • AND LOOK FOR host parasite!
  • Hydrogenase - compare eukaryotic to prokaryotic
    to guess bacterial symbiont.

  • Addressing Time (and place)
  • Extinct organism DNA (on NCBI)
  • Calibration of trees to fossil record
  • Calibrating mutation rates controversial!!!!
  • Some less controversial applications
  • - when pathogens invaded given hosts
  • - origin of epidemics in time and place
  • Several projects may want to consider this.

  • Molecule or Drug Design
  • Class Brainstorm - More On Final