Molecular Evolution and Phylogeny - PowerPoint PPT Presentation

About This Presentation
Title:

Molecular Evolution and Phylogeny

Description:

A team at Celera Genomics sequenced by exon-specific polymerase ... The phylum contains four classes (examples), including jellyfish, sea anemone and hydra ... – PowerPoint PPT presentation

Number of Views:177
Avg rating:3.0/5.0
Slides: 54
Provided by: fenBilk
Category:

less

Transcript and Presenter's Notes

Title: Molecular Evolution and Phylogeny


1
Molecular Evolution and Phylogeny
  • Examples

2
Weakly deleterious mutations
  • Weakly deleterious mutations can reach high
    frequencies in local populations and, thus, may
    contribute significantly to genetic variance in
    disease susceptibility.

3
Sequencing of human polymorphisms
  • A team at Celera Genomics sequenced by
    exon-specific polymerase chain reaction (PCR)
    amplification 20,362 loci in 20 European
    Americans, 19 African Americans and one male
    chimpanzee with the initial intention of finding
    novel nonsynonymous single nucleotide
    polymorphisms (SNPs) based on their 2001 build of
    the human genome.

4
Divergence between human and mouse
  • A total of 34,099 fixed synonymous differences
    between 39 humans and the chimpanzee yield a
    genomic average synonymous divergence of dS
    1.02.
  • 20,467 non-synonymous differences dN 0. 242
    across 11.81 megabases (Mb) of aligned coding DNA.

5
Polymorphisms
  • 15,750 synonymous and 14,311 non-synonymous SNPs
    among the human subjects, yielding average
    synonymous and non-synonymous SNP densities of pS
    0.470 and pN 0.169.

6
Polymorphisms are more than divergence
  • a highly significant excess of amino acid
    variation relative to divergence.

7
Can you comment on the following?
  • Evolution of human populations since sharing a
    last common ancestor with chimps
  • Type of nonsynonymous mutations (very deleterious
    or mildly deleterious) in human populations
  • Positively selection
  • Negative selection
  • Disease associations?

8
Non-neutral evolution
  • dN/dS 1 neutral evolution
  • dN/dS gt 1 positive selection
  • dN/dS lt1 negative selection

9
Accelerated evolution of genes
10
What makes us a vertebrate?
  • Neural crest?
  • Highly sophisticated nervous system?
  • Bones/cartilage?
  • Vertebrate specific genes?

11
Origin of bilateria
  • Some vertebrate genes date prior to the origin of
    bilateria

12
Bilateria
  • Bilateria a monophyletic group of metazoan
    animals characterized by bilateral symmetry.

13
Radial symmetry
  • Bilateria excludes the Cnidaria, Ctenophora (sea
    gooseberries), Porifera (sponges) and Placozoa.

14
A little taxonomy
15
Cnidaria
  • Cnidaria a basal phylum, has two body layers,
    radial symmetry and being at the tissue grade of
    morphological organization.
  • There are two basic morphologies the sessile
    polyp and the swimming medusa or jellyfish.
  • The phylum contains four classes (examples),
    including jellyfish, sea anemone and hydra

16
Body Axis
  • Oralaboral axis the single obvious body axis of
    the two radiate phyla (Cnidaria and
    Ctenophora), marked at one end by the mouth or
    oral pore.

17
Wnts signaling
http//www.stanford.edu/rnusse/reviews/NaVReviewF
inal438747a.pdf
18
Wnt Signaling
  • In Wnt signalling pathway, ligand binding
    triggers the formation of a receptor complex, and
    protein kinases modify the receptor tails,
    leading to recruitment of cytoplasmic factors.
  • In other signalling pathways, receptor-induced
    protein phosphorylation amplifies the signal, and
    the receptor-associated kinase acts as a catalyst
    for the modification of many substrate molecules.

19
Wnt genes
  • Mammals have 19 wnts
  • Sea anemone has 12
  • Nematostella vectensis, a diploblast

Kusserow A, Pang K, Sturm C, Hrouda M, Lentfer J,
Schmidt HA, Technau U, von Haeseler A, Hobayer B,
Martindale MQ, Holstein TW (2005) Unexpected
complexity of the Wnt gene family in a sea
anemone. Nature 433156-160.
20
Nematostella vectensis
http//www.nematostella.org/
21
Phylogenetic tree of wnts
22
Expression of wnts
The original bilaterian was equipped with a
fairly elaborate set of molecular tools.
23
Endoderm, ectoderm, mesoderm
  • For example, the Nematostella ectodermal genes,
    NvWnt1, NvWnt2, NvWnt4 and NvWnt7 correspond to
    the neuroectodermal Wnt genes in the higher
    Bilateria.
  • NvWnt5, NvWnt6 and NvWnt8 are expressed in the
    endoderm, whereas the corresponding genes in
    deuterostomes are all expressed in the mesoderm.

24
Collagen
  • Bone is significantly linked to cartilage, both
    in development and evolution, with earlier forms
    having a cartilaginous skeleton that is replaced
    by bone. In vertebrates, cartilage also contains
    threads of collagen running through it.

25
Collagen
  • Bone is a living tissue continually remodeling
    the mineral matrix threaded with fibers of a
    protein, type II collagen, gives strength.

26
Collagen
  • Collagen is an ancient protein (800 million years
    ago?).
  • There are about 27 different types of collage in
    at least a dozen different classes.
  • http//web.indstate.edu/thcme/mwking/extracellular
    matrix.html
  • One particular type, type II collagen, is an
    essential part of the matrix of bones and
    cartilages.

27
A primitive jawless fish from the late Devonian,
around 370 million years ago. Do lampreys have
collagen?
28
Initially it was thought lampreys dont have
collagen
  • Zhang et al. screened a library of lamprey
    sequences and isolated two forms of collagen II,
    Col2a1a and Col2a1b.
  • The presence of a collagen homolog related to
    human collagen II the gene arose before the
    (jawless)lamprey-gnathostome (true-jaws) split.
  • Col2a1 is used in developing branchial
    cartilaginous skeleton.

Proc Natl Acad Sci U S A. 2006 Feb 21 Lamprey
type II collagen and Sox9 reveal an ancient
origin of the vertebrate collagenous
skeleton. Zhang G, Miyamoto MM, Cohn MJ.
29
but they do!
30
Collagen phylogeny
31
Bootstrapping
  • The bootstrap is a procedure that involves
    choosing random samples with replacement from a
    data set and analyzing each sample the same way.

32
Bootstrapping
  • Sampling with replacement means that every sample
    is returned to the data set after sampling. So a
    particular data point from the original data set
    could appear multiple times in a given bootstrap
    sample.

33
Bootstrapping
  • The number of elements in each bootstrap sample
    equals the number of elements in the original
    data set. The range of sample estimates we obtain
    allows us to establish the uncertainty of the
    quantity we are estimating.

34
Reliability of a tree
  • reliability of an estimated tree is to examine
    the reliability of each interior branch.

35
Bootstrap
  • the reliability of an inferred tree is examined
    by using Efrons bootstrap resampling technique.
  • A set of nucleotide sites is randomly sampled
    with replacement from the original set, and this
    random set is used for constructing a new
    phylogenetic tree.
  • This process is repeated many times, and the
    proportion of replications in which a given
    sequence cluster appears is computed.
  • If this proportion (PB) is high (say, PB gt 095)
    for a sequence cluster, this cluster is
    considered to be statistically significant.

36
Bootstrap values
37
Bootstrapping
  • Open Matlab
  • Open Help
  • Type bootstrap and read

38
Example
  • gt load lawdata
  • gt plot(lsat,gpa,'')
  • gt lsline

39
Plot of lsat vs. gpa
40
Calculate correlation between lsat and gpa
  • gt rhohat corrcoef(lsat,gpa)
  • gt rhohat
  • 1.0000 0.7764
  • 0.7764 1.0000

41
Is 0.78 significant?
  • Now we have a number, 0.7764, describing the
    positive connection between LSAT and GPA, but
    though 0.7764 may seem large, we still do not
    know if it is statistically significant.

42
Bootstrp function
  • Using the bootstrp function we can resample the
    lsat and gpa vectors as many times as we like and
    consider the variation in the resulting
    correlation coefficients.

43
Generate 1000 lsat and gpa vectors by resampling
from the original vectors
  • rhos1000 bootstrp(1000,'corrcoef',lsat,gpa)
  • hist(rhos1000(,2),30)

44
What is the uncertainty associated with the
observed correlation?
  • gtgt mean(rhos1000(,2))
  • ans
  • 0.7711
  • gtgt std(rhos1000(,2))
  • ans
  • 0.1350
  • gtgt 0.13501.96
  • ans
  • 0.2646
  • Mean /-1.96std

45
You have data on the expression pattern of two
genes
  • HOXA1 and CDK6 expression values in different
    tissues are collected.
  • Open the excel file named data.xls
  • Copy and paste the numerial data columns (two of
    them) into the workspace as follows naming the
    data as a
  • gtgt a paste.here and close bracket

46
Calculate the uncertainty associated with the
correlation btw HOXA1 and CDK6 genes
  • Plot the expression values (x, HOXA1 and y,
    CDK6).
  • Place a lsline on the data
  • Calculate the correlation coefficient between the
    genes
  • Generate 1000 bootstrapped samples to estimate
    the sample correlation coefficient.
  • Determine the 95 confidence interval around the
    bootstrapped correlation coefficient.

47
Bootstrap of align2.m
  • Generate 1000 samples of bootstraped alignment
    score and its 95 confidence interval using the
    bootstrp function.

48
Bayesian Inference
  • There are three basic methods that have been used
    to estimate phylogeny, including distance,
    maximum parsimony (MP),and maximum likelihood
    (ML).
  • Bayesian statistics differs in that in addition
    to the current data, prior knowledge is included
    in the testing of the hypothesis.

49
Medical tests and Bayesian Stats
  • Assume that previous studies have evaluated the
    accuracy of this test and have shown that, if you
    are in fact ill, there is a 99 likelihood that
    the test will give a true positive result (and
    thus, a 1 likelihood that the test will give a
    false negative).

50
Medical tests and Bayesian Stats
  • It was also found that if you are healthy, there
    is a 0.1 likelihood of a false positive result
    from the test. If we were simply using the data
    (i.e., the test result), we would then conclude
    that a positive test result had approximately a
    99 chance of being correct.

51
Medical tests and Bayesian Stats
  • If we were to examine this question in a Bayesian
    framework, we could incorporate prior
    knowledgein this case that other studies have
    shown that the base rate of this illness is 0.1
    in the population.
  • Thus, of a population of 100,000 individuals, 100
    would be ill and 99,900 would be healthy.

52
Medical tests and Bayesian Stats
  • Using the likelihood values mentioned above, we
    could conclude that a positive test result would
    be seen in 99 of the ill individuals (99 true
    positives) and 0.1 of the healthy individuals
    (approximately 100 false positives).

53
Medical tests and Bayesian Stats
  • This leaves us with a conclusion that if a person
    has a positive test result, there is a 99/199 or
    approximately 50 chance that the test is correct
    and this person is actually ill. Therefore, by
    including prior knowledge of the base rate of the
    illness in the population, the perceived chance
    that a positive result indicates that an
    individual actually has the illness drops from
    99 to 50.
Write a Comment
User Comments (0)
About PowerShow.com