Quantitative Trait Loci, QTL An introduction to quantitative genetics and common methods for mapping of loci underlying continuous traits: - PowerPoint PPT Presentation

About This Presentation
Title:

Quantitative Trait Loci, QTL An introduction to quantitative genetics and common methods for mapping of loci underlying continuous traits:

Description:

k is the dominance coeffcient. k = 0 means complete additivity, k = 1 means complete dominance (of A2), k 1 if A2 is overdominant. ... – PowerPoint PPT presentation

Number of Views:1941
Avg rating:3.0/5.0
Slides: 58
Provided by: son117
Category:

less

Transcript and Presenter's Notes

Title: Quantitative Trait Loci, QTL An introduction to quantitative genetics and common methods for mapping of loci underlying continuous traits:


1
Quantitative Trait Loci, QTLAn introduction to
quantitative geneticsand common methods for
mapping ofloci underlying continuous traits
2
Why study quantitative traits?
  • Many (most) human traits/disorders are complex in
    the sense that they are governed by several
    genetic loci as well as being influenced by
    environmental agents
  • Many of these traits are intrinsically
    continuously varying and need specialized
    statistical models/methods for the localization
    and estimation of genetic contributions
  • In addition, in several cases there are potential
    benefits from studying continuously varying
    quantities as opposed to a binary
    affected/unaffected response

3
For example
  • in a study of risk factors the underlying
    quantitative phenotypes that predispose disease
    may be more etiologically homogenous than the
    disease phenotype itself
  • some qualitative phenotypes occur once a
    threshold for susceptibility has been exceeded,
    e.g. type 2 diabetes, obesity, etc.
  • in such a case the binary phenotype
    (affected/unaffected) is not as informative as
    the actual phenotypic measurements

4
A pedigree representation
5
Variance and variability
  • methods for linkage analysis of QTL in humans
    rely on a partitioning of the total variability
    of trait values
  • in statistical theory, the variance is the
    expected squared deviation round the mean value,
  • it can be estimated from data as
  • the square root of the variance is called the
    standard deviation

6
A simple model for the phenotype
  • Y X e
  • where
  • Y is the phenotypic value, i.e. the trait value
  • X is the genotypic value, i.e. the mean or
    expected phenotypic value given the genotype
  • e is the environmental deviation with mean 0.
  • We assume that the total phenotypic variance is
    the sum of the genotypic variance and the
    environmental variance, V (Y ) V (X ) V (e),
    i.e. the environmental contribution is assumed
    independent of the genotype of the individual

7
Distribution of Y a single biallelic locus
8
A single biallelic locus genetic effects
Genotype
Genotypic value
  • a is the homozygous effect,
  • k is the dominance coeffcient
  • k 0 means complete additivity,
  • k 1 means complete dominance (of A2),
  • k gt 1 if A2 is overdominant.

9
Example The pygmy gene, pg
  • From data we have the following mean values of
    weight
  • X 14g, Xpg 12g, Xpgpg 6g,
  • 2a 14 -6 8 implies a 4,
  • (1 k)a 12 - 6 6 implies k 0.5.
  • Data suggest recessivity (although not complete)
    of the pygmy gene.

10
Decomposition of the genotypic value, X
  • Xij is the mean of Y for AiAj-individuals
  • when k 0 the two alleles of a biallelic locus
    behaves in a completely additive fashion X is a
    linear function of the number of A2-alleles
  • we can then think of each allele contributing a
    purely additive effect to X
  • this can be generalized to k ? 0 by decomposition
    of X into additive contributions of alleles
    together with deviations resulting from
    dominance
  • the generalization is accomplished using
    least-squares regression of X on the gene content

11
Least-squares linear regression
12
Model 1
13
(No Transcript)
14
Interpretations
  • in the linear regression
  • is the heritable component of the
    genotype,
  • dis the non-heritable part
  • the sum of an individuals additive allelic
    effects, aiaj is called the breeding value and
    is denoted ?ij
  • under random mating aican be interpreted as the
    average excess of allele Ai
  • this is defined as the difference between the
    expected phenotypic value when one allele (e.g.
    the paternally transmitted) is fixed at Ai and
    the population average, µ

15
Linear Regression
16
Graphically
17
Linear Regression Model solving

X prob.
0 0
a(1k) 1
2a 2
18
(No Transcript)
19
(No Transcript)
20
average excesses
21
Interpretations under random mating
  • a a 1 k (p1-p2)
  • a - p2 a
  • a p1 a,
  • Population parameters for k?0
  • a is called the average effect of allelic
    substitution
  • substitute A1 A2for a randomly chosen
    A1 allele
  • then the expected change in X is,
  • (X12 -X11) p1 (X22 -X12) p2
  • which equals a. (simple calculations).

22
Average effect of allelic substitution
23
a is a function of p2 and k
24
Partitioning the genetic variance
  • the variance, V (X ), of the genotypic values in
    a population is called the genetic variance
  • is the
    additive
  • genetic variance, i.e. variance associated
    with additive allelic effects
  • dominance
    genetic variance, i.e. due to dominance
    deviations

25
VA
26
V (X) VA VD are functions of p2 and k

27
Example The Booroola gene, (Lynch and Walsh,
1998)
28
In summary
  • The homozygous effect a, and the dominance
    coefficient k are intrinsic properties of
    allelic products.
  • The additive effect ai, and the average excess
    ai are properties of alleles in a particular
    population.
  • The breeding value is a property of a particular
    individual in reference to a particular
    population. It is the sum of the additive effects
    of an individual's alleles.
  • The additive genetic variance, VA, , is a
    property of a particular population. It is the
    variance of the breeding values of individuals in
    the population.

29
Multilocus traits
  • Do the separate locus effects combine in an
    additive way, or do there exist non-linear
    interaction between different loci epistasis?
  • Do the genes at different loci segregate
    independently?
  • Do the gene expression vary with the
    environmental context gene by environment
    interaction?
  • Are specic genotypes associated with particular
    environments covariation of genotypic values and
    environmental effects?

30
Example epistasis
Average length of vegetative internodes in the
lateral branch (in mm) of teosinte. Table from
Lynch and Walsh (1998).
31
Two independently segregating loci
  • Extending the least-squares decomposition of X
  • ?k is the breeding value of the k'th locus,
  • dk is the dominance deviation of the k'th
    locus,
  • e is a residual term due to epistasis
  • if the loci are independently segregating

32
Neglecting V (e)
  • the epistatic variance components contributing to
    V (e) are often small compared to VA and VD
  • in linkage analysis it is this often assumed that
    V (e) 0
  • note however the relative magnitude of the
    variance components provide only limited insight
    into the physiological mode of gene action
  • epistatic interactions, can greatly inflate the
    additive and/or dominance components of variance

33
Resemblance between relatives
  • A model for the trait values of two relatives
  • Yk Xk ek, k 1 , 2,
  • where for the kth relative
  • Yk is the phenotypic value,
  • Yk is the genotypic value,
  • ek is the mean zero environmental deviation.
  • the eks are assumed to be mutually independent
    and also independent of k. Hence, the covariance
    of the trait values of two relatives is given by
    the genetic covariance, C(X1 X2), i.e.
  • C(Y1 Y2) C(X1 X2)

34
A (preliminary) formula for C(X1 ,X 2)
  • For a single locus trait
  • C(X1 X2) c1VA c2VD
  • c1 and c2 are constants determined by the type of
    relationship between the two relatives.
  • same formula applies for multilocus traits if no
    epistatic variance components are included in the
    model, i.e. V (e) 0.
  • in this latter case and are given by summation of
    the corresponding locus-specific contributions.

35
Joint distribution of sibling trait values
Single biallelic, dominant (k 1 ) model.
Correlation 0.46.
36
Measures of relatedness
  • N the number of alleles shared IBD by two
    relatives at a given locus
  • the kinship coefficient, ? , is given by
  • 2 ? E(N) / 2
  • i.e. twice the kinship coefficient equals the
    expected proportion of alleles shared IBD at the
    locus.
  • The coefficient of fraternity, ?, is defined as
  • ? P(N 2).

37
Some examples
  • Siblings
  • (z0 z1 z2) (1/4 1/2 1/4) implying E(N)
    1.
  • Thus ? 1/4 and ? 1/4
  • Parent-offspring
  • (z0 z1 z2) (0 1 0) implying E(N) 1.
  • Thus ? 1/4 and ? 0
  • Grandparent - grandchild
  • (z0 z1 z2) (1/2 1/2 0) implying E(N)
    12.
  • Thus ? 1/8 and ? 0

38
Covariance formula for a single locus
Under the assumed model
39
A single locus perfect marker data
40
Covariance formula for multiple loci
n independently segregating loci assuming
no epistatic interaction, i.e. putting V (e) 0
41
Covariance formula for multiple loci
n independently segregating loci assuming
no epistatic interaction, i.e. putting V (e) 0
42
Covariance... continued
Define for every pair of relatives
For two related individuals we then have,
43
Haseman-Elston method
  • Uses pairs of relatives of the same type most
    often sib pairs
  • for each relative pair calculate the squared
    phenotypic difference Z (Y1 Y2)2
  • given MDx regress the Z's on the expected
    proportion of alleles IBD, p(x) E Nx MDx/2,
    at the test locus
  • a slope coefficient ßlt 0, if statistically
    significant, is considered as evidence for
    linkage

44
HE an example
0.5Proportion of marker alleles identical by decent
Solid line is the tted regression line Dotted
line indicates true underlying relationship
45
HE motivation
Assume strictly additive gene action at each
locus, i.e.VD 0. Then, for a putative QTL at x,
46
HE linkage test
47
HE examples with simulated data
simulated data from n 200 sib-pairs top to
bottom h2 050 033 025.
48
Heritability and power
  • for a given locus we may define the
    locus-specific heritability as the proportion of
    the total variance 'explained' by that particular
    site, e.g. (in the narrow-sense),
  • the locus-specific heritability is the single
    most important parameter for the power of QTL
    linkage methods
  • heritabilities below 10 leads, in general, to
    unrealistically large sample sizes.

49
HE two-point analysis
  • where is the expected proportion of
    marker alleles shared IBD.
  • depends on the type of relatives considered
  • for sib pairs
  • recombination fraction (?) and effect size (VAl
    )
  • are confounded and cannot be separately
  • estimated

50
HE in summary
  • Simple, transparent and comparatively robust but
  • poor statistical power in many settings
  • different types of relatives cannot be mixed
  • parents and their offspring cannot be used in HE
  • assumptions of the statistical model not
    generally satisfied
  • Remedy
  • use one of several suggested extensions of HE
  • alternatively, use VCA instead

51
VCA
Mathematically YimbTaigiqiei where m is
the population mean, a are the environmental
predictor variables, q is the major trait locus,
g is the polygenic effect, and e is the residual
error.
52
VCA an additive model
53
VCA major assumption
  • The joint distribution of the phenotypic values
    in a pedigree is assumed to be multivariate
    normal with the given mean values, variances and
    covariances
  • the multivariate normal distribution is
    completely
  • specified by the mean values, variances and
  • covariances
  • the likelihood, L, of data can be calculated and
  • we can estimate the variance components
  • VAx VDx VAR VDR

54
VCA linkage test
  • The linkage test of
  • H0 VAx VDx 0
  • uses the LOD score statistic

When the position of the test locus, x, is varied
over a chromosomal region the result can be
summarized in a LOD score curve.
55
VCA vs HE LOD score proles
From Pratt et al. Am. J. Hum. Genet.
661153-1157, (2000)
56
Linkage methods for QTL
  • Fully parametric linkage approach is difficult
  • Model-free tests comprise the alternative choice
  • We will discuss
  • Haseman-Elston Regression (HE)
  • Variance Components Analysis (VCA)
  • Both can be viewed as two-step procedures
  • 1. use polymorphic molecular markers to extract
    information on inheritance patterns
  • 2. evaluate evidence for a trait-influencing
    locus at specified locations

57
Similarities and differences
  • HE and VCA are based on estimated IBD-sharing
    given marker data
  • both methods require specification of a
    statistical model!
  • ('model-free' means 'does not require
    specification of genetic model')
  • similarity in IBD-sharing is used to evaluate
    trait similarity using either
  • linear regression (HE) or
  • variance components analysis (VCA)
Write a Comment
User Comments (0)
About PowerShow.com