Title: Modifying the Schwarz Bayesian Information Criterion to locate multiple interacting Quantitative Trait Loci 1. M.Bogdan, J.K.Ghosh and R.W.Doerge, Genetics 2004 167: 989-999. 2. M.Bogdan and R.W.Doerge
1Modifying the Schwarz Bayesian Information
Criterion to locate multiple interacting
Quantitative Trait Loci1. M.Bogdan, J.K.Ghosh
and R.W.Doerge,Genetics 2004 167 989-999. 2.
M.Bogdan and R.W.Doerge Mapping multiple
interacting QTL by multidimensional genome
searches
2Xia- genotype of i-th individual at locus a Xia
1/2 - individual is heterozygous at locus
a Xia -1/2 - individual is homozygous at locus
a dab10 cM - ? (Xia, Xib) 0.81
Data for QTL mapping Y1,...,Yn - vector of trait
values for n backcross individuals XXij, 1
i n, 1 j m - genotypes of m markers
3Standard methods of QTL mapping One QTL model
1. Search over markers - fit model (1) at each
marker and choose markers for which the
likelihood exceeds a preestablished threshold
value as candidate QTL locations.
4Interval mapping Lander and Botstein (1989)
- Consider a fixed position between markers
5- Estimate µ, ß, and s by EM algorithm and compute
the corresponding likelihood. - Repeat this procedure for a new possible QTL
location. - Plot the resulting likelihoods as the function of
assumed QTL position.
6(No Transcript)
7- Problems with interval mapping
- a) Not able to distingush closely linked QTL
- b) Not able to detect epistatic QTL (involved
only in interactions) - Solution
- Estimate the location of several QTL at once
using multiple regression model (Kao et al. 1999)
8(No Transcript)
9(No Transcript)
10(No Transcript)
11(No Transcript)
12Problem estimation of the number of additive
and interaction terms
Xij - genotype of j-th marker
average number of markers - (200,400)
13Bayesian Information Criterion
- Choose the model which maximizes
- log L -1/2 k log n
- L likelihood of the data for a given model
- k number of parameters in the model
- n sample size
- Broman (1997) and Broman and Speed (2002) BIC
overestimates QTL number
14How to modify BIC ?
- Mi i-th linear model (specifies which
markers - are included in regression)
- ? (µ, ß1,..., ßp, ?1,..., ?r, s) vector of
parameters - for Mi
- fi(?) density of the prior distribution for ?
- p(i) prior probability of Mi
15 L(Y?) likelihood of the data given the
vector of paramers ? mi(Y) likelihood of the
data given the model Mi P(MiY) ? p(i)mi(Y)
BIC neglects p(i) and uses asymptotic
approximation
16neglecting p(i) assigning the same prior
probability to all models assigning high prior
probability to the event that there are many
regressors Example 200 markers 200 models
with one additive term 19 900 models with one
interaction or with two additive terms
9.051058 models with 100 additive terms
17Idea supplement BIC with a more realistic prior
distribution p
18Choice of p (George and McCulloch, 1993)
M number of markers
- number of potential interactions
a - the probability that i-th additive term
appears in the model
? - the probability that j-th interaction term
appears in the model
M- model with p additive terms and r interactions
p(M) ap ?r(1-a)M-p (1-?)N-r
19Prior distribution on the number of additive
terms, p Binomial (M,a)
Prior distribution on the number of interactions,
r Binomial (N,?)
We choose
log p(M)C(M,N,l,u)-p log(l-1)-r log(u-1)
20Choice of l and u should depend on the prior
knowledge on the number of QTL.
Our choice for the sample size 200 probability
of wrongly detecting QTL (when there are none)
0.05
We keep E(p) and E(r) equal to 2.2
The choice is supported by theoretical bound on
type I error based on Bonferoni inequality.
21Additional penalty similar to Risk Inflation
Criterion of Foster and George (2k log t , where
t is the total number of available regressors)
and to the modification of BIC proposed by
Siegmund (2004).
22Search over 12 chromosomesmarkers spaced every
10 cM
23(No Transcript)
24- The criterion adjusts well to the number of
available markers - For n 200 the criterion detects almost all
additive QTL with individual h2 0.13 and
interactions with h2 0.2. - For n 500 the criterion detects almost all
additive QTL with individual h2 0.06 and
interactions with h2 0.12.
25Bound for the type I error
26(No Transcript)
27(No Transcript)
28For n200 and typical values of M this yields
values in the range between 0.057 and 0.08.