Title: Use%20of%20Mixture%20Model%20in%20a%20genome-wide%20DNA%20microarray-based%20genetic%20screen%20for%20components%20of%20the%20NHEJ%20Pathway%20in%20Yeast
1Use of Mixture Model in a genome-wide DNA
microarray-based genetic screen for components of
the NHEJ Pathway in Yeast
- Rafael A. Irizarry
- Department of Biostatistics, JHU
- rafa_at_jhu.edu
2(No Transcript)
3Damaged DNA
Yku70p/Yku80p (DNA-PK )
DNA end binding
Nucleolytic processing
Rad50p/Mre11p/Xrs2p
Ligation
Lig4p/Lif1p
Repaired DNA
4A
kanR
DOWNTAG
UPTAG
CEN/ARS
B
URA3
MCS
Circular pRS416
EcoRI linearized PRS416
Transformation into deletion pool
Select for Ura transformants Genomic DNA
preparation
PCR
Cy5 labeled PCR products
Cy3 labeled PCR products
Oligonucleotide array hybridization
5Data
- 5718 mutants
- 3 replicates on each slide
- 5 Haploid slides, 4 Diploid slides
- Haploids are divided into 2 downtags, 3 uptag (2
of which replicate uptags) - Diploids are divided into 3 uptags (2 of which
are replicates) and 2 uptags
6Which mutants are NHEJ defective?
- Find mutants defective for transformation with
linear DNA - Dead in linear transformation (green)
- Alive in circular transformation (red)
- Look for spots with large log(R/G)
7(No Transcript)
8(No Transcript)
9(No Transcript)
10(No Transcript)
11(No Transcript)
12Improvement to usual approach
- Take into account that some mutants are dead and
some alive - Use a statistical model to represent this
- Mixture model?
- With ratios we lose information about of R and G
separately - Look at them separately (absolute analysis)
13(No Transcript)
14(No Transcript)
15(No Transcript)
16(No Transcript)
17(No Transcript)
18Warning
- Absolute analyses can be dangerous for
competitive hybridization slides - We must be careful about spot effect
- Big R or G may only mean the spot they where on
had large amounts of cDNA - Look at some facts that make us feel safer
19Correlation between replicates
- R1 R2 R3 G1 G2 G3
- R1 1.00 0.95 0.95 0.94 0.90 0.90
- R2 0.95 1.00 0.96 0.90 0.95 0.91
- R3 0.95 0.96 1.00 0.91 0.92 0.95
- G1 0.94 0.90 0.91 1.00 0.96 0.96
- G2 0.90 0.95 0.92 0.96 1.00 0.97
- G3 0.90 0.91 0.95 0.96 0.97 1.00
20Correlation between red, green, haploid, diplod,
uptag, downtag
- RHD RHU RDD RDU GHD GHU
GDD GDU - RHD 1.00 0.59 0.56 0.32 0.95 0.58 0.54 0.37
- RHU 0.59 1.00 0.38 0.56 0.58 0.95 0.40 0.58
- RDD 0.56 0.38 1.00 0.58 0.54 0.39 0.92 0.64
- RDU 0.32 0.56 0.58 1.00 0.33 0.53 0.58 0.89
- GHD 0.95 0.58 0.54 0.33 1.00 0.62 0.56 0.39
- GHU 0.58 0.95 0.39 0.53 0.62 1.00 0.41 0.58
- GDD 0.54 0.40 0.92 0.58 0.56 0.41 1.00 0.73
- GDU 0.37 0.58 0.64 0.89 0.39 0.58 0.73 1.00
21BTW
- The mean squared error across slides is about 3
times bigger than the mean squared error within
slides
22Mixture Model
- We use a mixture model that assumes
- There are three classes
- Dead
- Marginal
- Alive
- Normally distributed with same correlation
structure from gene to gene
23Random effect justification
- Each x (r1,,r5,g1,,g5) will have the
following effects - Individual effect same mutant same expression
(replicates are alike) - Genetic effect same genetics same expression
- PCR effect expect difference in uptag, downtag
24Does it fit?
25Does it fit?
26What can we do now that we couldnt do before?
- Define a t-test that takes into account if
mutants are dead or not when computing variance - For each gene compute likelihood ratios comparing
two hypothesis - alive/dead vs.dead/dead or alive/alive
27QQ-plot for new t-test
28Better looking than others
29(No Transcript)
30(No Transcript)
31(No Transcript)
32- 1 YMR106C 9.5 47 69.2 a
a 100 - 2 YOR005C 19.7 35 44.9 a
d 100 - 3 YLR265C 6.1 32 35.8 a
m 100 - 4 YDL041W 10.4 32 35.6 a
m 100 - 5 YIL012W 12.2 31 21.7 a
a 100 - 6 YIL093C 4.8 29 30.8 a
a 100 - 7 YIL009W 5.6 29 -23.5 a
a 100 - 8 YDL042C 12.9 29 32.1 a
d 100 - 9 YIL154C 1.8 28 91.3 m
m 82 - 10 YNL149C 1.7 27 93.4 m
d 71 - 11 YBR085W 2.5 26 -15.8 a
a 84 - 12 YBR234C 1.7 26 87.5 m
d 75 - 13 YLR442C 6.1 26 -100.0 a
a 100
33Acknowledgements
- Siew Loon Ooi
- Jef Boeke
- Forrest Spencer
- Jean Yang