Title: A Microarray-Based Screening Procedure for Detecting Differentially Represented Yeast Mutants
1A Microarray-Based Screening Procedure for
Detecting Differentially Represented Yeast Mutants
- Rafael A. Irizarry
- Department of Biostatistics, JHU
- rafa_at_jhu.edu
- http//biostat.jhsph.edu/ririzarr
2(No Transcript)
3A
kanR
DOWNTAG
UPTAG
CEN/ARS
B
URA3
MCS
Circular pRS416
EcoRI linearized PRS416
Transformation into deletion pool
Select for Ura transformants Genomic DNA
preparation
PCR
Cy5 labeled PCR products
Cy3 labeled PCR products
Oligonucleotide array hybridization
4Which mutants are NHEJ defective?
- Find mutants defective for transformation with
linear DNA - Dead in linear transformation (green)
- Alive in circular transformation (red)
- Look for spots with large log(R/G)
5 6Data
- 5718 mutants
- 3 replicates on each slide
- 5 Haploid slides, 4 Diploid slides
- Arrays are divided into 2 downtags, 3 uptag (2 of
which replicate uptags)
7Average Red and Green Scatter Plot
8Average Red and Green MVA plot
9(No Transcript)
10Improvement to usual approach
- Take into account that some mutants are dead and
some alive - Use a statistical model to represent this
- Mixture model?
- With ratios we lose information about R and G
separately - Look at them separately (absolute analysis)
11Histograms
12Using model we can attach uncertainty to tests
- For example posterior z-test,
- weighted average of z-tests with weights
obtained using the posterior probability
(obtained from EM) - Is Normal(0,1)
13QQ-Plot
14Uptag/Downtag Z-Scores
15Average Red and Green MVA Plot
16Average Red and Green Scatter Plot
17(No Transcript)
18(No Transcript)
19(No Transcript)
20(No Transcript)
21(No Transcript)
22ResultsTable
- 1 YMR106C 9.5 47 69.2 a
a 100 - 2 YOR005C 19.7 35 44.9 a
d 100 - 3 YLR265C 6.1 32 35.8 a
m 100 - 4 YDL041W 10.4 32 35.6 a
m 100 - 5 YIL012W 12.2 31 21.7 a
a 100 - 6 YIL093C 4.8 29 30.8 a
a 100 - 7 YIL009W 5.6 29 -23.5 a
a 100 - 8 YDL042C 12.9 29 32.1 a
d 100 - 9 YIL154C 1.8 28 91.3 m
m 82 - 10 YNL149C 1.7 27 93.4 m
d 71 - 11 YBR085W 2.5 26 -15.8 a
a 84 - 12 YBR234C 1.7 26 87.5 m
d 75 - 13 YLR442C 6.1 26 -100.0 a
a 100
23Acknowledgements
- Siew Loon Ooi
- Jef Boeke
- Forrest Spencer
- Jean Yang
24END
25Summary
- Simple data exploration useful tool for quality
assessment - Statistical thinking helpful for interpretation
- Statistical models may help find signals in noise
26Acknowledgements
MBG (SOM) Jef Boeke Siew-Loon Ooi Marina
Lee Forrest Spencer
Biostatistics Karl Broman Leslie Cope Carlo
Coulantoni Giovanni Parmigiani Scott Zeger
PGA Tom Cappola Skip Garcia Joshua Hare
UC Berkeley Stat Ben Bolstad Sandrine
Dudoit Terry Speed Jean Yang
Gene Logic Francois Colin Uwe Scherfs Group
WEHI Bridget Hobbs Natalie Thorne
27(No Transcript)
28(No Transcript)
29(No Transcript)
30(No Transcript)
31Warning
- Absolute analyses can be dangerous for
competitive hybridization slides - We must be careful about spot effect
- Big R or G may only mean the spot they where on
had large amounts of cDNA - Look at some facts that make us feel safer
32Correlation between replicates
- R1 R2 R3 G1 G2 G3
- R1 1.00 0.95 0.95 0.94 0.90 0.90
- R2 0.95 1.00 0.96 0.90 0.95 0.91
- R3 0.95 0.96 1.00 0.91 0.92 0.95
- G1 0.94 0.90 0.91 1.00 0.96 0.96
- G2 0.90 0.95 0.92 0.96 1.00 0.97
- G3 0.90 0.91 0.95 0.96 0.97 1.00
33Correlation between red, green, haploid, diplod,
uptag, downtag
- RHD RHU RDD RDU GHD GHU
GDD GDU - RHD 1.00 0.59 0.56 0.32 0.95 0.58 0.54 0.37
- RHU 0.59 1.00 0.38 0.56 0.58 0.95 0.40 0.58
- RDD 0.56 0.38 1.00 0.58 0.54 0.39 0.92 0.64
- RDU 0.32 0.56 0.58 1.00 0.33 0.53 0.58 0.89
- GHD 0.95 0.58 0.54 0.33 1.00 0.62 0.56 0.39
- GHU 0.58 0.95 0.39 0.53 0.62 1.00 0.41 0.58
- GDD 0.54 0.40 0.92 0.58 0.56 0.41 1.00 0.73
- GDU 0.37 0.58 0.64 0.89 0.39 0.58 0.73 1.00
34BTW
- The mean squared error across slides is about 3
times bigger than the mean squared error within
slides
35Mixture Model
- We use a mixture model that assumes
- There are three classes
- Dead
- Marginal
- Alive
- Normally distributed with same correlation
structure from gene to gene
36Random effect justification
- Each x (r1,,r5,g1,,g5) will have the
following effects - Individual effect same mutant same expression
(replicates are alike) - Genetic effect same genetics same expression
- PCR effect expect difference in uptag, downtag
37Does it fit?
38Does it fit?
39What can we do now that we couldnt do before?
- Define a t-test that takes into account if
mutants are dead or not when computing variance - For each gene compute likelihood ratios comparing
two hypothesis - alive/dead vs.dead/dead or alive/alive
40QQ-plot for new t-test
41Better looking than others
42(No Transcript)
43(No Transcript)
44(No Transcript)
45- 1 YMR106C 9.5 47 69.2 a
a 100 - 2 YOR005C 19.7 35 44.9 a
d 100 - 3 YLR265C 6.1 32 35.8 a
m 100 - 4 YDL041W 10.4 32 35.6 a
m 100 - 5 YIL012W 12.2 31 21.7 a
a 100 - 6 YIL093C 4.8 29 30.8 a
a 100 - 7 YIL009W 5.6 29 -23.5 a
a 100 - 8 YDL042C 12.9 29 32.1 a
d 100 - 9 YIL154C 1.8 28 91.3 m
m 82 - 10 YNL149C 1.7 27 93.4 m
d 71 - 11 YBR085W 2.5 26 -15.8 a
a 84 - 12 YBR234C 1.7 26 87.5 m
d 75 - 13 YLR442C 6.1 26 -100.0 a
a 100