Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li

About This Presentation

Title:

Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li

Description:

cRNA hybridized to human chip (HGU95) in range of proportions and dilutions. Dilution series begins at 1.25 g cRNA per GeneChip array, and rises through 2.5, ... – PowerPoint PPT presentation

Number of Views:21

Avg rating:3.0/5.0

Slides: 29

Provided by: iriz

Learn more at: https://biosun01.biostat.jhsph.edu

Category:

more less

Transcript and Presenter's Notes

Title: Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li

1
Bias, Variance, and Fit for Three Measures of
Expression AvDiff, Li Wongs, and AvLog(PM-BG)

Rafael A. Irizarry
Department of Biostatistics, JHU
(joint work with Bridget Hobbs and Terry Speed,
Walter Eliza Hall Institute of Medical Research)

2
Summary

Summarize the expression level of a probe set by
Average Log2 (PM-BG)
PMs need to be normalized
Background makes no use of probe-specific MM
Evaluate and compare through bias, variance and
model fit to AvDiff and the Li Wong algorithm
Use Gene Logic spike-in and dilution study
All three expression measures performed well
AvLog(PM-BG) is arguably the best of the three

3
SD vs. Avg of Defective Probes
4
Normalization at Probe Level
5
Spike-In Experiments

Add concentrations (0.5pM 100 pM) of 11 foreign
species cRNAs to hybridization mixture
Set A 11 control cRNAs were spiked in, all at
the same concentration, which varied across
chips.
Set B 11 control cRNAs were spiked in, all at
different concentrations, which varied across
chips. The concentrations were arranged in 12x12
cyclic Latin square (with 3 replicates)

6
Set A Probe Level Data (12 chips)
7
What Did We Learn?

Dont subtract or divide by MM
Probe effect is additive on log scale
Take logs

8
Why Remove Background?
9
Background Distribution
10
Average Log2(PM-BG)

Normalize probe level data
Compute BG background mean by estimating the
mode of the MM distribution
Subtract BG from each PM
If PM-BG lt 0 use minimum of positives divided by
2
Take average

11
Expression after Normalization
12
Expression Level Comparison
13
Spike-In B
Probe Set Conc 1 Conc 2 Rank
BioB-5 100 0.5 1
BioB-3 0.5 25.0 2
BioC-5 2.0 75.0 4
BioB-M 1.0 37.5 4
BioDn-3 1.5 50.0 5
DapX-3 35.7 3.0 6
CreX-3 50.0 5.0 7
CreX-5 12.5 2.0 8
BioC-3 25.0 100 9
DapX-5 5.0 1.5 10
DapX-M 3.0 1.0 11
Later we consider 23 different combinations of
concentrations
14
Differential Expression
15
Differential Expression
16
Differential Expression
17
Differential Expression
18
Observed Ranks
Gene AvDiff MAS 5.0 LiWong AvLog(PM-BG)
BioB-5 6 2 1 1
BioB-3 16 1 3 2
BioC-5 74 6 2 5
BioB-M 30 3 7 3
BioDn-3 44 5 6 4
DapX-3 239 24 24 7
CreX-3 333 73 36 9
CreX-5 3276 33 3128 8
BioC-3 2709 8572 681 6431
DapX-5 2709 102 12203 10
DapX-M 165 19 13 6
Top 15 1 5 6 10
19
Observed vs True Ratio
20
Dilution Experiment

cRNA hybridized to human chip (HGU95) in range of
proportions and dilutions
Dilution series begins at 1.25 ?g cRNA per
GeneChip array, and rises through 2.5, 5.0, 7.5,
10.0, to 20.0 ?g per array. 5 replicate chips
were used at each dilution
Normalize just within each set of 5 replicates
For each probe set compute expression, average
and SD over replicates, and fit a line to
log expression vs. log concentration
Regression line should have slope 1 and high R2

21
Dilution Experiment Data
22
Expression and SD
23
Slope Estimates and R2
24
Model check

Compute observed SD of 5 replicate expression
estimates
Compute RMS of 5 nominal SDs
Compare by taking the log ratio
Closeness of observed and nominal SD taken as a
measure of goodness of fit of the model

25
Observed vs. Model SE
26
Observed vs. Model SE
27
Conclusion

Take logs
PMs need to be normalized
Using global background improves on use of
probe-specific MM
Gene Logic spike-in and dilution study show all
three expression measures performed very well
AvLog(PM-BG) is arguably the best in terms of
bias, variance and model fit
Future better BG robust/resistant summaries

28
Acknowledgements

Gene Browns group at Wyeth/Genetics Institute,
and Uwe Scherfs Genomics Research Development
Group at Gene Logic, for generating the spike-in
and dilution data
Gene Logic for permission to use these data
Francois Collin (Gene Logic)
Ben Bolstad (UC Berkeley)
Magnus Åstrand (Astra Zeneca Mölndal)

Write a Comment

User Comments (0)