Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li

Description:

cRNA hybridized to human chip (HGU95) in range of proportions and dilutions. Dilution series begins at 1.25 g cRNA per GeneChip array, and rises through 2.5, ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 29
Provided by: iriz
Category:

less

Transcript and Presenter's Notes

Title: Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li


1
Bias, Variance, and Fit for Three Measures of
Expression AvDiff, Li Wongs, and AvLog(PM-BG)
  • Rafael A. Irizarry
  • Department of Biostatistics, JHU
  • (joint work with Bridget Hobbs and Terry Speed,
  • Walter Eliza Hall Institute of Medical Research)

2
Summary
  • Summarize the expression level of a probe set by
    Average Log2 (PM-BG)
  • PMs need to be normalized
  • Background makes no use of probe-specific MM
  • Evaluate and compare through bias, variance and
    model fit to AvDiff and the Li Wong algorithm
  • Use Gene Logic spike-in and dilution study
  • All three expression measures performed well
  • AvLog(PM-BG) is arguably the best of the three

3
SD vs. Avg of Defective Probes
4
Normalization at Probe Level
5
Spike-In Experiments
  • Add concentrations (0.5pM 100 pM) of 11 foreign
    species cRNAs to hybridization mixture
  • Set A 11 control cRNAs were spiked in, all at
    the same concentration, which varied across
    chips.
  • Set B 11 control cRNAs were spiked in, all at
    different concentrations, which varied across
    chips. The concentrations were arranged in 12x12
    cyclic Latin square (with 3 replicates)

6
Set A Probe Level Data (12 chips)
7
What Did We Learn?
  • Dont subtract or divide by MM
  • Probe effect is additive on log scale
  • Take logs

8
Why Remove Background?
9
Background Distribution
10
Average Log2(PM-BG)
  • Normalize probe level data
  • Compute BG background mean by estimating the
    mode of the MM distribution
  • Subtract BG from each PM
  • If PM-BG lt 0 use minimum of positives divided by
    2
  • Take average

11
Expression after Normalization
12
Expression Level Comparison
13
Spike-In B
Probe Set Conc 1 Conc 2 Rank
BioB-5 100 0.5 1
BioB-3 0.5 25.0 2
BioC-5 2.0 75.0 4
BioB-M 1.0 37.5 4
BioDn-3 1.5 50.0 5
DapX-3 35.7 3.0 6
CreX-3 50.0 5.0 7
CreX-5 12.5 2.0 8
BioC-3 25.0 100 9
DapX-5 5.0 1.5 10
DapX-M 3.0 1.0 11
Later we consider 23 different combinations of
concentrations
14
Differential Expression
15
Differential Expression
16
Differential Expression
17
Differential Expression
18
Observed Ranks
Gene AvDiff MAS 5.0 LiWong AvLog(PM-BG)
BioB-5 6 2 1 1
BioB-3 16 1 3 2
BioC-5 74 6 2 5
BioB-M 30 3 7 3
BioDn-3 44 5 6 4
DapX-3 239 24 24 7
CreX-3 333 73 36 9
CreX-5 3276 33 3128 8
BioC-3 2709 8572 681 6431
DapX-5 2709 102 12203 10
DapX-M 165 19 13 6
Top 15 1 5 6 10
19
Observed vs True Ratio
20
Dilution Experiment
  • cRNA hybridized to human chip (HGU95) in range of
    proportions and dilutions
  • Dilution series begins at 1.25 ?g cRNA per
    GeneChip array, and rises through 2.5, 5.0, 7.5,
    10.0, to 20.0 ?g per array. 5 replicate chips
    were used at each dilution
  • Normalize just within each set of 5 replicates
  • For each probe set compute expression, average
    and SD over replicates, and fit a line to
  • log expression vs. log concentration
  • Regression line should have slope 1 and high R2

21
Dilution Experiment Data
22
Expression and SD
23
Slope Estimates and R2
24
Model check
  • Compute observed SD of 5 replicate expression
    estimates
  • Compute RMS of 5 nominal SDs
  • Compare by taking the log ratio
  • Closeness of observed and nominal SD taken as a
    measure of goodness of fit of the model

25
Observed vs. Model SE
26
Observed vs. Model SE
27
Conclusion
  • Take logs
  • PMs need to be normalized
  • Using global background improves on use of
    probe-specific MM
  • Gene Logic spike-in and dilution study show all
    three expression measures performed very well
  • AvLog(PM-BG) is arguably the best in terms of
    bias, variance and model fit
  • Future better BG robust/resistant summaries

28
Acknowledgements
  • Gene Browns group at Wyeth/Genetics Institute,
    and Uwe Scherfs Genomics Research Development
    Group at Gene Logic, for generating the spike-in
    and dilution data
  • Gene Logic for permission to use these data
  • Francois Collin (Gene Logic)
  • Ben Bolstad (UC Berkeley)
  • Magnus Åstrand (Astra Zeneca Mölndal)
Write a Comment
User Comments (0)
About PowerShow.com