Short Course in Biostatistics2007 Lesson 5: Introduction to ANOVA - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Short Course in Biostatistics2007 Lesson 5: Introduction to ANOVA

Description:

The computer finds the best estimates of the main effects using 'Least Squares Estimation'. The computer will also provide ... ANOVA has lots of jargon. ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 39
Provided by: lmag6
Category:

less

Transcript and Presenter's Notes

Title: Short Course in Biostatistics2007 Lesson 5: Introduction to ANOVA


1
Short Course in Biostatistics-2007Lesson 5
Introduction to ANOVA
2
Review
  • Many scientific questions can be viewed as
    questions about the unknown probability
    distributions of variables .
  • These distributions can often be characterized by
    the value of one or two parameters.
  • Therefore many scientific questions can be
    specified in terms of the value of parameters.

3
Review (continued)
  • Statistical methods provide a formal approach for
    doing two things
  • 1. Determine what the data say regarding the
    value of these parameters (estimation and
    confidence intervals).
  • 2. Quantify the strength of the evidence
    against specified hypotheses about the unknown
    parameters (p-values)

4
ANOVA
  • A general set of methods to be used when there is
    a quantitative outcome and one or more
    categorical predictors.

5
One-way ANOVA
  • Often we might be interested in the distribution
    of a quantitative outcome in 3 or more groups.
  • Example Research Question
  • What is the effect of various medications on the
    change in blood pressure in lupus patients with
    hypertension.
  • Quantitative outcome Change in SBP
  • Categorical Predictor Type of Medication

6
Parameterizing one-way ANOVA
  • Statistical methods primarily focus on the means
    of the distributions.
  • Lupus Example
  • Suppose there are three medications
  • (ACE, CCB, or DI)
  • Then, if we focus on the means, there are three
    parameters
  • µACE, µCCB, µDI,

7
Parameterizing one-way ANOVA
  • There is scientific interest in estimating these
    means, and their differences, e.g.
  • MDACE-CCB, MDACE-DI, MDCCB-DI
  • We may also be interested assessing hypotheses.
    Here are a few we might be interested in
  • Ho1 µACE µCCB,
  • Ho2 µACE µDI
  • Ho3 µCCB µDI
  • HoG µACE µCCB µDI (Global Null)

8
Parameterizing one-way ANOVA
  • We also have to consider the variance of the
    distributions. These can be denoted with three
    parameters
  • sACE , sCCB , sDI

9
Additional assumptions of ANOVA
  • In ANOVA models we usually assume that the
    variances in each group are the same, e.g.
  • sACE sCCB sDI
  • In addition, we assume that the distributions of
    interest are normal.

10
Graphical Representation of One-Way ANOVA with 3
groups
11
How do we estimate the parameters?
  • Data Independent realizations of the random
    variable in each group.
  • Ygroup1,i i1 to ngroup1
  • Ygroup2,i i1 to ngroup2
  • Ygroup3,i i1 to ngroup3
  • Estimating the means Use the sample mean

12
How do we estimate the parameters?
  • Estimating the common variance Use the average
    squared distance from the sample means, i.e.

13
Confidence interval for the means
  • Formula for a 95 CI interval for µGroup1
  • Note is referred to as the
  • Standard Error of the Mean

14
Assessing simple pair-wise hypotheses
  • To assess
  • Ho1 µACE µCCB
  • Simply perform a two-sample t-test.
  • Nothing new so far!

15
But how to quantify the evidence against the
Global Null Hypothesis?
  • Global Null
  • HoG µACE µCCB µDI
  • Use an F-test based on an ANOVA table
    (inference based sums of squares)

16
Sums of Squares
  • Total Sum of Squares (SSTO)
  • Sum of squared deviations between observed
    values and estimated the overall sample mean
    ignoring groups membership.
  • Error Sum of Squares (SSE)
  • Sum of squared deviations between observed
    values and the group-specific means.
  • SSTO-SSE is Treatment Sum of Squares (SSTR)
  • The bigger the SSTR, the stronger the
    relationship between group and the outcome.

17
Illustration to explain the Sums of Squares
18
Sums of Squares are summarized on an Analysis of
Variance Table
  • Under Global Null, the F-statistic has an F
    distribution.
  • Strategy Calculate F and compare to F
    distribution

19
Quantitative Outcome Two Groups (cont.)
  • Example Data
  • 110 Lupus patients started on ACE inhibitors
  • 82 Lupus patients started on Diuretics
  • 50 Lupus patients started on Calcium Channel
    Blockers
  • All were followed for 90 days and their change in
    SBP was measured.

20
Always look at your data before using fancy
methods!
  • Box plot of the changes in SBP by treatment group

ACE
Di
CCB
21
Sample Means and Standard Deviations
------------chsbp------------
group N Mean
Std Dev 1
110 -18.3727273 23.8319264
2 82 -21.2195122
24.0092700 3
50 -13.5200000 19.6263463
22
Analysis of Variance Table
P.1801 for global null hypothesis. Not strong
evidence against it.
23
Multi-way ANOVA
  • Often we might be interested in the distribution
    of a quantitative outcome in groups
    cross-classified by two or more categorical
    variables.
  • Example Research Question
  • What is the effect of various medications and an
    exercise regimen on change in SBP among patients
    with hypertension.
  • Quantitative outcome Change in SBP
  • Categorical Predictors Type of Medication
  • Exercise

24
Parameterizing Multi-Way ANOVA
  • We can parameterize Multi-Way ANOVA using a mean
    for each group. For 2-way ANOVA, these means can
    be concisely displayed on a table.
  • Example

25
Alternative Ways to Parameterize ANOVA Models
  • First, lets revisit the one-way ANOVA setting.
  • Previously, we parameterized one-way ANOVA using
  • µACE µCCB µDI

26
Alternative parameterization for one-way ANOVA
  • An alternative way to parameterize it would be
  • µACE µoverall aACE
  • µDI µoverall aDI
  • µCCB µoverall aCCB
  • where
  • µoverall the overall mean, and
  • aACE the effect of ACE relative to
    the other groups, etc.
  • Or more generally,
  • µj µoverall aj j1,2,3,

27
Alternative parameterization for Multi-way
ANOVA
  • Using this general approach, we might
    parameterize the two-way ANOVA described above as
    follows
  • µij µoverall aj ßi j1,2,3
    and i1,2
  • where
  • aj, j1,2,3, stand for the effects of ACE, DI,
    CCB
  • and
  • ßi , i1,2, stand for the effects of
    exercise(yes/no)

28
Alternative multi-way parameterization
  • µij µoverall aj ßi j1,2,3
    and i1,2
  • Note that this model implicitly entails the
    assumption that the effect of exercise is the
    same for each type of medication. (No effect
    modification on the MD scale.)
  • To see this, note

29
Alternative multi-way parameterization
  • The model again
  • µij µoverall aj ßi j1,2,3
    and i1,2
  • Similarly, this model entails the assumption that
    the effect of each medication is the same among
    those who do or do not exercise.
  • This is an additive model. The effects are
    simply additive.
  • This is referred to as a Main Effects model.

30
Alternative multi-way parameterization
  • We can allow for effect modification by including
    more parameters. This is usually denoted as
    follows
  • µij µoverall aj ßi aßij,
    j1,2,3 and i1,2
  • These latter parameters are interaction effects.

31
How do we estimate the parameters?
  • Data Independent realizations of the random
    variable in each group.
  • For the main-effects model, we cannot estimate
    the parameters simply by calculating the sample
    means, because they may not satisfy the
    assumption of no effect modification.
  • The computer finds the best estimates of the main
    effects using Least Squares Estimation.
  • The computer will also provide standard errors,
    confidence intervals, etc.

32
Possible Hypotheses of Interest
  • There is a semi-global null hypothesis
    corresponding to each predictor.
  • For example
  • Ho,treatment a1 a2 a30
  • Ho,exercise ß1 ß20
  • And the true Global Null
  • Ho,global a1 a2 a3 ß1 ß20
  • Each can be assessed with an F test.

33
2-way ANOVA, Example
  • Example Data
  • 60 with ACE and no exercise
  • 50 with ACE and exercise
  • 43 with DI and no exercise
  • 37 with DI and exercise

34
Looking at the Data
ACE, Exer
ACE, No Exer
DI, Exer
DI, No Exer
35
Analysis of Variance Results (Main effect Model)
36
Analysis of Variance Results (Including
Interaction Terms)
37
ANOVA has lots of jargon.
  • The categorical predictors are referred to as
    Treatments or Factors.
  • The categories within each treatment are referred
    to as Levels
  • If some subjects are observed in every
    cross-classification of all factors, we have a
    factorial design.
  • If there are the same number of subjects in each
    level of each factor, we have a balanced
    design.

38
Extensions and Complications.
  • Repeated Measures ANOVA
  • Incomplete Designs
  • MANOVA
  • Random Effects Models
  • Adjustments for multiple comparisons and post-hoc
    comparisons.
Write a Comment
User Comments (0)
About PowerShow.com