STATISTICAL INFERENCE PART IX - PowerPoint PPT Presentation

About This Presentation
Title:

STATISTICAL INFERENCE PART IX

Description:

statistical inference part ix hypothesis testing - applications more than two population inferences about population means example: ho: 1 = 2 = 3 where 1 ... – PowerPoint PPT presentation

Number of Views:212
Avg rating:3.0/5.0
Slides: 12
Provided by: Ozlem1
Category:

less

Transcript and Presenter's Notes

Title: STATISTICAL INFERENCE PART IX


1
STATISTICAL INFERENCEPART IX
  • HYPOTHESIS TESTING - APPLICATIONS MORE THAN TWO
    POPULATION

2
INFERENCES ABOUT POPULATION MEANS
  • Example
  • Ho ?1 ?2 ?3
  • where
  • ?1 population mean for group 1
  • ?2 population mean for group 2
  • ?3 population mean for group 3
  • H1 Not all are equal.

3
Assumptions
  • Each of the populations are normally distributed
    (or large enough sample sizes to use CLT) with
    equal variances
  • Populations are independent
  • Cases within each sample are independent

4
INFERENCES ABOUT POPULATION MEANS - ANOVA
Difference in means large relative to overall
variability
Difference in means small relative to overall
variability
? F tends to be small
? F tends to be large
Larger F-values typically yield more significant
results. How large is large enough? We will
compare with the tabulated value.
5
INFERENCES ABOUT POPULATION MEANS
  • If F test shows that there are significant
    differences between means, then, apply paired
    t-tests to see which one(s) are different.
  • Apply multiple testing correction to control for
    Type I error

6
Example
  • Kenton Food Company wants to test 4 different
    package designs for a new product. Designs are
    introduced in 20 randomly selected markets. These
    markets are similar to each other in terms of
    location and sales records. Due to a fire
    incidence, one of these markets are removed from
    the study, leading to an unbalanced study design.

Example is taken from Neter, J., Kutner, M.H.,
Nachtsheim, C.J., Wasserman, W., (1996) Applied
Linear Statistical Models, 4th edition, Irwin.
7
Example
Market (j) Market (j) Market (j) Market (j) Market (j)
Design (i) 1 2 3 4 5
1 11 17 16 14 15
2 12 10 15 19 11
3 23 20 18 17
4 27 33 22 26 28
Is there a difference among designs in terms of
their average sales?
8
Example
  • gt va1read.table("VAT1.txt",headerT)
  • gt head(va1,3)
  • Case Design Market Sales
  • 1 1 1 1 11
  • 2 2 1 2 17
  • 3 3 1 3 16
  • gt aov1 aov(Sales Design,datava1)
  • gt summary(aov1)
  • Df Sum Sq Mean Sq
    F value Pr(gtF)
  • Design 1 483.08 483.08
    31.186 3.289e-05
  • Residuals 17 263.34 15.49
  • ---
  • Signif. codes 0 0.001 0.01 0.05
    . 0.1 1

Degrees of freedoms are wrong! Since there are 4
different designs, d.f. should be 3.
9
Example
  • gt class(va1,2)
  • 1 "integer"
  • gt va1,2as.factor(va1,2)
  • gt aov1 aov(Sales Design,datava1)
  • gt summary(aov1)
  • Df Sum Sq Mean Sq F
    value Pr(gtF)
  • Design 3 588.22 196.074
    18.591 2.585e-05
  • Residuals 15 158.20 10.547
  • ---
  • Signif. codes 0 0.001 0.01 0.05
    . 0.1 1
  • or, alternatively
  • gt aov1 aov(Sales factor(Design),datava1)

4 designs have different mean sales. But, which
one(s) are different?
10
Example
  • gt library(multcomp)
  • gt c1glht(aov1, linfct mcp(Design "Tukey"))
  • gt summary(c1)
  • Simultaneous Tests for General Linear
    Hypotheses
  • Multiple Comparisons of Means Tukey Contrasts
  • Fit aov(formula Sales Design, data va1)
  •  Linear Hypotheses
  • Estimate Std. Error t value
    Pr(gtt)
  • 2 - 1 0 -1.200 2.054 -0.584 0.9352
  • 3 - 1 0 4.900 2.179 2.249 0.1545
  • 4 - 1 0 12.600 2.054 6.135 lt0.001
  • 3 - 2 0 6.100 2.179 2.800 0.0584 .
  • 4 - 2 0 13.800 2.054 6.719 lt0.001
  • 4 - 3 0 7.700 2.179 3.534 0.0141
  • Signif. codes 0 0.001 0.01 0.05
    . 0.1 1
  • (Adjusted p values reported -- single-step
    method)

4th design has higher average sales than all
other designs.
3rd design is slightly significantly better than
2nd design.
11
Example
  • or, alternatively
  • gt TukeyHSD(aov1, "Tasarim", conf.level0.9)
  • There are many functions in R available for
    multiple testing correction. For instance, you
    can look into p.adjust function in stats
    library for other types of corrections (e.g.
    Bonferroni). Supply raw p-values ? obtain
    adjusted p-values.
  • Different ANOVA types (e.g. 2-factor, repeated,)
    in R reference Ilk, O. (2011) R Yazilimina
    Giris, ODTU, Chp. 7
Write a Comment
User Comments (0)
About PowerShow.com