Lecture 8 Hypothesis formulation and testing Contd'' - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Lecture 8 Hypothesis formulation and testing Contd''

Description:

Lecture - 8. Hypothesis formulation. and testing. Contd.. Test for normality ... 2. Multiple comparisons or pair-wise comparisons (compare all the possible ... – PowerPoint PPT presentation

Number of Views:157
Avg rating:3.0/5.0
Slides: 36
Provided by: stweb
Category:

less

Transcript and Presenter's Notes

Title: Lecture 8 Hypothesis formulation and testing Contd''


1
Lecture - 8Hypothesis formulation and testing
Contd..
2
Test for normalityNormal distribution
not normal distribution
3
  • Non-parametric tests
  • Distribution free tests
  • Data are far from normal or data do not follow
    any distribution pattern such as normal, linear,
    binomial, exponential etc. e.g. no. of insects,
    bacterial count, disease incidence, salary of
    staff etc.
  • Few samples/replications can cause non-normality
  • Problems in measurement e.g. GPA which do not
    exactly measure the intelligence of students
  • Means, SD, SE, or variance do not represent the
    data

4
  • Why non-parametric tests?
  • Observations are independent of each other
  • Scale of measurement is rank
  • Have low power than the parametric tests if
    parametric tests are not applicable only, these
    methods should be applied
  • These methods are recently becoming popular as
    distribution free data are quite common

5
  • Steps in non-parametric tests - Ranking
  • Example 1
  • Female heights (cm)193, 170, 188, 178, 183, 180,
    185,
  • Male heights 175, 173, 163, 168, 165
  • Methods
  • Step 1
  • Sorting by ascending
  • or Descending order
  • Step 2
  • Ranking of the
  • data from all
  • the groups (it is the
  • basic principle)

6
  • Ranking
  • Example 2
  • Tied ranks
  • There are two 32
  • they get
  • 3 and 4 ranks
  • therefore,
  • averaged rank
  • is 3.5
  • Similarly, three 44
  • with ranks 8, 9 10,
  • therefore they all
  • get the mean rank i.e. 9

7
  • Mann-Whitney test (U-test)
  • Two groups (k2) i.e. similar to t-test for
    non-normal distribution i.e. non-parametric
  • U n1n2 n1(n11)/2 R1
  • Where,
  • n1 is the number of samples in the first group
  • n2 is the number of samples in the second group
  • R1 is the sum of the ranks of the first group
  • R2 is the sum of the ranks of the second group
  • Here, assumption is n1 gt n2, but if n2 gt n1 then,
    the equation should be
  • U n1n2 n2(n21)/2 R2

8
  • Mann-Whitney test
  • Example 1
  • H0 Males females
  • n1 7 and n2 5
  • U 7578/2 30
  • 33
  • U 0.05, 5, 7
  • 30 (From table)
  • Reject H0

9
  • Mann-Whitney test
  • Example 2 Ordinal data
  • H0 Males females
  • n1 9 and n2 8
  • U 98910/2 69.5
  • 47.5
  • U 0.05, 8, 9
  • 57 (From table)
  • Accept H0
  • There is no difference
  • between grades obtained
  • by male female students

10
  • Wilcoxons test (paired samples)
  • Also called Rank Sum, Matched pair and
    Signed Rank tests
  • Analogous to paired t-test (but low power)
  • Example test whether the new breed of goat has
    longer hind-legs compared to the forelegs.

11
Example H0 Hindleg foreleg Here, T
4.54.5779.5 79.52 51.5 T- 31 4 T
0.05, 10 8 (Table) And P lt0.05 Reject H0
(Hindleg is longer than foreleg)
Note if difference is zero, it is discarded
12
Analysis of Variance (ANOVA)
  • (Parametric test)
  • Two means are compared with t-test, if more than
    two means need ANOVA
  • H0 there are no differences among the means
  • Comparison depends on purpose and objective or
    the experimental design

13
  • Comparisons of five means

Means A B C D E
Freq.
Values
14
  • Experimental designs
  • Completely Randomized Design (CRD)
  • Randomized Complete Block Design (RCBD)
  • Latin Square Design (LSD)
  • Factorial Design
  • One factor
  • Two factors
  • Multi-factors

15
  • Experimental designs
  • 1. Completely Randomized Design (CRD)
  • Assumptions
  • all the experimental units are considered uniform
    or identical
  • treatment allocation into experimental units is
    completely random

Experimental units
16
  • Hypothesis is tested by comparing the variation,
    therefore, called as Analysis of Variance (ANOVA)
  • between treatments with the variation among
    treatments
  • If variation between treatments (Treatment
    effect) is higher than the variation within
    treatment (i.e. Random error), there is a
    significant difference
  • Model

Yi ? Ti Ri
17
Separation of variation
If Ti gt Ri treatment effect is significant
Yi Ri Ti ?
Random errors
Treatment effects
18
  • Also called as
  • single factor experiment
  • For examples
  • Fertilization trials
  • - Organic, in-organic and combination
  • - 0, 40 and 60 kg N/ha/week etc.
  • Crop/vegetable/fruits varieties
  • Animal breeds
  • Drug efficacy etc.

19
  • Randomization and layout
  • Allocation of the treatments and replications is
    done by lottery or using random numbers/table
  • Determine the total number of experimental units
    (n) t x r e.g. to test 6 varieties with 4
    replications, you will need 24 plots
  • Assign plot number to each plot (1 to n)
  • Assign treatments to the experimental plots by
    using lottery or random table

20
  • Randomization

1 2 3 4 5 6
7 8 9 10 11 12
13 14 15 16 17 18 .? 24
21
Data analysis 1. Group the data by treatments
and calculate the treatment totals (T) and grand
total (G), the grand mean and the coefficient of
variation (c.v.) etc.2. Using number of
treatments (t) and the number of replications (r)
determine the degree of freedom (d.f.) for each
source of variation3. Construct an
outline/table (next slide) of the analysis of
variance
22
ANOVA table of a CRD experiment
t number of treatments r number of
replicates per treatment
23
  • 4. Using Xi to represent the measurement of the
    ith plot, Ti as the total of the ith treatment,
    and n as the total number of experimental plots
    i.e. n (r) (t) , calculate the correction
    factor (CF) and the various sums of square (SS)
  • 5. Calculate the mean square (MS) for each source
    of variation by dividing SS by their
    corresponding d.f.
  • 6. Calculate the F- value (R.A. Fisher) for
    testing significance of the treatment difference
    (F MST/MSE)
  • 7. Enter all the values computed in the ANOVA
    table

24
  • 8. Obtain the tabular F values with f1
    treatment d.f. (t-1) and f2 error d.f. t
    (r-1) and compare as follows

Statistical inference
25
Example Four different feeds were tested on 20
pigs. Following were the mean final weights (kg)
of 19 pigs (1 pig died). Here, H0 ?1 ?2 ?3
?4
26
Step 1 Calculate sum squares Correction factor
(C) (Grand total)2 /n (1482.2)2 /19
115,627 Total SS (60.8) 2 (57.0)2 -------
(90.3)2 - C 119,982-115627
4,355 Treatment SS ? (Treatment total)2/n
C (303.1)2 /5 (346.5)2 /5 (401.4)2 /4
(431.2)2 /5 - 115,627 4,226 Error SS Total
SS Treatment SS 4,355 4,226 128
27
Step 2 Prepare an ANOVA table
Note Numerator df 3 Denominator d.f.
15 Reject H0 which means Treatment (feed) has
effect on pig growth but to compare among feeds
need test for Multiple comparisons
28
  • If ANOVA shows significant difference, we need
    posteriori test such as
  • 1. Comparison between two means e.g. control
    verses others
  • - Students t-test (as before)
  • 2. Multiple comparisons or pair-wise comparisons
    (compare all the possible combinations
    simultaneously or ranking is possible)
  • - LSD (Least significant difference)
  • - DMRT (Duncans multiple range test)
  • - Tukeys HSD (Tukeys Honestly Significant
    Difference Test)
  • Note If ANOVA shows no significant difference
    there multiple range test are not necessary

29
  • 2. Multiple comparisons or pair-wise comparisons
  • Calculate the common value for difference using
    pooled variance such as
  • SE (X1-X2) v (S2 (1/ N1 1/N2)
  • v 8.557 (1/51/5) 1.85 g
  • t 0.05, 15 df 2.131, 95 CI 1.852.131
    3.94 g

Reject H0 - all means are different Results ?1
lt ?2 lt ?4 lt ?3
30
  • 2. Multiple comparisons or Post Hoc Test

Widely accepted
Not suggested
31
Homogeneous Subsets
Non-significant means are shown in the same
column.
Widely accepted
32
  • 2. Final result presentation tabular
  • Table no Mean weights of pigs fed with 4 diets
    during the trial.

Values with the same superscripts are not
significantly different at 0.05
33
  • 2. Final result presentation (Graphical)
  • Figure no Mean weights (kg ? 95 confidence
    intervals) of pigs fed with 4 diets during the
    trial.

d
c
b
a
34
  • CRD ANOVA vs Multiple range tests
  • Adv.
  • High proportion of degree of freedom thus it is
    suitable for smaller experiments with fewer
    experimental units.
  • It is stronger than multiple range tests
    therefore it is done before multiple range tests
  • Disad.
  • If experimental units are not homogenous, there
    will be an increased experimental error
  • It doesnt compare among the means or does not
    locate the differences

35
Some useful websites related to
ANOVA http//www.physics.csbsju.edu/stats/anova.
html http//www.psychstat.smsu.edu/introbook/sbk2
7.htm
  • Thank you!
Write a Comment
User Comments (0)
About PowerShow.com