Basic Experimental Design - PowerPoint PPT Presentation

Loading...

PPT – Basic Experimental Design PowerPoint presentation | free to download - id: 7ce55-NTY4O



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Basic Experimental Design

Description:

Basic Experimental Design. Larry V. Hedges. Northwestern University ... Basic principles of experimental design. Control of ... Basic Ideas of Design (Nesting) ... – PowerPoint PPT presentation

Number of Views:581
Avg rating:3.0/5.0
Slides: 158
Provided by: larry288
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Basic Experimental Design


1
Basic Experimental Design
  • Larry V. Hedges
  • Northwestern University
  • Prepared for the IES Summer Research Training
    Institute July 8, 2008

2
What is Experimental Design?
  • Experimental design includes both
  • Strategies for organizing data collection
  • Data analysis procedures matched to those data
    collection strategies
  • Classical treatments of design stress analysis
    procedures based on the analysis of variance
    (ANOVA)
  • Other analysis procedure such as those based on
    hierarchical linear models or analysis of
    aggregates (e.g., class or school means) are also
    appropriate

3
Why Do We Need Experimental Design?
  • Because of variability
  • We wouldnt need a science of experimental design
    if
  • If all units (students, teachers, schools) were
    identical
  • and
  • If all units responded identically to treatments
  • We need experimental design to control
    variability so that treatment effects can be
    identified

4
A Little History
  • The idea of controlling variability through
    design has a long history
  • In 1747 Sir James Linds studies of scurvy
  • Their cases were as similar as I could have
    them. They all in general had putrid gums, spots
    and lassitude, with weakness of their knees.
    They lay together on one place … and had one diet
    common to all (Lind, 1753, p. 149)
  • Lind then assigned six different treatments to
    groups of patients

5
A Little History
  • The idea of random assignment was not obvious and
    took time to catch on
  • In 1648 von Helmont carried out one randomization
    in a trial of bloodletting for fevers
  • In 1904 Karl Pearson suggested matching and
    alternation in typhoid trials
  • Amberson, et al. (1931) carried out a trial with
    one randomization
  • In 1937 Sir Bradford Hill advocated alternation
    of patients in trials rather than randomization
  • Diehl, et al. (1938) carried out a trial that is
    sometimes referred to as randomized, but it
    actually used alternation

6
A Little History
  • The first modern randomized clinical trial in
    medicine is usually considered to be the trial of
    streptomycin for treating tuberculosis
  • It was conducted by the British Medical Research
    Council in 1946 and reported in 1948

7
A Little History
  • Experiments have been used longer in the
    behavioral sciences (e.g., psychophysics Pierce
    and Jastrow, 1885)
  • Experiments conducted in laboratory settings were
    widely used in educational psychology (e.g.,
    McCall, 1923)
  • Thorndike (early 1900s)
  • Lindquist (1953)
  • Gage field experiments on teaching (1978 1984)

8
A Little History
  • Studies in crop variation I VI (1921 1929)
  • In 1919 a statistician named Fisher was hired at
    Rothamsted agricultural station
  • They had a lot of observational data on crop
    yields and hoped a statistician could analyze it
    to find effects of various treatments
  • All he had to do was sort out the effects of
    confounding variables

9
Studies in Crop Variation I (1921)
  • Fisher does regression analyseslots of themto
    study (and get rid of) the effects of confounders
  • soil fertility gradients
  • drainage
  • effects of rainfall
  • effects of temperature and weather, etc.
  • Fisher does qualitative work to sort out
    anomalies
  • Conclusion
  • The effects of confounders are typically larger
    than those of the systematic effects we want to
    study

10
Studies in Crop Variation II (1923)
  • Fisher invents
  • Basic principles of experimental design
  • Control of variation by randomization
  • Analysis of variance

11
Studies in Crop Variation IV and VI
  • Studies in Crop variation IV (1927)
  • Fisher invents analysis of covariance to combine
    statistical control and control by randomization
  • Studies in crop variation VI (1929)
  • Fisher refines the theory of experimental
    design, introducing most other key concepts known
    today

12
Our Hero in 1929
13
Principles of Experimental Design
  • Experimental design controls background
    variability so that systematic effects of
    treatments can be observed
  • Three basic principles
  • Control by matching
  • Control by randomization
  • Control by statistical adjustment
  • Their importance is in that order

14
Control by Matching
  • Known sources of variation may be eliminated by
    matching
  • Eliminating genetic variation
  • Compare animals from the same litter of mice
  • Eliminating district or school effects
  • Compare students within districts or schools
  • However matching is limited
  • matching is only possible on observable
    characteristics
  • perfect matching is not always possible
  • matching inherently limits generalizability by
    removing (possibly desired) variation

15
Control by Matching
  • Matching ensures that groups compared are alike
    on specific known and observable characteristics
    (in principle, everything we have thought of)
  • Wouldnt it be great if there were a method of
    making groups alike on not only everything we
    have thought of, but everything we didnt think
    of too?
  • There is such a method

16
Control by Randomization
  • Matching controls for the effects of variation
    due to specific observable characteristics
  • Randomization controls for the effects all
    (observable or non-observable, known or unknown)
    characteristics
  • Randomization makes groups equivalent (on
    average) on all variables (known and unknown,
    observable or not)
  • Randomization also gives us a way to assess
    whether differences after treatment are larger
    than would be expected due to chance.

17
Control by Randomization
  • Random assignment is not assignment with no
    particular rule. It is a purposeful process
  • Assignment is made at random. This does not
    mean that the experimenter writes down the names
    of the varieties in any order that occurs to him,
    but that he carries out a physical experimental
    process of randomization, using means which shall
    ensure that each variety will have an equal
    chance of being tested on any particular plot of
    ground (Fisher, 1935, p. 51)

18
Control by Randomization
  • Random assignment of schools or classrooms is not
    assignment with no particular rule. It is a
    purposeful process
  • Assignment of schools to treatments is made at
    random. This does not mean that the experimenter
    assigns schools to treatments in any order that
    occurs to her, but that she carries out a
    physical experimental process of randomization,
    using means which shall ensure that each
    treatment will have an equal chance of being
    tested in any particular school (Hedges, 2007)

19
Control by Statistical Adjustment
  • Control by statistical adjustment is a form of
    pseudo-matching
  • It uses statistical relations to simulate
    matching
  • Statistical control is important for increasing
    precision but should not be relied upon to
    control biases that may exist prior to assignment
  • Statistical control is the weakest of the three
    experimental design principles because its
    validity depends on knowing a statistical model
    for responses

20
Using Principles of Experimental Design
  • You have to know a lot (be smart) to use matching
    and statistical control effectively
  • You do not have to be smart to use randomization
    effectively
  • But
  • Where all are possible, randomization is not as
    efficient (requires larger sample sizes for the
    same power) as matching or statistical control

21
Basic Ideas of Design Independent Variables
(Factors)
  • The values of independent variables are called
    levels
  • Some independent variables can be manipulated,
    others cant
  • Treatments are independent variables that can be
    manipulated
  • Blocks and covariates are independent variables
    that cannot be manipulated
  • These concepts are simple, but are often confused
  • Remember
  • You can randomly assign treatment levels but not
    blocks

22
Basic Ideas of Design (Crossing)
  • Relations between independent variables
  • Factors (treatments or blocks) are crossed if
    every level of one factor occurs with every level
    of another factor
  • Example
  • The Tennessee class size experiment assigned
    students to one of three class size conditions.
    All three treatment conditions occurred within
    each of the participating schools
  • Thus treatment was crossed with schools

23
Basic Ideas of Design (Nesting)
  • Factor B is nested in factor A if every level of
    factor B occurs within only one level of factor A
  • Example
  • The Tennessee class size experiment actually
    assigned classrooms to one of three class size
    conditions. Each classroom occurred in only one
    treatment condition
  • Thus classrooms were nested within treatments
  • (But treatment was crossed with schools)

24
Where Do These Terms Come From? (Nesting)
  • An agricultural experiment where blocks are
    literally blocks or plots of land
  • Here each block is literally nested within a
    treatment condition

25
Where Do These Terms Come From? (Crossing)
  • An agricultural experiment
  • Blocks were literally blocks of land and plots of
    land within blocks were assigned different
    treatments

26
Where Do These Terms Come From? (Crossing)
  • Blocks were literally blocks of land and plots of
    land within blocks were assigned different
    treatments.
  • Here treatment literally crosses the blocks

27
Where Do These Terms Come From? (Crossing)
  • The experiment is often depicted like this. What
    is wrong with this as a field layout?
  • Consider possible sources of bias

28
Think About These Designs
  • A study assigns a reading treatment (or control)
    to children in 20 schools. Each child is
    classified into one of three groups with
    different risk of reading failure.
  • A study assigns T or C to 20 teachers. The
    teachers are in five schools, and each teacher
    teaches 4 science classes
  • Two schools in each district are picked to
    participate. Each school has two grade 4
    teachers. One of them is assigned to T, the other
    to C.

29
Three Basic Designs
  • The completely randomized design
  • Treatments are assigned to individuals
  • The randomized block design
  • Treatments are assigned to individuals within
    blocks
  • (This is sometimes called the matched design,
    because individuals are matched within blocks)
  • The hierarchical design
  • Treatments are assigned to blocks, the same
    treatment is assigned to all individuals in the
    block

30
The Completely Randomized Design
  • Individuals are randomly assigned to one of two
    treatments

31
The Randomized Block Design
32
The Hierarchical Design
33
Randomization Procedures
  • Randomization has to be done as an explicit
    process devised by the experimenter
  • Haphazard is not the same as random
  • Unknown assignment is not the same as random
  • Essentially random is technically meaningless
  • Alternation is not random, even if you alternate
    from a random start
  • This is why R.A. Fisher was so explicit about
    randomization processes

34
Randomization Procedures
  • R.A. Fisher on how to randomize an experiment
    with small sample size and 5 treatments
  • A satisfactory method is to use a pack of cards
    numbered from 1 to 100, and to arrange them in
    random order by repeated shuffling. The
    varieties treatments are numbered from 1 to 5,
    and any card such as the number 33, for example
    is deemed to correspond to variety treatment
    number 3, because on dividing by 5 this number is
    found as the remainder. (Fisher, 1935, p.51)

35
Randomization Procedures
  • You may want to use a table of random numbers,
    but be sure to pick an arbitrary start point!
  • Beware random number generatorsthey typically
    depend on seed values, be sure to vary the seed
    value (if they do not do it automatically)
  • Otherwise you can reliably generate the same
    sequence of random numbers every time
  • It is no different that starting in the same
    place in a table of random numbers

36
Randomization Procedures
  • Completely Randomized Design
  • (2 treatments, 2n individuals)
  • Make a list of all individuals
  • For each individual, pick a random number from 1
    to 2 (odd or even)
  • Assign the individual to treatment 1 if even, 2
    if odd
  • When one treatment is assigned n individuals,
    stop assigning more individuals to that treatment

37
Randomization Procedures
  • Completely Randomized Design (2pn
    individuals, p treatments)
  • Make a list of all individuals
  • For each individual, pick a random number from 1
    to p
  • One way to do this is to get a random number of
    any size, divide by p, the remainder R is between
    0 and (p 1), so add 1 to the remainder to get R
    1
  • Assign the individual to treatment R 1
  • Stop assigning individuals to any treatment after
    it gets n individuals

38
Randomization Procedures
  • Randomized Block Design with 2 Treatments
  • (m blocks per treatment, 2n individuals per
    block)
  • Make a list of all individuals in the first block
  • For each individual, pick a random number from 1
    to 2 (odd or even)
  • Assign the individual to treatment 1 if even, 2
    if odd
  • Stop assigning a treatment it is assigned n
    individuals in the block
  • Repeat the same process with every block

39
Randomization Procedures
  • Randomized Block Design with p Treatments
  • (m blocks per treatment, pn individuals per
    block)
  • Make a list of all individuals in the first block
  • For each individual, pick a random number from 1
    to p
  • Assign the individual to treatment p
  • Stop assigning a treatment it is assigned n
    individuals in the block
  • Repeat the same process with every block

40
Randomization Procedures
  • Hierarchical Design with 2 Treatments
  • (m blocks per treatment, n individuals per
    block)
  • Make a list of all blocks
  • For each block, pick a random number from 1 to 2
  • Assign the block to treatment 1 if even,
    treatment 2 if odd
  • Stop assigning a treatment after it is assigned m
    blocks
  • Every individual in a block is assigned to the
    same treatment

41
Randomization Procedures
  • Hierarchical Design with p Treatments
  • (m blocks per treatment, n individuals per
    block)
  • Make a list of all blocks
  • For each block, pick a random number from 1 to p
  • Assign the block to treatment corresponding to
    the number
  • Stop assigning a treatment after it is assigned m
    blocks
  • Every individual in a block is assigned to the
    same treatment

42
Sampling Models
43
Sampling Models in Educational Research
  • Sampling models are often ignored in educational
    research
  • But
  • Sampling is where the randomness comes from in
    social research
  • Sampling therefore has profound consequences for
    statistical analysis and research designs

44
Sampling Models in Educational Research
  • Simple random samples are rare in field research
  • Educational populations are hierarchically
    nested
  • Students in classrooms in schools
  • Schools in districts in states
  • We usually exploit the population structure to
    sample students by first sampling schools
  • Even then, most samples are not probability
    samples, but they are intended to be
    representative (of some population)

45
Sampling Models in Educational Research
  • Survey research calls this strategy multistage
    (multilevel) clustered sampling
  • We often sample clusters (schools) first then
    individuals within clusters (students within
    schools)
  • This is a two-stage (two-level) cluster sample
  • We might sample schools, then classrooms, then
    students
  • This is a three-stage (three-level) cluster
    sample

46
Precision of Estimates Depends on the Sampling
Model
  • Suppose the total population variance is sT2 and
    ICC is ?
  • Consider two samples of size N mn
  • A simple random sample or stratified sample
  • The variance of the mean is sT2/mn
  • A clustered sample of n students from each of m
    schools
  • The variance of the mean is (sT2/mn)1 (n
    1)?
  • The inflation factor 1 (n 1)? is called the
    design effect

47
Precision of Estimates Depends on the Sampling
Model
  • Suppose the population variance is sT2
  • School level ICC is ?S, class level ICC is ?C
  • Consider two samples of size N mpn
  • A simple random sample or stratified sample
  • The variance of the mean is sT2/mpn
  • A clustered sample of n students from p classes
    in m schools
  • The variance is (sT2/mpn)1 (pn 1)?S (n
    1)?C
  • The three level design effect is 1 (pn 1)?S
    (n 1)?C

48
Precision of Estimates Depends on the Sampling
Model
  • Treatment effects in experiments and
    quasi-experiments are mean differences
  • Therefore precision of treatment effects and
    statistical power will depend on the sampling
    model

49
Sampling Models in Educational Research
  • The fact that the population is structured does
    not mean the sample is must be a clustered sample
  • Whether it is a clustered sample depends on
  • How the sample is drawn (e.g., are schools
    sampled first then individuals randomly within
    schools)
  • What the inferential population is (e.g., is the
    inference to these schools studied or a larger
    population of schools)

50
Sampling Models in Educational Research
  • A necessary condition for a clustered sample is
    that it is drawn in stages using population
    subdivisions
  • schools then students within schools
  • schools then classrooms then students
  • However, if all subdivisions in a population are
    present in the sample, the sample is not
    clustered, but stratified
  • Stratification has different implications than
    clustering
  • Whether there is stratification or clustering
    depends on the definition of the population to
    which we draw inferences (the inferential
    population)

51
Sampling Models in Educational Research
  • The clustered/stratified distinction matters
    because it influences the precision of statistics
    estimated from the sample
  • If all population subdivisions are included in
    the every sample, there is no sampling (or
    exhaustive sampling) of subdivisions
  • therefore differences between subdivisions add no
    uncertainty to estimates
  • If only some population subdivisions are included
    in the sample, it matters which ones you happen
    to sample
  • thus differences between subdivisions add to
    uncertainty

52
Inferential Population and Inference Models
  • The inferential population or inference model has
    implications for analysis and therefore for the
    design of experiments
  • Do we make inferences to the schools in this
    sample or to a larger population of schools?
  • Inferences to the schools or classes in the
    sample are called conditional inferences
  • Inferences to a larger population of schools or
    classes are called unconditional inferences

53
Inferential Population and Inference Models
  • Note that the inferences (what we are estimating)
    are different in conditional versus unconditional
    inference models
  • In a conditional inference, we are estimating the
    mean (or treatment effect) in the observed
    schools
  • In unconditional inference we are estimating the
    mean (or treatment effect) in the population of
    schools from which the observed schools are
    sampled
  • We are still estimating a mean (or a treatment
    effect) but they are different parameters with
    different uncertainties

54
Fixed and Random Effects
  • When the levels of a factor (e.g., particular
    blocks included) in a study are sampled and the
    inference model is unconditional, that factor is
    called random and its effects are called random
    effects
  • When the levels of a factor (e.g., particular
    blocks included) in a study constitute the entire
    inference population and the inference model is
    conditional, that factor is called fixed and its
    effects are called fixed effects

55
Applications to Experimental Design
  • We will look in detail at the two most widely
    used experimental designs in education
  • Randomized blocks designs
  • Hierarchical designs

56
Experimental Designs
  • For each design we will look at
  • Structural Model for data (and what it means)
  • Two inference models
  • What does treatment effect mean in principle
  • What is the estimate of treatment effect
  • How do we deal with context effects
  • Two statistical analysis procedures
  • How do we estimate and test treatment effects
  • How do we estimate and test context effects
  • What is the sensitivity of the tests

57
The Randomized Block Design
  • The population (the sampling frame)
  • We wish to compare two treatments
  • We assign treatments within schools
  • Many schools with 2n students in each
  • Assign n students to each treatment in each school

58
The Randomized Block Design
  • The experiment
  • Compare two treatments in an experiment
  • We assign treatments within schools
  • With m schools with 2n students in each
  • Assign n students to each treatment in each school

59
The Randomized Block Design
  • Diagram of the design

60
The Randomized Block Design
  • School 1

61
The Conceptual Model
  • The statistical model for the observation on the
    kth person in the jth school in the ith treatment
    is
  • Yijk µ ai ßj aßij eijk
  • where
  • µ is the grand mean,
  • ai is the average effect of being in treatment i,
  • ßj is the average effect of being in school j,
  • aßij is the difference between the average effect
    of treatment i and the effect of that treatment
    in school j,
  • eijk is a residual

62
Effect of Context
Context Effect
63
Two-level Randomized Block Design With No
Covariates (HLM Notation)
  • Level 1 (individual level)
  • Yijk ß0j ß1jTijk eijk e N(0, sW2)
  • Level 2 (school Level)
  • ß0j p00 ?0j ?0j N(0, sS2)
  • ß1j p10 ?1j ?1j N(0, sTxS2)
  • If we code the treatment Tijk ½ or - ½ , then
    the parameters are identical to those in standard
    ANOVA

64
Effects and Estimates
  • The population mean of treatment 1 in school j
    is
  • a1 aß1j
  • The population mean of treatment 2 in school j is
  • a2 aß2j
  • The estimate of the mean of treatment 1 in school
    j is
  • a1 aß1j e1j?
  • The estimate of the mean of treatment 2 in school
    j is
  • a2 aß2j e2j?

65
Effects and Estimates
  • The comparative treatment effect in any given
    school j is
  • (a1 a2) (aß1j aß2j)
  • The estimate of comparative treatment effect in
    school j is
  • (a1 a2) (aß1j aß2j) (e1j? e2j?)
  • The mean treatment effect in the experiment is
  • (a1 a2) (aß1? aß2?)
  • The estimate of the mean treatment effect in the
    experiment is
  • (a1 a2) (aß 1? aß2?) (e1?? e2??)

66
Inference Models
  • Two different kinds of inferences about effects
  • Unconditional Inference (Schools Random)
  • Inference to the whole universe of schools
  • (requires a representative sample of schools)
  • Conditional Inference (Schools Fixed)
  • Inference to the schools in the experiment
  • (no sampling requirement on schools)

67
Statistical Analysis Procedures
  • Two kinds of statistical analysis procedures
  • Mixed Effects Procedures (Schools Random)
  • Treat schools in the experiment as a sample from
    a population of schools
  • (only strictly correct if schools are a sample)
  • Fixed Effects Procedures (Schools Fixed)
  • Treat schools in the experiment as a population

68
Unconditional Inference (Schools Random)
  • The estimate of the mean treatment effect in the
    experiment is
  • (a1 a2) (aß 1? aß2?) (e1?? e2??)
  • The average treatment effect we want to estimate
    is
  • (a1 a2)
  • The term (e1?? e2??) depends on the students in
    the schools in the sample
  • The term (aß1? aß2?) depends on the schools in
    sample
  • Both (e1?? e2??) and (aß1? aß2?) are random
    and average to 0 across students and schools,
    respectively

69
Conditional Inference (Schools Fixed)
  • The estimate of the mean treatment effect in the
    experiment is still
  • (a1 a2) (aß 1? aß2?) (e1?? e2??)
  • Now the average treatment effect we want to
    estimate is
  • (a1 aß1?) (a2 aß2?) (a1 a2) (aß1?
    aß2?)
  • The term (e1?? e2??) depends on the students in
    the schools in the sample
  • The term (aß1? aß2?) depends on the schools in
    sample, but the treatment effect in the sample of
    schools is the effect we want to estimate

70
Expected Mean Squares Randomized Block
Design (Two Levels, Schools Random)
71
Mixed Effects Procedures (Schools Random)
  • The test for treatment effects has
  • H0 (a1 a2) 0
  • Estimated mean treatment effect in the experiment
    is
  • (a1 a2) (aß1? aß2?) (e1?? e2??)
  • The variance of the estimated treatment effect is
  • 2sW2 nsTxS2 /mn 21 (n?S 1)?s2/mn
  • Here ?S sTxS2/sS2 and ? sS2/(sS2 sW2)
    sS2/s2

72
Mixed Effects Procedures
  • The test for treatment effects
  • FT MST/MSTxS with (m 1) df
  • The test for context effects (treatment by
    schools interaction) is
  • FTxS MSTxS/MSWS with 2m(n 1) df
  • Power is determined by the operational effect
    size
  • where ?S sTxS2/sS2 and ? sS2/(sS2 sW2)
    sS2/s2

73
Expected Mean Squares Randomized Block
Design (Two Levels, Schools Fixed)
74
Fixed Effects Procedures
  • The test for treatment effects has
  • H0 (a1 a2) (aß1? aß2?) 0
  • Estimated mean treatment effect in the experiment
    is
  • (a1 a2) (aß1? aß2?) (e1?? e2??)
  • The variance of the estimated treatment effect is
  • 2sW2 /mn

75
Fixed Effects Procedures
  • The test for treatment effects
  • FT MST/MSWS with m(n 1) df
  • The test for context effects (treatment by
    schools interaction) is
  • FC MSTxS/MSWS with 2m(n 1) df
  • Power is determined by the operational effect
    size
  • with m(n 1) df

76
Comparing Fixed and Mixed Effects Statistical
Procedures (Randomized Block Design)
77
Comparing Fixed and Mixed Effects
Procedures (Randomized Block Design)
  • Conditional and unconditional inference models
  • estimate different treatment effects
  • have different contaminating factors that add
    uncertainty
  • Mixed procedures are good for unconditional
    inference
  • The fixed procedures are good for conditional
    inference
  • The fixed procedures have higher power

78
The Hierarchical Design
  • The universe (the sampling frame)
  • We wish to compare two treatments
  • We assign treatments to whole schools
  • Many schools with n students in each
  • Assign all students in each school to the same
    treatment

79
The Hierarchical Design
  • The experiment
  • We wish to compare two treatments
  • We assign treatments to whole schools
  • Assign 2m schools with n students in each
  • Assign all students in each school to the same
    treatment

80
The Hierarchical Design
  • Diagram of the experiment

81
The Hierarchical Design
  • Treatment 1 schools

82
The Hierarchical Design
  • Treatment 2 schools

83
The Conceptual Model
  • The statistical model for the observation on the
    kth person in the jth school in the ith treatment
    is
  • Yijk µ ai ßi aßij ejk(i) µ ai
    ßj(i) ejk(i)
  • µ is the grand mean,
  • ai is the average effect of being in treatment i,
  • ßj is the average effect if being in school j,
  • aßij is the difference between the average effect
    of treatment i and the effect of that treatment
    in school j,
  • eijk is a residual
  • Or ßj(i) ßi aßij is a term for the combined
    effect of schools within treatments

84
The Conceptual Model
  • The statistical model for the observation on the
    kth person in the jth school in the ith treatment
    is
  • Yijk µ ai ßi aßij ejk(i) µ ai
    ßj(i) ejk(i)
  • µ is the grand mean,
  • ai is the average effect of being in treatment i,
  • ßj is the average effect if being in school j,
  • aßij is the difference between the average effect
    of treatment i and the effect of that treatment
    in school j,
  • eijk is a residual
  • or ßj(i) ßi aßij is a term for the combined
    effect of schools within treatments

Context Effects
85
Two-level Hierarchical Design With No Covariates
(HLM Notation)
  • Level 1 (individual level)
  • Yijk ß0j eijk e N(0, sW2)
  • Level 2 (school Level)
  • ?0j p00 p01Tj ?0j ? N(0, sS2)
  • If we code the treatment Tj ½ or - ½ , then
  • p00 µ, p01 a1, ?0j ßj(i)
  • The intraclass correlation is ? sS2/(sS2 sW2)
    sS2/s2

86
Effects and Estimates
  • The comparative treatment effect in any given
    school j is still
  • (a1 a2) (aß1j aß2j)
  • But we cannot estimate the treatment effect in a
    single school because each school gets only one
    treatment
  • The mean treatment effect in the experiment is
  • (a1 a2) (ß?(1) ß?(2))
  • (a1 a2) (ß1? ß2? ) (aß1? aß2?)
  • The estimate of the mean treatment effect in the
    experiment is
  • (a1 a2) (ß? (1) ß? (2)) (e1?? e2??)

87
Inference Models
  • Two different kinds of inferences about effects
    (as in the randomized block design)
  • Unconditional Inference (schools random)
  • Inference to the whole universe of schools
  • (requires a representative sample of schools)
  • Conditional Inference (schools fixed)
  • Inference to the schools in the experiment
  • (no sampling requirement on schools)

88
Unconditional Inference (Schools Random)
  • The average treatment effect we want to estimate
    is
  • (a1 a2)
  • The term (e1?? e2??) depends on the students in
    the schools in the sample
  • The term (ß?(1) ß?(2)) depends on the schools
    in sample
  • Both (e1?? e2??) and (ß?(1) ß?(2)) are random
    and average to 0 across students and schools,
    respectively

89
Conditional Inference (Schools Fixed)
  • The average treatment effect we want to (can)
    estimate is
  • (a1 ß?(1)) (a2 ß?(2)) (a1 a2) (ß?(1)
    ß?(2))
  • (a1 a2) (ß1? ß2? ) (aß1? aß2?)
  • The term (ß?(1) ß?(2)) depends on the schools
    in sample, but we want to estimate the effect of
    treatment in the schools in the sample
  • Note that this treatment effect is not quite the
    same as in the randomized block design, where we
    estimate
  • (a1 a2) (aß1? aß2?)

90
Statistical Analysis Procedures
  • Two kinds of statistical analysis procedures
    (as in the randomized block design)
  • Mixed Effects Procedures
  • Treat schools in the experiment as a sample from
    a universe
  • Fixed Effects Procedures
  • Treat schools in the experiment as a universe

91
Expected Mean Squares Hierarchical Design (Two
Levels, Schools Random)
92
Mixed Effects Procedures (Schools Random)
  • The test for treatment effects has
  • H0 (a1 a2) 0
  • Estimated mean treatment effect in the experiment
    is
  • (a1 a2) (ß?(1) ß?(2)) (e1?? e2??)
  • The variance of the estimated treatment effect is
  • 2sW2 nsS2 /mn 21 (n 1)?s2/mn
  • where ? sS2/(sS2 sW2) sS2/s2

93
Mixed Effects Procedures (Schools Random)
  • The test for treatment effects
  • FT MST/MSBS with (m 2) df
  • There is no omnibus test for context effects
  • Power is determined by the operational effect
    size
  • where ? sS2/(sS2 sW2) sS2/s2

94
Expected Mean Squares Hierarchical Design (Two
Levels, Schools Fixed)
95
Mixed Effects Procedures (Schools Fixed)
  • The test for treatment effects has
  • H0 (a1 a2) (ß?(1) ß?(2)) 0
  • Note that the school effects are confounded with
    treatment effects
  • Estimated mean treatment effect in the experiment
    is
  • (a1 a2) (ß?(1) ß?(2)) (e1?? e2??)
  • The variance of the estimated treatment effect is
  • 2sW2 /mn

96
Mixed Effects Procedures (Schools Fixed)
  • The test for treatment effects
  • FT MST/MSWS with m(n 1) df
  • There is no omnibus test for context effects,
    because each school gets only one treatment
  • Power is determined by the operational effect
    size
  • and m(n 1) df

97
Comparing Fixed and Mixed Effects
Procedures (Hierarchical Design)
98
Comparing Fixed and Mixed Effects Statistical
Procedures (Hierarchical Design)
  • Conditional and unconditional inference models
  • estimate different treatment effects
  • have different contaminating factors that add
    uncertainty
  • Mixed procedures are good for unconditional
    inference
  • The fixed procedures are not generally
    recommended
  • The fixed procedures have higher power

99
Comparing Hierarchical Designs to Randomized
Block Designs
  • Randomized block designs usually have higher
    power, but assignment of different treatments
    within schools or classes may be
  • practically difficult
  • politically infeasible
  • theoretically impossible
  • It may be methodologically unwise because of
    potential for
  • Contamination or diffusion of treatments
  • compensatory rivalry or demoralization

100
Applications to Experimental Design
  • We will address the two most widely used
    experimental designs in education
  • Randomized blocks designs with 2 levels
  • Randomized blocks designs with 3 levels
  • Hierarchical designs with 2 levels
  • Hierarchical designs with 3 levels
  • We also examine the effect of covariates
  • Hereafter, we generally take schools to be random

101
Complications
  • Which matchings do we have to take into account
    in design (e.g., schools, districts, regions,
    states, regions of the country, country)?
  • Ignore some, control for effects of others as
    fixed blocking factors
  • Justify this as part of the population definition
  • For example, we define the inference population
    as these five districts within these two states
  • But, doing so obviously constrains
    generalizability

102
Precision of the Estimated Treatment Effect
  • Precision is the standard error of the estimated
    treatment effect
  • Precision in simple (simple random sample)
    designs depends on
  • Standard deviation in the population s
  • Total sample size N
  • The precision is

103
Precision of the Estimated Treatment Effect
  • Precision in complex (clustered sample) designs
    depends on
  • The (total) standard deviation sT
  • Sample size at each level of sampling
  • (e.g., m clusters, n individuals per cluster)
  • Intraclass correlation structure
  • It is a little harder to compute than in simple
    designs, but important because it helps you see
    what matters in design

104
Intraclass Correlations in Two-level Designs
  • In two-level designs the intraclass correlation
    structure is determined by a single intraclass
    correlation
  • This intraclass correlation is the proportion of
    the total variance that is between schools
    (clusters)

105
Precision in Two-level Hierarchical Design With
No Covariates
  • The standard error of the treatment effect is
  • SE decreases as m (number of schools) increases
  • SE deceases as n increases, but only up to point
  • SE increases as ? increases

106
Statistical Power
  • Power in simple (simple random sample) designs
    depends on
  • Significance level
  • Effect size
  • Sample size
  • Look power up in a table for sample size and
    effect size

107
Fragment of Cohens Table 2.3.5
108
Computing Statistical Power
  • Power in complex (clustered sample) designs
    depends on
  • Significance level
  • Effect size d
  • Sample size at each level of sampling
  • (e.g., m clusters, n individuals per cluster)
  • Intraclass correlation structure
  • This makes it seem a lot harder to compute

109
Computing Statistical Power
  • Computing statistical power in complex designs is
    only a little harder than computing it for simple
    designs
  • Compute operational effect size (incorporates
    sample design information) ?T
  • Look power up in a table for operational sample
    size and operational effect size
  • This is the same table that you use for simple
    designs

110
Power in Two-level Hierarchical Design With No
Covariates
  • Basic Idea
  • Operational Effect Size (Effect Size) x (Design
    Effect)
  • ?T d x (Design Effect)
  • For the two-level hierarchical design with no
    covariates
  • Operational sample size is number of schools
    (clusters)

111
Power in Two-level Hierarchical Design With No
Covariates
  • As m (number of schools) increases, power
    increases
  • As effect size increases, power increases
  • Other influences occur through the design effect
  • As ? increases the design effect (and power)
    decreases
  • No matter how large n gets the maximum design
    effect is
  • Thus power only increases up to some limit as n
    increases

112
Two-level Hierarchical Design With Covariates
(HLM Notation)
  • Level 1 (individual level)
  • Yijk ß0j ß1jXijk eijk e N(0, sAW2)
  • Level 2 (school Level)
  • ß0j p00 p01Tj p02Wj ?0j ? N(0,
    sAS2)
  • ß1j p10
  • Note that the covariate effect ß1j p10 is a
    fixed effect
  • If we code the treatment Tj ½ or - ½ , then the
    parameters are identical to those in standard
    ANCOVA

113
Precision in Two-level Hierarchical Design With
Covariates
  • The standard error of the treatment effect
  • SE decreases as m increases
  • SE deceases as n increases, but only up to point
  • SE increases as ? increases
  • SE decreases as RW2 and RS2 increase

114
Power in Two-level Hierarchical Design With
Covariates
  • Basic Idea
  • Operational Effect Size (Effect Size) x (Design
    Effect)
  • ?T d x (Design Effect)
  • For the two-level hierarchical design with
    covariates
  • The covariates increase the design effect

115
Power in Two-level Hierarchical Design With
Covariates
  • As m and effect size increase, power increases
  • Other influences occur through the design effect
  • As ? increases the design effect (and power)
    decrease
  • Now the maximum design effect as large n gets big
    is
  • As the covariate-outcome correlations RW2 and RS2
    increase the design effect (and power) increases

116
Three-level Hierarchical Design
  • Here there are three factors
  • Treatment
  • Schools (clusters) nested in treatments
  • Classes (subclusters) nested in schools
  • Suppose there are
  • m schools (clusters) per treatment
  • p classes (subclusters) per school (cluster)
  • n students (individuals) per class (subcluster)

117
Three-level Hierarchical Design With No Covariates
  • The statistical model for the observation on the
    lth person in the kth class in the jth school in
    the ith treatment is
  • Yijkl µ ai ßj(i) ?k(ij) eijkl
  • where
  • µ is the grand mean,
  • ai is the average effect of being in treatment i,
  • ßj(i) is the average effect of being in school j,
    in treatment i
  • ?k(ij) is the average effect of being in class k
    in treatment i, in school j,
  • eijkl is a residual

118
Three-level Hierarchical Design With No
Covariates (HLM Notation)
  • Level 1 (individual level)
  • Yijkl ß0jk eijkl e N(0, sW2)
  • Level 2 (classroom level)
  • ß0jk ?0j ?0jk ? N(0, sC2)
  • Level 3 (school Level)
  • ?0j p00 p01Tj ?0j ? N(0, sS2)
  • If we code the treatment Tj ½ or - ½ , then
  • p00 µ, p01 a1, ?0j ?k(ij), ?0jk ßj(i)

119
Three-level Hierarchical Design Intraclass
Correlations
  • In three-level designs there are two levels of
    clustering and two intraclass correlations
  • At the school (cluster) level
  • At the classroom (subcluster) level

120
Precision in Three-level Hierarchical Design With
No Covariates
  • The standard error of the treatment effect
  • SE decreases as m increases
  • SE deceases as p and n increase, but only up to
    point
  • SE increases as ?S and ?C increase

121
Power in Three-level Hierarchical Design With No
Covariates
  • Basic Idea
  • Operational Effect Size (Effect Size) x (Design
    Effect)
  • ?T d x (Design Effect)
  • For the three-level hierarchical design with no
    covariates
  • The operational sample size is the number of
    schools

122
Power in Three-level Hierarchical Design With No
Covariates
  • As m and the effect size increase, power
    increases
  • Other influences occur through the design effect
  • As ?S or ?C increases the design effect decreases
  • No matter how large n gets the maximum design
    effect is
  • Thus power only increases up to some limit as n
    increases

123
Three-level Hierarchical Design With Covariates
(HLM Notation)
  • Level 1 (individual level)
  • Yijkl ß0jk ß1jkXijkl eijkl e N(0,
    sAW2)
  • Level 2 (classroom level)
  • ß0jk ?00j ?01jZjk ?0jk ? N(0, sAC2)
  • ß1jk ?10j
  • Level 3 (school Level)
  • ?00j p00 p01Tj p02Wj ?0j ? N(0,
    sAS2)
  • ?01j p01
  • ?10j p10
  • The covariate effects ß1jk ?10j p10 and ?01j
    p01 are fixed

124
Precision in Three-level Hierarchical Design With
Covariates
  • SE decreases as m increases
  • SE deceases as p and n increase, but only up to
    point
  • SE increases as ? increases
  • SE decreases as RW2, RC2, and RS2 increase

125
Power in Three-level Hierarchical Design With
Covariates
  • Basic Idea
  • Operational Effect Size (Effect Size) x (Design
    Effect)
  • ?T d x (Design Effect)
  • For the three-level hierarchical design with
    covariates
  • The operational sample size is the number of
    schools

126
Power in Three-level Hierarchical Design With
Covariates
  • As m and the effect size increase, power
    increases
  • Other influences occur through the design effect
  • As ?S or ?C increase the design effect decreases
  • No matter how large n gets the maximum design
    effect is
  • Thus power only increases up to some limit as n
    increases

127
Randomized Block Designs

128
Two-level Randomized Block Design With No
Covariates (HLM Notation)
  • Level 1 (individual level)
  • Yijk ß0j ß1jTijk eijk e N(0, sW2)
  • Level 2 (school Level)
  • ß0j p00 ?0j ?0j N(0, sS2)
  • ß1j p10 ?1j ?1j N(0, sTxS2)
  • If we code the treatment Tijk ½ or - ½ , then
    the parameters are identical to those in standard
    ANOVA

129
Randomized Block Designs
  • In randomized block designs, as in hierarchical
    designs, the intraclass correlation has an impact
    on precision and power
  • However, in randomized block designs designs
    there is also a parameter reflecting the degree
    of heterogeneity of treatment effects across
    schools
  • We define this heterogeneity parameter ?S in
    terms of the amount of heterogeneity of treatment
    effects relative to the heterogeneity of school
    means
  • Thus
  • ?S sTxS2/sS2

130
Precision in Two-level Randomized Block
Design With No Covariates
  • The standard error of the treatment effect
  • SE decreases as m (number of schools) increases
  • SE deceases as n and p increase, but only up to
    point
  • SE increases as ? increases
  • SE increases as ?S sTxS2/sS2 increases

131
Power in Two-level Randomized Block Design With
No Covariates
  • Basic Idea
  • Operational Effect Size (Effect Size) x (Design
    Effect)
  • ?T d x (Design Effect)
  • For the two-level hierarchical design with no
    covariates
  • Operational sample size is number of schools
    (clusters)

132
Precision in Two-level Randomized Block
Design With Covariates
  • The standard error of the treatment effect
  • SE decreases as m increases
  • SE deceases as n increases, but only up to point
  • SE increases as ? increases
  • SE increases as ?S sTxS2/sS2 increases
  • SE (generally) decreases as RW2 and RS2 increase

133
Power in Two-level Randomized Block Design With
Covariates
  • Basic Idea
  • Operational Effect Size (Effect Size) x (Design
    Effect)
  • ?T d x (Design Effect)
  • For the two-level hierarchical design with
    covariates
  • The covariates increase the design effect

134
Three-level Randomized Block Designs

135
Three-level Randomized Block Design With No
Covariates
  • Here there are three factors
  • Treatment
  • Schools (clusters) nested in treatments
  • Classes (subclusters) nested in schools
  • Suppose there are
  • m schools (clusters) per treatment
  • 2p classes (subclusters) per school (cluster)
  • n students (individuals) per class (subcluster)

136
Three-level Randomized Block Design With No
Covariates
  • The statistical model for the observation on the
    lth person in the kth class in the ith treatment
    in the jth school is
  • Yijkl µ ai ßj ?k aßij eijkl
  • where
  • µ is the grand mean,
  • ai is the average effect of being in treatment i,
  • ßj is the average effect of being in school j,
  • ?k is the effect of being in the kth class,
  • aßij is the difference between the average effect
    of treatment i and the effect of that treatment
    in school j,
  • eijkl is a residual

137
Three-level Randomized Block Design With No
Covariates (HLM Notation)
  • Level 1 (individual level)
  • Yijkl ß0jk eijkl e N(0, sW2)
  • Level 2 (classroom level)
  • ß0jk ?00j ?01jTj ?0jk ? N(0, sC2)
  • Level 3 (school Level)
  • ?00j p00 ?0j ?oi N(0, sS2)
  • ?01j p10 ?1j ?1i N(0, sTxS2)
  • If we code the treatment Tj ½ or - ½ , then
  • p00 µ, p10 a1, ?0j ßj , ?1j aßij , ?0jk
    ?k

138
Three-level Randomized Block Design Intraclass
Correlations
  • In three-level designs there are two levels of
    clustering and two intraclass correlations
  • At the school (cluster) level
  • At the classroom (subcluster) level

139
Three-level Randomized Block Design
Heterogeneity Parameters
  • In three-level designs, as in two-level
    randomized block designs, there is also a
    parameter reflecting the degree of heterogeneity
    of treatment effects across schools
  • We define this parameter ?S in terms of the
    amount of heterogeneity of treatment effects
    relative to the heterogeneity of school means
    (just like in two-level designs)
  • Thus
  • ?S sTxS2/sS2

140
Precision in Three-level Randomized Block
Design With No Covariates
  • The standard error of the treatment effect
  • SE decreases as m increases
  • SE deceases as p and n increase, but only up to
    point
  • SE increases as ?S increases
  • SE increases as ?S and ?C increase

141
Power in Three-level Randomized Block Design With
No Covariates
  • Basic Idea
  • Operational Effect Size (Effect Size) x (Design
    Effect)
  • ?T d x (Design Effect)
  • For the three-level hierarchical design with no
    covariates
  • The operational sample size is the number of
    schools

142
Power in Three-level Randomized Block Design With
No Covariates
  • As m and the effect size increase, power
    increases
  • Other influences occur through the design effect
  • As ?S or ?C increases the design effect decreases
  • No matter how large n gets the maximum design
    effect is
  • Thus power only increases up to some limit as n
    increases

143
Power in Three-level Randomized Block Design With
Covariates
  • SE decreases as m increases
  • SE deceases as p and n increases, but only up to
    point
  • SE increases as ? and ?S increase
  • SE decreases as RW2, RC2, and RS2 increase

144
Power in Three-level Randomized Block Design With
Covariates
  • Basic Idea
  • Operational Effect Size (Effect Size) x (Design
    Effect)
  • ?T d x (Design Effect)
  • For the three-level hierarchical design with
    covariates
  • The operational sample size is the number of
    schools

145
Power in Three-level Randomized Block Design With
Covariates
  • As m and the effect size increase, power
    increases
  • Other influences occur through the design effect
  • As ?S or ?C increases the design effect decreases
  • No matter how large n gets the maximum design
    effect is
  • Thus power only increases up to some limit as n
    increases

146
What Unit Should Be Randomized? (Schools,
Classrooms, or Students)
  • Experiments cannot estimate the causal effect on
    any individual
  • Experiments estimate average causal effects on
    the units that have been randomized
  • If you randomize schools the (average) causal
    effects are effects on schools
  • If you randomize classes, the (average) causal
    effects are on classes
  • If you randomize individuals, the (average)
    causal effects estimated are on individuals

147
What Unit Should Be Randomized? (Schools,
Classrooms, or Students)
  • Theoretical Considerations
  • Decide what level you care about, then randomize
    at that level
  • Randomization at lower levels may impact
    generalizability of the causal inference (and it
    is generally a lot more trouble)
  • Suppose you randomize classrooms, should you also
    randomly assign students to classes?
  • It depends Are you interested in the average
    causal effect of treatment on naturally occurring
    classes or on randomly assembled ones?

148
What Unit Should Be Randomized? (Schools,
Classrooms, or Students)
  • Relative power/precision of treatment effect
  • Assign Schools
  • (Hierarchical Design)
  • Assign Classrooms
  • (Randomized Block)
  • Assign Students
  • (Randomized Block)

149
What Unit Should Be Randomized? (Schools,
Classrooms, or Students)
  • Precision of estimates or statistical power
    dictate assigning the lowest level possible
  • But the individual (or even classroom) level will
    not always be feasible or even theoretically
    desirable

150
Questions and Answers About Design

151
Questions and Answers About Design
  • Is it ok to match my schools (or classes) before
    I randomize to decrease variation?
  • I assigned treatments to schools and am not using
    classes in the analysis. Do I have to take them
    into account in the design?
  • I am assigning schools, and using every class in
    the school. Do I have to include classes as a
    nested factor?
  • My schools all come from two districts, but I am
    randomly assigning the schools. Do I have to
    take district into account some way?

152
Questions and Answers About Design
  • I didnt really sample the schools in my
    experiment (who does?). Do I still have to treat
    schools as random effects?
  • I didnt really sample my schools, so what
    population can I generalize to anyway?
  • 3. I am using a randomized block design with
    fixed effects. Do you really mean I cant say
    anything about effects in schools that are not in
    the sample?

153
Questions and Answers About Design
  • We randomly assigned, but our assignment was
    corrupted by treatment switchers. What do we do?
  • We randomly assigned, but our assignment was
    corrupted by attrition. What do we do?
  • We randomly assigned but got a big imbalance on
    characteristics we care about (gender, race,
    language, SES). What do we do?
  • We randomly assigned but when we looked at the
    pretest scores, we see that we got a big
    imbalance (a bad randomization). What do we do?

154
Questions and Answers About Design
  • We care about treatment effects, but we really
    want to know about mechanism. How do we find out
    if implementation impacts treatment effects?
  • We want to know where (under what conditions) the
    treatment works. Can we analyze the relation
    between conditions and treatment effect to find
    this out?
  • We have a randomized block design and find
    heterogeneous treatment effects. What can we say
    about the main effect of treatment in the
    presence of interactions?

155
Questions and Answers About Design
  • I prefer to use regression and I know that
    regression and ANOVA are equivalent. Why do I
    need all this ANOVA stuff to design and analyze
    experiments?
  • Dont robust standard errors in regression solve
    all these problems?
  • I have heard of using school fixed effects to
    analyze a randomized block design. Is the a good
    alternative to ANOVA or HLM?
  • Can I use school fixed effects in a hierarchical
    design?

156
Questions and Answers About Design
  • We want to use covariates to improve precision,
    but we find that they act somewhat differently in
    different groups (have different slopes). What
    do we do?
  • We get somewhat different variances in different
    groups. Should we use robust standard errors?
  • We get somewhat different answers with different
    analyses. What do we do?

157
  • Thank You !
About PowerShow.com