LECTURE NOTES Repeated Measures Analysis: MANOVA and Covariance Pattern models

About This Presentation

Title:

LECTURE NOTES Repeated Measures Analysis: MANOVA and Covariance Pattern models

Description:

LECTURE NOTES Repeated Measures Analysis: MANOVA and Covariance Pattern models * * * * * * * * * * SAS Mixed Model PROC MIXED cl CLASS MODEL – PowerPoint PPT presentation

Number of Views:719

Avg rating:3.0/5.0

Slides: 63

Provided by: JohnLa150

Category:

more less

Transcript and Presenter's Notes

Title: LECTURE NOTES Repeated Measures Analysis: MANOVA and Covariance Pattern models

1
LECTURE NOTESRepeated Measures Analysis
MANOVA and Covariance Pattern models
2
Day16 Basic Repeated Measures Design

Data collected in a sequence of evenly spaced
points in time (not necessarily equally spaced)
Treatments are assigned to experimental units
I.e., subjects
Two factors
Treatment between-subjects factor
Time within-subjects factor

3
Hypotheses

How do treatment differences change over time?
Is there a Treatment ? Time interaction?
How do response means change by trt?
Is there a Trt main effect?
How do response means change over time?
Is there a Time main effect?

4
Example Two groups
id group time1 time2 time3 time4 1
A 31 29 15 26 2
A 24 28 20 32 3
A 14 20 28 30 4
B 38 34 30 34 5
B 25 29 25 29 6
B 30 28 16 34

Preliminary Analysis this includes
Profile plots
Mean plots
Correlation between repeated measurements

5
Profile plots by group

differences at baseline among subjects
different trends for different subjects
Variability higher at time 1 and low at time 4

B
A
6
Mean plots by group

differences at baseline between group means
non-linear trends

B
A
7
(b) Correlation (covariance) across time points
time1 time2 time3
time4 time1 1.00000 0.94035
-0.14150 0.28445 time2 0.94035
1.00000 -0.02819 0.26921
time3 -0.14150 -0.02819
1.00000 0.27844 time4 0.28445
0.26921 0.27844 1.00000
Certainly do NOT have equal correlations (CS?)!
Time1 and time2 are highly correlated, but time1
and time3 are inversely correlated!
8
Statistical analysis strategies

Strategy 1 ANCOVA on the final measurement,
adjusting for baseline differences (end-point
analysis)
Strategy 2 repeated-measures ANOVA
Univariate approach
Strategy 4 Multivariate ANOVA approach
Strategy 3 Summary approach
Strategy 5 GEE
Strategy 6 Mixed Models

9
Comparison of traditional and new methods
FROM Ralitza Gueorguieva, PhD John H. Krystal,
MD Move Over ANOVA Progress in Analyzing
Repeated-Measures Data and Its Reflection in
Papers Published in the Archives of General
Psychiatry. Arch Gen Psychiatry. 200461310-317.
10
General syntax in SAS's GLM procedure

Syntax for MANOVA and rANOVA
PROC GLM DATA sas-dataset-name CLASS
factor1 factor2 ... factork MODEL y1 y2 ...
yk factor1 ... factork REPEATED
repeated-factor-name k / PRINTE LSMEANS
factor1 factor2 ... factork RUN
Output can be restricted to rANOVA only using NOM
option and to MANOVA analysis using NOU in the
Repeated Statement

11
Strategy 1. End-point analysis
ANCOVA Asks whether or not the two group means
differ at the final time point, adjusting for
differences at baseline (using ANCOVA).
proc glm datahorizontal class group model
time4 time1 group run
- Comparing groups at every follow-up time point
in this way would hugely increase your type I
error.
12
Strategy 2 univariate repeated measures ANOVA
(rANOVA)
Explain away some error variability by accounting
for differences between subjects - requires
Sphericity
proc glm datahorizontal class group
model time1-time4 group repeated time
4/printe run quit
13
Strategy 3 Summary analysis

One way to overcome the problem of correlated
observations over time within each subject is to
summarize the observations over time by their
mean or some function and use ANOVA
This summary analysis leads to a conservative
test
Example, avetimemean(time1,time2,time3,time4)

proc glm datahorizontal class group
model avetime group run quit
- A special application of this is pre-post
analysis
14
Strategy4 MANOVA Approach

Successive response measurements made over time
are considered correlated dependent variables
That is, response variables for each level of
within-subject factor is presumed to be a
different dependent variable
MANOVA assumes there is an unstructured
covariance matrix for dependent variables

15
Why MANOVA

You do a MANOVA instead of a series of
one-at-a-time ANOVAs for two main reasons
to reduce the experiment-wise level of Type I
error.
None of the individual ANOVAs may produce a
significant main effect on the response, but in
combination they might, which suggests that the
variables are more meaningful taken together than
considered separately
MANOVA takes into account the inter-correlations
among the response Variables

16
MANOVA

If the multivariate test is
not significant, report no group differences
among the mean vectors
significant, perform univariate ANOVA and
relevant contrasts
Contrasts (similar to contrasts we considered
previously)
Prior (planned)
Post hoc (unplanned)

17
MANOVA Test Statistics

SAS reports four tests
Wilks Lmbda
Pillais trace (good for smaller sample size)
Hotelling- Lawley Trace
Roys greatest root
These are covered in Multivariate class
We will use results from Wilks

18
MANOVA Test Statistics

Wilks Lambda (?) was the first MANOVA test
statistic developed and is very important for
several multivariate procedures in addition to
MANOVA.
Wilks Lambda (?) is the error sum of squares (E)
divided by the sum of the effect sum of squares
(H) and the error sum of squares.
The quantity (1 - ?) is often interpreted as the
proportion of variance in the dependent variables
explained by the model effect. However, this
quantity is not unbiased and can be quite
misleading in small samples.
? is approximately chi-square distributed

19
rANOVA vs. MANOVA

For tests that involve only between-subjects
effects, both the MANOVA rANOVA give rise to
the same tests.
For within-subject effects they yield different
tests.
In Proc GLM, rANOVA are in a table "Univariate
Tests of Hypotheses for Within Subject Effects."
Results for MANOVA are displayed in a table
labeled "Repeated Measures Analysis of Variance.
The multivariate tests are Wilks lambda,
Pillais trace, Hotelling-Lawley trace, and Roys
greatest root.
The only assumption required for valid tests is
that the dependent variables in the model have a
multivariate normal distribution with a common
covariance matrix across the between-subject
effects.

20
Boxs test of equal covariances

Boxs M test can be used if there are significant
differences among the covariance matrices by
group.
when Boxs test finds that the covariance
matrices are significantly different across
groups that may indicate an increased possibility
of Type I error, so you might want to make a
smaller error region (alpha0.001).
If you redid the analysis with a confidence level
of .001, you should report the results of the
Boxs M.

Box's M test for equality of variances proc
discrim dataexercise methodnormal pooltest
wcov class diet var time1 time2 time3 run
21

Example3 Suppose 24 subjects are randomly
assigned to two groups (Control and Treatment)
and their responses are measured at 4 times.
These times are labeled as 0 (baseline), 1 (after
one month posttest) 3 (after 3 months of
follow-up) and 6 (after 6 months of follow-up).
time is the within-subjects factor in this design
Treatment is the between-subjects (grouping)
factor

Some of the data points
data short
input Group Subj y0 y1 y3 y6
datalines
1 1 296 175 187 242
1 2 376 329 236 126
1 3 309 238 150 173
1 4 222 60 82 135
1 5 150 271 250 266
1 6 316 291 238 194
1 7 321 364 270 358
1 8 447 402 294 266
1 9 220 70 95 137
2 23 319 68 67 12
2 24 300 138 114 12

Hypothesis H01 no trt effect H02 no time
effect H03 no interaction Sphericity is
violated
23

Results of GLM Analysis
24

Results of GLM Analysis
25

Results of GLM Analysis

The test of sphericity, when requested,
immediately precedes both sets of within-subjects
tests.
Although the output shows two separate tests of
sphericity, the only one of interest is the
second test, which is the test of sphericity
applied to the common covariance matrix of the
transformed within-subject variables.
If the Chi-square approximation has an associated
p value less than your alpha level, the
sphericity assumption has been violated

Sphericity Tests Sphericity Tests Sphericity Tests Sphericity Tests Sphericity Tests
Variables DF Mauchly's Criterion Chi-Square Pr gt ChiSq
Transformed Variates 5 0.462959 15.95853 0.0070
Orthogonal Components 5 0.462959 15.95853 0.0070
26

Example 3 continued rMANOVA

The first multivariate test of a within-subjects
effect is the within-subjects main effect test.
It examines changes in response as a function of
time.
The null hypothesis is that the mean response
does not change over time.

MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no time EffectH Type III SSCP Matrix for timeE Error SSCP MatrixS1 M0.5 N9 MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no time EffectH Type III SSCP Matrix for timeE Error SSCP MatrixS1 M0.5 N9 MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no time EffectH Type III SSCP Matrix for timeE Error SSCP MatrixS1 M0.5 N9 MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no time EffectH Type III SSCP Matrix for timeE Error SSCP MatrixS1 M0.5 N9 MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no time EffectH Type III SSCP Matrix for timeE Error SSCP MatrixS1 M0.5 N9 MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no time EffectH Type III SSCP Matrix for timeE Error SSCP MatrixS1 M0.5 N9
Statistic Value F Value Num DF Den DF Pr gt F
Wilks' Lambda 0.19328615 27.82 3 20 lt.0001
Pillai's Trace 0.80671385 27.82 3 20 lt.0001
Hotelling-Lawley Trace 4.17367645 27.82 3 20 lt.0001
Roy's Greatest Root 4.17367645 27.82 3 20 lt.0001
27

Next SAS tests the hypothesis that treatment
interacts with time.
In this instance, the F value associated with
these multivariate tests of the interaction is
high therefore, the associated p value is low
F(3, 20) 6.73, p .0025.

MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no timeGroup EffectH Type III SSCP Matrix for timeGroupE Error SSCP MatrixS1 M0.5 N9 MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no timeGroup EffectH Type III SSCP Matrix for timeGroupE Error SSCP MatrixS1 M0.5 N9 MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no timeGroup EffectH Type III SSCP Matrix for timeGroupE Error SSCP MatrixS1 M0.5 N9 MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no timeGroup EffectH Type III SSCP Matrix for timeGroupE Error SSCP MatrixS1 M0.5 N9 MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no timeGroup EffectH Type III SSCP Matrix for timeGroupE Error SSCP MatrixS1 M0.5 N9 MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no timeGroup EffectH Type III SSCP Matrix for timeGroupE Error SSCP MatrixS1 M0.5 N9
Statistic Value F Value Num DF Den DF Pr gt F
Wilks' Lambda 0.49748100 6.73 3 20 0.0025
Pillai's Trace 0.50251900 6.73 3 20 0.0025
Hotelling-Lawley Trace 1.01012703 6.73 3 20 0.0025
Roy's Greatest Root 1.01012703 6.73 3 20 0.0025
28

Between-Subjects Tests
Following the MANOVA for multivariate tests of
significance for within-subjects effects, SAS
prints tests of the between-subjects effects.
There is only one approach to testing these
effects.

The GLM Procedure Repeated Measures Analysis of
Variance Tests of Hypotheses for Between Subjects
Effects
Source DF Type III SS Mean Square F Value Pr gt F
Group 1 248677.0417 248677.0417 19.64 0.0002
Error 22 278540.9583 12660.9527
Source DF Type III SS Mean Square F Value Pr gt F Adj Pr gt F Adj Pr gt F
Source DF Type III SS Mean Square F Value Pr gt F G - G H-F-L
time 3 326635.5833 108878.5278 37.80 lt.0001 lt.0001 lt.0001
timeGroup 3 59461.8750 19820.6250 6.88 0.0004 0.0019 0.0012
Error(time) 66 190098.5417 2880.2809
Greenhouse-Geisser Epsilon 0.7204
Huynh-Feldt-Lecoutre Epsilon 0.8016
29

Observations
The sphericity assumption was violated
With nonspherical data either use corrected
univariate tests that we described earlier or use
results from MANOVA test.
The corrected univariate p values appear under
the G - G and H - F headers in the output shown
above.
Note that in this case, rANOVA agrees with the
MANOVA that there is a statistically significant
within-subjects main effect for time, as well as
interaction between treatment and time.
Further polynomial contrast analysis can be made
on time

30
Analysis of Variance of Contrast Variables time_N
represents the nth degree polynomial contrast for
time
Source DF Type III SS Mean Square F Value Pr gt F
Mean 1 89560.72781 89560.72781 44.23 lt.0001
Group 1 20747.22078 20747.22078 10.24 0.0041
Error 22 44552.41504 2025.10977
time_1
time_2
Source DF Type III SS Mean Square F Value Pr gt F
Mean 1 186802.0020 186802.0020 37.82 lt.0001
Group 1 4428.6429 4428.6429 0.90 0.3539
Error 22 108650.6885 4938.6677
time_3
Source DF Type III SS Mean Square F Value Pr gt F
Mean 1 50272.85354 50272.85354 29.98 lt.0001
Group 1 34286.01136 34286.01136 20.44 0.0002
Error 22 36895.43813 1677.06537
31
More on orthogonal Contrast

proc glm datashort
class group
model y0 y1 y3 y6 group/ nouni
repeated time 4 (0 1 3 6) profile /summary
printm NOM generates contrasts between adjacent
levels of the factor
proc glm datashort
class group
model y0 y1 y3 y6 group/ nouni
repeated time 4 (0 1 3 6) helmert /summary
printm NOM HELMERT-generates contrasts between
each level of the factor and the mean of
subsequent levels.
run

32
Day17 Strategy 6 Mixed Model Approach

Models with fixed and random effects are mixed
models
treatment, which is usually considered a fixed
effect
subject factor is a random effect
Analysis can follow
Linear Mixed models
Covariance pattern models user specifies
covariance structure
Random coefficient models induce covariance
structure

33
SAS Mixed Repeated Measures Syntax
34
SAS Mixed Model

PROC MIXED cl
CLASS
MODEL ltdependent variablegt ltfixed sourcesgt

cl requests confidence limits for variance
covariance estimates
Identifies variables used as sources of variation
and subject option of REPEATED statement
Specifies dependent variable and all fixed
sources of variation (includes treatment, time
and their interaction. The ddfm option computes
the correct degrees of freedom for the various
terms.

35
SAS Mixed Model

REPEATED/ subject ltEU idgt typeltcovariance
structuregt r rcorr

subject identifies the experimental unit in
the data set which represents the repeated
measures. It identifies the units that are
indpendent.
type identifies the covariance structure
r requests printing of the covariance matrix for
the repeated measures
rcorr requests printing of the correlation matrix
for the repeated measures

36
Covariance Structures Independent with common
variance

Equal variances along main diagonal
Zero covariances along off diagonal
Variances constant and residuals independent
across time.
The standard ANOVA model
Simple, because a single parameter is estimated
the pooled variance

37
Covariance Structures Unstructured

Separate variances on diagonal
Separate covariances on off diagonal
Most complex structure
Variance estimated for each time, covariance for
each pair of times
Need to estimate 10 parameters, 104(41)/2
Leads to less precise parameter estimation
(degrees of freedom problem)

38
Covariance Structures compound symmetry

Equal variances on diagonal
equal covariances along off diagonal (equal
correlation)
Simplest structure for fitting repeated measures
Split-plot in time analysis
Used for past 50 years
Requires estimation of 2 parameters

39
Covariance Structures First order
Autoregressive

Equal variances on main-diagonal
Off diagonal represents variance multiplied by
the correlation raised to increasing powers as
the observations become increasingly separated in
time.
Increasing power means decreasing covariances.
Times must be equally ordered and equally spaced.
Estimates 2 parameters
AR(1)

40
First order Autoregressive Heterogeneous

unequal variances on main-diagonal
Off diagonal represents product of standard
errors multiplied by the correlation raised to
increasing powers as the observations become
increasingly separated in time.
Increasing power means decreasing covariances.
Times must be equally ordered and equally spaced.
Estimates 5 parameters
ARH(1)

41
Strategies for Finding suitable covariance
structures

Run unstructured first
Next run compound symmetry simplest repeated
measures structure
Next try other covariance structures that best
fit the experimental design

42
Criteria for Selecting best Covariance Structure

Need to use model fitting statistics
AIC Akaikes Information Criteria
BIC Schwarzs Bayesian Criteria
Let q of covariance parameters, p of fixed
effect parameters in model and n of
observations and
AIC -2log(L) 2q
BIC -2log(L) q log(n)
AAIC -2log(L) q(logn 1)
Smaller the number the better
Goal covariance structure that is better than
compound symmetry

Example3 Suppose 24 subjects are randomly
assigned to two groups (Control and Treatment)
and their responses are measured at 4 times.
These times are labeled as 0 (baseline), 1 (after
one month posttest) 3 (after 3 months of
follow-up) and 6 (after 6 months of follow-up).

proc corr datashort cov var yo y1 y3
y6 run Pearson Correlation Coefficients, N
24 yo y1 y3 y6 Y0 1.00
0.51 0.50 0.07 y1 1.00
0.93 0.67 y3 1.00 0.65 Y4 1.
00

- What type of correlation structure do you think
is right?
Variances are 5456, 13505, 7881,6929
exercise compare models for this

44
Example4 exercise pulse study

Exercise data examples
The data consists of people who were randomly
assigned to two different diets low-fat and not
low-fat and three different types of exercise at
rest, walking leisurely and running. Their pulse
rate was measured at three different time points
during their assigned exercise at 1 minute, 15
minutes and 30 minutes.
data exercise
input id exertype diet time1 time2 time3
cards
1 1 1 85 85 88
2 1 1 90 92 93
3 1 1 97 97 94
4 1 1 80 82 83
5 1 1 91 92 91
6 1 2 83 83 84
7 1 2 87 88 90
8 1 2 92 94 95
9 1 2 97 99 96
10 1 2 100 97 100

45
Example

Let's look at the correlations, variances and
covariances for the exercise data.
since, we cannot use this kind of covariance
structures in a traditional repeated measures
analysis, we will use SAS PROC MIXED for such an
analysis.
proc corr dataexercise cov
var time1 time2 time3
run
Pearson Correlation Coefficients, N 30
time1 time2 time3
time1 1.00000 0.54454 0.51915
time2 0.54454 1.00000 0.85028
time3 0.51915 0.85028 1.00000

46
Example compound symmetry

proc mixed datalong
class exertype time
model pulse exertype time exertypetime
repeated time / subjectid typecs
run
Fit Statistics
-2 Res Log Likelihood 590.8
AIC (smaller is better) 594.8
AICC (smaller is better) 595.0
BIC (smaller is better) 597.6
Null Model Likelihood Ratio Test
DF Chi-Square Pr gt ChiSq
1 15.36 lt.0001
Type 3 Tests of Fixed Effects
Num Den
Effect DF DF F Value Pr gt
F
exertype 2 27 27.00
lt.0001
time 2 54 23.54
lt.0001

47
Example unstructured

proc mixed datalong
class exertype time
model pulse exertype time exertypetime
repeated time / subjectid typeun
run
Fit Statistics
-2 Res Log Likelihood 577.7
AIC (smaller is better) 589.7
AICC (smaller is better) 590.9
BIC (smaller is better) 598.1
Null Model Likelihood Ratio Test
DF Chi-Square Pr gt ChiSq
5 28.46 lt.0001
Type 3 Tests of Fixed Effects
Num Den
Effect DF DF F Value Pr gt
F
exertype 2 27 27.00
lt.0001
time 2 27 22.32
lt.0001

48
Example AR(1)

proc mixed datalong
class exertype time
model pulse exertype time exertypetime
repeated time / subjectid typear(1)
run
-2 Res Log Likelihood 590.1
AIC (smaller is better) 594.1
AICC (smaller is better) 594.3
BIC (smaller is better) 596.9
Null Model Likelihood Ratio Test
DF Chi-Square Pr gt ChiSq
1 16.08 lt.0001
Type 3 Tests of Fixed Effects
Num Den
Effect DF DF F Value Pr gt
F
exertype 2 27 28.39
lt.0001
time 2 54 18.20
lt.0001
exertypetime 4 54 11.73 lt.0001

49
Example ARH(1)

proc mixed datalong
class exertype time
model pulse exertype time exertypetime
repeated time / subjectid typearh(1)
run
Covariance Parameter Estimates
Cov
Parm Subject Estimate
Var(1) id 35.7683
Var(2) id 87.1927
Var(3) id 115.50
ARH(1) id 0.5101
Fit Statistics
-2 Res Log Likelihood 579.8
AIC (smaller is better) 587.8
AICC (smaller is better) 588.3
BIC (smaller is better) 593.4
Null Model Likelihood Ratio Test
DF Chi-Square Pr gt ChiSq

50
Example model comparison
Model AIC -2RLL Parms(df 1) Diff -2RLL(vs. CS) Diff in df (vs. CS) p value for Diff (from a chi square dist)
Compound Symmetry 594.8 590.8 2
Unstructured 589.7 577.7 6 13.1 4 .01
Autoregressive 594.1 590.1 2 .7 0 na
Autoregressive Heterogenous Variances 587.8 579.8 4 11 2 0.027
The two most promising structures are
Autoregressive Heterogeneous Variances and
Unstructured since these two models have the
smallest AIC values and the -2 Log Likelihood
scores are significantly smaller than the -2 Log
Likehood scores of other models.
51
RM with two group factors

Looking at models including only diet or exertype
separately does not answer all our questions. We
would also like to know if the people on the
low-fat diet who engage in running have lower
pulse rates than the people participating in the
not low-fat diet who are not running. In order to
address these types of questions we need to look
at a model that includes the interaction of diet
and exertype.
proc mixed datalong
class diet exertype time
model pulse exertypediettime
repeated time / subjectid typearh(1)
run
quit
proc glm dataexercise
class diet exertype
model time1 time2 time3 dietexertype
repeated time 3
run
quit

52
Group comparison in Proc Mixed

If we would like to look at the differences among
groups at each level of another variable we have
to utilize the lsmeans statement with the slice
option.
For example, we could test for differences among
the exertype groups at each level of diet across
all levels of time or we could test for
differences in groups of exertype for each time
point across both levels of diet we could also
test for differences in groups of exertype for
each combination of time and diet levels.
proc mixed datalong
class diet exertype time
model pulse exertypediettime
repeated time / subjectid typearh(1)
lsmeans dietexertype / slicediet /testing
for differences among exertype for each level of
diet /
lsmeans exertypetime / slicetime
/differences in exertype for each time point/
lsmeans exertypediettime / slicetimediet
run
quit

53
Worked Example from JL Text Book (Self Read)
54
Subject time 1 time 2 time 3 time 4
time 5 summary 1 y11 y12
y13 y14 y15 f(y11,
,y15) 2 y21 y22
y23 y24 y25 f(y21,
,y25)

n
yn1 yn2 yn3 yn4
yn5 f(yn1, ,yn5)
Summarizing with function over time removes
correlation Growth Curve
Approach
55
23 factorial design with temp, moisture and soil
type Each combination of factor level were
randomly assigned to two pots of soil Samples of
soil were taken in days 0, 7,14,30 and
60 Concentration of herbicide was measured for
each sample sphericity condition does not hold
56
(No Transcript)
57

58
xi day, n 5 yi log(concentration) f k
slope
sums
59
(No Transcript)
60
(No Transcript)
61
If you dont have a summary function, proc glm
can summarize with orthogonal polynomials over
time.
Linear orthogonal polynomial over time
62
Quadratic orthogonal polynomial over time

Write a Comment

User Comments (0)

About PowerShow.com

LECTURE NOTES Repeated Measures Analysis: MANOVA and Covariance Pattern models - PowerPoint PPT Presentation

LECTURE NOTES Repeated Measures Analysis: MANOVA and Covariance Pattern models

LECTURE NOTES Repeated Measures Analysis: MANOVA and Covariance Pattern models * * * * * * * * * * SAS Mixed Model PROC MIXED cl CLASS MODEL – PowerPoint PPT presentation