Chapter 18

- The Regression
- Approach to ANOVA

Handling groups with regression

- t-tests and ANOVAs are used to test differences

between groups. - Regression can also be used.
- The key is dummy encoding.
- Cases
- 2 groups
- gt2 groups
- Order issues
- Meaning in nominal vs. ratio IVs

How to deal with gt2 groups

- The trick
- Multiple regression
- Dichotomous IVs
- Example
- IV is severity of disorder (SOD)
- ANOVA type Levels
- Normal
- Neurotic
- Psychotic
- Regression type predictors
- X1 Presence of neurosis
- X2 Presence of psychosis

How to deal with gt2 groups

- Regression type predictors
- X1 Presence of neurosis
- X2 Presence of psychosis
- Possible combinations ( X1, X2 )
- ( 0, 0 ) Normal
- ( 1, 0 ) Neurotic
- ( 0, 1 ) Psychotic
- Not all combinations are considered ( X1, X2 ) ?

- ( 1, 1 ) Borderline?

Regression predicting group means

- Regression type predictors
- X1 Presence of neurosis
- X2 Presence of psychosis
- Suppose the mean Ys are
- 3 for normals
- 7 for neurotics
- 1 for psychotics
- Running a multiple regression on the 2

dichotomous predictors results in the following

raw score regression equation - Plugging in X values returns these means. Why?

Regression helps explain df

- What does df really mean?
- dfbet in a one way ANOVA k-1.
- 3 the number of groups.
- In the regression, we have k-1 predictors.
- Thus, Y can vary along 2 dimensions, I.e. it has

2 degrees of freedom. - Normals dont give us a third degree of freedom

because that is what everything is compared to.

The regression plane

- We had trouble plotting three groups on a line.
- 2 points define a line.
- However, 3 points define a plane!
- So, a multiple regression plane can be used to

predict, based on two predictors (3 groups).

The regression plane

- What happens with k4 groups,
- i.e. k-13 df,
- i.e. 3 predictors?
- We end up with a 3D hyperplane.
- Thats why we have algebra.

?

The case of no control group

- In the previous example the control group was

coded as (X1, X2) (0,0). - But what if we have no control group?
- Example
- Groups
- Obsessives
- Phobics
- Depressives
- Hysterics
- Encode the first 3 groups as before.
- Let hysterics be (arbitrarily) the base group.

The case of no control group

- This is called effect coding.
- Advantages of effect coding
- The intercept turns out to be the grand mean

(assuming equal sized groups). - The slope for each predictor equals the mean for

that group minus the grand mean. - The mean of the base predictor is the grand mean

minus the slopes of the other predictors.

The case of no control group

- The regression equation for such an effect coding

is - The effect of each group is defined as the group

mean minus the grand mean. - Where
- ? is the grand mean.
- Each b is the effect of the corresponding group.

The General Linear Model

- We often hear the term or see it in SPSS, but

what is it? - Consider the case of being in a particular group,

say the obsessives X1. - The X1 1 and all other Xi 0.
- The the predicted Y is
- Where ?1 is the effect of being obsessive and Y

is now the mean for the obsessives. - Likewise for the other predictors except for the

base. - Why?

The General Linear Model

- Predicting scores for an individual.
- Each individual has some unexplained variance

from his/her group. - This is called an error or residual term.
- j is the group number and i is the subject number

within that group. - This is the General Linear Model.
- When testing the null hypothesis we are testing
- Or equivalently

The General Linear Model

- Or equivalently
- Why are these equivalent?
- So, it is starting to look like ANOVA is simply a

special case of multiple regression.

The General Linear Model

- If this is true, then the F test for multiple R

will be related to the F test for an ANOVA. - Let be the portion of the SS accounted for by

the independent variables. - So
- And
- Therefore
- By substitution

The General Linear Model

- Recall our old formula for F in an ANOVA
- Substituting from the previous slide
- Which looks a lot like the F for R2.

The General Linear Model

- Which it is if k p1
- pk-1

The two-way ANOVA as regression

- Example Suppose we have a 2-way dichotomous

ANOVA. - Say a researcher wishes to show that gender and

breastfeeding can affect height. - We have then 2 IVs, gender and breastfeeding,

each with 2 levels. - Height is the dependent variable.
- For the regression we can encode X1 gender

female or male -1 or 1 respectively. - We can code X2 breastfed no or yes -1 or 1.
- Our GLM is similar to before except that now we

have an interaction term. - Where i is the level for gender and j is the

level for breastfed. - k is the index for the individual.

The two-way ANOVA as regression

- The interaction X1X2 is treated as an additional

predictor. - If we run a regression on some height data and

get - What happened to the residual?
- We can test the significance of each regression

slope by performing an F test on the

corresponding semipartial correlation. - It turns out this is equivalent to determining

the significance of either the main effects or

the interaction in a 2-way ANOVA.

The two-way ANOVA as regression

- Given the following equation from the regression,

- Can you fill in the table below as we would when

running an ANOVA?

Exercises

- Page 576
- 1, 2 (assume equal sized groups in c), 5

