Dia 1 - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Dia 1

Description:

4nd meeting: Multilevel modeling: logistic regression Subjects for today: What is logistic regression? Logistic regression in Mlwin Event history analysis: an ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 19
Provided by: Manfredte7
Category:

less

Transcript and Presenter's Notes

Title: Dia 1


1
  • 4nd meeting
  • Multilevel modeling logistic regression
  • Subjects for today
  • What is logistic regression?
  • Logistic regression in Mlwin
  • Event history analysis an introduction

2

Up till now we discussed lineair regression it
was assumed that there is a lineair relationschip
between X and Y. In logistic regression this
assumption has changed to a curvelineair
relationschip
This logistic curve is very interesting it will
never be below 0 and never larger than 1. This
makes it suitable for the analyses of a dependent
variable that has only scores 0 and 1
2
3
Take for instance a test in Math you may pass
the test (1) or you may fail to pass the test
(0). (data LOGSCHOOL.sav) Now we may want to
know whether white students pass the test more
frequently than non-whites
When you are a white student about 44 pass the
test When you are non-white about 25 passes.
3
4
There is more to it for non whites there is a
chance of .25 to pass the test (p1), and a chance
of .75 to fail (p0). The odds then are p1/p0
?.25/.75 or 1/3 out of every 4, 1 will pass, 3
will fail. For white the odds are .45 / .55
about 8/10, out of every 18, eight will pass, 10
will fail. Now the ratio between the odds tells
us something about the relationschip between
passing the test and race (8/10) / (1/3) 2.3
This is called an odds ratio, it tells us how
many times a particular odds (a) differs from
another odds (b). When the odds ratio is 1 there
is no relationship. Odds ratios range between 0
and infinity.
4
5
No relationship between passing the math test and
race
When you are a white student about 40 pass the
test When your you non-white about 40 passes.
Odds are 4/6 for both, oddsratio 1!!
5
6
Now, if we expand our analyses to x-variables
with more than 2 categories and/or use interval
variables cross-tables are poor instruments.
Instead we use logistic regression
analyses. Equation log p1/p0 a b X (where
a and b are logit parameters) or p1/p0 a b
X , where a odds and b is odds ratio.
Table output of logistic regression dependent
variable to pass test (1 passed, 0failed,
x-variable race (white1, other0)
6
7
With log p1/p0 (with p1 the probability that Y1
and p0 the probabilty that Y0) we get the
logistic curve for P1.
P1
X
This is statistically a good idea because it
takes ito account floor and ceiling effects which
occur quite easily with 0/1 data
7
8
Logistic regression in Mlwin One level only
Odds e -1.075 0.34 and odds ratio is e .847
2.33 (where e 2.7182)
ICC .658 / (.658 3.29) 0.17 (3.29 is
variance of e) For testing use t .658 / .061 10
with df level 2 units
Two levels
Alternatively use macro 'vpc.txt' in mlwin. See A
User's Guide to MLwiN 2.0 p131-134
8
9
Adding another level 1 variable, this time an
interval variable. The parameter .299 says that
the log odds (p1/p0) increases with .299 for
every hour extra spent on homework. This
actually means that chances to pass (p1) increase
when more time is spent on homework. In terms of
odds ratios the odds to pass the test are
multiplied with e .299 1.34 for every hour
extra.
9
10
Now maybe we want to include the level 2 variable
PUBLIC to test whether public schools do worse on
passing the Math test
-0.6 tells us that chances to pass go down when
on a public school (odds ratio e -0.6 0.54!)
10
11
Now maybe we want to test whether homework varies
across schools when it comes to passing the test.
First thing is to set homework at random (do not
worry about the significance of the
(co)variance). The test of the interaction is on
the next slide.
11
12
We continue with including a interaction between
homework and public. It turns out that the
effect for homework is lower when the student is
on a private school (estimate .246) versus a
public school (effect .246 .142 .388)
12
13
Introduction to event history analyses Example
The Spanish Flu, that took many casulties world
wide just after 1918. Maybe we record the data
like (Ive made the data up, based on
http//en.wikipedia.org/wiki/1918_flu_pandemic,
the fake data are in FLU.sav, spss syntax file to
create person period file is in flu.sps).
Every row has an individual and it is recorded
whether person died from the flu during certain
period.
13
14
Cross table on this data set
We have 200 people in the period 1-7 and 100 died
within that period, but the effect of period is
totally wrong here.
14
15
Now we turn this into a person period file
15
16
A person period file has a observation for every
period, this means that individuals are in the
file more than once. In case we have an event
that can occur only once then we have relatively
low numbers coded 1 on the dependent variable.
This means that p1 (the chance to encounter
Event) which is called the hazard rate is
rather low, while p0 (1-p1) is rather high.
Recall that in logistic regression the parameter
estimate for odds ratios is defined as b p1/p0
? because p0 is rather close to one we get b
p1 so the b parameter is the number of times the
hazard rate increases. A life table
16
17
A person period file has a observation for every
period, this means The life table again but now
as a graph
We now get a clear view on the period effect in
period 4 the chances to die from the Spanish flu
were about .18!
Please note that this hazard rate is conditional
upon the period, it says that IF you survived all
periodes before t x then the chances to suvive tx
is p1 (t)
17
18
A person period file can be analyzed with
multilevel models in at least 3 cases a) When
youve got individuals nested in higher levels
like districts and your dependent variable is a
one-shot event (happens only once) ? assignment
4 b) In case youve got individuals and your
dependent variable is a recurrent event (happens
more than once), for instance unemployment, level
1 is individuals, level 2 is period. c) In case
youve got individuals nested in higher levels
like districts and your dependent variable is a
recurrent event (happens more than once), level 1
is individuals, level 2 is period and level 3 is
district.
18
Write a Comment
User Comments (0)
About PowerShow.com