Title: Missing%20income%20data%20in%20the%20millennium%20cohort%20study:
1- Missing income data in the millennium cohort
study - Evidence from the first two sweeps
- Authors Denise Hawkes and Ian Plewis
- Discussant Nicholas Biddle
- nicholas.biddle_at_anu.edu.au
2Introduction and overview
- Data Millennium Cohort Study
- Research questions What are the factors
associated with non-response? More specifically - Are there within household and individual
correlations for missing income data? - Is the sex of the interviewer an important
explanatory variable? - How is missing data in sweep one related to
missing data in sweep two? - Is attrition at sweep two related to the level of
household income or the failure to provide data
in sweep one? - Method
- Descriptive analysis
- Binary and Multinomial Logit models with
non-response as dependent variable - Binary Logit with attrition between sweep one and
sweep two as dependent variable
3Data
- Millennium Cohort Study
- First sweep 18,819 babies born in the UK from
1st September 2000 (from 18,552 families).
Interviewed when baby was 9 months old - Second Sweep 14,898 families from original
sample and 692 new families. Interviewed when
children around 3 years old. - Information from main respondent (usually mother)
and partner of respondent (usually father) - Incomplete information on income through
- Unit non-response (response rate 72 in first
sweep) - Partner non-response (88 of families with
partners responded) - Item non-response for income (6 of main
respondents and partners did not provide income
data) - Attrition between sweeps (79 of eligible
families responded in sweep two) - Income information
- Collected from those currently doing paid work,
those who have a paid job but are on leave, those
who have worked in the past but have no current
job. - For employees total take home pay and gross pay
- For self employed amount you personally took
out of the business after all taxes and costs
4Data
- Millennium Cohort Study
- First sweep 18,819 babies born in the UK from
1st September 2000 (from 18,552 families).
Interviewed when baby was 9 months old - Second Sweep 14,898 families from original
sample and 692 new families. Interviewed when
children around 3 years old. - Information from main respondent (usually mother)
and partner of respondent (usually father) - Incomplete information on income through
- Unit non-response (response rate 72 in first
sweep) - Partner non-response (88 of families with
partners responded) - Item non-response for income (6 of main
respondents and partners did not provide income
data) - Attrition between sweeps (79 of eligible
families responded in sweep two) - Income information
- Collected from those currently doing paid work,
those who have a paid job but are on leave, those
who have worked in the past but have no current
job. - For employees total take home pay and gross pay
- For self employed amount you personally took
out of the business after all taxes and costs
5Data
- Millennium Cohort Study
- First sweep 18,819 babies born in the UK from
1st September 2000 (from 18,552 families).
Interviewed when baby was 9 months old - Second Sweep 14,898 families from original
sample and 692 new families. Interviewed when
children around 3 years old. - Information from main respondent (usually mother)
and partner of respondent (usually father) - Incomplete information on income through
- Unit non-response (response rate 72 in first
sweep) - Partner non-response (88 of families with
partners responded) - Item non-response for income (6 of main
respondents and partners did not provide income
data) - Attrition between sweeps (79 of eligible
families responded in sweep two) - Income information
- Collected from those currently doing paid work,
those who have a paid job but are on leave, those
who have worked in the past but have no current
job. - For employees total take home pay and gross pay
- For self employed amount you personally took
out of the business after all taxes and costs
6Patterns of income response
- Original sample (paper has information on new
families and proxies)
Sweep one Sweep one Sweep two Sweep two Sweep two
Main Partner Partner Main Partner Partner Partner
Income response 45.9 64.7 64.7 50.6 62.9 62.9 62.9
Dont know 1.8 2.1 2.1
Refusal 0.9 2.1 2.1
Total non-response 2.7 4.3 4.3 4.4 8.7 8.7 8.7
Not applicable 51.5 31.0 31.0 45.1 28.4 28.4 28.4
Sample 18,552 18,552 14,898 14,898
7Patterns of income response
- Original sample (paper has information on new
families and proxies)
Sweep one Sweep one Sweep two Sweep two Sweep two
Main Partner Partner Main Partner Partner Partner
Income response 45.9 64.7 64.7 50.6 62.9 62.9 62.9
Dont know 1.8 2.1 2.1
Refusal 0.9 2.1 2.1
Total non-response 2.7 4.3 4.3 4.4 8.7 8.7 8.7
Not applicable 51.5 31.0 31.0 45.1 28.4 28.4 28.4
Sample 18,552 18,552 14,898 14,898
8Partner and main respondent income response
Sweep one
Partner respondent Partner respondent Partner respondent
Dont know/ refusal Dont know/ refusal Not applicable Income response Income response
Dont know/refusal (464) 26.6 26.6 27.4 45.9 45.9
Main respondent Not applicable (10,264) 3.9 3.9 42.7 53.4 53.4
Income response (7,824) 3.5 3.5 18.0 78.5 78.5
9Partner and main respondent income response
Sweep two
Partner respondent Partner respondent Partner respondent
Dont know/ refusal Dont know/ refusal Not applicable Income response Income response
Dont know/refusal (614) 26.7 26.7 29.0 44.3 44.3
Main respondent Not applicable (7,190) 9.6 9.6 36.5 54.0 54.0
Income response (7,094) 6.5 6.5 21.0 72.5 72.5
10Sweep one and sweep two income response Main
respondent
Sweep two Sweep two Sweep two
Dont know/ refusal Dont know/ refusal Not applicable Income response Income response
Dont know/refusal (357) 17.9 17.9 26.7 55.4 55.4
Sweep one Not applicable (7,733) 2.9 2.9 74.4 22.8 22.8
Income response (6,504) 5.3 5.3 14.8 79.9 79.9
11Sweep one and sweep two income response Partner
Sweep two Sweep two Sweep two
Dont know/ refusal Dont know/ refusal Not applicable Income response Income response
Dont know/refusal (501) 35.2 35.2 0.4 64.4 64.4
Sweep one Not applicable (1,778) 22.9 22.9 2.4 74.7 74.7
Income response (7,433) 8.7 8.7 0.1 91.2 91.2
12Modelling non-response Main respondent
Sweep one Sweep two
Spec. (I) Spec. (II) Spec. (III)
Self employed 6.4 6.8 6.6 6.7
Has a partner 0.58 0.57 0.56
Social class Intermediate 1.6
- Reference managerial Small employers and self employment 1.8
and professional Lower supervisors and technical
Semi routine and routine
Ethnicity Mixed
- Reference white Indian 2.4 2.3 2.3
Pakistani and Bangladeshi
Black or Black British 1.6
Other ethnic group 2.3
Country Wales
- Reference England Scotland
Northern Ireland 1.7 1.5
Respondent did not respond in sweep one - - 3.0 3.0
Respondent same in sweep one and two - - - 5.3
Sample Size 8,190 5,800 5,800 5,800
13Modelling non-response Main respondent
Sweep one Sweep two
Spec. (I) Spec. (II) Spec. (III)
Self employed 6.4 6.8 6.6 6.7
Has a partner 0.58 0.57 0.56
Social class Intermediate 1.6
- Reference managerial Small employers and self employment 1.8
and professional Lower supervisors and technical
Semi routine and routine
Ethnicity Mixed
- Reference white Indian 2.4 2.3 2.3
Pakistani and Bangladeshi
Black or Black British 1.6
Other ethnic group 2.3
Country Wales
- Reference England Scotland
Northern Ireland 1.7 1.5
Respondent did not respond in sweep one - - 3.0 3.0
Respondent same in sweep one and two - - - 5.3
Sample Size 8,190 5,800 5,800 5,800
14Modelling non-response Partner (I)
Sweep one Sweep two
Spec. (I) Spec. (II) Spec. (III)
Self employed 1.7 3.6 3.6 3.6
Social class Intermediate
- Reference managerial Small employers and self employment 3.0
and professional Lower supervisors and technical 0.68
Semi routine and routine 0.66
NVQ Level 1
NVQ Levels NVQ Level 2 0.63
- Reference none NVQ Level 3 0.59
NVQ Level 4 0.47
NVQ Level 5 0.34
Other/overseas qual only
Ethnicity Mixed 2.3 2.4 2.5
- Reference white Indian 1.8 2.5 2.3 2.3
Pakistani and Bangladeshi 2.2 2.4 2.2 2.2
Black or Black British
Other ethnic group 2.0
Owner occupier 0.76 0.76 0.77
15Modelling non-response Partner (I)
Sweep one Sweep two
Spec. (I) Spec. (II) Spec. (III)
Self employed 1.7 3.6 3.6 3.6
Social class Intermediate
- Reference managerial Small employers and self employment 3.0
and professional Lower supervisors and technical 0.68
Semi routine and routine 0.66
NVQ Level 1
NVQ Levels NVQ Level 2 0.63
- Reference none NVQ Level 3 0.59
NVQ Level 4 0.47
NVQ Level 5 0.34
Other/overseas qual only
Ethnicity Mixed 2.3 2.4 2.5
- Reference white Indian 1.8 2.5 2.3 2.3
Pakistani and Bangladeshi 2.2 2.4 2.2 2.2
Black or Black British
Other ethnic group 2.0
Owner occupier 0.76 0.76 0.77
16Modelling non-response Partner (II)
Sweep one Sweep two
Spec. (I) Spec. (II) Spec. (III)
Country Wales
- Reference England Scotland
Northern Ireland 1.9 1.5 1.6 1.6
Respondent did not respond in sweep one - - 4.6 4.5
Respondent same in sweep one and two - - - 0.39
Sample Size 10,754 7,893 7,893 7,893
17Other modeling Multinomial Logit and attrition
- Multinomial Logit Response vs. dont know vs.
refuse - Main respondent
- Self employed only significantly more likely to
be dont know not refusal - Same with social class variables
- Black or Black British as well as Northern
Ireland more likely to refuse - Partner respondent
- Self employed significantly more likely to refuse
and not know - NVQ levels and ethnicity both associated with
refusal - Attrition at sweep two
- Higher income in sweep one associated with lower
odds of attrition between sweep one and sweep two - Main income and partner income non-response in
sweep one associated with higher odds of
attrition between sweep one and sweep two
18Other modeling Multinomial Logit and attrition
- Multinomial Logit Response vs. dont know vs.
refuse - Main respondent
- Self employed only significantly more likely to
be dont know not refusal - Same with social class variables
- Black or Black British as well as Northern
Ireland more likely to refuse - Partner respondent
- Self employed significantly more likely to refuse
and not know - NVQ levels and ethnicity both associated with
refusal - Attrition at sweep two
- Higher income in sweep one associated with lower
odds of attrition between sweep one and sweep two - Main income and partner income non-response in
sweep one associated with higher odds of
attrition between sweep one and sweep two
19Summary
- Household and individual correlations for missing
income data - Self employment, some ethnic groups (though not
consistent), Northern Ireland - The sex of the interviewer is not an important
explanatory variable in explaining income
non-response - Some variables only associated with dont know
or refusal only - Missing data in sweep one associated with higher
odds of missing data in sweep two - Especially amongst partner respondents
- Higher household income in sweep one associated
with lower attrition in sweep two - Missing data in sweep one associated with higher
attrition in sweep two
20Suggested further work and information
- Models for non-response
- More diagnostic information (e.g. tests of group
significance) - Information on the child?
- Interviewer bias
- Multilevel model?
- Interactions or other information on the
interviewer - Implications for survey design
- Difference between dont know and refusal
21(No Transcript)