Winning the War of Attrition Sampling, response analysis and weighting using the National Pupil Data - PowerPoint PPT Presentation

Loading...

PPT – Winning the War of Attrition Sampling, response analysis and weighting using the National Pupil Data PowerPoint presentation | free to download - id: 115c6a-YWRmO



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Winning the War of Attrition Sampling, response analysis and weighting using the National Pupil Data

Description:

The way we were sampling from school records for the Youth Cohort Studies (YCS) ... given to AAPOR annual conference 2005 by Peter Lynn, Patten Smith and Iain Noble ... – PowerPoint PPT presentation

Number of Views:203
Avg rating:3.0/5.0
Slides: 28
Provided by: jule178
Learn more at: http://www.bris.ac.uk
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Winning the War of Attrition Sampling, response analysis and weighting using the National Pupil Data


1
Winning the War of Attrition?Sampling, response
analysis and weighting using the National Pupil
Database
  • James Halse
  • Young People Analysis, DCSF
  • james.halse_at_dcsf.gsi.gov.uk

2
Overview
  • The way we were sampling from school records
    for the Youth Cohort Studies (YCS)
  • A new way of sampling for the Longitudinal Study
    of Young People in England (LSYPE)
  • Analysis of response rates and non response bias
    using NPD
  • Weighting for non-response on LSYPE
  • Applying the lessons learned to the next cohort
    of the YCS

3
The way we were - the YCS
  • Youth Cohort Studies were a multimode panel study
    of young people starting in the spring after year
    11 and following these young people 1, 2 and 3
    years later
  • In theory a simple random sample - the Department
    wrote to all schools and asked for names and
    addresses of pupils born on 3 dates within any
    month (e.g. 5th, 15th, 25th)
  • Issued sample drawn from information provided by
    schools
  • Some attempt to correct for school non-response
  • For cohorts 11 and 12, attempt to increase the
    number of young people from ethnic minorities by
    over sampling in LAs with high proportion of
    pupils from minority ethnic groups

4
YCS response
  • Non-response and attrition are a big problem
  • Attempts to deal with this by increasing the
    sample size

5
Non-response bias
  • But the real concern is differential
    non-response, especially over 4 sweeps

YCS cohort 11 respondents at each sweep by year
11 attainment
6
Achieved sample sizes by selected characteristics
and sweep YCS cohort 12
7
YCS Weighting for non-response
  • Cell weighting at sweep 1 (attainment, region,
    school type and sex)
  • CHAID for sweep 2 onwards using information
    collected at previous sweeps
  • Lowest response rate is at initial sweep, but
    this is the stage at which we have least
    information for non-response weighting

8
Problems with the YCS
  • Burden on schools to provide details for sample
    frame
  • Boosting number of sample members from LAs or
    schools with high proportion of minority ethnic
    pupils was inefficient
  • Declining response rates and differential
    non-response led to very small sample sizes for
    some groups by 3rd or 4th sweep
  • Little information for sweep 1 non-response
    weighting
  • Large differentials in non-response weights
    leading to large design effects and reduced
    sample efficiency (55 efficient at 11.4)

9
Things can only get better the Longitudinal
Study of Young People in England (LSYPE)
  • Similar to YCS in that it is a study of
    transitions from compulsory education, but
  • Face to face
  • Started when pupils were in year 9 (age 13/14)
  • Plan to continue till young people are aged 25
  • Includes interviews with parents
  • Much more detailed (e.g. attitudes to school,
    bullying, parental employment histories)
  • Used incentives (conditional at wave 1,
    unconditional thereafter)
  • For LSYPE use a 2 stage Probability Proportional
    to Size (PPS) design with schools as PSUs
  • Sample drawn directly from PLASC
  • But had to approach schools for contact details
    so drew a large enough sample to allow for some
    non-cooperation from schools

10
LSYPE Sampling schools
  • Maintained schools stratified into
    deprived/non-deprived
  • Deprived schools sampled with fraction 1.5 times
    greater than non-deprived
  • Within each stratum, a size measure was
    calculated dependent on number of pupils from
    major ethnic minority groups (Indian, Pakistani,
    Bangladeshi, Black African, Black Caribbean,
    Mixed) in year 8 at that school
  • A small sample of independent schools also
    selected

11
Sampling pupils
  • Within each school, selection probabilities were
    calculated for pupils to ensure issued sample
    target numbers of 1000 from each of the main
    ethnic minority groups
  • Importantly, the way ethnic minorities were
    boosted means that all pupils within an ethnic
    group and within a school deprivation stratum
    were sampled with the same probability as one
    another

12
LSYPE response
  • About 3 quarters of schools sampled cooperated
  • Of the issued sample, the overall response rate
    was 74 (including partial responses)
  • Some evidence of response bias

13
Analysis of LSYPE response
  • Use NPD to analyse school non-response and pupil
    level non response separately
  • Run logistic regression models to find variables
    associated with propensity to respond
  • Start with variables in sample frame and add
    attainment variables
  • For school non-response, significant terms in the
    model were deprivation strata and whether or not
    the school was in London
  • For pupil non-response, significant terms are
    attainment, ethnicity and region, plus an
    interaction between white and region

14
LSYPE non-response weighting wave 1
  • School non-response and pupil non-response
    treated separately
  • Logistic regression model used to estimate
    probability of response p
  • To create weights, take reciprocal of p (i.e.
    1/p) and rescale by dividing by mean of 1/p
  • School non-response and pupil level non-response
    weights combined with design weights to create
    final weight
  • Generally speaking, non-response weights are
    inversely correlated with design weights small
    loss of efficiency

15
LSYPE waves 2 and 3 response
  • Good response rates (89 wave 2, 93 wave 3)
  • Model response using both NPD variables and
    information collected at earlier sweeps
  • NPD variables had stronger association with
    propensity to response at wave 2 than at wave 1
  • Adding survey variables to the model only
    explains a bit more than the NPD variables

16
YCS 13
  • Similar sample design to LSYPE
  • Face to face
  • 2 stage PPS design
  • Over sample ethnic minorities using school census
  • But
  • Over sample low attainers (defined as those with
    no A-Cs and less than 5 D-Gs) by a factor of 2
  • Postcode sectors are PSUs as opposed to schools
    (smaller design effects)
  • Full address collected through school census
    by-passing need to go through schools

17
YCS 13 response (maintained sector)
  • Note the high proportion of movers and address
    problems

18
YCS 13 response by selected characteristics
19
Benefits of sampling from the NPD
  • Wealth of information from which to design your
    sample
  • Run simulations to help decide on the optimum
    design for your requirements and budget
  • Easy to over sample key groups of interest and/or
    those least likely to respond
  • Lots of information to use for non-response
    weighting
  • Now that addresses are collected through school
    census, school non-cooperation is not an issue
  • Can follow up drop outs longitudinally through
    the admin data

20
Drawbacks of sampling from the NPD
  • Address information missing or not up to datebut
    2006 was the first year in which schools were
    required to supply addresses in the school census
    so this should improve
  • Data quality in school census is a potential
    problem, e.g. discrepancies between census report
    and self reported ethnicity

21
Any questions?
  • For more information on LSYPE see our page at
    ESDS longitudinal http//www.esds.ac.uk/longitudi
    nal/access/lsype/L5545.asp
  • YCS downloads and documentation
  • http//www.esds.ac.uk/search/indexSearch.asp?ctxm
    lSnq133233

22
LSYPE sampling technical slides
  • Taken from A new method for sample designs with
    disproportionate stratification paper given to
    AAPOR annual conference 2005 by Peter Lynn,
    Patten Smith and Iain Noble

23
Sampling Method for LSYPE
  • Construct size measure Si in each PSU (school)
  • Si ?(Nik(nk/Nk))
  • Where
  • Si the size measure for PSU i
  • Nik the number in sub-population group k in PSU
    i
  • nk number required in issued sample in
    sub-population group k
  • Nk number in sub-population group k in the
    population.
  • Select m PSUs with probability proportional to
    Si
  • P(PSU) mSi/? Si

24
Method
  • Within each PSU, select 2nd stage units with
    probability Pjki
  • Pjki (n(s)/Si ) (nk/Nk)
  • Where
  • Pjki conditional probability of selecting 2nd
    stage unit j in sub-population group k in PSU i.
  • n(s) total number to be selected in each PSU

25
Result
  • Overall probability of selection of 2nd stage
    unit Pjk is constant within sub-population k
  • Pjk nk/Nk
  • Total number selected in each PSU is fixed at
    n(s)
  • Therefore avoid precision losses through
    corrective (design) weighting and excessive
    variation in cluster sizes

26
LSYPE some complications
  • Sample deprived schools (top quintile in
    students entitled to free school meals) at 1.5
    times the rate of other schools
  • Calculations resulted in Pgt1 for some schools
  • Calculations resulted in Pgt1 for students in some
    small schools (happens when Si lt (nk/Nk) n(s))
  • Small schools covering small proportion of
    student population fieldwork inefficiencies
  • No data on current number of year 9 students

27
Dealing with the complications
  • Deprived schools separate stratum with higher
    sampling fraction
  • Schools for which calculations give Pgt1 sample
    with certainty and select pupils with appropriate
    sampling fraction for ethnic group
  • Small schools where students in a group for which
    calculations give Pgt1 select all pupils in the
    group and apply weight
  • Small schools for fieldwork efficiency reasons
    omit schools for which no. students selected
    would be less than 12
  • No information on no. Year 9s use previous no.
    year 8s as proxy, and then select new year 9
    pupils during interviewer school visits
About PowerShow.com