Estimating Phone Service and Usage Percentages: How to Weight the Data from a Local, Dual-Frame Sample Survey of Cellphone and Landline Telephone Users in the United States - PowerPoint PPT Presentation

1 / 51
About This Presentation
Title:

Estimating Phone Service and Usage Percentages: How to Weight the Data from a Local, Dual-Frame Sample Survey of Cellphone and Landline Telephone Users in the United States

Description:

CPO percentage varies with age, ethnicity, urbanicity, landline phone costs ... Here we combine the landline samples and treat as a dual-frame design ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 52
Provided by: tomgut
Category:

less

Transcript and Presenter's Notes

Title: Estimating Phone Service and Usage Percentages: How to Weight the Data from a Local, Dual-Frame Sample Survey of Cellphone and Landline Telephone Users in the United States


1
Estimating Phone Service and Usage
PercentagesHow to Weight the Data from a Local,
Dual-Frame Sample Surveyof Cellphone and
Landline Telephone Users in the United States
Thomas M. Guterbock TomG_at_virignia.edu
  • Presented at
  • AAPOR 2009
  • Hollywood, FL
  • May 14, 2009

2
The Problem
  • Dual-frame telephone surveys are becoming more
    prevalent in U.S. survey research
  • The rising percentages and distinctive
    demographics of cellphone-only CPO households
    make it imperative that sample designs cover
    them.
  • Landline RDD Cellphone RDD sample frames
  • Result sample data for 3 phone-service segments
  • CPO overlap (dual-phone) landline-only LLO
  • Problem what is the correct population
    distribution across 3 phone service segments?

3
National data? No problem
  • National Health Interview Survey NHIS data are
    the gold standard
  • Uses a very large N, continuous sampling,
    in-person mode to establish household phone
    service.
  • NHIS provides fairly current data on cellphone
    coverage, percent CPO, phone segment
    distributions
  • NHIS data are available for the U.S. for four
    census regions
  • State estimates released in 2009 using CPS NHIS
  • SOLUTION Weight phone-service segments in the
    national sample to NHIS percents for U.S.

4
What about local studies?
  • We cannot assume that the local phone-service
    segment distribution is the same as national or
    regional averages.
  • Cellphone penetration and CPO lifestyle adoption
    vary considerably across areas.
  • Cell penetration is higher in high density areas,
    metro areas, high-income areas, flat terrain,
    near interstates
  • CPO percentage varies with age, ethnicity,
    urbanicity, landline phone costs
  • NHIS strong phone service variation across
    regions, states
  • Variation within states is probably similar in
    magnitude

5
Why not use percents from the local sample data?
  • In a local dual-frame sample, we will directly
    observe CPO in the cell sample, LLO in the
    landline sample.
  • But estimation from these observed percents is
    problematic for several reasons
  • If we just combine the two samples, we overlook
    the fact that overlap households are
    double-sampled.
  • Its not intuitively obvious how to calculate the
    percentages for the combined sample from the
    split sample results.

6
Why not use percents from the local sample data?
  • Cellphone-only cases are substantially
    overcounted in a cellphone sample.
  • CPOs have different telephone behaviors. More
    likely than dual-phone users . . .
  • To have phone with them
  • To have phone turned on
  • To accept calls from unknown numbers
  • Cellphone samples are usually kept small because
    of higher per-completion cost
  • So we cant just add up the segment counts from
    the two samples.

7
Can we use the local sample data?
  • Collected data from the two realized, local
    samples surely contain useful information about
    local phone-service segments
  • Overcounts of CPO and LLO distort these data
  • We have to do the math correctly
  • IDEA Estimate the amount of CPO and LLO
    overcount in national dual-frame studies, and
    then apply an adjustment to the local sample data
    to arrive at local estimates for CPO and LLO

8
Overview A proposed solution
  • Develop algebraic solution for combining the two
    sample results from a dual-frame design into an
    overall phone service segment distribution,
    assuming equal response rates.
  • Develop algebraic solution for combining the two
    samples when response rates are NOT equal
  • higher response rates (overcounts) are assumed
    for CPO and LLO (compared to overlap)
  • Compare 2007 CHIS to 2007 NHIS (West region) to
    estimate response rate ratios that correspond
    to the observed overcount
  • Apply these ratios to newly collected dual-frame
    survey data from three counties in Virginia
  • Result plausible, locality-specific estimates of
    phone segments

9
Key assumptions
  • Local phone-service segment distributions vary
  • Forcing NHIS segment distributions onto local
    data would distort results
  • Response rate ratios (rates of overcount) are
    constant across surveys
  • If fielding and screening procedures are similar
  • Sampling variability is ignorable
  • In comparison of NHIS to CHIS
  • In projection from the local samples to local
    population

10
How to combine dual-frame sample results(equal
response rates)
11
The universe of telephone households
100
12
Cell phone samples include some that are also in
the RDD frame
Landline- only households are excluded
81.1
Cell phones (Frame 1)
13
RDD samples cover all landline households
RDD (Frame 2)
Cell-phone- only households are excluded
86.8
14
RDD and Cell samples overlap,yield complete
coverage
a
RDD
LLO LANDLINE ONLY 18.9 PbT.189
OVERLAP CELL LANDLINE 67.9 PabT.679
CPO CELL ONLY 13.2 PaT.132
b
These proportions define the population
distribution of segments
ab
Cell phones
All percentages are from 2007 NHIS data (West
region).
15
With equal response rates, cell sample would
show
OVERLAP PabT.679
a
RDD
LLO LANDLINE ONLY PbT.189
CPO PaT.132
81.1
OVERLAP as percent of Frame 1 Pab'
.679/.811 .837
CPO as percent of Frame 1 Pa' .132/.811 .163
Cell phones
All percentages are from 2007 NHIS data (West
region).
16
With equal response rates,RDD sample would show
a
86.8
RDD
LLO PbT.189
OVERLAP PabT.679
CPO PaT.132
b
OVERLAP as percent of Frame 2 Pab?.679/.868 .783
LLO as percent Of Frame 2 Pb?.189/.868 .218
Cell phones
ab
All percentages are from 2007 NHIS data (West
region).
17
So, if response rates were equal, we would have
. . .
True values NHIS West 2007 True values NHIS West 2007 Observed thru Cell sample Observed thru Cell sample Observed thru RDD sample Observed thru RDD sample
CPO PaT 13.2 Pa' 16.3
OverlapPabT 67.9 Pab' 83.7 Pab? 78.3
LLO PbT 18.9 Pb? 21.7
Total 100.0 100.0 100.0
18
How do we get from observedpercentages to
population percents?
True values NHIS West 2007 True values NHIS West 2007 Observed thru Cell sample Observed thru Cell sample Observed thru RDD sample Observed thru RDD sample
CPO PaT ?? Pa' 16.3
OverlapPabT ?? Pab' 83.7 Pab? 78.3
LLO PbT ?? Pb? 21.7
Total 100.0 100.0 100.0
19
Formulas for calculating underlying population
distribution
With PabT PaT evaluated, we have
.
20
Combining dual-frame sample results when
response rates are not equal
21
Three segments, four response rates
RDD sample response rate for LLOs rb
a
RDD
Cell sample response rate for CPOs ra
b
RDD sample response rate for overlap rab?
Cell sample response rate for overlap rab'
ab
Cell phones
22
4 response rates,2 response rate ratios
  • Reduction in base response for dual-phone in the
    cell sample is
  • This is the response rate ratio that applies to
    the cellphone sample.
  • Reduction in base response for dual-phone in the
    RDD sample is
  • This is the response rate ratio for the RDD
    sample.


23
It follows that . . .
  • And our expressions for calculating true
    population phone service segments are modified by
    incorporating the response rate ratios

24
How to calculate response rate ratios
  • Now assume that we have observed results from a
    dual-frame phone survey.
  • We also know the true population distribution.
  • We can calculate the response rate ratios

25
Deriving response rate ratiosby comparingCHIS
2007 to NHIS
26
CHIS 2007California Health Interview Survey
True values NHIS West 2007 True values NHIS West 2007 Observed thru Cell sample Observed thru Cell sample Observed thru RDD sample Observed thru RDD sample
CPO PaT 13.2 Pa' 34.6
OverlapPabT 67.9 Pab' 65.4 Pab? 68.3
LLO PbT 18.9 Pb? 32.7
Total 100.0 100.0 100.0
?16.3
?21.7
27
From these data we can evaluate r1 and r2
In the cellphone sample, overlap response rate is
only 37 of CPO rate.
In the RDD sample, overlap response rate is about
60 of LLO rate.
  • Overcount of CPOs is greater than overcount of
    LLOs.
  • This shows many dual-phone users still use
    cellphone
  • as a secondary device.

28
Calculating local area estimatesof population
phone-servicesegment distributions
29
2008 Prince William County Survey
  • Citizen satisfaction survey in large, suburban
    county in Northern Virginia
  • N 1,666
  • Triple frame design cellphone, landline RDD, and
    directory-listed sample
  • Here we combine the landline samples and treat as
    a dual-frame design
  • Screening questions patterned after those on CHIS

30
2008 Results for Prince William County, VA
Observed thru Cell sample Observed thru Cell sample Observed thru RDD sample Observed thru RDD sample
CPO PaT Pa' 40.6 0.7
OverlapPabT Pab' 59.4 Pab? 88.5
LLO PbT Pb? 10.5
Total 100.0 100.0 100.0
31
2008 Results for Prince William County, VA
True values for PWC True values for PWC Observed thru Cell sample Observed thru Cell sample Observed thru RDD sample Observed thru RDD sample
CPO PaT ?? Pa' 40.6 0.7
OverlapPabT ?? Pab' 59.4 Pab? 88.5
LLO PbT ?? Pb? 10.5
Total 100.0 100.0 100.0
32
Apply formulas given above
Calculations based on r1 .368 r2 .598
33
2008 Results for Prince William County, VA
True values for PWC True values for PWC Observed thru Cell sample Observed thru Cell sample Observed thru RDD sample Observed thru RDD sample
CPO PaT 19.0 Pa' 40.6 0.7
OverlapPabT 75.3 Pab' 59.4 Pab? 88.5
LLO PbT 5.7 Pb? 10.5
Total 100.0 100.0 100.0
34
2008 Albemarle County Survey
  • Citizen satisfaction survey
  • Suburban and rural county surrounding City of
    Charlottesville, VA
  • Similar triple-frame design as in PWC survey
  • Smaller sample size n 700

35
2008 Results for Albemarle County, VA
Observed thru Cell sample Observed thru Cell sample Observed thru RDD sample Observed thru RDD sample
CPO PaT Pa' 21.9 0.2
OverlapPabT Pab' 78.1 Pab? 82.7
LLO PbT Pb? 17.2
Total 100.0 100.0 100.0
36
2008 Results for Albemarle County, VA
True values for Albemarle True values for Albemarle Observed thru Cell sample Observed thru Cell sample Observed thru RDD sample Observed thru RDD sample
CPO PaT 8.4 Pa' 21.9 0.2
OverlapPabT 81.4 Pab' 78.1 Pab? 82.7
LLO PbT 10.2 Pb? 17.2
Total 100.0 100.0 100.0
37
2008 Chesterfield County Survey
  • Citizen satisfaction survey
  • Suburban county adjacent to Richmond, VA
  • Similar triple-frame design as in PWC survey
  • Treated as dual frame here
  • n 1600

38
2008 Results for Chesterfield County, VA
Observed thru Cell sample Observed thru Cell sample Observed thru RDD sample Observed thru RDD sample
CPO PaT Pa' 20.4 0.1
OverlapPabT Pab' 79.6 Pab? 87.6
LLO PbT Pb? 12.4
Total 100.0 100.0 100.0
39
2008 Results for Chesterfield County, VA
True values for Chesterfield True values for Chesterfield Observed thru Cell sample Observed thru Cell sample Observed thru RDD sample Observed thru RDD sample
CPO PaT 8.0 Pa' 20.4 0.1
OverlapPabT 84.8 Pab' 79.6 Pab? 87.6
LLO PbT 7.2 Pb? 12.4
Total 100.0 100.0 100.0
40
Contrasting results
NHIS CHIS NHIS Prince William Albe-marle Chester-field
CPO PaT 13.2 13.2 19.0 8.4 8.0
OverlapPabT 67.9 67.9 75.3 81.4 84.8
LLO PbT 18.9 18.9 5.7 10.2 7.2
Total 100.0 100.0 100.0 100.0 100.0
41
Using the estimated segment distribution to
weight thesample data
42
Example PWC 2008
Observed thru cell sample Observed thru cell sample Observed thru RDD sample Observed thru RDD sample Combined sample unweighted Combined sample unweighted
CPO 76 40.6 11 0.7 87 5.3
Overlap 111 59.4 1303 88.5 1414 85.4
LLO 154 10.5 154 9.3
Total 187 100.0 1468 100.0 1655 100.0
43
3-segment weights PWC 2008
Combined sample unweighted Combined sample unweighted True values for PWC Weight Weighted N Weighted N
CPO 87 5.3 19.0 3.61 314 19.0
Overlap 1414 85.4 75.3 .88 1247 75.3
LLO 154 9.3 5.7 .61 94 5.7
Total 1655 100.0 100.0 1655 100.0
44
But wait . . . We have 4 segments
Observed thru cell sample Observed thru cell sample Observed thru RDD sample Observed thru RDD sample Combined sample unweighted Combined sample unweighted
CPO 76 40.6 11 0.7 87 5.3
Overlap via cell 111 59.4 111 6.7
Overlap via RDD 1303 88.5 1303 78.7
LLO 154 10.5 154 9.3
Total 187 100.0 1468 100.0 1655 100.0
45
If 2 frames split the overlap equally
Combined sample unweighted Combined sample unweighted True values for PWC Weight Weighted N Weighted N
CPO 87 5.3 19.0 3.61 314 19.0
Overlap via cell 111 6.7 37.7 5.62 623 37.7
Overlap via RDD 1303 78.7 37.7 .48 623 37.7
LLO 154 9.3 5.7 .61 94 5.7
Total 1655 100.0 100.0 1655 100.0
46
If overlap-cell segment gets weight 2
Combined sample unweighted Combined sample unweighted True values for PWC Weight Weighted N Weighted N
CPO 87 5.3 19.0 3.61 314 19.0
Overlap via cell 111 6.7 75.3 2.00 222 13.4
Overlap via RDD 1303 78.7 75.3 .79 1025 61.9
LLO 154 9.3 5.7 .61 94 5.7
Total 1655 100.0 100.0 1655 100.0
47
In Summary . . .
48
Problem and solution
  • We dont have gold standard data by which to
    weight the results of a dual-frame telephone
    survey in a local area
  • Weighting to national or state averages might not
    be accurate
  • We developed needed formulas that relate observed
    percentages to underlying population phone
    segment distributions
  • We calculated response rate ratios by comparing
    CHIS 2007 to regional NHIS 2007 results.
  • We applied these ratios to calculate underlying
    distributions in three local telephone surveys

49
Results
  • The estimates for three suburban counties in
    Virginia are quite different from national
    phone-segment distributionsand from each other
  • Cellphone penetration is higher in Northern
    Virginia than in downstate suburbs, or in
    national estimates
  • CPO lifestyle has been adopted by fewer people in
    the downstate suburbs
  • The estimates can guide weighting of sample data
  • But we must use caution in weighting our
    cellphone samples up too much
  • Larger cellphone samples needed in the future

50
Future research
  • This is a time of rapid change in the telephone
    system
  • We are just learning how to deal with the
    weighting issues in cellphone surveys
  • We need to look at optimization of our dual-frame
    designs (cf. Hartley 1962)
  • Estimates of response rate ratios can be updated
    using more current national phone surveys
    compared to NHIS
  • Results would be strengthened if external local
    data were available to validate the estimates

51
Estimating Phone Service and Usage
PercentagesHow to Weight the Data from a Local,
Dual-Frame Sample Surveyof Cellphone and
Landline Telephone Users in the United States
Thomas M. Guterbock TomG_at_virignia.edu
  • Presented at
  • AAPOR 2009
  • Hollywood, FL
  • May 14, 2009
Write a Comment
User Comments (0)
About PowerShow.com