Data Quality in Nationwide German Social Surveys - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Data Quality in Nationwide German Social Surveys

Description:

Data Quality in Nationwide (German) Social Surveys. Michael Blohm ... Age, sex, level of education, marital status, size of household, nationality, employment status ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 31
Provided by: nn450
Category:

less

Transcript and Presenter's Notes

Title: Data Quality in Nationwide German Social Surveys


1
Data Quality in Nationwide (German) Social
Surveys
  • Michael Blohm
  • Centre for Survey Research and Methodology (ZUMA)
    German General Social Survey (ALLBUS )
  • Mannheim, Germany
  • European Conference on Quality in Survey
    Statistics, 24 26 April 2006, Cardiff, South
    Wales

European Centre for Cross-Cultural Surveys
2
Outline
  • Background
  • Indicator for Data Quality
  • Sampling Design and Data Quality
  • Changes in Data Quality during the Fielding Period

3
Background I
  • Survey costs ? increasing
  • Response rates ? decreasing

4
Background II
  • Questions
  • Is data quality in expensive sampling designs
    higher?
  • Is data quality higher the higher response
  • rates are?

5
A) Indicator for Data Quality I
  • Research strategy
  • Compare the net samples with the German
    Microcensus
  • 7 Socio-demographic variables
  • Age, sex, level of education, marital status,
    size of household, nationality, employment
    status

6
Indicator for Data Quality II
  • For distributions Index of Dissimilarity

Example Legal marital status
7
Indicator for Data Quality III
  • Index of Dissimilarity
  • pro
  • easy to interpret, takes all cases of a
    sample into account
  • con does not consider the relative size of
    categories and the direction of deviations

8
B) Sampling Design and Data Quality I
9
Sampling Design and Data Quality II
  • A) Random-Route (ADM-Design)
  • (3 Stages) Constituencies-Households-Individu
    als
  • ? definition of target househould and
    target person by interviewer
  • (according to rules)
  • B) Adress-Random (ADM-Design) (3 Stages)
    Constituencies-Households-Individuals
  • ? definition of target household by
    field organization, definition of
    target persons by interviewer (according to
    rules)
  • C) Sample with named individuals/Register
    (2 Stages) Municipalities-Individuals
  • ? definition of target persons by
    field organization/researcher

10
Sampling Design and Data Quality III
  • Hypotheses
  • the greater the leeway for interviewers with
    regard to the selection of target persons the
    greater the differences to the Microcensus

11
Mean of Summed Index of Dissimilarity and Mean
Response Rate by Sampling Design
10
80
9
70
8
60
7
50
6
5
Mean Index of Dissimilarity
40
Response Rate ()
4
30
3
20
2
10
1
0
0
SR / RR
AR
Named Indi.
Index of Dissimilarity
Response Rate
12
Sampling Design and Data Quality IV
  • The more expensive the sampling designs ...
  • the higher data quality
  • the lower the response rates

13
C) Data Quality During the Course of Fielding
Period I
  • Research strategy
  • Analyses of the deviations between the net
    samples and Microcensus during the course
    of fielding period
  • for distributions, bivariate and multivariate
    associations
  • Only for samples with named individuals
    (ALLBUS and ESS German Part)

14
Data Quality During the Course of Fielding Period
II
  • Questions
  • the higher response rates the higher data
    quality?
  • typical sequences?
  • differences between surveys?

15
Data Quality During the Course of Fielding Period
III
  • Hypotheses
  • for distributions During fielding period
    deviations from Microcensus should decrease
  • for (multivariate) Associations During field
    work no change (slight improvement)

16
Distributions Mean of Index of Dissimilarity for
7 Socio- demographic Variables, by Response
Rate, by Survey
17
Distributions Employment status Index of
Dissimilarity, by Response Rate, by Survey
18
Distributions Level of education Index of
Dissimilarity, by Response Rate, by Survey
19
Bivariate Associations
  • Deviations between correlation coefficents for
    surveys and for German Microcensus
  • N 15 correlations

20
Mean Deviations of Correlation Coefficients
(N15) ALLBUS and ESS (German Part) vs.
Microcensus
21
Multivariate Associations
  • Deviations between ß-coefficients and constant
    for Surveys and for Microcensus
  • Logistic regression models
  • Dependent Variable
  • - Employment Status (in paid work / not
    in work)
  • Independent Variables
  • - Sex,
  • - Age (4 Cat.),
  • - Education (3 Cat.)

22
Mean Deviations of ß-Coefficients (N 7) ALLBUS
and ESS (German Part) vs. Microcensus
23
Data Quality During the Course of Fielding Period
IV
  • Association between response rates and data
    quality not clear-cut
  • Higher response rates trend to result in
    higher data quality, but
  • in the case of distributions improvements after
    a response rate of 30 to 35 has been
    achieved are marginal
  • strong effect of field institute on the level
    of bias/deviation

24
Conclusions
  • Sampling design does matter
  • Response rates are not a good indicator for data
    quality

25
  • Thank you for your attention !
  • http//www.gesis.org/dauerbeobachtung/Allbus/index
    .htm

26
Index of Dissimilarity Mean for 7
socio-demographic variables, by response-rate,
by survey
27
(No Transcript)
28
(No Transcript)
29
Level of educationIndex of dissimilarity, by
response rate, by surveyage lt 50
30
Mean of Summed Index of Dissimilarity and
Response Rate by Sampling Design
Write a Comment
User Comments (0)
About PowerShow.com