Combining Cross Sectional Market Research Surveys: An Application of Statistical Matching - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Combining Cross Sectional Market Research Surveys: An Application of Statistical Matching

Description:

... Sectional Market Research Surveys: An Application of Statistical Matching ... the method perform when non-matching variables change slowly, systematically, ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 43
Provided by: janeca9
Category:

less

Transcript and Presenter's Notes

Title: Combining Cross Sectional Market Research Surveys: An Application of Statistical Matching


1
Combining Cross Sectional Market Research
Surveys An Application of Statistical Matching
  • Catherine Frethey-Bentham

2
  • Most Quantitative Market Research in New Zealand
    uses Cross-sectional Data

Useful because
3
  • We get a snapshot of a given situation at any
    point in time

Relatively simple, easy and timely to conduct
  • E.g., change in the proportion of mobile phone
    users from one year to the next.
  • Allows estimation of net changes that is,
    changes at the aggregate level
  • Representative of population

4
  • But cross-sectional data is not always ideal.
  • Consider this situation

5
Time 1
Time 2
6
Limitations of Cross-Sectional Surveys
  • Net changes only
  • Which segment did segment five gain most of
    its members from?
  • Are there other changes between segments that
    are not obvious on the surface?
  • No past history, memory, to see what happens
    to individuals
  • Require data that provide more detail about
    how people are changing

7
  • Now, consider this situation

8
Time 1
Time 2
9
Example Profiling Change
Market Segments - time 1
Market Segments - time 2
1 2 3 4 5
1
95
2
73
25
3
96
4
62
37
10
  • Longitudinal data has many benefits, including

11
  • Some phenomena are inherently longitudinal in
    nature
  • As researchers we should be able to capture this
  • E.g., Links between current events and outcomes
    and past history
  • Such ascurrent brand choice and past brand
    choice

12
Longitudinal Data Permit
  • Tracing the dynamics of behaviours
  • Can observe how circumstances change with time
    spent in state
  • Identifying the influence of past behaviours on
    current behaviours
  • Ability to make causal inference enhanced by
    temporal ordering
  • Repeated observations on individuals allow for
    possibility of controlling for unobserved
    individual characteristics (measurement error)

13
  • But longitudinal Data is not without its
    limitations

14
time
Cost
  • Attrition rates estimated at 6-50 between two
    survey rounds
  • Panel Attrition
  • Panel Conditioning

15
A Little Background to my Problem
  • Otago Lifestyles Study
  • Four cross-sectional consumer lifestyles studies
    undertaken by The University of Otago.
  • Exploring psychographics, consumer lifestyles and
    demographics of New Zealanders at a point in time
  • Initial desire to conduct a study to explore
    change in lifestyle segments over time but
    problems with repeated cross-sectional studies

Study 1
Study 2
Study 3
Study 4
1989
2005
2000
1995/96
Note Samples are independent but drawn from same
population.
16
  • There must be another way
  • Data Fusion???

17
Data Fusion
  • Data Fusion (Statistical Matching)
  • Typically involves matching each unit in one
    database with similar (but not identical) units
    in another database
  • Traditional Uses
  • Typically used to explore dependency
    relationships when one data set contains
    independent variables and another contains
    dependent variables
  • E.g., Media research - to merge dependent and
    independent variables into one data set
  • Media consumption and product purchased

18
Research Objective
  • To use data fusion (statistical matching) methods
    to develop a methodology capable of modelling
    gross change using repeated cross-sectional
    surveys
  • The intention is to create pseudo panel data
    that depicts change in consumer lifestyles over
    time

35 still purchase brand X
45 purchase brand X
10 dont
Time one
Time two
19
Research Summary
Time 1
Time 1
Time 2
Ageing Characteristics
Common Variables
Matching Cohorts
Population Weightings
Respondent Set A
Respondent Set B
Respondent Set A
Note Respondent Set A and Respondent Set B are
independent of one another
20
Design over Multiple Time Periods
21
Research Summary
Time 1
Time 2
Respondent Set A
Respondent Set B
Matching Changeable
Cluster Characteristics (t1) Characteristics
(t1) Membership (t1) E.g., Age, Gender E.g.,
TV Viewing 4 .. .
Matching Changeable
Cluster Characteristics (t2) Characteristics
(t2) Membership (t2) E.g., Age, Gender E.g.,
TV Viewing 6 .. .
Match (t1) Match (t2)
Behavioural (t1) Behavioural (t2) Cluster
(t1) Cluster (t2) Match 1 ..
. . 4 6

Match 2 .. .
.
22
Example Profiling Change
Market Segments - time 1
Market Segments - time 2
1 2 3 4 5
Creating a singular merged database allows the
exploration of gross change across time
1
95
2
73
25
3
96
4
62
37
23
Procedure
  • Step One
  • Validation Study
  • To test how accurate the results are compared to
    panel data
  • Using ACNielsens PanelViews (Australia) panel
    data

24
  • Initial Results

25
Consider the question.
Do individuals report buying more or fewer
environmentally friendly products than before?
26
Initial Results
Example I try to buy environmentally friendly
products (ENVIRON)
Year 2001
Year 2000
27
Initial Results
Response 2001
28
Initial Results
Response 2001
29
Initial Results
Response 2001
30
Initial Results
Response 2001
31
Initial Results
Response 2001
32
Example I try to buy environmentally friendly
products (ENVIRON)
33
So, does it always work this well?
  • No such luck!

Response 2001
34
Main Assumption of Matching
  • Conditional Independence Assumption (CIA)
  • Y1 and Y2 are conditionally independent given X
  • I.e., the common variables contain all the
    information about the relationship between Y1 and
    Y2

Time 1
Time 2
Behavioural Traits (Y2) e.g. purchase habits
Behavioural Traits (Y1) e.g. purchase habits
Common Variables (Xi) e.g. income, materialism
35
Overcoming the CIA Problem
  • Can utilise multiple regression as a tool to help
    choose common variables
  • A large R-square when a Y variable is regressed
    on common (X) variables (at any given time
    period) is a necessary, but not sufficient,
    condition for a good match
  • A good match is also dependent on the patterns of
    the residuals from regression analyses
  • We must use the panel data to obtain these
    residuals

36
Patterns in the residuals
  • Looking at the residuals from the panel data

Variable ENVIRON
Variable BUDGET
Residuals - 2000
Residuals - 2000
Residuals - 2001
Residuals - 2001
There is still some pattern here that our common
variables are unable to explain
Good outcome no/little pattern in the residuals
37
Procedure
  • Step Two
  • Simulation Study
  • Undertaken to simulate, and understand the
    effects of, different scenarios that might occur
  • How does the method perform when non-matching
    variables change slowly, systematically,
    rapidly/instantaneously?
  • How does the method perform with different sample
    sizes?

38
Procedure
  • Step Three
  • Apply Method
  • Using TNS New Zealand Lifestyle and Opinions
    Survey
  • Series of cross-sectional studies collected
    annually
  • Same lifestyles and demographic questions
    collected at each phase
  • Data collected between 1998 and 2005
  • Sample of approximately 8000 individuals at each
    period, randomly selected from New Zealand
    population

39
  • Thank You!

40
Issues for Further Investigation
  • Must have same variables across studies
  • Cannot account for new (and different) members of
    the population
  • Error introduced between time periods due to
    matching

41
Technical Notes
  • Preparation of the Matching Framework
  • Population weightings
  • All data weighted so representative of the
    Australian population
  • Matching cohorts based on age groups
  • 16-34, 35-44, 45-54, 55
  • Choice of common variables
  • Between seven and ten common variables used -
    combination of demographic and psychographic
    variables
  • Data adjusted for change over time using
    secondary data from The Australian Bureau of
    Statistics

42
Technical Notes
  • Use of unconstrained matching algorithm
  • nearest neighbour method - achieves best matches
    possible at an individual unit level
  • Results from survey at time one (donors) fused
    onto those at time two (recipients)
  • Minimum distance between matched elements good
  • Using Gowers distance
  • Initially a few elements matched too many times
  • Constraints on the number of matches imposed
    where necessary
Write a Comment
User Comments (0)
About PowerShow.com