Title: Impact of using fiscal data on the imputation strategy of the Unified Enterprise Survey of Statistics Canada Ryan Chepita, Yi Li, Jean-S
1Impact of using fiscal data on the imputation
strategy of the Unified Enterprise Survey of
Statistics CanadaRyan Chepita, Yi Li,
Jean-Sébastien Provençal, Chi Wai
YeungStatistics CanadaICES III, Montréal, June
2007
2Goals
- To illustrate the challenges of applying a
centralized E and I strategy to a broad range of
industrial sectors - To discuss the changes put in place due to the
increasing use of fiscal data - To discuss one approach used to quantify the
overall E and I effect
3Outline
- Overview of the Unified Enterprise Survey (UES)
- Survey content
- Imputation strategy
- Use of fiscal data
- Challenges
- Diagnostic tool
- Conclusion
4Overview of the UES
- Annual business survey
- Initiated with 7 industries in 1997
- Presently integrates over 40 industries covering
the major sectors of the economy - 950K establishments in the population
- 127K establishments in the sample
5Overview of the UES
- Stratified sampling design
- NAICS, province, and size in terms of revenue
- Data collection
- Mail out survey, fax and phone follow-up
- Edit and Imputation
- Estimation
- H.-T. for totals and provincial and industrial
breakdowns
6Survey content
- 2 or 3 Key variables
- Total revenue and total expenses
- Similar concepts from one industry to another
- A lot of details (over 50 variables)
- Totals breakdowns
- By province, type of expenses or source of
revenue - Industry specific
- Can be revised from year to year
7Survey content
- Example manufacturing sector
VARIABLES
Sales oth. Goods and serv. produced
Total sales of goods purch for resale
Amount received for custom work
Amount received for repair work
Stumpage sales
Total sales of goods and services produced
Sales of logs and wood residue
Total sales
Details
Key
8Imputation Strategy
- Categories of non-response
- Category 1 Partial response with at least 1 key
variable reported - Category 2 Total non-response with historical
data - Category 3 Total non-response without
historical data
9Imputation Strategy
- Historical data for some records
- Records sampled the year before
- Same questionnaire
- Administrative data for all records
- Stratification information
- NAICS, province, size in terms of revenue
10Imputation Strategy
- Type 1 and type 2 non-response
- Missing key variables
- Historical Trend
- Ratio using current survey information
- Missing details
- Historical distribution
- Distribution from all respondent within a
homogeneous group - Distribution from a single donor
11Imputation Strategy
- Type 3 non-response
- Donor imputation
- Closest neighbour based on administrative data
12Use of fiscal data
- Use fiscal data as a proxy value for total
non-response - Use fiscal data as a proxy value for simple units
randomly selected at the sampling stage - Use to update the initial size in terms of
revenue - Number of survey variables for which we use
fiscal data as proxy range from 7 to 25
13Challenges
- Conceptual differences
- Questionnaire content review
- Variables for which there is no proxy value on
the fiscal data base - Modeling
- Industry specific needs
- Tailored strategy
14Challenges
- Monitoring the effect
- Creation of a distinct path for records where we
used fiscal data (category 4 of non-response) - Creation of a diagnostic tool
15Diagnostic tool
- Identification section
- Industry, province, variable description
- Weighted sums, share and percentages by category
of non-response
Variable X Resp. Cat.2 Cat.3 Cat.4 Total
Sums 30M 5M 5M 10M 50M
Share 60 10 10 20 100
Percentages 20 20 25 18 20
Variable Y (Total)
150M 25M 20M 55M 250M
16Conclusion
- Centralized E and I strategy vs industry specific
needs - Diagnostic tool
- Modeling
17Thank you!Questions?