Title: A selective editing method considering both suspicion and potential impact, developed and applied to the Swedish foreign trade statistics Topic (ii), WP 12
1A selective editing method considering both
suspicion and potential impact, developed and
applied to the Swedish foreign trade
statisticsTopic (ii), WP 12
Anders Jäder and Anders Norberg, Statistics Sweden
2The data
- Main variables collected monthly
- Commodity code (8-digit CN codes)
- Country of dispatch/arrival
- Quantity (weight and supplementary unit)
- Invoiced Value
- 350 000 observations per month
3Score function
- Computed as a weighted geometric mean of measures
of Suspicion and Potential impact -
4Selective editing
- The 1,500 observations with the highest scores
are flagged
5Suspicion
- The difference between Unit price and the
lower/upper quartile, divided by inter-quartiles
distance. Logarithmic scale - (Euro/Kg)
6Potential Impact
- The difference between Invoiced Value and the
median of Unit price multiplied by
Quantity(Euro)
7(No Transcript)
8(No Transcript)
9(No Transcript)
10(No Transcript)
11(No Transcript)
12(No Transcript)
13Hit rate 30
14Hit rate46
Impact65
15Hit rate30
Impact80
16 Hit rate34Impact81 Best!
17Potential impact
The 8-digit commodity codes can be aggregated to
6, 4 and 2-digit commodity codes (CN6, CN4, CN2)
and other classifications , e.g. the SITC
classification. ? Over 10,000 estimates to be
computed
18Potential impact
- We have developed a formula with which the impact
of an error on the statistics on all aggregation
levels and sizes of estimates can be expressed in
one single variable.
19Potential impact
20Potential impact
21Strategy
- SCB has saved raw and corrected data for all
months since 2000. We analyzed them - New system with parameters
- Produce monthly process data for a continuous
search of best parameter values
Will we be misled when we analyze data that has
been flagged by the old method ???
22Study
- We need many months of historical data current
data is not enough - Homogenous groups modest demand on number of
observations - Computation of median and quartiles weighted by
Quantity - Suspicion versus probability of error
transformation of Suspicion
23Suspicion versus probability of error
Suspicion
24Experiences from production
Hit rate by variable
25Experiences from production
Impact by variable
26Experiences from production
- Impact on variable invoiced value
27Thank You!