Variance estimation for Generalized Entropy and Atkinson inequality indices: the complex survey data case - PowerPoint PPT Presentation

About This Presentation
Title:

Variance estimation for Generalized Entropy and Atkinson inequality indices: the complex survey data case

Description:

Results don t apply to Gini coefficient. Overview of analytical derivation ... Extend results to Gini coefficient and other measures based on order-statistics ... – PowerPoint PPT presentation

Number of Views:211
Avg rating:3.0/5.0
Slides: 20
Provided by: martin377
Category:

less

Transcript and Presenter's Notes

Title: Variance estimation for Generalized Entropy and Atkinson inequality indices: the complex survey data case


1
Variance estimation for Generalized Entropy and
Atkinson inequality indices the complex survey
data case
  • Martin Biewen (Goethe University Frankfurt)
  • Stephen Jenkins (University of Essex)

Presentation at 4th German Stata User Group
Meeting, Mannheim, 31 March 2006
2
Inequality indices measures of the dispersion of
a distribution
  • Imposition of a small number of axioms
    substantially restricts functional form that
    indices may have
  • Axioms for
  • Anonymity
  • Scale invariance
  • Replication invariance
  • Normalization
  • Principle of Transfers mean preserving spread
    in increases

3
Classes of inequality measures satisfying the
axioms
for
  • Generalized Entropy
  • Advantage subgroup decomposability

transfer sensitivity
4
Classes of inequality measures satisfying the
axioms
  • Atkinson index
  • Advantage welfare interpretation
  • Gini coefficient
  • Advantage most well-known inequality index

inequality aversion
5
Estimation of inequality indices
  • These indices are routinely calculated by many
    analysts
  • The most commonly-used programs among Stata users
    are ineqdeco and inequal7 (available using ssc)
  • But only rarely do analysts report estimates of
    the associated sampling variances (or SEs) of the
    esti-mates!

6
Estimation of inequality indices
  • Analytical derivations to date have omitted some
    important situations (and indices)
  • Most derivations assume i.i.d. observations (cf.
    survey clustering or other sample dependencies!),
    and dont consider probability weighting (cf.
    strati-fication!)
  • The methods that do exist are not well known
  • Lack of available software
  • But cf. geivars (Cowell (1989), linearization
    methods i.i.d. assumptions) and ineqerr
    (bootstrap), both available using ssc

7
What we provide
  • Estimates of indices and associated sampling
    varian-ces for all members of the GE and Atkinson
    classes, while also
  • Accounting for clustering and stratification, and
    for the i.i.d. case
  • Analytical results (see our paper) and new Stata
    programs (version 8.2) svygei and svyatk
  • Based on Taylor-series linearization methods
    com-bined with a result from Woodruff (JASA,
    1971).
  • Results dont apply to Gini coefficient.

8
Overview of analytical derivation
  • Write estimator of each index as a function of
    popula-tion totals (involves sums over clusters,
    weights etc.)
  • (Taylor-series approximation) Variance of each
    esti-mator can be approximated by variance of 1st
    order residual
  • As is, each expression is not easily calculated
  • But (Woodruff) reversing order of summation in
    residual ? estimation is equivalent to
    derivation of a sampling variance of a total
    estimator for which one can apply standard svy
    methods

9
The programs svygei and svyatk
  • svygei varname if exp in range ,alpha()
    subpop(varname) level()
  • svyatk varname if exp in range ,epsilon()
    subpop(varname) level()
  • Where, of course, the data have first been
    svyset.
  • How data are organised, and described using
    svyset is of crucial importance

Calculations for
(use alpha() option to chose one other than
)
Calculations for
(use epsilon() option to chose one other than
)
10
Survey data set-up for estimation of inequality
among individuals
  • 1) Observation unit is person sampling unit is
    household all persons in each household
    attributed with the equivalised income of the
    house-hold to which they belong individual
    sample weight available (xwgt) but no
    information about PSU or strata
  • 2) As 1), except also know PSU and strata
    information (includes allowance for
    within-household correlation)
  • 3) Observation unit is household sampling unit
    is household
  • weight (xhhwgt) household sample weight
    household size
  • no information about PSU or strata

svyset pwxwgt, psu(hh_id)
svyset pwxwgt, psu(PSU_id) strata(STRATA_id)
svyset pwxhhwgt
? i.i.d. case
11
Illustration
  • German Socio-Economic Panel (GSOEP), wave 18 data
    (2001) used as a cross-section
  • 12,939 individuals in 5,195 households 1004 PSUs
    (psu), 169 strata (strata)
  • Equivalized (square-root equivalence scale)
    post-tax post-benefit household income (eq)
  • Each individual attributed with the equivalised
    income of her household (? clustering within
    households)
  • Even if survey does not include PSU and strata
    identifiers, you should account for this (use
    house-hold identifier as PSU variable)

12
Generalized Entropy indices
  • . ssc install svygei_svyatk
  • . version 8.2
  • . svyset pweightxwgt, psu(psu) strata(strata)
  • . svygei eq
  • Complex survey estimates of Generalized Entropy
    inequality indices
  • pweight xwgt
    Number of obs 12939
  • Strata strata
    Number of strata 169
  • PSU psu
    Number of PSUs 1004

  • Population size 31487411
  • --------------------------------------------------
    -------------------------
  • Index Estimate Std. Err. z
    Pgtz 95 Conf. Interval
  • -------------------------------------------------
    -------------------------
  • GE(-1) .1179647 .00614786 19.19
    0.000 .1059151 .1300143
  • MLD .1020797 .00495919 20.58
    0.000 .0923599 .1117996
  • Theil .1027892 .0058706 17.51
    0.000 .091283 .1142954
  • GE(2) .1201693 .00962991 12.48
    0.000 .101295 .1390436

13
Atkinson indices
  • . svyset pweightxwgt, psu(psu) strata(strata)
  • . svyatk eq
  • Complex survey estimates of Atkinson inequality
    indices
  • pweight xwgt
    Number of obs 12939
  • Strata strata
    Number of strata 169
  • PSU psu
    Number of PSUs 1004

  • Population size 31487411
  • --------------------------------------------------
    -------------------------
  • Index Estimate Std. Err. z
    Pgtz 95 Conf. Interval
  • -------------------------------------------------
    -------------------------
  • A(0.5) .0496963 .0025263 19.67
    0.000 .0447448 .0546477
  • A(1) .0970424 .00447794 21.67
    0.000 .0882658 .105819
  • A(1.5) .1434968 .00616915 23.26
    0.000 .1314055 .1555881
  • A(2) .1908923 .00804946 23.71
    0.000 .1751157 .206669
  • A(2.5) .2432834 .01237288 19.66
    0.000 .219033 .2675338
  • --------------------------------------------------
    -------------------------

14
Subpopulation option
  • . gen female sex2
  • . svygei eq, subpop(female)
  • Complex survey estimates of Generalized Entropy
    inequality indices
  • pweight xwgt
    Number of obs 12939
  • Strata strata
    Number of strata 169
  • PSU psu
    Number of PSUs 1004

  • Population size 31487411
  • Subpop female, subpop. size 16499055
  • --------------------------------------------------
    -------------------------
  • Index Estimate Std. Err. z
    Pgtz 95 Conf. Interval
  • -------------------------------------------------
    -------------------------
  • GE(-1) .112828 .00573308 19.68
    0.000 .1015914 .1240646
  • MLD .0994741 .00471331 21.10
    0.000 .0902362 .1087121
  • Theil .0998958 .00543287 18.39
    0.000 .0892476 .110544
  • GE(2) .1151464 .00877057 13.13
    0.000 .0979564 .1323364

15
Empirical illustration in our paper
  • GSOEP income data for 2001 (same as used here)
  • British Household Panel Survey for 2001 (9,979
    indi-viduals in 4,058 households 250 PSUs, 75
    strata)
  • Results
  • Inequality larger in Britain than in Germany, for
    all indices, and difference is statistically
    significant
  • z-ratios (index ? SE) vary from 7.5 to 23.9 (DE)
    and 5.1 to 31.9 (GB), being smallest for
    top-sensi-tive indices and largest for
    middle-sensitive indices
  • Although sample larger in Germany, z-ratios are
    not always smaller (? different sample designs)

16
Empirical illustration (ctd.)
Index Germany Germany Germany Great Britain Great Britain Great Britain
Index Est. Std. z-rat. Est. Std. z-rat.
GE(-1) .11796 .00614 19.19 .31329 .03751 8.35
MLD .10207 .00496 20.58 .17420 .00608 28.64
Theil .10278 .00587 17.51 .16769 .00755 22.19
GE(2) .12016 .00963 12.48 .21164 .01868 11.33
reject
17
Empirical illustration (ctd.)
  • Effects of different assumptions about survey
    design on sampling variance estimates?
  • For each index, the estimated standard error is
    larger if one accounts for survey clustering and
    stratification (unsurprising), but
  • Results suggest that accounting for survey design
    features per se have little (additional) effect
    on variance estimates as long as the replication
    of incomes within multi-person households is
    ac-counted for

18
Conclusions
  • Researchers now have the means to estimate
    samp-ling variances for most of the inequality
    indices in common use, accomodating a range of
    potential assumptions about design effects
  • Topics for future research
  • GE indices are additively decomposable by
    popula-tion subgroup (? ineqdeco) extend results
    here to the components of decompositions
  • Extend results to Gini coefficient and other
    measures based on order-statistics (Lorenz curves
    etc.)

19
Selected references
  • Biewen, M. and Jenkins S.P. (2006) Estimation of
    Generalized Entropy and Atkinson indices from
    com-plex survey data, forthcoming in Oxford
    Bulletin of Economics and Statistics
  • Cowell, F.A. (2000) Measurement of inequality,
    in A.B. Atkinson and F. Bourguignon (eds),
    Handbook of Income Distribution, Vol. 1,
    Elsevier, Amsterdam
  • Woodruff, R.S. (1971) A simple method for
    approxi-mating the variance of a complicated
    estimate, Jour-nal of the American Statistical
    Association, 66, 411-4
Write a Comment
User Comments (0)
About PowerShow.com