Title: A method for estimating the age-specific mortality pattern in limited populations of small areas
1A method for estimating the
age-specific mortality pattern in limited
populations of small areas
- Anastasia Kostaki
- Athens University of Economics and Business
- email kostaki_at_aueb.gr
- Byron Kotzamanis
- University of Thessaly, Greece
- email bkotz_at_prd.uth.gr
2Need of analytical and reliable mortality data
- e.g.
- for providing population projections
- for construction of complete life tables
- for calculation of net reproduction rates
- for Age-specific mortality comparisons
- for construction of complete multiple decrement
tables
3Problems in mortality data of small population
areas
- A. Limited information
- Aggregated data or
- Incomplete data sets
- B. Misleading information (Data of low quality)
- Data affected by sources of systematic errors
(age misstatements heaping, under registrations
of deaths, etc). - C. Unstable documentation (small exposed-to-risk
population, and low age-specific death counts)
3
4Limited information
- Incomplete data sets
- POSSIBLE WAYS OUT
- Fit a parametric model to the existed one-year
values in order to produce estimates for the
missing values and also smooth the rates,
providing closer estimates to the true
probabilities underlying the empirical rates. - e.g. 1992 Kostaki A. "A Nine-Parameter
Version of the Heligman-Pollard Formula".
Mathematical Population Studies, Vol. 3, No. 4,
pp. 277-288. - or
- Apply a nonparametric graduation technique to
existed one-year values - e.g. 2005 Kostaki, A., Peristera P.
Graduating mortality data using Kernel
techniques Evaluation and comparisons Journal
of Population Research, Vol 22(2), 185-197 . - 2009 Kostaki, A., , Moguerza,M.J.,
Olivares, A., Psarakis, S. Graduating the
age-specific fertility pattern using Support
Vector Machines. Demographic Research., Vol
20(25) 599-622. - 2010 Kostaki, A., , Moguerza,M.J.,
Olivares, A., Psarakis, Support Vector Machines
as tools for Mortality Graduations to appear in
Canadian Studies in Population 38, No. 34, pp.
3758.
4
5B. Misleading information(data of low quality)
- A Typical problem
- HEAPING At age declaration, a preference of the
responder to round off the age in multiples of
five.
5
6- A way out
- Group death counts in five-year groups
with central ages the multiples of five. Form
the five-year death rates for these groups. Then
apply an expanding technique in order to estimate
the correct rates and/or counts. -
- TECHNIQUES FOR EXPANSION
- 1991 Kostaki, A. "The Helignman - Pollard
Formula as a Tool for Expanding an Abridged Life
Table". Journal of Official Statistics Vol. 7, No
3, pp. 311-323. - 2000 Kostaki A., Lanke J. Degrouping mortality
data for the elderly Mathematical Population
Studies, Vol. 7(4), pp. 331-341. - 2000 Kostaki A. A relational technique for
estimating the age-specific mortality pattern
from grouped data . Mathematical Population
Studies, Vol. 9(1), pp.83-95. - 2001 Kostaki, A., Panousis, E. Methods of
expanding abridged life tables Evaluation and
Comparisons Demographic Research, Vol 5(1) pp
1-15. http//www.demographic-research.org/volumes/
vol5/1/ - These techniques can also be used if the
data are provided in five-year age groups
7- C. Small exposed-to-risk populations - low death
counts - (a sneaky problem)
-
- On the contrary of problems of types A and B
(incompleteness bad quality), problems of type C
is easy for the researcher to neglect, though
their impact can lead to seriously misleading
results. - Let us consider this problem, which is highly
actual when we deal with spatial (small area)
population analysis or limited population samples.
7
8- Denote as nDx the observed death count at
the age interval x, x5) - nDx is a random variable binomially distributed
with - E(nDx )Ex .nqx and Var(nDx) Ex . nqx
.npx -
- where Ex is the exposed-to risk population at
age x, - and nqx is the unknown probability of dying in
x,xn).
8
9- Let us now consider the observed death rate,
- nQx nDx/Ex
- Since nDxBin , while Ex is large and nqx
very small, - nDx can also be considered
as approximately Po , with - E(nDx)Var(nDx) Ex nqx
- and also asymptotically normal distributed.
Therefore nQx can also be considered as
asymptotically normal distributed, with - E(nQx) nqx and Var(nQx) nqx (1- nqx)/Ex
9
10- Hence, the unknown probability of dying at the
age interval x, xn), nqx is expected to have a
value in the interval - or simpler
10
11- What do all these mean in practice?
- Consider a real example in Eurytania in
Greece, the exposed population at age 10 is E10
1888 while the death count at age 10 5D100
. Thus the observed death rate 5Q10 at age
10 is zero! - Does it mean that 5q100 ? . Unfortunately
not! - gt We are not able to provide estimates for the
value of 5q10.
- Another example from the same population
- The death count at age 55, D55 6 and the
exposed population, E551464 - Thus Q556/1464gt Q550,0041
- Thus calculating the confidence interval we
conclude that - -0,0009ltq55lt0,0091 !!!
- Qx can be a highly inaccurate estimator of
qx when Dx is - small
11
12Alternatively a more properly defined CI might
be derived from the following
this is equivalent to
The inequality in the probability statement is
satisfied for those values of which lie
between the roots of the
quadratic equation
13We therefore have
and forms a CI for
where are given by 2002
Garthwaite, P.H., Jolliffe, I.T., Jones, B.,
Statistical Inference 2nd edition, Oxford Science
Publication
14Considering the previous example
- The count of deaths at the age 55, D55
is 6 and the exposed population, E55 is equal
to 1464 - Q556/1464gt Q550,0041
- Putting these values in Q55 z1-a/2
vQ55(1-Q55)/Ex - we took -0,0009ltq55lt0,0091, (95 CI
0.0008, 0.0074) - while now using the above defined CI
- we take 0,0012ltq55lt0,0130 , (95 CI
0.0014, 0.0112) - which is better in the sense that the limits
are always positive but still too wide to be
useful!
14
15Possible ways out
- Use wide age intervals (five-year ones, or ten
year ones) - Consider wide periods of investigation (up to
ten-year periods) - Utilize an expansion technique for estimating
the unknown age-specific probabilities from
grouped data. - The later can be a cure for problems of
Types A and B too - At the outset a technique for estimating the
age-specific - death counts from data given in age-groups is
presented.
15
16A technique for estimate age-specific death
counts data given in age groups
- Consider the empirical death count for the
five-year age interval x, x5), 5Dx x0,
5, 10, w-5. - Then the five-year death rates , x0, 5, 10,
are calculated by
where the summation is restricted to multiples
of five. Obviously the consideration of the
exposed-to-risk population of a given age as the
sum of deaths after that age is precisely valid
when the data concern a closed cohort. However,
as it will be demonstrated, this procedure
produces excellent results.
17- Let us consider the set of the abridged
5qx-values as calculated before. The next step
is to expand them. For that - Let us also consider a set of one-year
probabilities, qx(s) (S for Standard) of a
standard complete life table. - Under the assumption that the force of
mortality, µ(x), underlying the target abridged
life table is, in each age of the 5-year age
interval x, x5), a constant multiple of the
one underlying the standard life table in the
same age interval, µ(S)(x), i.e
17
18- the one-year probabilities qxi , i 0,1,..., 4,
for each age in the five-year age interval can
be calculated using -
-
(1)
-
-
where
An inherent property of the new technique is that
its results fulfill the desired relation
19Italy, Males 1990-91
19
20Norway males 1951-55
20
21- Now for the ages that are multiples of five
using will be calculated using - while for the rest of ages we calculate
using
22- It is interesting to observe that the resulting
fulfill the desirable property -
23(No Transcript)
24(No Transcript)
25(No Transcript)
26(No Transcript)
27Comments and guidelines
- The results are nice in the sense that they are
very close to the - true values and also fulfill desirable
properties. - The choice of the standard table does not affect
the results . Use every complete life table you
have at hand! - Easiest to apply using a simplest software
-
28 29- References
-
- 2002 Garthwaite, P.H., Jolliffe, I.T.,
Jones, B., Statistical Inference 2nd edition,
Oxford Science Publications - 2001 Karlis D., Kostaki A. Bootstrap
techniques for mortality models Biometrical
Journal, Vol. 44(7) pp 850-866. - 1992 Kostaki A. "A Nine-Parameter
Version of the Heligman-Pollard Formula".
Mathematical Population Studies, Vol. 3, No. 4,
pp. 277-288. - 2005 Kostaki, A., Peristera P.
Graduating mortality data using Kernel
techniques Evaluation and comparisons Journal
of Population Research, Vol 22(2), 185-197 . - 2009 Kostaki, A., , Moguerza,M.J.,
Olivares, A., Psarakis, S. Graduating the
age-specific fertility pattern using Support
Vector Machines. Demographic Research., Vol
20(25) 599-622. - 2010 Kostaki, A., , Moguerza,M.J.,
Olivares, A., Psarakis, Support Vector Machines
as tools for Mortality Graduations to appear in
Canadian Studies in Population 38, No. 34, pp.
3758. -