Models for cost analysis in health care: a critical and selective review

About This Presentation

Title:

Models for cost analysis in health care: a critical and selective review

Description:

Ljubljana, 1st April 2005 Models for cost analysis in health care: a critical and selective review Dario Gregori Department of Public Health and Microbiology ... – PowerPoint PPT presentation

Number of Views:129

Avg rating:3.0/5.0

Slides: 60

Provided by: DarioG5

Category:

more less

Transcript and Presenter's Notes

Title: Models for cost analysis in health care: a critical and selective review

1
Models for cost analysis in health care a
critical and selective review
Department of BioinformaticsLjubljana, 1st April
2005

Dario Gregori
Department of Public Health and Microbiology,
University of Torino
Giulia Zigon, Department of Statistics,
University of Firenze
Rosalba Rosato, Eva Pagano, Servizio di
Epidemiologia dei Tumori, Università di Torino,
CPO Piemonte
Simona Bo, Gianfranco Pagano, Dipartimento di
Medicina Interna, Università di Torino
Alessandro Desideri, Service of Cardiology,
Castelfranco Veneto Hospital

University of TorinoDepartment of Public Health
and Microbiology
2
Outline

Cost-effectiveness and cost-analisys
Problems in cost analisys of clinical data
zero costs
skewness
censoring
Models for cost data
Two case studies
Diabetes costs in the Molinette cohort
COSTAMI trial

3
The Molinette Diabetes Cohort

3892 subjects, including all type 2 diabetic
patients, resident in region Piedmont, attending
the Diabetic Clinic of the San Giovanni Battista
Hospital of the city of Torino (region Piedmont,
Italy) during 1995 and alive at 1st January 1996.
A mortality and hospitalization follow-up was
carried over up to 30th June 2000.
A sub-cohort of 2550 patients having at least one
hospitalization in the subsequent years was also
identified.
Demographic data (age, sex) and clinical data
relative to the year 1995 ( duration of disease
or years of diabetes and number of other
co-morbidities) were recorded.
Costs (in euros) for the daily and the ordinary
hospitalizations have been calculated referring
to the Italian DRG system.

4
The COSTAMI study

487 patients with uncomplicated AMI were randomly
assigned to three different strategies
(132 patients) early (Day 3-5) use of
pharmacological stress echocardiography and
discharge on days 7-9 in case of a negative test
result
(130 patients) pre-discharge exercise ECG, that
is a maximum, symptom limited test on days 7-9,
followed by discharge in case of a negative test
result
(225 patients) clinical evaluation and hospital
discharge in Day 7-9.
The suggested strategy in case of a positive test
for the strategy 1 and 2 was coronary angiography
followed by ischaemia guided revascularisation
(Desideri et. al, 2003).
A follow up of 1 year for medical costs was
carried out. Cost of hospitalization was
estimated referring to mean reimbursement for the
diagnosis-related groups (DRG).

5
The CE Incremental Ratio

Goal is to compare efficacy with costs
T1, T2 treatment-groups of patients

6
The Cost-Efficacy plane
?C
Upper Threshold
R1
R1c
Lower Threshold
R1B
R2A
?E
R1A
R2B
R2
R2c
7
Dominance

Laska Wakker work (late 80s)
?C lt 0, ?E gt 0 T1 is dominant
?C gt 0, ?E lt 0 T2 is dominant
?C gt 0, ?E gt 0 T1 more effective and more costly
?C lt 0, ?E lt 0 T1 less costly but less effective

If effects are equivalent or of no interest, then
the approach is the analysis of costs alone
8
Typical goals in cost-analysis

To get an estimate of the mean costs of treating
the disease
In experimental settings to test for differences
among two or more groups
In observational settings to identify
patients/structure characteristics influencing
costs
To get an estimate of the expected costs, at a
fixed time point, for specific types of patients
(cost profiling)

9
Typical problems in cost-analysis

The possible large mass of observations with
zero cost
The asymmetry of the distribution, given that
there is a minority of individuals with high
medical cost compared to the rest of the
population
Possible presence of censoring
Right censoring due to loss at follow-up or
administrative rule (OHagan 2002)
Death censoring dead patients are seen as lost
at follow-up, to compensate for higher/earlier
mortality at lower costs (Dudley et al, 1993)
General requisite are
the censoring must be independent or non
informative. This condition is needed because the
individuals still under observation must be
representative of the population at risk in each
group, otherwise the observed failure rate in
each group will be biased
the assumption of proportional hazards may be
violated by the medical costs due to accumulation
at different rates

10
Proportionality on cost accumulation and censoring
Etzioni, 1999
11
Accumulation under alternatives (without
covariates)
12
Censoring some conflicting definitions
Analysis Censoring definition Caveats
Administrative Cost till death (OHagan, 2003) Only dead patients have complete follow-up history Cost and survival are closely related
Loss at follow-up Cost till death Only dead patients have complete follow-up history Possible informative censoring
Death censoring Cost up to a pre-specified time (Harrell, 1993) Only patients arrived alive at the end of follow-up are uncensored Informative censoring
No-censoring (actual data) Observed costs Downward bias in cost estimation
13
Cost distribution
zero-cost patients 2226
Min 1st Q Median Mean 3rd Q Max
99.42 1938 3913 7278 9014 89650
14
Accumulation of costs over time
15
Studies with no-zero mass

OLS on untransformed use or expenditures
OLS for log(y) to deal with skewness
Box-Cox generalization
Gamma regression model with log link
Generalized Linear Models (GLM)
Robustness to skewness
Reduce influence of extreme cases
Good forecast performance
No systematic misfit over range of predictions
Efficiency of estimator

16
Linear models

Ordinary Least Square (OLS) model assumes the
following form for the costs

estimated via Gauss-Markov or ML, in this case
requiring normality and constant variance on
residuals To reduce skewness in the residuals,
the Box-Cox transform of ci can be used

Problems
normality is still assumed
bias is
thus, heteroscedasticity, if present, raises
additional efficiency and inference problems on
the transformed scale

17
Log-normal models

A particular case of transformation is the
ln(Cij) N(?j, sj2) for two treatments j0,1
In this case, E(Cij)exp(?j0.5 sj2) and a test
of H0 ?1 ?20 is a test for the geometric
means. This was argued to be less interesting for
policy makers, but observing
H0 exp(?10.5 s12) exp(?20.5 s22) implies
H0 ?1 ?20 iff s12 s22
Making a test for the geometric means being
equivalent to one on arithmetic means only in
case of homogeneity of variances in the treatment
groups

18
Box-Cox transform varying ?
19
The threshold-logit model

Utilized to model the probability of having costs
in excess of a given threshold, usually chosen as
the median q2 or the third quartile q3 in the
cost distribution

It does not requires normality, and can work also
for very skewed cost-distributions.
Problems
it does not give an estimate of the mean costs,
although it estimates the covariates effects on
costs
conclusions are sensitive to the threshold
chosen, which, in addition is sample-based

20
GLM models

To avoid bias in transforming the costs directly,
since

the idea is to model the transformation of the
expectation

Where the distribution for the response is
usually taken to be Gamma() and the link function
for additive effects as the identity function
I()
for multiplicative models as the log()
allowing in this case back-transformation to
avoid bias

21
GLM and QL/GEE estimate

Use data to find distributional family and link
Family down weights noisy high mean cases
Link can handle linearity
Note difference in roles from Box-Cox
Box-Cox power addresses mostly symmetry in error.
GLM with power function addresses linearity of
response on scale to be chosen
GLM/GEE/GMM modeling approachs estimating
equations

Given correct specification of Eyx µ(xß),
key issues relate to second-order or efficiency
effects This requires consideration of the
structure of v(yx)
22
Variance determination

Accommodates skewness related issues via
variance weighting rather than transform/retransfo
rm methods
Assumes Varyx a E(yx)?
a exp(xß)?
For GLM, solutions are
Adopt alternative "standard" parametric
distributional assumptions,
? 0 (e.g. Gaussian NLLS)
? 1 (e.g. Poisson)
? 2 (e.g. Gamma)
? 3 (e.g. Wald or inverse Gaussian)
Estimate ? via
linear regression of log((y- µ)2) on 1, log( µ)
(modified "Park test" by least squares)
gamma regression of (y- µ)2 on 1, log( µ)
(modified "Park test" estimated by GLM)
nonlinear regression of (y- µ)2 on aµ?
Given choice of ?, can form V(x) and conduct
(more efficient) second-round estimation and
inference

23
Monte Carlo Simulation (Mannings, 2000)

Data Generation
Skewness in dependent measure
Log normal with variance 0.5, 1.0, 1.5, 2.0
Heavier tailed than normal on the log scale
Mixture of log normals
Heteroscedastic responses
Std. dev. proportional to x
Variance proportional to x
Alternative pdf shapes
monotonically declining or bell-shaped
Gamma with shapes 0.5, 1.0, 4.0
Estimators considered
Log-OLS with
homoscedastic retransformation
heteroscedastic retransformation
Generalized Linear Models (GLM), log link
Nonlinear Least Squares (NLS)
Poisson
Gamma

24
Effect of skewness on the raw scale
25
Effects of heavy tails on the log scale
26
Effects of shape for Gamma
27
Effect of heteroschedasticity on the log scale
28
Simulation summary

All consistent, except Log-OLS with homoscedastic
retransformation if the log-scale error is
actually heteroscedastic
GLM models suffer substantial precision losses in
face of heavy-tailed (log) error term. If
kurtosis gt 3, substantial gains from least
squares or robust regression.
Substantial gains in precision from estimator
that matches data generating mechanism

29
The zero problem

Problems with standard model
OLS may predict negative values
Zero mass may respond differently to covariates
These problems may be bigger when higher mass at
0
Alternative estimators
Ignore the problem
ln(ck)
Tobit and Adjusted Tobit models (Heckman type
model)
Two-part models

30
The log(ck) solution

Solution add positive constant k to costs
Advantages
Easy
Log addresses skewness, constant deals with ln(0)
Disadvantages
Zero mass may respond differently to covariates
Many set k1 arbitrarily
Value of k matters, need grid search for optimum
Poorly behaved (Duan 1983)
Retransformation problem aggravated at low end

31
Latent Variables

Sometimes binary dependent variable models are
motivated through a latent variables model
The idea is that there is an underlying variable
y, that can be modeled as
y b0 xb e, but we only observe
y 1, if y gt 0, and y 0 if y 0,

32
The Tobit Model

Can also have latent variable models that dont
involve binary dependent variables
Say y xb u, ux Normal(0,s2)
But we only observe y max(0, y)
The Tobit model uses MLE to estimate both b and s
for this model
Important to realize that b estimates the effect
of x on y, the latent variable, not y

33
Interpretation of the Tobit Model

Unless the latent variable y is whats of
interest, cant just interpret the coefficient
E(yx) F(xb/s)xb sf(xb/s), so
?E(yx)/?xj bj F(xb/s)
If normality or homoskedasticity fail to hold,
the Tobit model may be meaningless

34
Tobit fit to diabetes data
35
Tobit some notes

Only works well if dependent variable is censored
Normal
Places many restrictions on parameters, error
term
Hypersensitive to minor departures from normality
(Almost) never recommended for health economics

36
Mixed models

On the basis of the basic rule of expectation one
can partition

Thus, expectation is splitted in two parts,
Pr(any use or expenditures)
Full sample
Use logit or probit regression
2. Level of use or expenditures
Conditional on c gt 0 (subsample with c gt0)
Use appropriate continuous model
Estimates of mean costs are obtained using the
Duans (1983) smearing estimator (mean of the
exponentiated residuals)

37
Diabetes two-part model
38
Marginal effect in the two-part model
Continuous variable x
P(ygt0)0.54 E(YYgt0)7509.82 For year of
diabetes, this means ?logit 0.025 ?ols49.83 Ma
rginal effect is 208 per year of diabetes
39
Weighted-regression models

To adjust for censoring, the basic idea is to
weight the costs for the inverse of the
probability of being alive, mimicking the basic
Horvitz-Thompson estimator.
Thus, the Bang-Tsiatis (2000) basic estimator is

where d is the censoring indicator, M(t) is the
cumulative cost up to time t and K() is the
Kaplan-Meier estimate
Bang-Tsiatis (2000) proposed an improved version
accounting for cost-history lost due to
censoring, allowing the cost function M() and the
Kaplan-Meier to be estimated in each of the K
intervals, defined optimally according to Lin
(1993)
40
Improving estimation (Jiang, 2004)

Bootstrap confidence interval had much better
coverage accuracy than the normal approximation
one when medical costs had a skewed distribution.
When there is light censoring on medical costs
(lt25) the bootstrap confidence interval based on
the simple weighted estimator is preferred due to
its simplicity and good coverage accuracy.
For heavily censored cost data (censoring rate
gt30) with larger sample sizes (ngt200), the
bootstrap confidence intervals based on the
partitioned estimator has superior performance in
terms of both efficiency and coverage accuracy

41
Censored estimation (diabetes cohort)
Mean estimate SE
Lin estimate (administrative censoring) 5856 249
Cox estimate (death censoring at 4 years) 33896 1249
No-censoring estimate 4488.18 129.44
42
Survival models

The cost function is defined as

and the hazard of having an excess of costs is
modeled avoiding (Coxs model) or not (Weibull
model) the full specification of the baseline ?0
to avoid assumption of proportional accumulation
over time (Etzioni, 1999), an alternative model
can be the Aalen additive regression (Zigon, 2005)
where the hazard rate is a linear combination of
the variables x(c) and a(c) are functions
estimated from the data
43
Survival approach some notes

Coefficients are interpretable as the risk of
having costs greater than actual ones
If proportionality does not hold, then
Baseline cost-hazard with strata
Partition of the costs axis
Model non-proportionality by cost-dependent
covariates ß(c)X ßX(c)
Refer to other models (accelerated failure or
additive hazards)

44
Diabetes Full cohort
45
Issues and models in cost-analysis
X satisfied, o partially satisfied
46
Estimates on the Molinette Cohort

We compared performances of the survival models
with two benchmarks widely (and often
inappropriately) used in the literature, OLS and
Threshold-logit model, using the non-zero costs
cohort

Both normality (Shapiro-Wilk test plt0.0001) and
proportionality in hazards (Grambsch-Therneau
test plt0.001) assumptions refused
47
Covariates effects
48
Estimates of the mean
49
Cost profiling
50
Effect of covariates (Aalen model) on ?(c)
51
One-year cost distribution
52
Cost distribution
53
Cost accumulation over time
54
Model coefficients
Significant coefficients in italic
55
Mean cost estimates
56
Patient profiling
57
Relative accuracy
Deviation () for the fitted model from the
observed data
58
Remarks - I

First papers appeared in late 80 in medical
literature, and a decade before in the
econometrical literature
Censored costs estimators appeared in Lin, 1997
and still growing research (Bang, 2002, 2003)
Still high interest is in the statistical aspects
of no-censoring fitting approaches (Basu, HE,
2004, Etzioni, HE, 2005)
Need for a comprehensive simulation study under
complex situations (censoring and non
proportional accumulation in particular)

59
Remarks - II

Modeling costs is basically an exercise of
fitting adequacy
and
bias reduction
however, it does also have strong impact on
public health aspects, like economic planning and
resource allocation, based on optimal prediction
of future costs (patient profiling).
Nevertheless, caution has to be used in choosing
the model and interpreting results, which can be
a finding due to an artifactual representation of
real cost process, as a consequence of
inappropriate assumptions made on data