Title: New interval estimating procedures for the disease transmission probability in multiplevector transf
1New interval estimating procedures for the
disease transmission probability in
multiple-vector transfer designs
- Joshua M. Tebbs and Christopher R. Bilder
- Department of Statistics
- Oklahoma State University
- tebbs_at_okstate.edu and chris_at_chrisbilder.com
2Introduction
- Plant disease is responsible for major losses in
agricultural throughout the world - Diseases are often spread by insect vectors
(e.g., aphids, leafhoppers, planthoppers, etc.) - Example www.knowledgebank.irri.org/ricedoctor_mx/
Fact_Sheets/Pests/Planthopper.htm
3Example
- Ornaghi et al. (1999) study the effects of the
Mal Rio Cuarto (MRC) virus and its spread by
the Delphacodes kuscheli planthopper - The MRC virus is most-damaging maize virus in
Argentina - It was desired to estimate p, the probability of
disease transmission for a single vector - Vector-transfers are often used by plant
pathologists wanting to estimate p - In such experiments, insects are moved from an
infected source to the test plants
4Single-vector transfers
- The most straightforward way to estimate p is by
using a single-vector transfer - Each test plant contains one vector, and test
plants must be individually caged - Under the binomial model, the proportion of
infected test plants gives the maximum likelihood
estimate of p - Disadvantages with a single-vector transfer
- Requires a large amount of space (since insects
must be individually isolated) - Is a costly design since one needs a large number
of test plants and individual cages
5Multiple-vector transfers
- A group of s gt 1 insect vectors is allocated to
each test plant. - Even though test plants are occupied by multiple
insects, the goal is still to estimate p, the
probability of disease transmission for a single
vector
?
6Multiple-vector transfers
- Advantages of a multiple-vector versus
single-vector transfer - Potential savings in time, cost, and space
- Statistical properties of estimators are much
better (for a fixed number of test plants) - A multiple-vector transfer is an application of
the group-testing experimental design - Other applications of group testing
- Infectious disease seroprevalence estimation in
human populations - Disease-transmission in animal studies
- Drug discovery applications
7Notation and assumptions
- Define
- n number of test plants
- s number of insects per plant (group size)
- Y1 infected test plant plant for which at
least one vector (out of s) infects - Y0 uninfected test plant plant for which no
vectors (out of s) infect - Assumptions
- Common group size s
- The statuses of individual vectors are iid
Bernoulli random variables with mean p - The statuses of test plants are independent
- Test plants are not misclassified
8Maximum likelihood estimator for p
- Let T ?Y denote the number of infected test
plants. Under our design assumptions, T has a
binomial distribution with parameters n and - The maximum likelihood estimator of p is given by
- where (the proportion of infected
test plants) - Estimates of p are computed by only examining the
test plants (and not the individual vectors
themselves) - The binomial model is only appropriate if test
plants do not differ materially in their
resistance to pathogen transmission
9Properties of the MLE and the Wald CI
- The statistic has the following properties
- Consistent as n gets large
- Approximately normally distributed more
precisely, where - A 100(1-?) percent Wald confidence interval is
given by - where
10Variance stabilizing interval (VSI)
- Goal Find whose variance is free of the
parameter p - Solve the following differential equation
- With c0 1, a solution is given by
- It follows that
- is a 100(1-?) percent confidence interval for p.
Here,
11Modified Clopper-Pearson (CP) interval
- The number of infected test plants, T, has a
binomial distribution with parameters n and
- One can obtain an exact Clopper-Pearson interval
for ? and then transform back to the p scale
(Chiang and Reeves, 1962) - Exact 100(1-?) percent confidence limits for p
are given by -
and - where F1-?,a,b denotes the 1-? quantile of the
central F distribution with a (numerator) and b
(denominator) degrees of freedom
12Comparing the Wald, VSI, and CP
- The Wald interval is simple and easy to compute.
However, it has three main drawbacks - Provides symmetric confidence intervals even
though the distribution of may be very skewed - Often produces negative lower limits when p is
small! - The VSI handles each of these drawbacks
- Not symmetric
- Always produces lower limits within the parameter
space (i.e., strictly larger than zero) - The CP intervals main advantage is that its
coverage probability is always greater than or
equal to 1-?. However, such intervals can be
wastefully wide, especially if n is small.
13Bayesian estimation
- Prior distribution for p
- One parameter Beta distribution
- for a known value of ?
- Takes into account p is small
- Example when ? 52.4
14Bayesian estimation
- Prior distribution for p
- Why use one parameter instead of two parameter
Beta? - Sensible model acknowledging p is small
- Bayes and empirical Bayes estimators are simpler
- Resulting estimator using squared error loss with
a two parameter beta is ratio of complicated
alternating sums - See Chaubey and Li (Journal of Official
Statistics, 1995) for Bayes estimators
15Bayesian estimation
- Posterior distribution for 0 lt p lt 1
- Note U 1 - (1 - P)s beta(t 1, n - t ?/s)
16Empirical Bayesian estimation
- Use the marginal distribution for T to derive an
estimate for ? - Why?
- Avoid possible poor choice for ?
- n is often small in multiple-vector transfer
experiments - Posterior may be adversely affected by the prior
- Marginal distribution of T for t 0, 1, , n
- Maximize fT(t?) as a function of ? to obtain the
marginal maximum likelihood estimate, - Iteratively solve for ? inwhere ?( ) is the
digamma function
17Credible intervals
- (1 - ?)100 Equal-tail
- pL, pU satisfy
and - Use relationship with Beta distribution, U 1 -
(1 - p)s beta(t 1, n - t /s) - Interval
- where B?,a,b is the ? quantile of a Beta(a,b)
distribution
Remember that ? 1 - (1 - p)s implies p 1 - (1
- ?)1/s
18Credible intervals
- (1 - ?)100 highest posterior density (HPD)
regions - Posterior is unimodal and right skewed
- Find pL, pU such that (1 - ?)100 area of
posterior density is included and pU - pL is as
small as possible - See Tanner (1996, p. 103-4)
- Key is to sample from posterior distribution
- Use U 1 - (1 - p)s beta(t 1, n - t /s)
relationship
19Example - Ornaghi et al. (1999)
- Data
- s 7 planthoppers per plant
- n 24 plants
- t 3 infected plants observed
-
- 95 interval estimates for p
20Interval comparisons
- Coverage where I(n,t,s) 1 if the interval
contains 1 and I(n,t,s) 0 otherwise. - Do not consider the t 0 and t n cases
- Poor multiple-vector transfer experimental design
- See Swallow (1985, Phytopathology) for guidance
in choosing s - Brown, Cai, and DasGupta (2001, Statistical
Science) - Frequentist evaluation similar to how Carlin and
Louis (2000) approach evaluating confidence and
credible intervals
21Interval comparisons
- ? 0.05, n40, and s10
- Black line denotes Wald bold line denotes plot
title
22Summary
- Best interval VSI or modified Clopper-Pearson
- Credible intervals may be improved by taking into
account variability of the ? estimators - Bootstrap intervals mentioned in abstract VSI
and Clopper-Pearson perform better - Many other intervals could be investigated!
- Website
- www.chrisbilder.com/bilder_tebbs
- Contains R programs for examining the interval
estimation properties - Different values of p, n, and s can be used
- Also calculates empirical Bayes estimators
- Program for Ornaghi et al. (1999) data example
23New interval estimating procedures for the
disease transmission probability in
multiple-vector transfer designs
- Joshua M. Tebbs and Christopher R. Bilder
- Department of Statistics
- Oklahoma State University
- tebbs_at_okstate.edu and chris_at_chrisbilder.com
Contact address starting Fall 2003 Joshua M.
TebbsDepartment of StatisticsKansas State
University
Christopher R. BilderDepartment of
StatisticsUniversity of Nebraska-Lincolnchris_at_ch
risbilder.com