Detecting Multi-Item Associations and Temporal Trends Using the WebVDME/MGPS Application - PowerPoint PPT Presentation

About This Presentation

Title:

Detecting Multi-Item Associations and Temporal Trends Using the WebVDME/MGPS Application

Description:

For example, if 2% of all reports have PROZAC as a drug, and 3% of all reports ... of the reports will include this combination (PROZAC in combination with RASH) ... – PowerPoint PPT presentation

Number of Views:140

Avg rating:3.0/5.0

Slides: 34

Provided by: richardfer

Learn more at: http://archive.dimacs.rutgers.edu

Category:

more less

Transcript and Presenter's Notes

Title: Detecting Multi-Item Associations and Temporal Trends Using the WebVDME/MGPS Application

1
Detecting Multi-Item Associations and Temporal
Trends Using the WebVDME/MGPS Application

DIMACS Tutorial on Statistical and Other Analytic
Health Surveillance Methods
18 June 2003
Richard Ferris

2
Pharmaceutical post-marketing surveillance

Companies and regulatory agencies collect
databases of spontaneous adverse reaction reports
Relevant exposure data not readily available (the
denominator problem)
Can drug-event combinations of potential interest
be identified from internal evidence alone?
Approach
Use an internally defined denominator
Construct set of expected counts using a
stratified independence model

3
Computation of Expected Counts

The expected count for a given drug-event
combination is determined by the overall count
for the particular drug (across all events) and
the overall count of the particular event (across
all drugs)
For example, if 2 of all reports have PROZAC as
a drug, and 3 of all reports have RASH as an
event, then one would expect that 0.06
(0.020.03) of the reports will include this
combination (PROZAC in combination with RASH)
(MGPS carries out this computation separately for
each distinct stratum and sums the
strata-specific expected counts to obtain an
overall expected count)

4
Comparing Observed and Expected CountsRelative
Reporting Rate

Relative Report Rate (RR) RRij Nij / Eij
Easy to interpret, easy to compute
Statistically unstable if N is small or E is very
small
The following all have RR 100
N 1000, E 10
N 100, E 1
N 10, E 0.1
N 1, E 0.01

5
Comparing Observed and Expected
CountsStatistical Significance

What is the probability that Nij would be
observed by chance (sampling error) when
expected value is Eij ? (p-value for testing a
null hypothesis)
Harder to interpret (not expressed in same units
as RR)
Results in computation of absurdly small
probabilities that have no meaning
N100, E1 produces 10-158 !
Small RR can be very significant (small p-value)
when sample size is very large
N 2000, E 1000, RR 2 is more
significant than
N 10, E 0.1, RR 100

6
Comparing Observed and Expected CountsEmpirical
Bayes Multi-Item Gamma Poisson Shrinker

Try for best of both previous approaches
interpretability of relative rate
adjust properly for sampling variation
Focus on the distribution across the set of
drug-event combinations of the ratios
Estimate lij mij /Eij , where Nij
Poisson(mij )
Fit a parameterized prior distribution function
(mixture of two gamma functions) to the empirical
distribution of the ls
Find posterior distribution of l after observing
N some value n
Use this to obtain posterior estimate of
expectation value of l given observation of Nij
This posterior estimate is what we call EBGM
(Empirical Bayes Geometric Mean) also get lower
and upper 95 confidence bounds (EB05, EB95).
EBGM is termed the shrinkage estimate for RR

7
Multi-Item Associationsvs. Pairwise Associations

Consider the case of an item triplet e.g. 2
drugs and an event
RRijk Nijk/Eijk where Eijk is based on
independence model
EBGMijk shrinkage estimate of RRijk
Suppose a particular itemset (drug A, drug B,
event C kidney failure) is unusually frequent
(EBGM for the triplet is gtgt 2)
Important to ask
Is this merely the result of one or more of the
pairs (AB, AC, BC) being unusually frequent? OR
Is this a drug-drug interaction
Compare Empirical Bayes estimate of the frequency
count of the triplet to the prediction from the
all-2-factor log-linear model
EXCESS2 (EBGM E ) EAll2F
E is the expected count from independence
Computation of EAll2F uses shrinkage estimates of
pairwise counts
EXCESS2 is an estimate of how many extra cases
were observed over what was expected using the
all-2-factor model
Alternate approach Define Eijk from predictions
of all-2-factor model in which case resulting
EBGM directly measures divergence of observed
count from all-2-factor prediction

8
Health Authority Adoption of Signal Detection
Technologies

FDA
CDER
Experimented in Office of Biostatistics with GPS
for several years
Validated GPS
Moving to production
Have published data mining results on internal
web for almost all products
CBER
initial GPS implementation (VAERS)
CRADA between Lincoln and FDA to further develop
methodology and tools
CDC
Collaborative GPS methodology development with
FDA
Includes simulation capability
WHO Uppsala Monitoring Centre
Production safety signal generation mechanism
using BCPNN

9
FDA/GPS Validation Activities

Positive controls
Examine data mining results for drug-event
combinations corresponding to known labeled
adverse reactions
Negative controls
Examine data mining results for several drugs
(with differing safety profiles) given for the
same indication
Roll back database in time to determine when
method would have provided first signal

10
Databases of Spontaneous AE Reports

FDA Spontaneous Report System (SRS)
Post-Marketing Surveillance of all Drugs since
1969
Dates from mid-60s thru 1997
1.5 Million Reports
Encoded in COSTART
FDA Adverse Event Reporting System (AERS)
US cases, serious unlabeled events from all
manufacturers.
All products sold in the US 5000 Rxs
Replaced SRS in 1997
Reactions coded as MedDRA PTs
Quarterly Updates, 4-6 month delay
Drugs are Verbatim
Includes initial and some follow-up reports
Includes Demographics, Reactions, Drugs,
Outcomes, etc.
FDA/CDC Vaccine Adverse Events (VAERS)
Stricter Laws for Vaccine Adverse Event Reporting

11
Signal Detection DemonstrationUsing VAERS Data
12
Significant EBGM and even extremely
conservative EB05 with small N
13
Simple Rankings by Signal Strength
14
Evolution of Signals Over Time
15
Multi-Symptom Syndromes (Higher Order
Associations)
16
The Serotonin Syndrome

Could MGPS be used to identify unknown syndromes?
Try mining the AERS data for significant event
triples using a known syndrome.
"The symptoms of the serotonin syndrome are
euphoria, drowsiness, sustained rapid eye
movement, overreaction of the reflexes, rapid
muscle contraction and relaxation in the ankle
causing abnormal movements of the foot,
clumsiness, restlessness, feeling drunk and
dizzy, muscle contraction and relaxation in the
jaw, sweating, intoxication, muscle twitching,
rigidity, high body temperature, mental status
changes were frequent (including confusion and
hypomania - a "happy drunk" state), shivering,
diarrhea, loss of consciousness and death. (The
Serotonin Syndrome, AM J PSYCHIATRY, June 1991)

17
(No Transcript)
18
Using Simulation to Testthe Signal Detection
Process
19
Interpreting Simulation Parameters
Outcome
Yes
No
Yes
P-R
R
P
Exposure
1-P-QR
Q-R
No
1-P
Q
1-Q
1

As R ? P and (Q-R) ? (1-P) gt No Signal
As R ? P and (Q-R) ltlt (1-P) gt Strong Signal
When R ltlt P and (Q-R)?(1-P) gt No Signal
When R ltlt P and (Q-R) ltlt (1-P) gt Rare event

20
Using Simulation to Create a Receiver Operating
Characteristic (ROC) Curve for EBGM

An ROC curve displays the true-positive rate
(sensitivity) versus the false-positive rate(1
specificity) for a statistic
Ran a 20 iteration simulation using P 0.003Q
0.001 and R 0.00003 (RR 10) to check the
true-positive rate
Ran a 20 iteration simulation using P 0.003,Q
0.001 and R 0.0003 (RR 1) to check the
false-positive rate

21
ROC Curve Based on Simulated Injection of Signals
22
Simulating a Rare Event

Sample 100,000 records from VAERS data
Set P 0.003, Q 0.001, R 0.00003
Iterate 20 Monte Carlo simulations
Expect (on average)
0.003 x 100,000 300 Rare Exposures
0.001 x 100,000 100 Rare Outcomes
0.00003 x 100,000 3 Rare Exposure Rare
Outcome combinations
E (300 x 100) / 100,000 0.3
RR 3/ 0.3 10

23
Base Simulation on VAERS Data
24
Sample Cases From VAERS
25
Sample 100,000 Cases
26
P 0.003 Q 0.001 R 0.00003
27
20 Monte Carlo Iterations
28
RareExposure Expected N 300
29
RareOutcome Expected N 100
30
RareExposure RareOutcome Expected N
3Expected RR 10
31
Technical Details

William DuMouchel. Bayesian Data Mining in Large
Frequency Tables (with Discussion). The American
Statistician (1999) pp 177-190.
William Dumouchel and Daryl Pregibon. Empirical
Bayes Screening for Multi-Item Associations.
Proceedings of KDD 2001.

32
Methodology History and Key Contributors

Stephan Evans
MCA, UK
Proportional reporting ratio (PRR) with Chi 2
analyses
Simple, highly intuitive, can be calculated by
hand
Bate, Lindquist, Edwards et. al.
WHO Uppsala Monitoring Centre
Bayesian neural network method for adverse drug
reaction signal generation
Ana Szarfman, FDA (CDER) and Bill DuMouchel (ATT)
Empiric Bayes, more robust than PRR for small n
MGPS method statistical parameter is EGBM
William DuMouchel. Bayesian Data Mining in Large
Frequency Tables (with Discussion). The American
Statistician (1999) pp 177-190.
William Dumouchel and Daryl Pregibon. Empirical
Bayes Screening for Multi-Item Associations.
Proceedings of KDD 2001.
Multidimensional analyses possible
Interactions, gender and other demographic
associates, syndrome identification
Can directly compare EBGM values of different
drugs, as well as for a specific drug

33
Key Contributors (continued)

WHO Collaborating Center for Internatl Drug
Monitoring M Lindquist, M Stahl, A. Bate, R.
Edwards, RH Meyboom.
Bayesian confidence propagation neural network
(BCPNN) . Information Component (IC) statistic is
the measure of the strength of DE relationship
Iterative approach
L. Gould . Comparison and refinement of Bayesian
approaches for evaluating spontaneous reports of
ADRs. DIA Annual meeting, July 2001, (Denver)
EB vs BCPNN similar results
Thakrar, BT, Blesch, KS, Sacks, ST, Wilcock, K
(2001)
(ISPE, Pharmacoepid. Drug Safety 10),
PRR vs. EB similar sensitivity, EB better at
ranking events based on small N.