Safety Data Mining: Background and Current Issues - PowerPoint PPT Presentation


PPT – Safety Data Mining: Background and Current Issues PowerPoint presentation | free to download - id: 44f679-NzIxY


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

Safety Data Mining: Background and Current Issues


Safety Data Mining: Background and Current Issues Ramin Arani, PhD Safety Data Mining Global Biometric Science Bristol-Myers Squibb Company SAMSI: July, 2006 – PowerPoint PPT presentation

Number of Views:163
Avg rating:3.0/5.0
Slides: 33
Provided by: ara94
Learn more at:


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Safety Data Mining: Background and Current Issues

Safety Data Mining Background and Current Issues
  • Ramin Arani, PhD
  • Safety Data Mining
  • Global Biometric Science
  • Bristol-Myers Squibb Company
  • SAMSI July, 2006

  • Rationale for Pharmacovigilance
  • AERS Data Base
  • Data base issues
  • Methodologies
  • BCNN (WHO)
  • MGPS (FDA)
  • Summary
  • Challenges and Opportunities

Pharmacovigilance - Rationale
  • Information obtained prior to first marketing is
    inadequate to cover all aspects of drug safety
  • tests in animals are insufficiently
    predictive of human safety,
  • in clinical trials patients are selected and
    limited in number,
  • conditions of use in trials differ from those
    in clinical practice,
  • duration of trials is limited
  • information about rare but serious adverse
    reactions, chronic toxicity, use in special
    groups or drug interactions is often not

Pharmacovigilance - Rationale
Spontaneous AE Reports
  • Safety information from clinical trials is
  • Few patients -- rare events likely to be missed
  • Not necessarily real world
  • Need info from post-marketing surveillance
    spontaneous reports
  • Pharmacovigilance by reg. agencies mfrs carried
  • Long history of research on issue
  • Finney (MIMed1974, SM1982) Royall (Bcs1971)
  • Inman (BMedBull1970) Napke (CanPhJ1970)

  • Incomplete reports of events, not necessarily
  • How to compute effect magnitude
  • Many events reported, many drugs reported
  • Bias noise in system
  • Difficult to estimate incidence because no. of
    pats at risk, duration of exposure seldom
  • Appropriate use of computerized methods, e.g.,
    supplementing standard pharmacovigilance to
    identify possible signals sooner -- early warning

Safety Signal Reported information on a
possible causal relationship between an adverse
event and a drug.
Pharmacovigilance - Definition
PhamacovigilanceSet of methods that aim at
identifying and quantitatively assess the risks
related to the use of drugs in the entire
population, or in specific population subgroups
Adverse Drug Reaction A response to a drug which
is harmful and unintended, and which occurs at
doses normally used.
AERS Database
  • Database Origin 1969
  • SRS until 11/1/97 changed to AERS
  • 3.0 million reports in database
  • All SRS data migrated into AERS
  • Contains Drug and "Therapeutic" Biologic Reports
  • exception vaccines (VAERS)

(No Transcript)
Source of AERS Reports
  • Health Professionals, Consumers / Patients
  • Voluntary Direct to FDA and/or to
  • Manufacturers Regulations for Postmarketing

AERS Limitations
  • Different populations, Co-morbidities,
    Co-prescribing, Off-label use, Rare events
  • Report volume for a drug is affected by, volume
    of use, publicity, type and severity of the event
    and other factors, therefore the reporting rate
    is not a true measure of the rate or the risk
  • An observed event may be due to the indication
    for therapy rather than the therapy itself
    therefore observed associations should be viewed
    as signal, and causal conclusions drawn with

  • Claritin and arrhythmias (channeling and need for
    detailed data not in data base)
  • Increased number of reports due to preexisting
    condition. Selection of high risk patients for
    the drug deemed safest for them.
  • Prozac and suicide (confounding by indication)
    Large increase in reports following publicity and
    stimulated reporting

The Pharmacovigilance Process
Traditional Methods
Data Mining
Detect Signals
Generate Hypotheses
Insight from Outliers
Public Health Impact, Benefit/Risk
Type A (Mechanism-based)
Estimate Incidence
Type B (Idiosyncratic)
Restrict use/ withdraw
Change Label
Finding Interestingly Large Cell Counts in a
Massive Frequency Table
No. Reports AE1 AEn Total
Drug 1 N11 N1n N1
Drug m Nm1 Nmn Nm
Total N1 Nn N
  • Rows and Columns May Have Thousands of Categories
  • Most Cells Are Empty, even though N Is very
  • Only 386K out of 1331K Cells Have Nij gt 0
  • 174 Drug-Event Combinations Have Nij gt 1000

Method - Basics
  • Endpoint No of AEs
  • Most use variations of 2-way table statistics

No. Reports Target AE Other AE Total
Target Drug a b ab
Other Drug c d cd
Total ac bd n
Basic idea Flag when R a/E(a) is large
  • Some possibilities
  • Reporting Ratio E(a) (ab) ? (ac)/n
  • Proportional Reporting Ratio E(a) (ab) ?
    c / (cd)
  • Odds Ratio E(a) b ? c / d
  • OR gt PRR gt RR when a gt E(a)

Bayesian Approaches
  • Two current approaches DuMouchel WHO
  • Both use ratio nij / Eij where
  • nij no. of reports mentioning both drug i
    event j
  • Eij expected no. of reports of drug i event j
  • Both report features of posterior distn of
    information criterion
  • ICij log2 nij / Eij PRRij
  • Eij usually computed assuming drug i event j
    are mentioned independently
  • Ratio gt 1 (IC gt 0) ? combination mentioned more
    often than expected if independent

WHO (Bate et al, EurJClPhrm1998)
  • Bayesian Confidence Neural Network (BCNN)
  • nij no. reports mentioning both drug i event
  • ni no. reports mentioning drug i
  • nj no. reports mentioning event j
  • Usual Bayesian inferential setup
  • Binomial likelihoods for nij, ni , nj
  • Beta priors for the rate parameters (rij, pi, qj)

WHO, contd
  • Uses delta method to approximate variance of
  • Qij ln rij / piqj ln 2 ? ICij
  • However, can calculate exact mean and variance
    of Qij
  • WHO measure of importance E(ICij) - 2 SD(ICij)
  • Test of signal detection predictive value by
    analysis of signals 1993-2000 Drug Safety 2000
  • 84 Negative Pred Val, 44 Positive Pred Val
  • Good filtering strategy for clinical assessment

WHO, contd
  • WHO. (Orre et al 2000)

WHO, contd
Let A denote adverse events and D denote the drug.
Mutual information I(A,D) is a measure of
DuMouchel (AmStat1999)
  • Eij known, computed using stratification of
    database --
  • ni(k) no. reports of drug i in stratum k
  • nj(k) no. reports of event j in stratum k
  • N(k) total reports in stratum k
  • Eij ?k ni(k)nj(k) / N(k) (E (nij) under
  • nij Poisson(?ij) -- interested in ?ij ?ij/Eij
  • Prior distn for ? mixture of gamma distns
  • f(? a1, b1, a2, b2, ?) ? g(? a1, b1) (1
    ?) g(? a2, b2)
  • where g(? a, b) b (b?)a 1e-b?/?(a)

DuMouchel, contd
  • Estimate ?, a1, b1, a2, b2 using Empirical Bayes
    -- marginal distn of nij is mixture of negative
  • Posterior density of ?ij also is mixture of
  • ln2 ?ij ICij
  • Easy to get 5 lower bound (i.e. E(ICij) - 2
    SD(ICij) )

The control group and the issue of compared to
  • Signal strategies, compare
  • a drug with itself from prior time periods
  • with other drugs and events
  • with external data sources of relative drug usage
    and exposure
  • Total frequency count for a drug is used as a
    relative surrogate for external denominator of
    exposure for ease of use, quick and efficient
  • Analogy to case-control design where cases are
    specific AE term, controls are other terms, and
    outcomes are presence or absence of exposure to a
    specific drug.

Other useful metrics and methods
  • Chi-square statistics
  • P-value type metric- overly influenced by sample
  • Modeling association through directly
    Multivariate Poisson dist
  • Incorporation of a prior distribution on some
    drugs and/or events for which previous
    information is available - e.g. Liver events or
    pre-market signals

Interpreting the Signal Throughthe Role of
Visual Graphics
  • Four examples of spatial maps that reduce the
    scores to patterns and user friendly graphs and
    help to interpret many signals collectively

Example 1A spatial map showing the signal
scores for the most frequently reported events
(rows) and drugs (columns) in the database by the
intensity of the empirical Bayes signal score
(blue color is a stronger signal than purple)
Example 2Spatial map showing fingerprints of
signal scores allowing one to visually compare
the complexity of patterns for different drugs
and events and to identify positive or negative
Example 3Cumulative scores and numbers of
reports according to the year when the signal was
first detected for selected drugs
Example 4Differences in paired male-female
signal scores for a specific adverse event across
drugs with events reported (red means females
greater, green means males greater)
  • There is NO Golden Standard method for signal
  • The signals become more stable over time, however
    there is a limited time window of opportunity for
    signal detection.
  • Use Time-slice evolution of signal.-Fluctuation
    might reveal external risk factors. -Robustness
    can be assessed.
  • Consider other endpoint such as time to onset,
    duration of event, etc.
  • For spontaneous case reports, the means to
    improve content is to standardize and improve
  • Data mining likely will generate many false
    positives and affirmations of what was previously
  • Causality assessments should largely be reserved
    refining important signals

Challenges in the future
  • More real time data analysis
  • More interactivity ( Visual Data mining, e.g.
    ggobi )
  • Linkage with other data bases to control the bias
    inherent in data base
  • Quality control strategies (e.g. Identifying
  • Methods to reduce the false positive and