Some Ideas for Detecting Spurious Observations Based on Mixture Models - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Some Ideas for Detecting Spurious Observations Based on Mixture Models

Description:

Some Ideas for Detecting Spurious Observations. Work with Dave Dickey and ... Primarily Motivated by Dave's American Airlines Data and Proschan's (1963) paper ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 30
Provided by: francis49
Category:

less

Transcript and Presenter's Notes

Title: Some Ideas for Detecting Spurious Observations Based on Mixture Models


1
Some Ideas for Detecting Spurious Observations
Based on Mixture Models
  • Jim Lynch
  • NISS/SAMSI University of South Carolina

2
Some Ideas for Detecting Spurious Observations
  • Work with Dave Dickey and Francisco Vera
  • Very Preliminary Ideas
  • Primarily Motivated by Daves American Airlines
    Data and Proschans (1963) paper on pooling to
    explain a decreasing failure rate and, to a
    lesser extent, M. J. Bayarri talk on Multiple
    testing

3
Outline
  • 1. Introduction
  • 2. Mixture Models
  • 3. Some Ideas
  • 4. Simulations
  • 5. The American Airlines Data

4
IntroductionSome Motivation AA Data(Largest
Log Vol Removed)
  • Some Time Series Diagnostics Suggest That Log
    Volume Ratio is an MA(1)
  • Fit an MA(1) to the log Vol Ratio to the AA Data
  • Look At The Residuals

5
Introduction
  • Detecting spurious observations is an important
    area of research and has implications for anomaly
    detection (AD).
  • The term spurious observation is used to
    distinguish it from an outlier, since outliers
    are usually extreme observations in the data
    while a spurious observation need not be.
  • E.g., one could imagine that sophisticated
    intruders into computer systems would make
    sporadic intrusions and try to mimic as best as
    possible normal behavior

6
Introduction
  • Goal
  • To develop approaches to detect very transient
    spurious events where the objectives are
  • To detect when there are spurious events present
    and, if possible,
  • To identify them

7
Introduction
  • The Basic Data Analytic Model
  • X1,, Xn iid fp (1-p) f0 p f1
  • f0 is the background model
  • f1 models the spurious behavior
  • The likelihood is then

8
Introduction
  • A More Realistic Model
  • Generate a configuration C with probability p(C)
  • Given C, for ieC, Xi are iid f0 and, for ieCc,
    Xi are iid f1
  • C and Cc model a spatial or temporal (e.g., a
    change-point) pattern
  • You are pooling observations based on the
    configuration C
  • The likelihood is then

9
IntroductionSome Approaches for Analyzing the
MR Model
  • Envision that the data are the effects of pooling
    observations from f0 and f1.
  • Treat the data as if it is from a mixture model
    and use a mixture model to determine the mle, p,
    of the mixing proportion.
  • Use p to test H0 p0 versus H1 pgt0(Under H0
    and the mixture model, n-.5p converges in
    distribution to X where X0 with probability .5
    and N(0,I0-1) with probability .5)
  • If H0 is rejected see if the mixture model can
    give insights into the configuration Cj
  • E.g., do an empirical Bayes with prior
    p(Cj)(1-p)jpn-j. Then

10
IntroductionAnother Approach
  • Since f1 models the spurious behavior p0
  • p0 suggest using the locally most powerful (LMP)
    test statistic for testing H0p0 versus H1pgt0
    as the basis of discovering if there are spurious
    observations present
  • The test statistic is related essentially to the
    gradient plot introduced by Lindsay (1983) to
    determine when a finite mixture mle is the global
    mixture mle in the mixed distribution model

11
IntroductionAnother Approach
  • The basis of this approach
  • use the gradient plot to determine if the one
    point mixture mle is the global mixture mle
  • When it isnt, this suggest that some spurious
    behavior is present
  • One can then use the components in the mle mixed
    distribution to calculate assignment
    probabilities to the data to indicate what
    observations might be considered spurious
  • The examples indicate that detecting the presence
    of spurious observations seems to be considerably
    simpler than identifying which ones they are

12
IntroductionMining Data Graphs
  • Data (Maguire, Pearson and Wynn, 1952) Time
    Between Accidents with 10 or more fatalities
  • At the right are the gradient plots for the 2 and
    3 point mixture mles and the assignment function
    for the 3 pt mle (mixing over exponentials)
  • The 2 and 3 pt mixture mles
  • m 592.9, 166.2 p .175, .825
  • m 595.5, 171.6, 29.1 p .171, .806, .023

13
Mixture Models
  • X1,, Xn iid fp (1-p) f0 p f1
  • f0 is the background model
  • f1 models the spurious behavior
  • Since the spurious observations are
    sporadic/transient p0
  • Denote the log likelihood by f(f(X1),,
    f(Xn)) f(f) log Pif(Xi)
  • Denote the gradient function of f by

14
Mixture Models LMP
  • LemmaThe locally most powerful test for testing
    H0p0 versus H1pgt0 is based on F0(f1 f0).
  • ProofThe LMP test for testing H0p p0 versus
    H1pgt p0 is based on the statistic
  • For p0 this reduces to

15
Mixture Model
  • The Function F(f1 f0)
  • Plays a prominent role in the analysis of data
    from mixtures models where it is essentially the
    gradient function.
  • Introduced by Lindsay (1983ab and 1995) to
    determine when the mle for the mixing
    distribution with a finite number of points was
    the global mixture mle.

16
Mixture ModelFramework
  • Family of densities fqq e Q.
  • M is the set of probability measures on Q.
  • The mixed distribution over the family with
    mixing distribution Q by
  • For X1,, Xn be iid from fQ, the likelihood and
    log likelihood are given by
  • L(Q) PfQ(Xi) and f(fQ) log PifQ(Xi)
  • fQ (fQ(X1),, fQ(Xn)).

17
Mixture ModelFramework
  • The Directional Derivative

18
Mixture ModelA Diagnostic
  • Theorem 4.1 of Lindsay (1983a)
  • A. The following three conditions are equivalent
  • Q maximizes L(Q)
  • Q minimizes supq D(qQ)
  • supq D(qQ)0.
  • B. Let ffQ. The point (f,f) is a saddle
    point of .i.e.,
  • F(fQf) lt 0 F(ff) lt F(f fQ) for Q,
    Q e M.
  • C. The support of Q is contained in the set of q
    for which D(qQ)0.

19
Mixture ModelThe Assignment/Membership Function
20
Simulations n10 5 points N(0,1), 5 points
N(1,1)
  • 0 -0.34964
  • 0 -1.77582
  • 0 -0.92900
  • 0 0.58061
  • 0 -0.36032
  • 1 2.51937
  • 1 0.59549
  • 1 1.16238
  • 1 0.76632
  • 1 1.57752

21
Simulations n10 5 points N(0,1), 5 points
N(1,1)
  • m p
  • -.487880 .388813
  • .929969 .611187

22
SimulationsThe Assignment Function
23
Simulationsn30 25 points N(0,1), 5 points
N(1,1)
  • m p
  • -0.05537 0.867670
  • 2.05801 0.132330

24
Simulationsn30 25 points N(0,1), 5 points
N(1,1)
25
SimulationsAnother n30 25 points N(0,1), 5
points N(1,1)
  • m p
  • 0.78767 0.921009
  • 3.30559 0.078991

26
SimulationsAnother n30 25 points N(0,1), 5
points N(1,1)
27
AA Data
  • Francisco will discuss this and some other
    simulations in a moment.

28
Closing Comments
  • Is there an analogue (or alternative) of these
    ideas for the SCAN (or for the SCAN framework)?
  • As an alternative, view the problem as having
    several (two) mechanisms creating observations
  • background
  • infectious material is present.
  • Just consider that the data are a pooling from
    all these sites. See if the data is a
    2-component mixture. If it is, try to assign
    the sites to these components. (You might use a
    thresh-holding of the assignment function to do
    this or p in the LMP Test Statistic.)
  • Instead of the assignment function, consider the
    following based on the LMP test statistic.
    Define Li(f1(Xi) - f0(Xi))/f0(Xi). Let L(1)
    ltL(2) ltlt L(n) and let j(i) denote the inverse
    rank, i.e., L(i) Lj(i). For mixture or scanning
    purposes, consider the sets Cij(n),..,j(n-i1)
    k L(n-i1) lt Lk. For mixtures with mle p,
    assign Ci to f1 and Cic to f0 where npi. For
    scanning purposes, look through increasing
    sequence of sets Ci for a spatial pattern to
    emerge.

29
REFERENCES
  • Ferguson, T. S. (1967) Mathematical Statistics A
    Decision Theoretical Approach. Academic Press,
    NY.
  • Grego, J., Hsi, Hsiu-Li, and Lynch, J. D. (1990).
    A strategy for analyzing mixed and pooled
    exponentials. Applied Stochastic Models and Data
    Analysis, 6, 59-70.
  • Lindsay, B.G. (1983a). The geometry of mixture
    likelihoods a general theory. Ann. Statist.,
    11, 86-94.
  • Lindsay, B.G. (1983b). The geometry of mixture
    likelihoods, Part II the exponential family.
    Ann. Statist., 11, 783-792.
  • Lindsay, B.G. (1995). Mixture Models Theory,
    Geometry Applications, NSF-CBMS lecture series,
    IMS/ASA
  • Maguire, B.A., Pearson, E.S., and Wynn, A.H.A.
    (1952) The time interval between industrial
    accidents. Biometrika, 39, 168-180.
  • Proschan, F. (1963). Theoretical explanation of
    decreasing failure rate. Technometrics, 5,
    375-383.
Write a Comment
User Comments (0)
About PowerShow.com