Variance Estimation over Sliding Windows - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Variance Estimation over Sliding Windows

Description:

Invariant: Any adjacent bucket pair except B2,1 within right-half window W1 has ... C (C 1) pairs of adjacent buckets in merging step. Worst Case Time is ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 17
Provided by: publicI
Category:

less

Transcript and Presenter's Notes

Title: Variance Estimation over Sliding Windows


1
Variance Estimation over Sliding Windows
Linfeng Zhang and Yong Guan Department of
Electrical and Computer Engineering Information
Assurance Center Iowa State University Ames,
Iowa, USA
June 13, 2007, Beijing, China
2
Outline
  • Motivation
  • Problem Definition
  • Related Work
  • Our Algorithm
  • Contribution
  • Optimal in space requirement.
  • Optimal in worst case running time.
  • Summary Future Work

3
Motivation
  • Advanced Attack Traceback Project
  • Goal
  • Trace origin of the attack through the Internet.
  • Capture the statistics of large network data.
  • Challenging Issues
  • Huge and continuous data vs. Limited Memory
  • Only one pass to process data
  • Monitor/Detect anomaly of the network data.

4
Motivation (cont.)
  • Variance Estimation over Sliding Windows
  • Variance is often related to anomaly and status
    change.
  • Our Approach achieves
  • Optimal space requirement
  • Optimal worst case running time
  • Applications of Our Approach
  • Network monitoring
  • Intrusion detection
  • Financial analysis
  • Weather forecast
  • Disaster forecast

5
Problem Definition
  • Sliding Window Model
  • First proposed by Datar, Gionis, Indyk and
    Motwani
  • Example Window Size N 8

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

16
0
1
0
0
1
1
1
0
1
1
0
0
0
1
0
1

FULL
6
Problem Definition (cont.)
  • Problem Definition
  • Maintain ?-approximate variance of an integer
    stream over sliding windows with size N in one
    pass.
  • Variance ( a
    series of N integers)
  • ( mean)
  • ?-Approximation (
    Variance Estimation)

7
EH Algorithm
  • Exponential Histograms (EH)
  • Datar, Gionis, Indyk and Motwani (SODA 2002)
  • Bit Counting
  • Space Requirement
  • How EH works
  • Example

1
1
1
1
2
1
1
2
1
1
2
1
1
2
4
1
1
0
1
0
1
1
1
0
1
1
1
1
.
.
.
8
EH Algorithm (cont.)
  • Can apply to any function f satisfying
    properties
  • 1. f(X) 0.
  • 2. f(X) poly(X).
  • 3. f(XUY) f(X) f(Y).
  • 4. f(XUY) C(f(X) f(Y)), where constant C1.
  • However, variance does not satisfy the last
    property.
  • Example

X
µX
µXUY
VXUY
µY
VX
VY
Y
9
BDMO Algorithm
  • Babcock, Datar, Motwani and OCallaghan (PODS
    2003)
  • Keep each new element in a single bucket.
  • Each bucket Bi maintains three variables (ni, µi,
    Vi)
  • ni Number of elements in the bucket
  • µi Mean of elements in the bucket
  • Vi Variance of elements in the bucket
  • Merge adjacent buckets if
  • Summary information of the combination of two
    buckets

10
BDMO Algorithm (cont.)
  • Space
  • However optimal bound is
  • Running Time
  • Amortized
  • Worst Case

Open Problem
11
Our Algorithm
xt
  • Step 1 Insert New Element xt
  • Create a new bucket B1 for xt with (n1, µ1, V1)
    (1, xt, 0)
  • Step 2 Delete Expired Bucket
  • Step 3 Merge Adjacent Buckets
  • Rule 1
  • Rule 2
  • Rule 3

(Oldest)
(Newest)
12
Correctness
  • Why can such a merging rule set bound error to
    O(?)?
  • Rule 1 guarantees
  • Case 1 µC is close to µB
  • Rule 12 guarantee
  • Case 2 µC is far away from µB

O(1)
13
Space Requirement
  • Invariant Any adjacent bucket pair except B2,1
    within right-half window W1 has either Property 1
    or 2.
  • Property 1
  • Variance doubles for each 5/? bucket pairs.
  • Property 2
  • Size doubles for each 10/? bucket pairs.
  • Space Requirement

Optimal!
14
Worst Case Running Time
  • Basic Mechanism
  • Scan all buckets each time when a new element
    arrives.
  • Worst Case Running Time is
  • Advanced Mechanism
  • Each time only check C (C gt 1) pairs of adjacent
    buckets in merging step.
  • Worst Case Time is
  • Only slightly increase required memory by a
    factor of .
  • Example C 2.

15
Summary Future Plan
  • Summary
  • Algorithm for variance estimation over sliding
    windows.
  • Optimal Space Requirement
  • Optimal Worst Case Time
  • Future Work
  • How to estimate higher moments over sliding
    windows.
  • Acknowledgments
  • Supporters NSF, DTO/ARDA and Carver Trust
    Foundation.

16
Thanks
  • Questions and Suggestions
Write a Comment
User Comments (0)
About PowerShow.com