Title: Monitoring and Early Warning for Internet Worms
1Monitoring and Early Warning for Internet Worms
- Cliff C. Zou, Lixin Gao, Weibo Gong, Don
Towsley - Univ. Massachusetts, Amherst
2How to detect an unknown worm at its early
stage?
- Monitoring
- Monitor worm scan traffic (non-legitimate
traffic). - Connections to nonexistent IP addresses.
- Connections to unused ports.
- Observation data is very noisy.
- Old worms scans.
- Port scans by hacking toolkits.
- Detecting
- Anomaly detection for unknown worms
- Traditional anomaly detection threshold-based
- Check traffic burst (short-term or long-term).
- Difficulties False alarms threshold tuning.
3Trend Detection ? Detect traffic trend, not
burst
Trend worm exponential growth trend at the
beginning Detection the exponential rate should
be a positive, constant value
Monitored illegitimate traffic rate
Exponential rate a on-line estimation
Non-worm traffic burst
4Why exponential growth at the beginning?
- The law of natural growth ? reproduction
- Exponential growth fastest growth pattern when
- Negligible interference (beginning phase).
- All objects have similar reproductive capability.
- Large-scale system law of large number.
- Fast worm has exponential growth pattern
- Attackers incentive infect as many as possible
before counteractions. - If not, a worm does not reach its spreading speed
limit. - Slow spreading worms can be detected by other
ways.
5Worm modeling simple epidemic model
of susceptible
of infectious
Total of hosts
Infectious ability
of contacts ? I ? S
Simple epidemic model
It
Discrete model
with exponential rate
6Why use simple epidemic model?
- Can model most scan-based worms.
- We can use other worm models as well with minor
modifications (such as exponential model).
Figures from D. Moore, V. Paxson, S.
Savage, C. Shannon, S. Staniford, N. Weaver,
Inside the Slammer Worm, IEEE Security
Privacy, July 2003.
7Kalman Filter Estimation
- Equivalent to Recursive Least Square Estimator
- Give estimation at each discrete time.
- Robust to noise.
- System Discrete-time simple epidemic model
- System state
- Worm infection rate a. (a bN, exponential
growth rate at beginning) - Epidemic parameter b. (worm infectious ability)
- Measurement from monitors
- Ci cumulative of observed infected, Zi
of scans at time i.
8Kalman Filter Estimation
System
where
Kalman Filter for estimation of Xt
9Code Red simulation experiments
- Population N360,000, Infection
rate a 1.8/hour, - Scan rate h N(358/min, 1002), Initially
infected I010 - Monitored IP space 220,
Monitoring interval D 1 minute - Consider background noise
Before 2 (223 min) estimate is already
stabilized and oscillating a little
around a positive constant value
10SQL Slammer simulation experiments
Population N100,000, Monitored IP space 220,
Scan rate h N(4000/sec, 20002), Initially
infected I010 Monitoring interval D 1
second, Consider background noise
Before 1 (45 sec) estimate is already
stabilized and
oscillating around a positive constant value
11Early detection of Blaster
- Blaster sequentially scans from a starting IP
address - 40 from local Class C address.
- 60 from a random IP address.
- It follows simple epidemic model.
12Bias correction for uniform-scan worms
- Bernoulli trial for a worm to hit monitors
(hitting prob. p ).
Bias correction
Average scan rate
Monitoring 214 IP space
Monitoring 217 IP space
Bias correction can provide unbiased estimate of
It
13Prediction of Vulnerable population size N
Direct from Kalman filter
?
Alternative method
h A worm sends out h scans per D time
(derived from egress scan monitor)
?
Estimation of population N
14Use exponential growth model
At the early stage
?
?
Early stage of worm propagation
Model 2 Autoregressive (AR) model
?
Model 3 Transformed linear model
?
15Comparison between three estimation models
Epidemic model
AR exponential model
Transformed linear model
- Observations
- AR exponential model is smoother than epidemic
model - Transformed linear model gives best results
- Detect a worm when it infects about 0.5
population
16Simple analysis of three estimation models
- Why AR exponential model is smoother than
epidemic model? - Introduced errors from measurement data
- Epidemic model
- AR exponential model
- Why transformed linear model is better than AR
model?
assume
AR exponential model
Transformed linear model
where
17Summary
- Trend detection non-threshold-based methodology
- Principle detect traffic trend, not burst
- Pros Robust to background noise ? low false
alarm rate - Cons Rely on worm model, representation of
measurement data - Epidemic model, exponential model
- Using low-pass filter on noisy observation data
- For uniform-scan worms
- Bias correction
- Forecasting N
( IPv4 ) -
?
Routing worm
?
Average scan rate
Infection rate
scanning IP space
cumulative of observed infectious
scan hitting prob.