Learning Rules for Anomaly Detection of Hostile Network Traffic

About This Presentation

Title:

Learning Rules for Anomaly Detection of Hostile Network Traffic

Description:

Anomaly Detection (eBayes, ADAM, SPADE) ... Anomaly score = tn/r summed over violated rules, t ... Anomalies are due to bugs and idiosyncrasies in hostile code ... – PowerPoint PPT presentation

Number of Views:108

Avg rating:3.0/5.0

Slides: 17

Provided by: mattma

Learn more at: https://cs.fit.edu

Category:

more less

Transcript and Presenter's Notes

Title: Learning Rules for Anomaly Detection of Hostile Network Traffic

1
Learning Rules for Anomaly Detection of Hostile
Network Traffic

Matthew V. Mahoney and Philip K. Chan
Florida Institute of Technology

2
Problem How to detect novel intrusions in
network traffic given only a model of normal
traffic

Normal web server request
GET /index.html HTTP/1.0
Code Red II worm
GET /default.ida?NNNNNNNNN

3
What has been done

Firewalls
Cant block attacks on open ports (web, mail,
DNS)
Signature Detection (SNORT, BRO)
Hand coded rules (search for default.ida?NNN)
Cant detect new attacks
Anomaly Detection (eBayes, ADAM, SPADE)
Learn rules from normal traffic for low-level
protocols (IP, TCP, ICMP)
But application protocols (HTTP, mail) are too
hard to model

4
Learning Rules for Anomaly Detection (LERAD)

Associative mining (APRIORI, etc.) learns rules
with high support and confidence for one value
LERAD learns rules with high support (n) and a
small set of allowed values (r)
Any value seen at least once in training is
allowed

If port 80 and word1 GET then word3 ?
HTTP/1.0, HTTP/1.1 (r 2)
5
LERAD Steps

Generate candidate rules
Remove redundant rules
Remove poorly trained rules
LERAD is fast because steps 1-2 can be done on a
small random sample (100 tuples)

6
Step 1. Generate Candidate RulesSuggested by
matching attribute values
Sample Port Word1 Word2 Word3
S1 80 GET /index.html HTTP/1.0
S2 80 GET /banner.gif HTTP/1.0
S3 25 HELO pascal MAIL

S1 and S2 suggest
port 80
if port 80 then word1 GET
if word3 HTTP/1.0 and word1 GET then port
80
S2 and S3 suggest no rules

7
Step 2. Remove Redundant RulesFavor rules with
higher score n/r
Sample Port Word1 Word2 Word3
S1 80 GET /index.html HTTP/1.0
S2 80 GET /banner.gif HTTP/1.0
S3 25 HELO pascal MAIL

Rule 1 if port 80 then word1 GET (n/r
2/1)
Rule 2 if word2 /index.html then word1
GET (n/r 1/1)
Rule 2 has lower score and covers no new values,
so it is redundant

8
Step 3. Remove Poorly Trained RulesRules with
violations in a validation set will probably
generate false alarms
r (number of allowed values)
Incompletely trained rule (removed)
Fully trained rule (kept)
Train Validate Test
9
Attribute Sets

Inbound client packets (PKT)
IP packet cut into 24 16-bit fields

Inbound client TCP streams
Date, time
Source, destination IP addresses and ports
Length, duration
TCP flags
First 8 application words

Anomaly score tn/r summed over violated rules,
t time since previous violation
10
Experimental Evaluation

1999 DARPA/Lincoln Laboratory Intrusion Detection
Evaluation (IDEVAL)
Train on week 3 (no attacks)
Test on inside sniffer weeks 4-5 (148 simulated
probes, DOS, and R2L attacks)
Top participants in 1999 detected 40-55 of
attacks at 10 false alarms per day
2002 university departmental server traffic
(UNIV)
623 hours over 10 weeks
Train and test on adjacent weeks (some unlabeled
attacks in training data)
6 known real attacks (some multiple instances)

11
Experimental ResultsPercent of attacks detected
at 10 false alarms per day
12
UNIV Detection/False Alarm TradeoffPercent of
attacks detected at 0 to 40 false alarms per day
13
Run Time Performance(750 MHz PC Windows Me)

Preprocess 9 GB IDEVAL traffic 7 min.
Train test lt 2 min. (all systems)

14
Anomalies are due to bugs and idiosyncrasies in
hostile codeNo obvious way to distinguish from
benign events
UNIV attack How detected
Inside port scan HEAD / HTTP\1.0 (backslash)
Code Red II worm TCP segmentation after GET
Nimda worm host www
Scalper worm host unknown
Proxy scan host www.yahoo.com
DNS version probe (not detected)
15
Contributions

LERAD differs from association mining in that the
goal is to find rules for anomaly detection a
small set of allowed values
LERAD is fast because rules are generated from a
small sample
Testing is fast (50-75 rules)
LERAD improves intrusion detection
Models application protocols
Detects more attacks

16
Thank you

Write a Comment

User Comments (0)