A%20Framework%20for%20Constructing%20Features%20and%20Models%20for%20Intrusion%20Detection%20Systems%20Authors:%20Wenke%20Lee%20 - PowerPoint PPT Presentation

About This Presentation

Title:

A%20Framework%20for%20Constructing%20Features%20and%20Models%20for%20Intrusion%20Detection%20Systems%20Authors:%20Wenke%20Lee%20

Description:

A Framework for Constructing Features and Models for Intrusion Detection Systems Authors: Wenke Lee & Salvatore J.Stolfo Published in ACM Transactions on Information ... – PowerPoint PPT presentation

Number of Views:140

Avg rating:3.0/5.0

Slides: 30

Provided by: Suc77

Learn more at: http://www.cs.ucf.edu

Category:

more less

Transcript and Presenter's Notes

Title: A%20Framework%20for%20Constructing%20Features%20and%20Models%20for%20Intrusion%20Detection%20Systems%20Authors:%20Wenke%20Lee%20

1
A Framework for Constructing Features and Models
for Intrusion Detection SystemsAuthors Wenke
Lee Salvatore J.Stolfo

Published in ACM Transactions on Information and
System Security, Volume 3, Number 4, 2000.
Presented By Suchandra Goswami

2
Contributions of this work

Development of automated Intrusion Detection
Systems (IDSs) rather than pure knowledge
encoding and engineering approaches
Provides a novel framework called MADAM ID, for
Mining Audit Data for Automated Models for
Intrusion Detection
First work applying data mining and machine
learning algorithms for IDSs

3
Introduction

Intrusion Detection (ID) is the art of detecting
inappropriate, incorrect, or anomalous activity.
ID systems that operate on a host to detect
malicious activity on that host are called
host-based ID systems
ID systems that operate on network data flows
are called network-based ID systems.
statistical anomaly detection and
pattern-matching most commonly used for ID

Two main ID techniques misuse detection and
anomaly detection
Misuse detection Use patterns of well-known
attacks or weak spots of the system to match and
identify known intrusions
E.g, signature rule for guessing password
attack will be there are more than 4 failed
login attempts within 2 minutes
Anomaly detection Flag observed activities that
deviate significantly from the established normal
usage profiles.

This research takes a data centric point of view
and ID is considered to be a data analysis
process
Data mining programs used to compute models that
accurately capture actual behavior (patterns)
Eliminates the need to manually analyze and
encode intrusion patterns
Validated with large amounts of audit data

6
Some Data Mining techniques and its application
to IDSs

Classification maps data items into one of
several pre-defined categories. Algorithms
generally output classifiers like decision trees
or rules.
E.g, gather sufficient normal and abnormal
audit data. Apply classification algorithm to
learn a classifier that can label unseen audit
data normal or abnormal
Link Analysis determines relation/correlation
between fields in the database records. E.g,
emacs may be highly associated with C files

Sequence analysis models sequential patterns.
Algorithms discover what time based sequence of
audit events are frequently occurring together.
Frequent event patterns provide guidelines for
incorporating temporal statistical measures into
intrusion detection models
E.g, patterns from audit data containing network
based DoS attacks suggest that several per-host
and per-service measures should be included

8
models
features
Patterns
evaluation feedback
Connection/session records
Packets/events (ASCII)
Raw audit data
Data Mining process of building ID models
9
Data Mining techniques used in MADAM ID

Audit data consist of pre-processed time-stamped
audit records with a number of features/fields
ID is considered to be a classification problem
Given a set of records, where one of the features
is a class label, classification algorithms
compute a model that uses the most discriminating
feature values to describe a concept

10
Telnet Records
label service flag hot Failed_logins compromised Root_ shell su duration .
normal telnet SF 0 0 0 0 0 10.2
normal telnet SF 0 0 0 3 1 2.1
guess telnet SF 0 6 0 0 0 26.2
normal telnet SF 0 0 0 0 0 126.2
overflow telnet SF 3 0 2 1 0 92.5
normal telnet SF 0 0 0 0 0 2.1
guess telnet SF 0 5 0 0 0 13.9
overflow telnet SF 3 0 2 1 0 92.5
normal telnet SF 0 0 0 0 0 1248
.
11
Rule Learning

RIPPER is a classification rule learning program
that generate rules
Accuracy of classification model depends on the
set of features provided in the training set
Classification algorithm looks for features with
large information gain
Adding per-host and per-service temporal service
reulted in significant improvement in accuracy

RIPPER RULE Meaning
Guess - failed_logins 4 If number of failed logins is at least 4, then this telnet connection is guess, a guessing password attack
Overflow - hot 3, compromised 2, root_shell 1 If the number of hot indicators is at least 3, the number of compromised conditions is at least 2, and a root shell is obtained, then this telnet connection is a buffer overflow attack
.. ..
Normal - true If none of the above, then this connection is normal
12
Meta-classification

Meta-learning is a mechanism for inductively
learning the correlation of predictions made by a
number of base classifiers
Each record in training data has the true class
label and the predictions made by the base
classifiers
Meta-classifier combines the base models to make
a final prediction
IDS should consist of multiple cooperative
lightweight subsystems that monitors separate
parts of the network environment

13
Association Rules

Program executions and user activities exhibit
frequent correlations among system features
Goal of mining association rules - derive
multi-feature correlations from database table
Support(X) - of records that contain item set X
where each record is a set of items
Association rule an expression of the form
X ? Y, c,s where X and Y are itemsets,
X n Y F, s support(X U Y)
c support(X U Y) / support(X) is the
confidence

14
time hostname command arg1 arg2
am pascal mkdir dir1
am pascal cd dir1
am pascal vi text
am pascal tex vi
am pascal subject progress
am pascal vi text
am pascal vi text
am pascal subject progress
am pascal vi text
vi ? time am, hostname pascal, arg1 text,
1.0, 44.4
Support(vi) 44.4
When using vi to edit a file, the user is always
(i.e, 100 of the time) editing a text file, in
the morning, and at host pascal and 44.4 of the
command data matches this pattern
15
Frequent Episodes

A frequent episode rule is the expression
X,Y ? Z,c,s,w where w is the width of the time
interval t1 , t2 during which the episode
occurs
Mined frequent episodes from audit data contain
association among features used to construct
temporal statistical features for building
classifiers

16
Network Connection Records
timestamp duration service Src_host Dst_host Src_ bytes Dst_ bytes flag
1.1 0 http Spoofed_1 victim 0 0 S0
1.1 0 http Spoofed_2 Victim 0 0 S0
1.1 0 http Spoofed_3 Victim 0 0 S0
1.1 0 http Spoofed_4 Victim 0 0 S0
1.1 0 http Spoofed_5 Victim 0 0 S0
1.1 0 http Spoofed_6 Victim 0 0 S0
1.1 0 http Spoofed_7 Victim 0 0 S0
. . .. .. .. . .. ..
10.1 2 ftp A B 200 300 SF
Flag S0, service http, dst_host victim
used to describe the SYN flood attack Victim ?
service http, src_byte 0, dst_byte 0,
flag S0 1.0, 0.7, 0
17
Constructing features from intrusion patterns

Parse frequent episodes and use three operators,
count, percent, and average to construct
statistical features
Procedure
E.g, assume F0 say dst_host is a reference
feature and the width of the episode is w seconds
Add the following features that examine only the
connections in the past w seconds that share the
same value in dst_host as the current connection
Add a feature that computes the count of these
connections
Let F1 be service, src_host or dst_host other
than F0. If the same value of F1 is in all item
sets of the episode, add a feature that computes
of connections having same F1 value as the
current connection
Let V2 be a value (e.g, S0) of a feature F2 (say
flag). If V2 is in all the itemsets of the
episode, add a feature that computes of
connections having same V2 otherwise if F2 is a
numerical feature, add a feature that computes
the average of F2 values.

18
Example to illustrate feature construction

Suppose record 7 in slide 17 is our current
connection ( F0 value victim, F1 value
http)
Assume w 0, F0 is the feature dst_host
Count number of connections in the past w 0
time units having the same value for feature F0
i.e, dst_host victim ( 7 for this e.g)
Create a new feature count_F0 (containing value 7
in this e.g)
Assume F1 is the feature service
Compute the of connections having service
http in the past w 0 time units for a given
F0 i.e, dst_host victim ( 100 for this
e.g)
Create a new feature pcnt_F1_F0 (containing value
100 in this e.g)
Assume V2 S0
Compute the of connections having V2 S0 in
the past w 0 time units for a given F0 i.e,
dst_host victim
Create a new feature pcnt_V2_F0 (containing value
100 in this e.g)

19
Experimentation

Experiments were conducted on 1998 DARPA
Intrusion Detection Evaluation Program data and
DARPA BSM data
Algorithms and tools of MADAM ID were used to
process audit data, mine patterns, construct
features and build RIPPER classifiers
DARPA data 4 gigabytes of compressed tcpdump
data of 7 weeks of network traffic
Data was processed into 5 million connection
records of about 100 bytes each
DoS, R2L, U2R, PROBING attacks in training data

20
Model of features in records of rules of features used in rules
content 22 55 11
traffic 20 26 4 9
Host traffic 14 8 1 5
Model Complexities
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
User Anomaly Detection

Goal is to determine whether the behavior of a
user is normal or not
Difficult to classify a single event by a user as
normal or abnormal
A users actions during a login session needs to
be studied as a whole to determine whether he/she
is behaving normally
Approach
- Mine the frequent patterns from command data
- Form the normal usage profile of the user
- Analyze a login session by comparing its
similarity to the profile

26
User Descriptions
User Anomaly Description
27
Strengths

Paper very well written
Exhaustive experimentation with real world data
Developed a simple, intuitive yet powerful method
for feature construction
First attempt to incorporate data mining
algorithms in IDS
Experimented with both misuse detection and user
anomaly detection
Models performed better than the systems built
with knowledge engineering approaches
Critique their own work

28
Weaknesses

Results show that the tools are not effective for
attacks having large variance in behavior (like
DoS and R2L)
Results depend on quality and quantity of
training data may lead to overtraining
Network anomaly detection not implemented to
detect new attacks
Computationally expensive

29
Future Improvements

Develop algorithms to learn network anomaly
detection models
ID models should be sensitive to cost factors
like development cost, operational cost, i.e,
needed resources, cost of damages of an
intrusion, cost of detecting and responding to
potential intrusion
Algorithms should incorporate user-defined
factors and policies to compute cost-sensitive ID
models