Title: Anomaly/Intrusion Detection and Prevention in Challenging Network Environments
1Anomaly/Intrusion Detection and Prevention in
Challenging Network Environments
- Yan Chen
- Department of Electrical Engineering and Computer
Science - Northwestern University
- Lab for Internet Security Technology (LIST)
- http//list.cs.northwestern.edu
2(No Transcript)
3The Spread of Sapphire/Slammer Worms
4Current Intrusion Detection Systems (IDS)
- Mostly host-based and not scalable to high-speed
networks - Slammer worm infected 75,000 machines in lt10 mins
- Host-based schemes inefficient and user dependent
- Have to install IDS on all user machines !
- Mostly simple signature-based
- Inaccurate, e.g., with polymorphism
- Cannot recognize unknown anomalies/intrusions
5Current Intrusion Detection Systems (II)
- Cannot provide quality info for forensics or
situational-aware analysis - Hard to differentiate malicious events with
unintentional anomalies - Anomalies can be caused by network element
faults, e.g., router misconfiguration, link
failures, etc., or application (such as P2P)
misconfiguration - Cannot tell the situational-aware info attack
scope/target/strategy, attacker (botnet) size,
etc.
6Network-based Intrusion Detection, Prevention,
and Forensics System
- Online traffic recording
- SIGCOMM IMC 2004, INFOCOM 2006, ToN 2007
INFOCOM 2008 - Reversible sketch for data streaming computation
- Record millions of flows (GB traffic) in a few
hundred KB - Small of memory access per packet
- Scalable to large key space size (232 or 264)
- Online sketch-based flow-level anomaly detection
- IEEE ICDCS 2006 IEEE CGA, Security
Visualization 2006 - Adaptively learn the traffic pattern changes
- As a first step, detect TCP SYN flooding,
horizontal and vertical scans even when mixed - Online stealthy spreader (botnet scan) detection
- IEEE IWQoS 2007
7Network-based Intrusion Detection, Prevention,
and Forensics System (II)
- Polymorphic worm signature generation detection
- IEEE Symposium on Security and Privacy 2006
IEEE ICNP 2007 - Accurate network diagnostics
- SIGCOMM IMC 2003, SIGCOMM 2004, ToN 2007
SIGCOMM 2006 INFOCOM 2007 (2) - Scalable distributed intrusion alert fusion w/
DHT - SIGCOMM Workshop on Large Scale Attack Defense
2006
8Network-based Intrusion Detection, Prevention,
and Forensics System (III)
- Large-scale botnet and P2P misconfiguration event
situational-aware forensics work under
submission - Botnet attack target/strategy inference
- Root cause analysis of the P2P misconfiguration/po
isoning traffic - NetShield vulnerability signature based NIDS for
high performance network defense work in
progress - Vulnerability analysis of wireless network
protocols and its defense work in progress
9System Deployment
- Attached to a router/switch as a black box
- Edge network detection particularly powerful
Monitor each port separately
Monitor aggregated traffic from all ports
Original configuration
10NetShield Matching with a Large Vulnerability
Signature Ruleset for High Performance Network
Defense
11Outline
- Motivation
- Feasibility Study a Measurement Approach
- High Speed Parsing
- High Speed Matching for Large Rulesets.
- Evaluation
- Conclusions
12Motivation
- Desired Features for Signature-based NIDS/NIPS
- Accuracy (especially for IPS)
- Speed
- Coverage Large ruleset
Cannot capture vulnerability condition well!
Shield sigcomm04
Regular Expression Vulnerability
Accuracy Relative Poor Much Better
Speed Good ??
Memory OK ??
Coverage Good ??
13Vision of NetShield
14Research Challenges
- Background
- Use protocol semantics to express vulnerability
- Protocol state machine predicates for each
state - Example ver1 methodput len(buf)gt300
- Challenges
- Matching thousands of vulnerability signatures
simultaneously - Sequential matching ? algorithmic parallel
matching - High speed parsing
- Applicability for large NIDS/NIPS rulesets
15Outline
- Motivation
- Feasibility Study a Measurement Approach
- Given a large NIDS/NIPS ruleset, what percentage
of the rules can be improved with protocol
semantic vulnerability signatures? - High Speed Parsing
- High Speed Matching for Large Rulesets.
- Evaluation
- Conclusions
16Measure Snort Rules
- Semi-manually classify the rules.
- Group by CVE-ID
- Manually look at each vulnerability
- Results
- 86.7 of rules can be improved by protocol
semantic vulnerability signatures. - Most of remaining rules (9.9) are web DHTML and
scripts related which are not suitable for
signature based approach. - On average 4.5 Snort rules are reduced to one
vulnerability signature. - For binary protocol the reduction ratio is much
higher than that of text based ones. - For netbios.rules the ratio is 67.6.
17Outline
- Motivation
- Feasibility Study a Measurement Approach
- High Speed Parsing
- High Speed Matching for Large Rulesets.
- Evaluation
- Conclusions
18Observations
- PDU ? parse tree
- Leaf nodes are integers or strings
- Vulnerability signatures mostly based on leaf
nodes
- Observation 1 Only need to parse the fields
related to signatures. - Observation 2 Traditional recursive descent
parsers which need one function call per node are
too expensive.
19Efficient Parsing with State Machines
- Pre-construct parsing state machines based on
parsing trees and vulnerability signatures. - Studied eight protocols HTTP, FTP, SMTP, eMule,
BitTorrent, WINRPC, SNMP and DNS as well as their
vulnerability signatures. - Common relationship among leaf nodes.
20Example for WINRPC
- Rectangles are states
- Parsing variables R0 .. R4
- 0.61 instruction/byte for BIND PDU
21Outline
- Motivation
- Feasibility Study a Measurement Approach
- High Speed Parsing
- High Speed Matching for Large Rulesets.
- Evaluation
- Conclusions
22 A Matching Problem Example
- Data representations
- For all the vulnerability signatures we studied,
we only need integers and strings - Integer operators , gt, lt
- String operators , match_re(.,.), len(.).
- Example signature for Blaster worm
23Matching Problem Formulation
- Suppose we have n signatures, each is defined on
k matching dimensions (matchers) - A matcher is a two-tuple (field, operation) or a
four-tuple for the associate array elements. - Challenges for Single PDU matching problem (SPM)
- Large number of signatures n
- Large number of matchers k
- Large number of dont cares
- Cannot reorder matchers arbitrarily -- buffering
constraint - Field dependency
- Arrays, associate arrays
- Mutually exclusive fields.
24Observations
- Observation 1 Most matchers are good.
- After matching against them, only a small number
of signatures can pass (candidates). - String matchers are all good, and most integer
matchers are good. - We can buffer bad matchers to change the matching
order. - Observation 2 Real world traffic mostly does not
match any signature. Actually even stronger in
most traffic, no matcher is met. - Observation 3 NIDS/NIPS will report all the
matched rules regardless the ordering. Different
from firewall rules.
25Matching Algorithms
- Two steps
- Pre-computation decides the rule order and
matcher order - For each matcher m, compare traffic w/ all the
rules that involve m and filter/combine the
candidate matching rules iteratively. - Matcher Implementation
- Integer range checking Binary search tree
- String exact matching Trie
- String regular expression DFA, XFA, etc.
- String length checking Binary search tree
26Step 1 Pre-Computation
- Put the selective matchers earlier
- Observe buffering constraint field arrival
order
27Step 2 Iterative Matching
28Refinement and Extension
- SPM improvement
- Allow negative conditions
- Handle array case
- Handle associate array case
- Handle mutual exclusive case
- Report the matched rules as early as possible
- Extend to Multiple PDU Matching (MPM)
- Allow checkpoints.
29Outline
- Motivation
- Feasibility Study a Measurement Approach
- Problem Statement
- High Speed Parsing
- High Speed Matching for Large Rulesets.
- Evaluation
- Conclusions
30Evaluation Methodology
- Fully implemented and deployed to sniff a campus
router hosting university Web servers and several
labs. - Run on a P4 3.8Ghz single core PC w/ 4GB memory.
- Much smaller memory usage. E.g., http 791
vulnerability sigs from 941 Snort rules - DFA 5.29 GB vs. NetShield 1.08MB
31Stress Test Results
- Traces from Tsinghua Univ. (TH) and Northwestern
Univ. (NU) - After TCP reassembly and preload the PDU in
memory - For DNS we only evaluate parsing.
- For WINRPC we have 45 vulnerability signatures
which covers 3,519 Snort rules - For HTTP we have 791 vulnerability signatures
which covers 941 Snort rules.
32Conclusions
- A novel network-based vulnerability signature
matching engine - Through measurement study on Snort ruleset, prove
the vulnerability signature can improve most of
the signatures in NIDS/IPS. - Proposed parsing state machine for fast parsing
- Propose a candidate selection algorithm for
matching a large number of vulnerability
signature simultaneously
33With Our Solutions
Regular Expression Vulnerability
Accuracy Relative Poor Much Better
Speed Good Even faster
Memory OK Better
Coverage Good Similar
- Ongoing work apply NetShield on Cisco signature
ruleset
Build a better Snort alternative
34Backup
35Observation
- PDU ? parse tree
- Leaf nodes are integers or strings
- Vulnerability signature mostly based on leaf nodes
Only need to parse the fields related to
signatures
- Traditional recursive descent parsers (BINPAC)
which need one function call per node are too
expensive.
36Limitations of Regular Expression Signatures
Signature 10.01
Traffic Filtering
Internet
Our network
X
X
Polymorphism!
Polymorphic attack (worm/botnet) might not have
exact regular expression based signature
37Reason
Shield
RE
X
Cannot express exact condition
Can express exact condition
- Regular expression is not power enough
- to capture the exact vulnerability condition!
38Outline
- Motivation
- Feasibility Study a measurement approach
- Problem Statement
- High Speed Parsing
- High Speed Matching for massive vulnerability
Signatures. - Evaluation
- Conclusions
39What Do We Do?
- Build a NIDS/NIPS with much better accuracy and
similar speed comparing with Regular Expression
based approaches - Feasibility in Snort ruleset (6,735 signatures)
86.7 can be improved by vulnerability
signatures. - High speed Parsing 2.712 Gbps
- High speed Matching
- Efficient Algorithm for matching a large number
of vulnerability rules - HTTP, 791 vulnerability signatures at 1Gbps
40Network based IDS/IPS
- Accuracy (especially for IPS)
- False positive
- False negative
- Speed
- Coverage Large ruleset
Regular Expression Vulnerability
Accuracy Poor Much Better
Speed Good Good
Coverage Good Good
Regular expression is not power enough to capture
the exact vulnerability condition!