Title: Anomaly/Intrusion Detection and Prevention in Challenging Network Environments
1Anomaly/Intrusion Detection and Prevention in
Challenging Network Environments
- Yan Chen
- Department of Electrical Engineering and Computer
Science - Northwestern University
- Lab for Internet Security Technology (LIST)
- http//list.cs.northwestern.edu
2The Spread of Sapphire/Slammer Worms
3Current Intrusion Detection Systems (IDS)
- Mostly host-based and not scalable to high-speed
networks - Slammer worm infected 75,000 machines in lt10 mins
- Host-based schemes inefficient and user dependent
- Have to install IDS on all user machines !
- Mostly simple signature-based
- Inaccurate, e.g., with polymorphism
- Cannot recognize unknown anomalies/intrusions
4Current Intrusion Detection Systems (II)
- Cannot provide quality info for forensics or
situational-aware analysis - Hard to differentiate malicious events with
unintentional anomalies - Anomalies can be caused by network element
faults, e.g., router misconfiguration, link
failures, etc., or application (such as P2P)
misconfiguration - Cannot tell the situational-aware info attack
scope/target/strategy, attacker (botnet) size,
etc.
5Network-based Intrusion Detection, Prevention,
and Forensics System
- Online traffic recording
- SIGCOMM IMC 2004, INFOCOM 2006, ToN 2007
INFOCOM 2008 - Reversible sketch for data streaming computation
- Record millions of flows (GB traffic) in a few
hundred KB - Small of memory access per packet
- Scalable to large key space size (232 or 264)
- Online sketch-based flow-level anomaly detection
- IEEE ICDCS 2006 IEEE CGA, Security
Visualization 2006 - Adaptively learn the traffic pattern changes
- As a first step, detect TCP SYN flooding,
horizontal and vertical scans even when mixed - Online stealthy spreader (botnet scan) detection
- IEEE IWQoS 2007
6Network-based Intrusion Detection, Prevention,
and Forensics System (II)
- Polymorphic worm signature generation detection
- IEEE Symposium on Security and Privacy 2006
IEEE ICNP 2007 - Accurate network diagnostics
- SIGCOMM IMC 2003, SIGCOMM 2004, ToN 2007
SIGCOMM 2006 INFOCOM 2007 (2) - Scalable distributed intrusion alert fusion w/
DHT - SIGCOMM Workshop on Large Scale Attack Defense
2006
7Network-based Intrusion Detection, Prevention,
and Forensics System (III)
- Large-scale botnet and P2P misconfiguration event
situational-aware forensics work under
submission - Botnet attack target/strategy inference
- Root cause analysis of the P2P misconfiguration/po
isoning traffic - NetShield vulnerability signature based NIDS for
high performance network defense work in
progress - Vulnerability analysis of wireless network
protocols and its defense work in progress
8System Deployment
- Attached to a router/switch as a black box
- Edge network detection particularly powerful
Monitor each port separately
Monitor aggregated traffic from all ports
Original configuration
9NetShield Matching with a Large Vulnerability
Signature Ruleset for High Performance Network
Defense
10Outline
- Motivation
- Feasibility Study a Measurement Approach
- High Speed Parsing
- High Speed Matching for Large Rulesets.
- Evaluation
- Conclusions
11Motivation
- Desired Features for Signature-based NIDS/NIPS
- Accuracy (especially for IPS)
- Speed
- Coverage Large ruleset
Cannot capture vulnerability condition well!
Shield sigcomm04
Regular Expression Vulnerability
Accuracy Relative Poor Much Better
Speed Good ??
Memory OK ??
Coverage Good ??
12Vision of NetShield
13Research Challenges
- Background
- Use protocol semantics to express vulnerability
- Protocol state machine predicates for each
state - Example ver1 methodput len(buf)gt300
- Challenges
- Matching thousands of vulnerability signatures
simultaneously - Sequential matching ? parallel matching
- High speed parsing
- Applicability for large NIDS/NIPS rulesets
14Outline
- Motivation
- Feasibility Study a Measurement Approach
- Given a large NIDS/NIPS ruleset, what percentage
of the rules can be improved with protocol
semantic vulnerability signatures? - High Speed Parsing
- High Speed Matching for Large Rulesets.
- Evaluation
- Conclusions
15Measure Snort Rules
- Semi-manually classify the rules.
- Group by CVE-ID
- Manually look at each vulnerability
- Results
- 86.7 of rules can be improved by protocol
semantic vulnerability signatures. - Most of remaining rules (9.9) are web DHTML and
scripts related which are not suitable for
signature based approach. - On average 4.5 Snort rules are reduced to one
vulnerability signature. - For binary protocol the reduction ratio is much
higher than that of text based ones. - For netbios.rules the ratio is 67.6.
16Outline
- Motivation
- Feasibility Study a Measurement Approach
- High Speed Parsing
- High Speed Matching for Large Rulesets.
- Evaluation
- Conclusions
17Observation
- PDU ? parse tree
- Leaf nodes are integers or strings
- Vulnerability signature mostly based on leaf nodes
Only need to parse the fields related to
signatures
- Traditional recursive descent parsers (BINPAC)
which need one function call per node are too
expensive.
18Outline
- Motivation
- Feasibility Study a Measurement Approach
- High Speed Parsing
- High Speed Matching for Large Rulesets.
- Evaluation
- Conclusions
19Problems Formulation
- Data representations
- For all the vulnerability signatures we studied,
we only need integers and strings - Integer operators , gt, lt
- String operators , match_re(.,.), len(.),
- Buffer constraint
- The string fields could be too long to buffer.
- Field dependency
- Array
- Associate array
- Mutual exclusive fields.
- PDU level protocol state machine
20Matching Problems (cont.)
- Example signature for Blaster worm
- Single PDU matching problem (SPM)
- Multiple PDU matching problem (MPM)
21Requirement of matching
- Suppose we have n signatures, each is defined on
k matching dimensions (matchers) - A matcher is a two-tuple (field, operation) or a
four-tuple for the associate array elements. - Challenges for SPM
- Large number of signatures n
- Large number of matchers k
- Large number of dont cares
- Cannot reorder the matchers arbitrarily (buffer
constraint) - Field dependency
- Array
- Associate array
- Mutually exclusive fields.
22Observations
- Observation 1 Most matchers are good.
- After matching against them, only a small number
of signatures can pass (candidates). - String matchers are all good, and most integer
matchers are good. - We can buffer bad matchers to change the matching
order. - Observation 2 Real world traffic mostly does not
match any signature. Actually even stronger in
most traffic, no matcher is met. - Observation 3 NIDS/NIPS will report all the
matched rules regardless the ordering. Different
from firewall rules.
23Outline
- Motivation
- Feasibility Study a Measurement Approach
- Problem Statement
- High Speed Parsing
- High Speed Matching for Large Rulesets.
- Evaluation
- Conclusions
24Evaluation Methodology
- Fully implemented and deployed to sniff a campus
router hosting university Web servers and several
labs. - Run on a P4 3.8Ghz single core PC w/ 4GB memory.
- Much smaller memory usage. E.g., http 791
vulnerability sigs from 941 Snort rules - DFA 5.29 GB vs. NetShield 1.08MB
25Stress Test Results
- Traces from Tsinghua Univ. (TH) and Northwestern
Univ. (NU) - After TCP reassembly and preload the PDU in
memory - For DNS we only evaluate parsing.
- For WINRPC we have 45 vulnerability signatures
which covers 3,519 Snort rules - For HTTP we have 791 vulnerability signatures
which covers 941 Snort rules.
26Conclusions
- A novel network-based vulnerability signature
matching engine - Through measurement study on Snort ruleset, prove
the vulnerability signature can improve most of
the signatures in NIDS/IPS. - Proposed parsing state machine for fast parsing
- Propose a candidate selection algorithm for
matching a large number of vulnerability
signature simultaneously
27With Our Solutions
Regular Expression Vulnerability
Accuracy Relative Poor Much Better
Speed Good Even faster
Memory OK Better
Coverage Good Similar
- Ongoing work apply NetShield on Cisco signature
ruleset
Build a better Snort alternative
28Backup
29Parsing State Machine
- Studied eight popular protocols HTTP, FTP, SMTP,
eMule, BitTorrent, WINRPC, SNMP and DNS and
vulnerability signatures. - Protocol semantic are context sensitive
- Common relationship among leaf nodes.
30Example for WINRPC
- Rectangles are states
- Parsing variables R0 .. R4
- 0.61 instruction/byte for BIND PDU
31Matching Algorithm
- Match each matcher against all the rules and
combine the results together - Match single matcher
- Integer range checking Binary search tree
- String exact matching Trie
- String regular expression matching DFA, XFA,
etc. - String length checking Binary search tree
32Candidate Selection for SPM
- Basic algorithm pre-computation
33Matching Illustration
34Refinement
- SPM improvement
- Allow negative conditions
- Handle array case
- Handle associate array case
- Handle mutual exclusive case
- Report the matched rules as early as possible
- Extend to MPM
- Allow checkpoints.
35Limitations of Regular Expression Signatures
Signature 10.01
Traffic Filtering
Internet
Our network
X
X
Polymorphism!
Polymorphic attack (worm/botnet) might not have
exact regular expression based signature
36Reason
Shield
RE
X
Cannot express exact condition
Can express exact condition
- Regular expression is not power enough
- to capture the exact vulnerability condition!
37Outline
- Motivation
- Feasibility Study a measurement approach
- Problem Statement
- High Speed Parsing
- High Speed Matching for massive vulnerability
Signatures. - Evaluation
- Conclusions
38What Do We Do?
- Build a NIDS/NIPS with much better accuracy and
similar speed comparing with Regular Expression
based approaches - Feasibility in Snort ruleset (6,735 signatures)
86.7 can be improved by vulnerability
signatures. - High speed Parsing 2.712 Gbps
- High speed Matching
- Efficient Algorithm for matching a large number
of vulnerability rules - HTTP, 791 vulnerability signatures at 1Gbps
39Network based IDS/IPS
- Accuracy (especially for IPS)
- False positive
- False negative
- Speed
- Coverage Large ruleset
Regular Expression Vulnerability
Accuracy Poor Much Better
Speed Good Good
Coverage Good Good
Regular expression is not power enough to capture
the exact vulnerability condition!