Title: Towards a Highspeed Routerbased AnomalyIntrusion Detection System HRAID Zhichun Li, Yan Gao, Yan Che
1Towards a High-speed Router-based
Anomaly/Intrusion Detection System (HRAID)
Zhichun Li, Yan Gao, Yan Chen (lizc,ygao,ychen_at_
cs.northewstern.edu)Northwestern Lab for
Internet and Security Technology
(LIST)Department of Computer Science,
Northwestern Universityhttp//list.cs.northwester
n.edu/hpnaidm.html
Current Intrusion Detection Systems and
Shortcomings
Features of HRAID System
- Mostly host-based and not scalable to high-speed
networks - Slammer worm infected 75,000 machines in lt10
mins - High speed gateways and backbone are vantage
points to detect attacks
- Mostly signature-based and cannot recognize
unknown anomalies/intrusions - existing flow level approach not scalable to
high volume traffic. - overall traffic based detection cannot cooperate
with attack mitigation
- Online traffic recording
- compact data structure reversible k-ary
sketch - small memory usage (fit in SRAM)
- small memory accesses per packet
- Online flow-level anomaly/intrusion detection
mitigation - TCP SYN flooding, horizontal and vertical scan
even when mixed - Infer key characteristics of malicious flows for
mitigation
HRAID First flow-level intrusion detection that
can sustain 10s Gbps bandwidth even for worst
case traffic of 40-byte packet streams
The design of a HRAID system
EWMA
RS((DIP, Dport), SYN-SYN/ACK) RS((SIP, DIP),
SYN-SYN/ACK) RS((SIP, Dport), SYN-SYN/ACK)
Attach HRAID black boxes to high-speed routers
(a) original configuration, (b) distributed
configuration for which each port is monitored
separately, (c) aggregate configuration for which
a splitter is used to aggregate the traffic from
all the ports of a router.
k-ary Sketch Update, Combine and Estimate
The two-phase heavy key inference of Reversible
k-ary Sketch
- Estimate v(S, k) sum of updates for key k
- Array of hash tables TjK (j 1, , H)
- Similar to count sketch, counting bloom filter,
multi-stage filter,
- Update (k, u) Tj hj(k) u (for all j)
- Can aggregate data from different times,
locations, and sources
Reversible Sketch Based Anomaly Detection
Preliminary Evaluation
25 SYN flooding, 936 horizontal scans and 19
vertical scans detected (after sketch-based
false positive reduction) 18 out of 25 SYN
flooding verified w/ backscatter Scans
verified (all for vscan, top and bottom 10 for
hscan)
Evaluated with NU traces (239M flows, 1.8TB
traffic/day) Scalable - Can handle hundreds
of millions of time series Accurate Anomaly
Detection w/ Reversible Sketch - Compared with
detection using complete flow-level tables -
Provable probabilistic accuracy guarantees -
Even more accurate on real Internet traces
Efficient - For the worst case traffic, all
40 byte packets 16 Gbps on a single FPGA
board 526 Mbps on a Pentium4 3.2 GHz PC -
Only less than 3MB memory used - Only 15
memory access per packet for 48 bit
reversible sketches and 16 per packet for 64
bit reversible sketches
Top 10 horizontal scans
Input stream (key, update) (e.g., SIP,
SYN-SYN/ACK)
Summarize input stream using sketches
Build forecast models on top of sketches
Report flows with large forecast errors Infer the
(characteristics) key for mitigation
Bottom 10 horizontal scans
Acknowledgment Thank Yin Zhang for the original
k-ary sketch slides