Gnort: High Performance Intrusion Detection Using Graphics Processors

About This Presentation

Title:

Gnort: High Performance Intrusion Detection Using Graphics Processors

Description:

Gnort: High Performance Intrusion Detection Using Graphics Processors. Giorgos Vasiliadis, Spiros Antonatos, ... Packets are transferred to the GPU in batches ... – PowerPoint PPT presentation

Number of Views:342

Avg rating:3.0/5.0

Slides: 17

Provided by: giorgosva

Category:

more less

Transcript and Presenter's Notes

Title: Gnort: High Performance Intrusion Detection Using Graphics Processors

1
Gnort High Performance Intrusion Detection Using
Graphics Processors

Giorgos Vasiliadis, Spiros Antonatos, Michalis
Polychronakis, Evangelos Markatos, Sotiris
Ioannidis
Institute of Computer Science
Foundation for Research and Technology Hellas

2
General Idea

How to speed up the processing throughput of
intrusion detection systems by offloading the
pattern matching operations to the GPU.

3
Introduction

The problem
Network Intrusion Detection Systems (NIDS) are
based on String Matching for detecting and
preventing from well-known attacks
String Matching process accounts up to 75 of the
total CPU processing
String Matching Algorithms
Aho-Corasick
Specialized hardware devices (NP, FPGAs, ASICs)
Complex to modify and program
Poor flexibility
Graphics Cards
Easy to program
Powerful and ubiquitous
Researches have begun exploring ways to tap their
power for non-graphics applications

4
Why use the GPU ?

The GPU is specialized for compute-intensive,
highly parallel computation

5
NVIDIA GeForce SIMD Architecture

Many Multiprocessors
Each multiprocessor contains many Stream
Processors
Memory model
Shared On-Chip Memory
1 cycle
Constant Memory
400-600 cycles 1 cycle if cached
Texture Memory
400-600 cycles 1 cycle if cached
Global Device Memory
400-600 cycles

Size
GPU can be used as a general purpose processor,
capable of executing many threads in parallel
6
The Aho-Corasick Algorithm

Used in most modern NIDSes
Scans for multiple patterns simultaneously
Preprocess all patterns to build a state machine
The state machine is used to scan for multiple
patterns simultaneously at linear time
Complexity is independent of the number of
patterns

Example Phe, she, his, hers
7
Mapping Aho-Corasick on GPU

How to represent the State Machine ?
Snort represent each state as an array of
pointers
It is difficult to map them on the GPU memory
Transform to a 2D array
Can easily bind to Texture Memory
Texture fetches are cached
Aho-Corasick exhibits strong locality of
references
Random access memory read
The usage of Texture Memory boosts GPU execution
time about 19

8
Parallelizing Packet Searching (1/2)

Assigning a Single Packet to each Multiprocessor

Each packet is copied to the shared memory of the
Multiprocessor
Stream Processors search different parts of the
packet concurrently
Overlapping computation
Matching patterns may span consecutive chunks of
the packet
Same amount of work per Stream Processor
Stream Processors will be synchronized

9
Parallelizing Packet Searching (2/2)

Assigning a Single Packet to each Stream Processor

Each packet is processed by a different Stream
Processor
No overlapping computation
Different amount of work per Stream Processor
Stream processors of the same Multiprocessor will
have to wait until all have finished

10
Software Mapping

Packets are transferred to the GPU in batches
Performs much better than making each transfer
separately
Packets are stored to a buffer that is copied to
the GPU when gets full
Use page-locked memory to store the packets
Higher transfer throughput from host to device
Copies are performed using DMA, without occupying
the CPU
CPU and GPU execution can overlap

11
Evaluation (1/2)

Scalability as a function of the number of
patterns

We ran Snort using random generated patterns
All patterns are matched against every packet
Payload trace contained UDP 800-bytes packets of
random payload
Throughput remains constant when patterns
increases
2.4x faster than the CPU

12
Evaluation (2/2)

Throughput as a function of the packets size

Ran Snort using 1000 random patterns
All patterns are matched against every packet
2.3 Gbit/s for full packets
3.2x faster compared to the CPU
Both GPU implementations do not present
significant differences in performance

13
Evaluation with real input and rules

Experimental setup
Two PCs connected via a 1 Gbit/s Ethernet switch
To directly compare with prior work Jacob et
al, we re-implemented the Knuth-Morris-Pratt
(KMP) and Boyer-Moore (BM) algorithms on the GPU.

14
Evaluation with real input and rules

Snort loaded about 8000 patterns.
Preprocessors and PCRE were disabled
Original Snort (AC) cannot process all packets in
rates higher than 300 Mbit/s
GPU-assisted Snort (AC1, AC2) begins to loose
packets at 600 Mbit/s
200 improvement
KMP and BM algorithms used from Jacob et al
perform worse in all cases

15
Conclusion

Graphics cards can be used effectively to speed
up Network Intrusion Detection Systems.
Low-cost
Easy programming
Future work includes
Transfer the packets directly from the NIC to the
GPU
Utilize multiple GPUs on multi-slot motherboards