Gnort: High Performance Intrusion Detection Using Graphics Processors - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Gnort: High Performance Intrusion Detection Using Graphics Processors

Description:

Gnort: High Performance Intrusion Detection Using Graphics Processors. Giorgos Vasiliadis, Spiros Antonatos, ... Packets are transferred to the GPU in batches ... – PowerPoint PPT presentation

Number of Views:342
Avg rating:3.0/5.0
Slides: 17
Provided by: giorgosva
Category:

less

Transcript and Presenter's Notes

Title: Gnort: High Performance Intrusion Detection Using Graphics Processors


1
Gnort High Performance Intrusion Detection Using
Graphics Processors
  • Giorgos Vasiliadis, Spiros Antonatos, Michalis
    Polychronakis, Evangelos Markatos, Sotiris
    Ioannidis
  • Institute of Computer Science
  • Foundation for Research and Technology Hellas

2
General Idea
  • How to speed up the processing throughput of
    intrusion detection systems by offloading the
    pattern matching operations to the GPU.

3
Introduction
  • The problem
  • Network Intrusion Detection Systems (NIDS) are
    based on String Matching for detecting and
    preventing from well-known attacks
  • String Matching process accounts up to 75 of the
    total CPU processing
  • String Matching Algorithms
  • Aho-Corasick
  • Specialized hardware devices (NP, FPGAs, ASICs)
  • Complex to modify and program
  • Poor flexibility
  • Graphics Cards
  • Easy to program
  • Powerful and ubiquitous
  • Researches have begun exploring ways to tap their
    power for non-graphics applications

4
Why use the GPU ?
  • The GPU is specialized for compute-intensive,
    highly parallel computation

5
NVIDIA GeForce SIMD Architecture
  • Many Multiprocessors
  • Each multiprocessor contains many Stream
    Processors
  • Memory model
  • Shared On-Chip Memory
  • 1 cycle
  • Constant Memory
  • 400-600 cycles 1 cycle if cached
  • Texture Memory
  • 400-600 cycles 1 cycle if cached
  • Global Device Memory
  • 400-600 cycles

Size
GPU can be used as a general purpose processor,
capable of executing many threads in parallel
6
The Aho-Corasick Algorithm
  • Used in most modern NIDSes
  • Scans for multiple patterns simultaneously
  • Preprocess all patterns to build a state machine
  • The state machine is used to scan for multiple
    patterns simultaneously at linear time
  • Complexity is independent of the number of
    patterns

Example Phe, she, his, hers
7
Mapping Aho-Corasick on GPU
  • How to represent the State Machine ?
  • Snort represent each state as an array of
    pointers
  • It is difficult to map them on the GPU memory
  • Transform to a 2D array
  • Can easily bind to Texture Memory
  • Texture fetches are cached
  • Aho-Corasick exhibits strong locality of
    references
  • Random access memory read
  • The usage of Texture Memory boosts GPU execution
    time about 19

8
Parallelizing Packet Searching (1/2)
  • Assigning a Single Packet to each Multiprocessor
  • Each packet is copied to the shared memory of the
    Multiprocessor
  • Stream Processors search different parts of the
    packet concurrently
  • Overlapping computation
  • Matching patterns may span consecutive chunks of
    the packet
  • Same amount of work per Stream Processor
  • Stream Processors will be synchronized

9
Parallelizing Packet Searching (2/2)
  • Assigning a Single Packet to each Stream Processor
  • Each packet is processed by a different Stream
    Processor
  • No overlapping computation
  • Different amount of work per Stream Processor
  • Stream processors of the same Multiprocessor will
    have to wait until all have finished

10
Software Mapping
  • Packets are transferred to the GPU in batches
  • Performs much better than making each transfer
    separately
  • Packets are stored to a buffer that is copied to
    the GPU when gets full
  • Use page-locked memory to store the packets
  • Higher transfer throughput from host to device
  • Copies are performed using DMA, without occupying
    the CPU
  • CPU and GPU execution can overlap

11
Evaluation (1/2)
  • Scalability as a function of the number of
    patterns
  • We ran Snort using random generated patterns
  • All patterns are matched against every packet
  • Payload trace contained UDP 800-bytes packets of
    random payload
  • Throughput remains constant when patterns
    increases
  • 2.4x faster than the CPU

12
Evaluation (2/2)
  • Throughput as a function of the packets size
  • Ran Snort using 1000 random patterns
  • All patterns are matched against every packet
  • 2.3 Gbit/s for full packets
  • 3.2x faster compared to the CPU
  • Both GPU implementations do not present
    significant differences in performance

13
Evaluation with real input and rules
  • Experimental setup
  • Two PCs connected via a 1 Gbit/s Ethernet switch
  • To directly compare with prior work Jacob et
    al, we re-implemented the Knuth-Morris-Pratt
    (KMP) and Boyer-Moore (BM) algorithms on the GPU.

14
Evaluation with real input and rules
  • Snort loaded about 8000 patterns.
  • Preprocessors and PCRE were disabled
  • Original Snort (AC) cannot process all packets in
    rates higher than 300 Mbit/s
  • GPU-assisted Snort (AC1, AC2) begins to loose
    packets at 600 Mbit/s
  • 200 improvement
  • KMP and BM algorithms used from Jacob et al
    perform worse in all cases

15
Conclusion
  • Graphics cards can be used effectively to speed
    up Network Intrusion Detection Systems.
  • Low-cost
  • Easy programming
  • Future work includes
  • Transfer the packets directly from the NIC to the
    GPU
  • Utilize multiple GPUs on multi-slot motherboards

16
Thank you
  • Any questions?
  • gvasil_at_ics.forth.gr
Write a Comment
User Comments (0)
About PowerShow.com