Use of Measurements in Anomaly Detection - PowerPoint PPT Presentation

About This Presentation

Title:

Use of Measurements in Anomaly Detection

Description:

Use of Measurements in Anomaly Detection CS 8803: Network Measurements Seminar Instructor: Constantinos Dovrolis Fall 2003 Presenter: Bu ra Gedik – PowerPoint PPT presentation

Number of Views:244

Avg rating:3.0/5.0

Slides: 51

Provided by: loca284

Learn more at: https://faculty.cc.gatech.edu

Category:

more less

Transcript and Presenter's Notes

Title: Use of Measurements in Anomaly Detection

1
Use of Measurements in Anomaly Detection

CS 8803 Network Measurements Seminar
Instructor Constantinos Dovrolis
Fall 2003
Presenter Bugra Gedik

2
Outline

Well be discussing 3 papers
Topic Detail Inferring DoS Activity
Paper D. Moore, G. M. Voelker, and S. Savage.
Inferring internet denial-of-service activity. In
Proceedings of the USENIX Annual Technical
Conference (USENIX 2001).
Topic Detail Code-Red Worm
Paper D. Moore, C. Shanning, and J. Brown.
Code-Red A Case Study on the Spread and Victims
of an Internet Worm. In Proceedings of the ACM
Internet Measurement Workshop (IMW 2002).
Topic Detail DoS Attacks and Flash Crowds
Paper J. Jung, B. Krishnamurthy, and M.
Rabinovich. Flash Crowds and Denial of Service
Attacks Characterization and Implications for
CDNs and Web Sites. In Proceedings of the
International World Wide Web Conference (WWW
2002).

Inferring Internet Denial-of-Service Activity
David Moore
Geoffrey M. Voelker
Stefan Savage
In Proceedings of the USENIX Annual Technical
Conference (USENIX 2001).

4
Problem Statement Solution Overview

Problem
How prevalent are denial-of-service attacks in
the Internet today?
This paper only considers flood type of attacks
Technique
Use backscatter analysis for estimating the
worldwide prevalence of DoS attacks

5
Backscatter Analysis
6
Some Limiting Assumptions

Address uniformity Attackers spoof source
addresses at random.
Reliable delivery Attack traffic is delivered
reliably to the victim and backscatter is
delivered reliably to the monitor.
Backscatter hypothesis Unsolicited packets
observed by the monitor represent backscatter.

7
Address uniformity

May not hold because
Some ISPs employ ingress filtering, as a result
the attacker may be forced to restrict its
address space
Reflector Attacks A different kind of flooding
attack that is not captured by backscattering,
e.g. Smurf or Fraggle attacks

The main motivation of the assumption
Many direct DoS attack tools use random address
spoofing, e.g. Shaft, TFN, TFN2k, trinoo,
Stacheldraht, mstream, Trinity
It is possible to use tests like A2 to test
uniformity

Multicast Group
8
Reliable delivery

May not hold because
During the attack packets may be dropped due to
congestion
IDS may filter the packets
Some type of attacks may not produce a
backscatter
Many attacks generate a backscatter
Most type of flooding attacks do generate a
response

9
Backscatter hypothesis

May not hold because
Any host on the internet can send unsolicited
packets to the monitored network
Motivation of the assumption
Packets that are consistently targeted to a
specific address in the monitored network can be
filtered easily
Although a concerted effort by a third party can
bias the results, this is quite unlikely

10
Extrapolating Backscatter Analysis Results

Let n be the number of monitored IP addresses
And consider an attack with m packets
Then the expected number of backscatter packets
observed from the attack, E(X), is E(X)
(nm)/232
Similarly, if the observed rate of an attack is
R, than an upper bound on the real rate R, is
R gt R 232 /n

11
Attack Classification

Two types of classification are done
Flowed based classification
Used to classify individual attacks
Answering the questions
how many
how long
what kind
Event based classification
Analyze the severity of attacks on short time
scales

12
Flow-based classification

A flow is defined as a series of consecutive
packets sharing the same target (victims
address) and same IP protocol
If no more packets are observed from a flow for 5
minutes, the flow is assumed to end
All flows that do not have more than 100 packets
or last less than 60secs are discarded
Flows that are only backscattered to a single IP
address in the monitored range are discarded

13
Examining the Flows

Determine the type of attack by examining
TCP flag settings
ICMP packets
Look at the distributions of
IP addresses, use A2 uniformity test to validate
the assumption, significance level of 0.05
port addresses
Classify the victim by examining
DNS information of the victim
AS level information of the victim from BGP tables

14
Event-based Classification

An attack event is defined by a victim emitting
at least 10 backscatter packets during a one
minute period
Attacks are not classified based on type, only
criterion is the victims IP address
For each minute, the victims that are under
attack and the intensity of each attack is
determined and recorded

15
Experimental Setup

/8 network represents 1/256 of the total Internet
February 1st to February 25th, Ethernet traffic
is captured using a shared hub with the ingress
router

16
Summary of Observed Attacks

5000 distinct victim IP addresses in more than
2000 distinct DNS domains

17
Attack/Response Protocols

50 of the attacks generate TCP (RST ACK)
suggesting they are TCP flood attacks destined to
closed ports
15 of the attacks generate ICMP host
unreachable containing a TCP header including the
victims IP again suggesting a TCP flood
12 of the attacks generate ICMP (TTL Exceeded)
Strange! These we caused by attacks with very
high rate and they correspond to around 50 of
all backscatter packets observed
8 of the attacks generate TCP (SYN ACK)
suggesting SYN floods

18
Attack Rate

Uniform Random Attacks are the ones whose source
IP addresses satisfy the A2 test
500 SYN packets per second are enough to
overwhelm a server (40 of attacks satisfy this)
14,000 SYN packets per second are enough to
overwhelm a server with specialized firewalls
(2.5 of attacks satisfy this)

19
Attack Duration

50 of the attacks are less than 10 minutes
80 of the attacks are less than 30 minutes
90 of the attacks are less than 60 minutes

20
Victim Classification

Significant fraction of attacks targeted to home
machines, either dial-up or broadband
Within home users, cable-modem users have
experienced some intense attacks with rates going
up to 1,000 packets per second.
Significant number of attacks to IRC servers

21
Victim Classification

No single AS or a small set of ASs are major
targets
65 of the victems were attacked once and 18
twice

22
Validation

98 of the packets attributed to backscatter does
not itself provoke a response, so they can not be
packets used to probe the monitored network
98 of the victim IP addresses are also
encountered in other traces extracted from
different datasets collected at the same period

Code-Red A Case Study on the Spread and Victims
of an Internet Worm
David Moore
Colleen Shannon
Jeffery Brown
In Proceedings of the ACM Internet Measurement
Workshop (IMW 2002)

24
Analysis of the Code-Red Worm

Worms Self replicating viruses
Code-Red worm classification
Code-RedI-v1 memory-resident, static seed,
infect/spread/attack
Code-RedI-v2 memory-resident, random seed,
infect/spread/attack
Code-RedII disk-resident, intelligent,
infect/backdoor/spread
Data Sets
Packet header trace of hosts sending unsolicited
TCP SYN packets to a /8 (class A) network and two
/16 networks, July 4 / August 21
July 12, 2001 - Code-RedI-v1 set loose
July 19, 2001 - Code-RedI-v2 set loose
August 4, 2001 - Code-RedII set loose
Hosts that has sent at least two unsolicited TCP
SYN packets (on port 80) to the /8 network are
suspected as infected hosts

25
Code-RedI Worms
From the beginning of 20th to the end of the month
From the beginning to the end of 19th of the month
Infection Phase
Attack Phase
. . .
26
Unsolicited SYN probes, Code-Redv1

The trace includes large number of probes to 23
IP addresses within the monitored /8 network
Using the same static seed first 1 million IP
addresses are generated by reverse engineering
the worm code
Those 23 addresses in deed appear in the
generated sequence
3 source addresses in the trace do not belong to
the generated IP addresses, they must be the
initial hosts infected manually
Atlanta, USA
Cambridge, USA
GuangDong, China

27
Host Infection Rate, Code-Redv2

More than 359,000 unique IP addresses are
infected with the Code-RedI worm within a day
between midnight of July 19 and July 20.

28
Deactivation rate for Code-Redv1

A clear time of day effect is seen from the
figure
Many machines are shut during the night
This is an indication that many home and office
users are affected from the virus
The worm is programmed to switch to its attack
phase on July 20, thus we have a sudden increase
in deactivation rate at midnight

29
Host Classification

Reverse DNS lookups are used to characterize the
hosts
It is clear that a surprisingly large number of
hosts are dial-up and broadband users
Diurnal variations are observed, which suggests
that a majority of the infected hosts are not
production web servers

30
Investigating time of day effect

Find location of hosts using IxMapping
(http//www.ipmapper.com) service
Convert UTC time to local time for each host and
plot active hosts as function of time

31
The Effect of DHCP

Between August 2 and August 16, 2 million
infected addresses are observed
However only 143,000 hosts were active in the
most active 10 minute period
This can be accounted to DHCP

DHCP inflates the infected host number
However NAT usage may deflate the number

Flash Crowds and Denial of Service Attacks
Characterization and Implications for CDNs and
Web Sites
J. Jung
B. Krishnamurthy
M. Rabinovich
In Proceedings of the International World Wide
Web Conference (WWW 2002)

33
Definitions Problem Statement

Definitions
Flash Event (FE) A FE is a large surge in
traffic to a particular Web site causing dramatic
increase in server load and putting severe strain
on the network links.
Denial of Service Attack (DoS) A DoS is an
explicit attempt by attackers to prevent
legitimate users of a service from using that
service.
Problem
How to differentiate DoS attacks from Flash
Events ?
How to improve CDN performance for handling FEs ?

34
Some Example DoS Attacks

TCP SYN Attack spoofed SYN packets
UDP Attacks connect chargen-echo
Ping of Death oversized ICMP packets cause crash
Smurf Attack ping various hosts with victims
address
Fragile and Snork Attacks echo and WinNT RPC
Flooding Attack flood network with useless
packets
DDoS Attacks !!!

35
Example Flash Events

Popular Events, like
Elections
Olympics
Catastrophic events, like
Sept. 11
Popular Webcasts
Play-along Web Sites (for TV shows)

36
Dimensions of the Comparison

The comparison between DoS and FE is done along
the following dimensions
Traffic Patterns
Client Characteristics
File Reference Characteristics

37
Flash Events

Datasets Studied
Play-alongPlay-along web site for a populat TV
show
ChileThe Chile Web site that hosted continuously
updated election results of 1999 election

38
Traffic Volume

Request rate grows dramatically during the FE
But the duration of the FE is relatively short

39
Traffic Volume

Request rates increase rapidly during the
initial period of the attack
But the increase is far from instantaneous,
enough room for adaptation

40
Characterizing Clients

Number of clients in a FE is commensurate with
the request rate

41
Characterizing Clients

There is no clear increase in per-client request
rates

42
Old and New clusters

Old clusters clusters that have been seen before
the FE
New clusters clusters that have been seen during
the FE but not before
The percentage of old clusters during the FE is
42.7 for Play-along and 82.9 for Chile

Significant proportion of the clusters seen
during the FE consists of old clusters
Request distribution over clusters is highly
skewed

43
File Reference Characteristics

Over 60 of documents are accessed only during
flash events
Less than 10 of documents account for more than
90 of the requests
File reference distribution is highly Zipf-like

44
DoS Attacks

Datasets studied
esg and olLog files that recorded more than 1
million requests within 60 days. A password
cracking attack is performed during this period.
bit.nl, creighton, fullnote, rellim,
sptcccxusCollection of 5 traces that recorded
requests to Web servers from machines infected by
Code-Red worm.

45
Traffic Volume Client Characteristics
(Code-Red)

The surge occurred because of new clusters
joining the attack
For traces that contain both infected and
non-infected client requests, less than 14.3 of
the clusters during the attack were old clusters
(even smaller for password cracking)

46
Client Characteristics (Code-Red)

Request rates per client do not change during
the attack
Distribution of requests among clusters are more
spread across a number of clusters

47
Comparison of FE and DoS
?
48
Implications to CDNs

How we can handle FEs more effectively using
CDNs?
We have seen that most requests during a FE are
to documents that are not accessed before the FE
This causes a lot of cache misses, which
overloads the origin server
One solution is to use cooperative caches, but
this introduces high delays
Authors propose an alternative approach which
does not incur a high delay yet decrease load on
the origin server