Securing Web Service by Automatic Robot Detection - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Securing Web Service by Automatic Robot Detection

Description:

Malicious robots are widespread. Password cracking. Referrer/Blog spamming ... Dynamically embed JavaScript code. MouseMove triggers the event handler ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 17
Provided by: kyoungs
Category:

less

Transcript and Presenter's Notes

Title: Securing Web Service by Automatic Robot Detection


1
Securing Web Service by Automatic Robot Detection
  • KyoungSoo Park, Vivek S. Pai
  • Princeton University
  • Kang-Won Lee, Seraphin Calo
  • IBM T.J. Watson Research Center

2
Web Robots
  • Automatic agents
  • Web crawlers
  • URL link checkers
  • Malicious robots are widespread
  • Password cracking
  • Referrer/Blog spamming
  • Click frauds on Google search
  • Burning CPU with heavy CGI queries

3
Contributions
  • Real-time robot detector
  • Fast detection
  • 80 at 20 reqs, 95 at 57 reqs
  • High accuracy
  • 2.4 max false positive rate
  • Low overhead
  • 200 usec additional delay per page
  • Easy deployment

4
Operational Scenario
  • Server-side
  • Site Webserver
  • Many-to-one
  • Client-side
  • Firewall/Proxies at LAN
  • Many-to-many

MON
Servers
Clients
5
Design Goals
  • Transparency
  • No human intervention
  • Accuracy
  • Minimal false positives
  • Real-time proof
  • Periodic check should be possible
  • Authentication or CAPTCHA not enough
  • Practicality

6
Observation Intuition
  • Robot behavior
  • Custom program
  • Goal-oriented
  • No embedded objs
  • No index file
  • Follow hidden links
  • No HW events
  • Human behavior
  • Standard browsers
  • Browsing purpose
  • Cascading style sheets
  • Images
  • Never follow hidden links
  • Mouse keyboard

Humans are easier to detect
7
Browser Detection
  • No standard browser?(implies) robot
  • User-Agent HTTP header?
  • Use behavioral artifacts (dynamic mods)
  • Redundant embedded objects
  • Empty cascading style sheet (CSS)
  • Invisible images (1x1 JPEG) or mute sounds
  • Hidden links

8
Human Activity Detection
  • Human activities ?(implies) human
  • Mouse/keyboard event tracking
  • Most robots dont generate HW events
  • Dynamically embed JavaScript code
  • MouseMove triggers the event handler
  • Event handler fetches a fake image
  • Semantically lexically obfuscated

9
Test with CoDeeN
  • CoDeeN (http//codeen.cs.princeton.edu/)
  • Pulling-based CDN on PlanetLab over 3 years
  • 25 million reqs from 50K clients/day
  • Malicious robots seeking abuse
  • Results for 1-week measurement
  • But changes now permanent

10
Main Result
CSS Fetch 28.9
Robots 71.1
11
Main Result
Max False Positive Rate FP/negatives
/Robots 1.9/77.7 2.4
CSS Fetch 28.9

Only 9 passed (optional) CAPTCHA
Only 0.9 followed hidden links
Robots 71.1
12
How Fast Can We Detect?
13
of CoDeeN Complaints
14
Limitations
  • Defeating browser detection
  • Behave exactly like a standard browser
  • Human activity detection
  • Robots generating mouse/key events
  • Disable JavaScript 4
  • Solution
  • Ensemble techniques

15
Machine Learning (AdaBoost)
Three most effective attributes 1. RESPONSE CODE
300 2. REFERRER 3. UNSEEN REFERRER
Drawbacks 1. Heavy computation/memory 2. Pattern
may change 3. Human intervention
16
Conclusions
  • Practical robot detection tool
  • Detect human by
  • Standard browser behavior
  • Human activities
  • Arms Race in the end
  • Turing test
  • Most simple bots screened out
  • Ensemble techniques promising
Write a Comment
User Comments (0)
About PowerShow.com