The Dark Side of the Web: An Open Proxys View PowerPoint PPT Presentation

presentation player overlay
1 / 24
About This Presentation
Transcript and Presenter's Notes

Title: The Dark Side of the Web: An Open Proxys View


1
The Dark Side of the WebAn Open Proxys View
  • Vivek S. Pai, Limin Wang, KyoungSoo Park, Ruoming
    Pang, Larry Peterson
  • Princeton University

2
Origins Surviving Heavy Loads
  • Surviving flash crowds, DDoS attacks
  • Absorb via massive resources
  • Raise the bar for attacks
  • Tolerate smaller crowds
  • Survive larger attacks
  • Existing approach
  • Content Distribution Networks

3
Building an Academic CDN
  • Flash crowds are real
  • We have the technology
  • OSDI02 paper on CDN performance
  • USITS03 proxy API
  • PlanetLab provides the resources
  • Continuous service, decentralized control
  • Seeing real traffic, reliability, etc
  • We use it ourselves
  • Open access more traffic

4
How Does CoDeeN Work?
  • Server surrogates (proxies) on most North
    American sites
  • Originally everywhere, but we cut back
  • Clients specify proxy to use
  • Cache hits served locally
  • Cache misses forwarded to CoDeeN nodes
  • Maybe forwarded to origin servers

5
How Does CoDeeN Work?
origin
CoDeeN Proxy
Each CoDeeN proxy is a forward proxy, reverse
proxy, redirector
6
Steps For Inviting Trouble
  • Use a popular protocol
  • HTTP
  • Emulate a popular tool/interface
  • Web proxy servers
  • Allow open access
  • With HTTPs lack of accountability
  • Be more attractive than competition
  • Uptime, bandwidth, anonymity

7
Hello, Trouble!
  • Spammers
  • Bandwidth hogs
  • High request rates
  • Content Thieves
  • Worrisome anonymity
  • Commonality using CoDeeN to do things they would
    not do directly

8
The Root of All Trouble
CoDeeN Proxy
http/tcp
http/tcp
origin
(Malicious) Client
9
Spammers
  • SMTP (port 25) tunnels via CONNECT
  • Relay via open mail server
  • POST forms (formmail scripts)
  • Exploit website scripts
  • IRC channels (port 6667) via CONNECT
  • Captive audience, high port

10
Attempted SMTP Tunnels/Day
11
Bandwidth Hogs
  • Webcam trackers
  • Mass downloads of paid cam sites
  • Cross-Pacific traffic
  • Simultaneous large file downloads
  • Steganographers
  • Large files small images
  • All uniform sizes

12
High Request Rates
  • Password crackers
  • Attacking random Yahoo! accounts
  • Google crawlers
  • Dictionary crawls baffles Googlians
  • Click counters
  • Defeat ad-supported game

13
Content Theft
  • Licensed content theft
  • Journals and databases are expensive
  • Intra-domain access
  • Protected pages within the hosting site

14
Worrisome Anonymity
  • Request spreaders
  • Use CoDeeN as a DDoS platform!
  • TCP over HTTP
  • Non-HTTP Port 80
  • Access logging insufficient
  • Vulnerability testing
  • Low rate, triggers IDS

15
Goals, Real Otherwise
  • Desired allow only safe accesses
  • Ideally
  • An oracle tells you whats safe
  • Your users are not impacted
  • Open proxies considered inherently bad
  • NLANR requires accounts, proxy-auth
  • JANET closed to outsiders
  • No research in partially open proxies

16
Privilege Separation
Remote Proxy
Local Proxy
Local Server
17
Rate Limiting
Minute
Hour
Day
  • 3 scales capture burstiness
  • Exceptions
  • Login attempts
  • Vulnerability tests

18
Other Techniques
  • Limiting methods GET, (HEAD)
  • Local users not restricted
  • Sanity checking on requests
  • Browsers, machines very different
  • Modifying request stream
  • Most promising future direction

19
By The Numbers
  • Running 24/7 since May, 40 nodes
  • Over 400,000 unique IPs as clients
  • Over 150 million requests serviced
  • Valid rates up to 50K reqs/hour
  • Roughly 4 million reqs/day aggregate
  • About 4 real abuse incidents
  • Availability high uptimes, fast upgrades

20
Daily Client Population Count
21
Daily Request Volume
22
Monitors Other Venues
  • Routinely trigger open proxy alerts
  • Educating sysadmins, others
  • Really good honeypots
  • 6000 SMTP flows/minute at CMU
  • Spammers do 1M HTTP ops/day
  • Early problem detection
  • Failing PlanetLab nodes
  • Compromised university machines

23
Lessons Directions
  • Few substitutes for reality
  • Non-dedicated hardware really interesting
  • Failure modes not present in NS-2
  • Stopgap measures pretty effective
  • Very slow arms race
  • Breathing time for better solutions
  • Next more complex techniques
  • Machine learning, high-dim clustering

24
More Info
  • http//codeen.cs.princeton.edu
  • Thanks
  • Intel, HP, iMimic, PlanetLab Central
Write a Comment
User Comments (0)
About PowerShow.com