Title: Experiences with CoralCDN A Five-Year Operational View
1Experiences with CoralCDNA Five-Year
Operational View
- Michael J. Freedman
- Princeton University
- www.coralcdn.org
2A Cooperative, Self-Organizing CDN
Client
Resolver
2
5
1
3
6
4
- Goal To make desired content widely available
- regardless of publishers own resources, by
- organizing and utilizing any cooperative resources
3http//example.com/path
http//example.com.nyud.net/path
4 Adopted by Clients Servers Thir
d-parties
5Many of you have used CoralCDN
6Many of you have used CoralCDN
7Many of you have used CoralCDN
8Many of you have used CoralCDN
9Many of you have used CoralCDN
10Many of you have used CoralCDN
11Daily Request Volume
- 2M clients 2 TB content 20K origin
domains - From 300-400 PlanetLab servers
12Based on peer-to-peer DHT
CoralCDN DNS Server
CoralCDN HTTP Proxy
CoralCDN DNS Server
CoralCDN HTTP Proxy
Coral index node
Coral index node
CoralCDN
CoralCDN HTTP Proxy
CoralCDN DNS Server
CoralCDN HTTP Proxy
Coral index node
Coral index node
- Weakened consistency algorithms that prevent
tree saturation during lookup - Decentralized clustering for locality and
hierarchical lookup - Cooperative HTTP / DNS that leverages locality
13Based on peer-to-peer DHT
CoralCDN DNS Server
CoralCDN HTTP Proxy
CoralCDN DNS Server
CoralCDN HTTP Proxy
Coral index node
Coral index node
CoralCDN
CoralCDN HTTP Proxy
CoralCDN DNS Server
CoralCDN HTTP Proxy
Coral index node
Coral index node
- Weakened consistency algorithms that prevent
tree saturation during lookup - Decentralized clustering for locality and
hierarchical lookup - Cooperative HTTP / DNS that leverages locality
14CoralCDN DNS Server
CoralCDN HTTP Proxy
CoralCDN DNS Server
CoralCDN HTTP Proxy
Coral index node
Coral index node
CoralCDN
CoralCDN HTTP Proxy
CoralCDN DNS Server
Coral index node
Virtualization Layer
Interactions with the External Environment
Clients
Origin Domains
15- Experiences
- Naming
- Fault Tolerance
- Resource management
- Revisit CoralCDNs design
16Naming
?
Flexible, open API
x
Mismatch with domain-based access control policies
17CoralCDNs Platform-as-a-Service API
- Rewrite rules in origin webservers
- RewriteEngine on
- RewriteCond HTTP_USER_AGENT !CoralWebPrx
- RewriteCond QUERY_STRING !()coral-no-serve
- RewriteRule (.) http//HTTP_HOST.nyud.net
REQUEST_URI R,L
18Elastic Provisioning
CoralCDNs Platform-as-a-Service API
- Rewrite rules in origin webservers
- RewriteEngine on
- RewriteCond HTTP_USER_AGENT !CoralWebPrx
- RewriteCond QUERY_STRING !()coral-no-serve
- RewriteCond HTTP_REFERER slashdot\.org NC
- RewriteCond HTTP_REFERER digg\.com NC,OR
- RewriteCond HTTP_REFERER blogspot\.com
NC,OR - RewriteRule (.) http//HTTP_HOST.nyud.net
REQUEST_URI R,L
- Sites integrate with load/bandwidth monitoring
19Naming Conflation
http//domain /path
.service2
.service1
?
- Location to retrieve content
- Human-readable name for administrative entity
- Security policies to govern objects interactions
x
x
20Domain-based Security Policies
Web Page
evil.com
target.com
Cookies
Document Object Model
21Domain-based Security Policies
Web Page
evil.com
.nyud.net
target.com
.nyud.net
Cookies
Document Object Model
Defaults violate least privilege
22Fault Tolerance Failure Decoupling
?
- Internal failures
- DHT nodes
- DNS servers, HTTP proxies
- Management service
x
- External failures
- Decouple IPs from hosts
- Interactions with origin sites
- - Node failures
- - DHT / DHS / HTTP level
- - Management layer
- Couldn't decouple IPs from host
23 happens!
- Unresponsive
- Returns error code
- Reply truncated
- Cache negative results
- Serve stale content
- Use whole-file overwrites
24 happens!
- Unresponsive
- Returns error code
- Reply truncated
- Cache negative results
- Serve stale content
- Use whole-file overwrites
Maintain status quo unless improvements are
possible
25What is failure?
Return values should have fail-safe defaults
26Resource Management
?
Control over bandwidth consumption
x
Control and visibility into environments
resources
- - Node failures
- - DHT / DHS / HTTP level
- - Management layer
- Couldn't decouple IPs from host
27Some timeline
Mar 2004 CoralCDN released on PlanetLab
28Some timeline
Mar 2004 CoralCDN released on PlanetLab
Aug 2004 Slashdotted
29Some timeline
Mar 2004 CoralCDN released on PlanetLab
Aug 2004 Slashdotted
Dec 2004 Asian Tsunami
- PlanetLab traffic jumps
- Site threatens to yank PL
- PL admin kills slice
- Slice restored next day
- Initiates discussion of resource limits for
slices
30Demand gtgt SupplyEnter Fair-Sharing Algorithms
?
Si di S
Avg MB per hour (di)
Domains with heaviest consumption
31Demand gtgt SupplyEnter Fair-Sharing Algorithms
find max ?, s.t. Si min (?, di) S
Avg MB per hour (di)
?
Domains with heaviest consumption
32Demand gtgt SupplyEnter Fair-Sharing Algorithms
find max ?, s.t. Si min (?, di) S
?
Domains with heaviest consumption
33Admission Control under Fair-Sharing
10 kB imgs 3.3 rejected
5 MB videos 89 rejected
- Demand gt 10 TB Supply 2 TB
34Some timeline
Mar 2004 CoralCDN released on PlanetLab
Aug 2004 Slashdotted
Dec 2004 Asian Tsunami
Mar 2006 PL deploys bandwidth throttling
- PlanetLab traffic jumps
- Site threatens to yank PL
- PL admin kills slice
- Slice restored next day
- Initiates discussion of resource limits for
slices
35Resource Management Us vs. Them
- Track HTTP traffic
- If site gt fair share rate, reject via HTTP 403
- If total gt peak rate, close server socket
- Track all network traffic
- If total gt 80 daily rate, BW shaping in kernel
36Resource Management Us vs. Them
- Track HTTP traffic
- If site gt fair share rate, reject via HTTP 403
- If total gt peak rate, close server socket
- Track all network traffic
- If total gt 80 daily rate, BW shaping in kernel
Result HTTP traffic is 1/2 - 2/3 of all traffic
Lower layers should expose greater visibility
and control over resources
37- Experiences
- Naming
- Fault Tolerance
- Resource management
- Revisit CoralCDNs design
38Usage Scenarios
- Resurrecting old content
- Accessing unpopular content
39Usage Scenarios
- Resurrecting old content
- Accessing unpopular content
- Serving long-term popular content
40Usage Scenarios
- Resurrecting old content
- Accessing unpopular content
- Serving long-term popular content
41Usage Scenarios
- Resurrecting old content
- Accessing unpopular content
- Serving long-term popular content
Top URLs Reqs Agg Size (MB)
0.01 49.1 14
0.10 71.8 157
1.00 84.8 3744
10.00 92.2 28734
Result Frequency
Local Cache 70.4
Origin Site 9.9
CoralCDN Proxy 7.1
4xx/5xx Error 12.6
42Usage Scenarios
- Resurrecting old content
- Accessing unpopular content
- Serving long-term popular content
- Surviving flash crowds to content
43Usage Scenarios
- Resurrecting old content
- Accessing unpopular content
- Serving long-term popular content
- Surviving flash crowds to content
44Usage Scenarios
- Resurrecting old content
- Accessing unpopular content
- Serving long-term popular content
- Surviving flash crowds to content
5 second epochs
10 minute epochs
24 epochs 1 domain with 10x incr
0.006 epochs 1 domain with 100x incr
0 epochs 1 domain with 1000x incr
99.93 epochs 1 domain with 10x incr
28 epochs 1 domain with 100x incr
0.21 epochs 1 domain with 1000x incr
45Conclusions?
- Most requested content is long-term popular and
already cached locally - Flash crowds occur, but on order of minutes
46Conclusions?
- Most requested content is long-term popular and
already cached locally - Flash crowds occur, but on order of minutes
- Focus on long-term popular
- Little / no HTTP cooperation
- Global discovery (e.g., DNS)
- Focus on flash crowds
- Regional coop. as default
- Global coop. as failover
47Reconfiguring CoralCDNs design
- Leverage Coral hierarchy for lookup
Latency 90 Origin Load 5
Failover to global 0.5
48Reconfiguring CoralCDNs design
- Leverage Coral hierarchy for lookup
- During admission control, bias against long-term
use
Si min (?, di) lt S
?
heavily weight history in ewma
49Conclusions
- Experiences
- Naming
- Fault Tolerance
- Resource management
- Revisit CoralCDNs design
- Current design unnecessary for deployment / most
use - Easy changes to promote flash-crowd mitigation
50Can we reach Internet scale?
- www.firecoral.net
- Initial beta-release
- of browser-based P2P web cache