Title: Overlay Networks (with a focus on Content Distribution Networks)
1Overlay Networks(with a focus on Content
Distribution Networks)
- Janardhan R. Iyengar
- CISC 856 TCP/IP and Upper Layer Protocols
- 04/23/2002
2What is an Overlay ?
What is the topology of this network? WHICH
network??
3Overlay Networks Overview
- Networks built using an existing network as
substrate - Also known as Virtual Networks
- Most popular overlay The Internet Evolved as
an overlay on the POTS (Plain Old Telephone
System) network - Overlays could consist of routing software
installed at selected sites, connected by
encapsulation tunnels or direct links
4Overlay Networks Examples
- MBone, 6Bone, ABone
- RON, VNS
- P2P (Napster, FreeNet, Gnutella)
- Content Networks
- Cooperating Caches
- Server Farms
- Content Distribution Networks (CDNs)
5Example Overlays (1) MBone
- Semi-permanent testbed to carry IP multicast
traffic - Routing of IP multicast traffic is not commonly
integrated and deployed in production routers on
the Internet - Hence, layered on the Internet to support routing
of IP multicast packets using tunneling
6Example Overlays (1) MBone
7Example Overlays (2) 6Bone
- 6bone is an IPv6 testbed on the Internet
- Intended to eventually subsume the underlying
IPv4 network - IPv4 tunnels used to overlay the 6bone
- ABone is the Active Networks Backbone, for
experimentation in Active networking. Uses
tunneling
8Example Overlays (2) 6Bone
9Other known Overlays
- Resilient Overlay Network (RON) Provides fault
tolerance and faster recovery as compared to
conventional routing techniques - Virtual Network Service (VNS) Infrastructure for
provisioning QoS within Virtual Private Networks - Peer to Peer Networks Infrastructure for
distribution and sharing of files (eg Napster,
Gnutella, Freenet) - Content Networks
- Server Farms, Caching Proxies, Content
Distribution Networks (CDNs) - Today, we will try to focus on CDNs
- What are the motivations for Content Networks?
10Motivations for Content Networks
- More hops between client and Web server gt more
congestion! - Same data flowing repeatedly over links between
clients and Web server
S
11Motivations for Content Networks (contd.)
- Origin server is bottleneck as number of users
grows - Flash Crowds (for instance, Sept. 11)
- The Content Distribution Problem Arrange a
rendezvous between a content source at the origin
server (www.cnn.com) and a content sink (us, as
users)
12Example content networks Server Farms
- Simple solution to the content distribution
problem - deploy a large group of servers
- Arbitrate client requests to servers using an
intelligent L4-L7 switch - Pretty widely used today
13Example content networks Caching Proxies
ISP
Other traffic
Client ren.cis.udel.edu
Intercepters
www.cnn.com
Internet
TCP port 80 traffic
Client merlot.cis.udel.edu
Proxy
- Majorly motivated by ISP business interests
reduction in bandwidth consumption of ISP from
the Internet - Reduced network traffic
- Reduced user perceived latency
14Consider, On September 11, 2001
New Content WTC News!
15Problems with discussed approachesServer farms
and Caching proxies
- Server farms do nothing about problems due to
network congestion, or to improve latency issues
due to the network - Caching proxies serve only their clients, not all
users on the Internet - Content providers (say, Web servers) cannot rely
on existence and correct implementation of
caching proxies - Accounting issues with caching proxies.
- For instance, www.cnn.com needs to know the
number of hits to the webpage for advertisements
displayed on the webpage
16Again, On September 11, 2001
New Content WTC News!
- Distribution
- Infrastructure
- Surrogate
17Web replication - CDNs
- Overlay network to distribute content from origin
servers to users - Avoids large amounts of same data repeatedly
traversing potentially congested links on the
Internet - Reduces Web server load
- Reduces user perceived latency
- Tries to route around congested networks
18CDN vs. Caching Proxies
- Caches are used by ISPs to reduce bandwidth
consumption, CDNs are used by content providers
to improve quality of service to end users - Caches are reactive, CDNs are proactive
- Caching proxies cater to their users (web
clients) and not to content providers (web
servers), CDNs cater to the content providers
(web servers) and clients - CDNs give control over the content to the content
providers, caching proxies do not
19CDN Architecture
20CDN Components
- Content Delivery Infrastructure Delivering
content to clients from surrogates - Request Routing Infrastructure Steering or
directing content request from a client to a
suitable surrogate - Distribution Infrastructure Moving or
replicating content from content source (origin
server, content provider) to surrogates - Accounting Infrastructure Logging and reporting
of distribution and delivery activities
21Server Interaction with CDN
22Client Interaction with CDN
Q How did the CDN choose the Delaware surrogate
over the California surrogate ?
23Request Routing Techniques
- Request routing techniques use a set of metrics
to direct users to best surrogate - Proprietary, but underlying techniques known
- DNS based request routing
- Content Modification (URL rewriting)
- Anycast based (how common is anycast?)
- URL based request routing
- Transport layer request routing
- Combination of multiple mechanisms
24DNS based Request-Routing
- Common due to the ubiquity of DNS as a directory
service - Specialized DNS server inserted in DNS resolution
process - DNS server is capable of returning a different
set of A, NS or CNAME records based on
policies/metrics
25DNS based Request-Routing
Q How does the Akamai DNS know which surrogate
is closest ?
Akamai DNS
26DNS based Request-Routing
27DNS based Request Routing Caching
Requesting DNS - 76.43.32.4 Surrogate -
145.155.10.15
www.cnn.com A 145.155.10.15 TTL 10s
28DNS based Request Routing Techniques Discussion
- Originator Problem Client may be far removed
from client DNS - Client DNS Masking Problem Virtually all DNS
servers, except for root DNS servers honor
requests for recursion - Q Which DNS server resolves pel.cis.udel.edu?
- Q Which DNS server performs the last
recursion of the DNS request? - Hidden Load Factor A DNS resolution may result
in drastically different load on the selected
surrogate issue in load balancing requests, and
predicting load on surrogates
29Server Selection Metrics
- Network Proximity (Surrogate to Client)
- Network hops (traceroute)
- Internet mapping services (NetGeo, IDMaps)
-
- Surrogate Load
- Number of active TCP connections
- HTTP request arrival rate
- Other OS metrics
-
- Bandwidth Availability
30Value of a CDN
- Scale Aggregate infrastructure size
- Reach Diversity of content locations (diverse
placement of surrogates) - Request routing efficiency, delivery techniques
31Content Distribution Internetworking CDI
- Interconnection of content networks
collaboration between caching proxies and CDNs,
as well as between individual CDNs - Greater reach, larger scale, higher capacity,
increased fault tolerance - A new area, lots of challenges
- Basic architecture involves gateways between
various content networks
32CDI Architecture
33Traditional vs. Overlay Content Networks
- Content networks
- Overlay "Content Layer" to enable richer
services on top of layer 7 protocols (HTTP, RTSP) - Information processed at layers 4 through 7 of
the OSI stack - Units of transported data in content networks are
images, movies, songs
- Traditional networks
- Information processed at layers 1 through 3 of
the OSI stack - Units of transported data are frames and packets
34In Summary
- Overlays is a concept which can be used to
- deploy new services on the Internet
(Mbone, 6bone, Abone, Peer-to-Peer,
Content Networks) - get around problems in the underlying technology
(Resilient Overlay Networks) - Further reading - Overlays
- www.savetz.com/mbone/
- www.6bone.net/
- nms.lcs.mit.edu/projects/ron/
- www-2.cs.cmu.edu/hzhang/VNS/
- Further reading - CDNs
- www.ietf.org/internet-drafts/draft-ietf-cdi-model
-01.txt - www.ietf.org/internet-drafts/draft-ietf-cdi-known
-request-routing-00.txt - Bunch of papers send me mail if you are
interested - Questions? Answers? Thoughts?
35Full-Site vs. Partial-Site Content Delivery
- Full-Site delivery is what we have seen so far
entire webpage is delivered from the CDN - Partial-Site delivery delivers only embedded
objects (say, only images on the webpage) from
the CDN - Embedded object redirection can be done using DNS
based request routing or URL rewriting
- Q How many TCP connections are needed to do a
P-HTTP transfer of a webpage with embedded
objects using the above 2 techniques?
36CDN with Full-Site Delivery
37CDN with Partial-Site Delivery
38CDN Types (Skeletal)
CDNs
Hosting CDN
Relaying CDN
Full Site Content Delivery
Partial Site Content Delivery
Request Routing Techniques
URL Rewriting
DNS based
39DNS Outsourcing
CDN
CDN DNS (DNS server maintained by CDN company)
Client ISP
Clients
5
Client DNS (Local DNS server for client)
1
4
6
2
3
Content Provider
A or CNAME redirection
Customer DNS (DNS containing NS entry for
customer site)
40Tunneling
v6
v6
v6
41Example Overlays (1) MBone
- IP multicast packets are encapsulated for
transmission through tunnels - Tunnel endpoints are typically workstation-class
machines with OS support for IP multicast and
running the mrouted multicast routing daemon - DVMRP routing algorithm used in the overlay