Caching and Content Distribution Networks - PowerPoint PPT Presentation

About This Presentation
Title:

Caching and Content Distribution Networks

Description:

CDN Selection The tricky issue is selecting which local content server to use for a particular request Want to spread load evenly Want minimal impact if server is ... – PowerPoint PPT presentation

Number of Views:132
Avg rating:3.0/5.0
Slides: 36
Provided by: Michael2042
Category:

less

Transcript and Presenter's Notes

Title: Caching and Content Distribution Networks


1
Caching and Content Distribution Networks
2
Web Caching
  • As an example, we use the web to illustrate
    caching and other related issues

3
Web Browser Caching
  • Web browsers have their own caches. When a page
    is downloaded from a site the web page is put
    into the browser cache.
  • This is especially useful in those cases when the
    back button is pressed.
  • If a new copy is needed then a refresh can be
    done.
  • No page stays permanently in the cache. There is
    limited room.
  • A replacement algorithm is needed to determine
    which cached page should be purged.

4
Why Web Server Caching
  • Latency
  • Reduce latency
  • Request does not require going to the server
  • Request is served from the client side which
    means that network communication is avoided
  • Reduce traffic

5
Consistency
  • What if the page changes after saved in the
    cache?
  • This means that cached copy is out of date
  • The copy and the original are not consistent
  • There are different strategies for dealing with
    this

6
Web Browser Caching
  • Client pull
  • The server provides the content with instructions
    on when the client should ask for a refreshed
    copy of the content or if the content should be
    cached.
  • Server push
  • The server transmits page information to the
    screen.
  • The browser application displays the information
    and leaves the connection to the server open.
  • With an open connection, the server can continue
    to push updated pages for your screen to display
    on an ongoing basis. You can close the connection
    by closing the page.
  • The server is in control
  • Browser caches are different from proxy caches
    (discussed next).

7
Web Caching
  • Proxy caches (also called proxy server)
  • Intercepts HTTP requests from client
  • Serves object if in its cache and the date is
    still valid
  • If not go to objects home server
  • On behalf of user, gets the object and possibly
    deposits in its cache before returning to user
  • Usually deployed at edges of a network
  • Wide area bandwidth savings, improved response
    time and increased availability of static
    web-based objects
  • A browser may have to be configured to point to
    the proxy server.
  • Usually a proxy cache is purchased and installed
    by an organization

8
Web Caching
  • Not all web pages can be cached
  • If the Last-Modified tag then page can be cached
  • Refresh is often done when
  • There is a request and
  • Expiry time has passed

9
Cooperative Caching
  • Caching infrastructure can have multiple web
    proxies
  • Proxies can be arranged in a hierarchy or other
    structures
  • Proxies can cooperate with one another
  • Answer client requests
  • Propagate server notifications
  • Uses a combination of HTTP and ICP (Internet
    Caching Protocol).
  • ICP can be used by one cache to quickly ask
    another cache if it has an object.
  • HTTP is used to actually retrieve the object.

10
Problems
  • Caching proxies do not serve all Internet users
  • Content providers (say, Web servers) cannot rely
    on existence and correct implementation of
    caching proxies.
  • Accounting issues with caching proxies
  • Example www.cnn.com needs to know the number of
    hits to the advertisements displayed on the web
    page.

11
Content Distribution Networks (CDN)
  • Business Model A content provider such as
    www.cnn.com or Yahoo pays a CDN company (such as
    Akamai) to get its content to the requesting
    users with short delays.
  • A CDN provides a mechanism for
  • Replicating content on multiple servers in the
    Internet
  • Providing clients with a means to determine the
    servers that can deliver the content fastest.

12
Terminology
  • Content Any publicly accessible combination of
    text, images, applets, frames, MP3, video, flash,
    virtual reality objects, etc.
  • Content Provider Any individual, organization,
    or company that has content that it wishes to
    make available to users.
  • Origin Server Content providers server , where
    the content is first uploaded.
  • Surrogate Server (sometimes called edge server)
    Content distributors server, where the
    replicated content is kept.

13
Players
Yahoo, MSNBC, CNN CBC
Content Provider
Send content
Akamai,
Content Distributor
Sells servers
Install servers
H/W and S/W Vendor
  • Cisco,
  • Oracle-
  • Sun

Hosting Provider
Bell
14
CDN Distribution
14
  • Content providers are CDN customers
  • Content replication
  • CDN company installs thousands of servers
    throughout Internet
  • In large datacenters
  • Or, close to users
  • CDN replicates customers content
  • When provider updates content, CDN updates
    servers

origin server in North America
CDN distribution node
CDN server in S. America
CDN server in Asia
CDN server in Europe
15
CDN Functional Components
  • Distribution Service
  • Redirection Service
  • Accounting and Billing system

16
CDNDistribution Service
  • The content provider determines which of its
    objects it wants the CDN to distribute.
  • The content provider tags and then pushes this
    content to a CDN node, which in turn replicates
    and pushes the content to all its CDN servers.

17
CDN Redirection
  • When a browser in a users host is instructed to
    retrieve a specific object (specified using a
    URL), how does the browser determine whether it
    should retrieve the object from the origin server
    or from one of the CDN servers?
  • As an example, suppose the hostname of the
    content provider is www.cnn.com

18
How Akamai Works
18
cnn.com (content provider)
DNS root server
GET index.html
Akamai cluster
Akamai global DNS server
http//a.73.g.akamai.net/7/23/cnn.com/af/cnn.com/f
oo.jpg
1
2
HTTP
Akamai regional DNS server
Nearby Akamai cluster
  • End-user

19
CDN Redirection
  • Users get an html document from www.cnn.com this
    could be index.html
  • The file index.html uses a modified URL for
    content that has been replicated.
  • Example If the jpeg files are what has been
    replicated then
  • ltimg srchttp//cnn.com/af/foo.jpggt
  • may be modified as follows
  • ltimg srchttp//a73.g.akamai.net/7/23/cnn.com/af/f
    oo.jpggt
  • The browser needs to resolve a73.g.akamai.net
    hostname for replicated content.

20
CDN Redirection
  • What does this mean?
  • ltimg srchttp//a73.g.akamai.net/7/23/cnn.com/af/f
    oo.jpggt
  • host part a73.g.akamai.net
  • Akamai control part /7/23
  • Content URL /af/foo.jpg

21
CDN Redirection
  • DNS is configured so that all queries about
    g.akamai.net that arrive at a DNS server are sent
    to an authoritative DNS server for g.akamai.net.
  • This is referred to as a Akamai DNS server
    (authoritative DNS server)

22
How Akamai Works
cnn.com (content provider)
DNS root server
DNS lookup cache.cnn.com
Akamai cluster
Akamai global DNS server
3
1
2
4
ALIAS g.akamai.net
Akamai regional DNS server
Nearby Akamai cluster
  • End-user

23
CDN Redirection
  • DNS is configured so that all queries about
    g.akamai.net that arrive at a DNS server are sent
    to an authoritative DNS server for g.akamai.net.
    This is referred to as a Akamai DNS server
    (authoritative DNS server)
  • When the Akamai DNS server receives the query, it
    extracts the IP address of the requesting
    browser.
  • .

24
How Akamai Works
P
cnn.com (content provider)
DNS root server
DNS lookup g.akamai.net
Akamai cluster
Akamai global DNS server
5
3
1
2
6
4
Akamai regional DNS server
ALIAS a73.g.akamai.net
Nearby Akamai cluster
  • End-user

25
CDN Redirection
  • Based on the IP address and information that it
    has about the Internet (called a map), the IP
    address of an Akamai regional server is returned
    to the requesting browser based on policy
  • e.g., select the server that is the fewest hops
    away.
  • The regional server may choose a surrogate server
    for content retrieval

26
How Akamai Works
HTTP
cnn.com (content provider)
DNS root server
Akamai cluster
Akamai global DNS server
5
3
1
2
6
4
Akamai regional DNS server
7
DNS a73.g.akamai.net
8
Address 1.2.3.4
Nearby Akamai cluster
  • End-user

27
How Akamai Works
HTTP
cnn.com (content provider)
DNS root server
Akamai cluster
Akamai global DNS server
5
3
1
2
6
4
Akamai regional DNS server
7
8
Nearby Akamai cluster
9
  • End-user

GET /foo.jpg Host cache.cnn.com
28
How Akamai Works
HTTP
cnn.com (content provider)
DNS root server
GET foo.jpg
11
12
Akamai cluster
Akamai global DNS server
5
3
1
2
6
4
Akamai regional DNS server
7
8
Nearby Akamai cluster
9
  • End-user

GET /foo.jpg Host cache.cnn.com
29
CDN Redirection
  • The Akamai DNS server IP address is now in the
    cache of the local DNS server.
  • This implies that it is not always necessary to
    go to the root DNS server.
  • The TTL associated with the IP address of an
    Akamai server(surrogate) is relatively small.
  • This is done for performance reasons.
  • Akamai content distribution servers are caches

30
CDN Redirection
  • What if content is not there?
  • If the request content is not found then the
    surrogate will ask other surrogates within a
    specified region for information.
  • If requested information is still not found or is
    stale, then a request is made to the original web
    site.

31
CDN Selection
  • The tricky issue is selecting which local content
    server to use for a particular request
  • Want to spread load evenly
  • Want minimal impact if server is added or
    removed.
  • In Akamai, each surrogate server sends
    measurement results to the Network Operations
    Communications Center (NOCC).
  • Measurement results include number of active TCP
    connections, HTTP request arrival rate, bandwidth
    availability, etc
  • This information is used by the Akamai DNS
    server.

32
Accounting Mechanism
  • Accounting mechanisms collect and track
    information related to request routing,
    distribution and delivery.
  • Information is gathered in real time and put into
    log files for each CDN component.
  • This gets sent to the Network Operations
    Communications Center (NOCC).

33
Full Site Delivery vs. Partial Site Delivery
  • Full Site Delivery All the contents are
    delivered by the CDN (including HTML, images,
    and other objects).
  • Partial Site delivery Only images, streaming
    media and other bandwidth intensive objects
    delivered by the CDN.

34
Current Akamai Customers
35
Summary
  • We have examined replication and issues related
    to the design and implementation of a replicated
    system.
  • Many choices and tradeoffs to consider
Write a Comment
User Comments (0)
About PowerShow.com