Web Caching - PowerPoint PPT Presentation

1 / 62
About This Presentation
Title:

Web Caching

Description:

... Co-operating ... Distributed /Co-operating Cache. More the users more disk space higher in ... Distributed /Co-operating Cache. First an UDP packet sent for ... – PowerPoint PPT presentation

Number of Views:860
Avg rating:3.0/5.0
Slides: 63
Provided by: alpa6
Category:

less

Transcript and Presenter's Notes

Title: Web Caching


1
Web Caching
  • By
  • Amisha Thakkar
  • Alpa Shah

2
Overview
  • What is a Web Cache ?
  • Caching Terminology
  • Why use a cache?
  • Disadvantages of Web Cache
  • Other Features
  • Caching Rules

3
Overview
  • Caching Architectures
  • Comparison of Architectures
  • Cache Deployment Scheme
  • Client Side Cache Cooperation
  • Active Caching

4
What is a Web Cache ?
  • Cache is a place where temporary copies of
    objects are stored
  • Cached information is generally closer to the
    requester than the permanent information is
  • Objects -HTML pages, images, files

5
What is a Web Cache?
6
Caching Terminology
  • Client - An application program that establishes
    connections for sending requests
  • Server- An application program that accepts
    connection to service requests by sending back
    responses
  • Origin Server-The server on which the given
    resource resides or is to be created

7
Caching Terminology
  • Proxy- An intermediary program which acts both as
    a server and a client which requests on behalf
    of the other clients
  • Proxy is not necessarily a cache
  • Proxy does not always cache the replies
    passing through it
  • It may be used on a firewall to monitor
    accesses

8
Why use a cache ?
  • To reduce latency
  • To reduce network traffic
  • Load on origin servers will be reduced
  • Can isolate end users from network failures

9
Disadvantages of Web cache
  • With cached data there is always a chance of
    receiving stale information
  • Content providers lose access counts when cache
    hits are served
  • Manual configuration is often required
  • Operation of cache requires additional resources
  • In some situations the cache can be a single
    point of failure

10
Other Features
  • Depending on the perspective the following may be
    good or bad
  • Cache requests on behalf of clients the
    servers never see the clients IP addresses
  • Cache provides an easy opportunity to
    monitor and analyze browsing activities
  • Cache can be used to block certain requests

11
Types of Web Caches
  • Proxy caches
  • Serve a large number of users
  • Large corporations and ISPs often set
  • them up on the firewalls
  • They are type of shared caches
  • Browser caches
  • Use a section of the computers hard disk
  • to store objects that you have seen

12
Caching Rules
  • Rules on which caches work -
  • Some of them set in protocols
  • Some are set by cache administrator
  • Most common rules
  • If the object is authenticated or secure it
  • wont be cached
  • Objects headers indicate whether the
  • object is cacheable or not

13
Caching Rules
  • Object is considered fresh when -
  • ? It has an expiry time or other age
  • controlling directive set is still
  • within the fresh period
  • ? If the browser cache has already seen
  • the object has been set to check
  • once a session

14
Caching Rules
  • ? If a proxy cache has seen the object
  • recently it was modified relatively
  • long ago
  • Fresh documents are served directly from the
  • cache without checking with the origin server

15
Caching Rules
  • For a stale object , the origin server will
  • be asked to validate the object , or tell
    the
  • cache whether the copy is still good
  • The most common validator is the time
  • that the object was last changed

16
Caching Architectures Hierarchical /Simple Cache
  • Browser-cache interaction is same as browser
    -host interaction, i.e. a TCP connection is made
    item requested
  • If not found send request to parent cache
  • Hierarchy built up - each level serving
    indirectly a wider community of users

17
Caching Architectures Hierarchical /Simple Cache
18
Caching Architectures Distributed /Co-operating
Cache
  • Decentralized(Cache Mesh)
  • Multiple servers cooperate in such a way that
    they share their individual caches to create a
    large distributed one
  • Simply put caching proxies communicating with
    each other to serve different users
  • On a cache miss, it checks with other proxy
    caches before contacting the origin server

19
Caching Architectures Distributed /Co-operating
Cache
  • Caches communicate amongst themselves using a
    protocol like ICP (Internet Cache Protocol)
  • Caches can be selected on the basis of
  • Distances from the end user
  • Specialize in particular URLs(location hint).

20
Caching Architectures Distributed /Co-operating
Cache
  • Why Distributed - limitations of hierarchy
  • Width of cache in hierarchy caches at same
    level are inaccessible to each other
  • LRU policy implies sufficient disk space
  • Cost in replication of disk storage
  • Amount of disk space reqd. depends on number
    of users served breadth of reading

21
Caching Architectures Distributed /Co-operating
Cache
  • More the users ? more disk space higher in the
    hierarchy
  • Exponential growth of number of documents on
    WWW

22
Caching Architectures Distributed /Co-operating
Cache
  • Caching close to user - more effective, higher
    the level lower the efficiency
  • Can be created for load balancing
  • Most effective when serving a community of
    interests

23
Caching Architectures Distributed /Co-operating
Cache
  • First an UDP packet sent for cache inquiry.
  • Cache selection decision is determined by RTT
  • Potential problem -network congestion because of
    UDP
  • In favor-
  • UDP exchange 2 IP packets, TCP at least 8
    packets

24
Caching Architectures Distributed /Co-operating
Cache
  • UDP reply from cache can indicate
  • a. Presence
  • b. Speed
  • c. Availability of requested documents

25
Caching Architectures Hybrid Cache
  • Note ICP

26
Comparison of Architectures
  • Hierarchical caches placed at multiple levels
  • Distributed caches only at bottom level no
    intermediate caches

27
Comparison of Architectures
  • Performance parameters.
  • ? Connection time (Tc)is defined as the time
    since the document is requested first data byte
    is received
  • ? Transmission time (Tt)is defined as the time
    taken to transmit the document
  • ? Total latency Tc Tt .
  • ? Bandwidth usage

28
Comparison of Architectures
  • Fig 3 -Connection time for different documents
    popularity

29
Comparison of Architectures
  • For unpopular documents high connection time
  • No of requests increases ? avg.. connection time
    decreases
  • For extremely popular documents distributed has
    smaller connection times

30
Comparison of Architectures
  • Fig 4 Network traffic generated

31
Comparison of Architectures
  • On lower levels, distributed caching practically
    double the network bandwidth usage
  • Around the root node in national network, the
    network traffic is reduced to half
  • Distributed caching uses all possible network
    shortcuts between institutional caches,
    generating more traffic in the less congested low
    network levels

32
Comparison of Architectures
  • Fig 5 a, Not congested national network

33
Comparison of Architectures
  • The only bottleneck on the path from the client
    to the origin server is the international path.
    Hence transmission times are similar for both

34
Comparison of Architectures
  • Fig 5 b Congested National Networks

35
Comparison of Architectures
  • Both have higher transmission times compared to
    the previous case
  • Distributed caching gives shorter transmission
    times than hierarchical because many requests
    travel through lower network levels

36
Comparison of Architectures
  • Fig 6 Average total latency

37
Comparison of Architectures
  • For large documents transmission time is more
    relevant than connection times
  • Hierarchical caching gives lower latencies for
    documents smaller 200 KB due to lower connection
    times
  • Distributed caching gives lower latencies for
    larger documents due to lower transmission times

38
Comparison of Architectures
  • The size- threshold depends on the degree of
    congestion in national network
  • Higher the congestion, lower is the size-
    threshold
  • Distributed caching has lower latencies than
    hierarchical

39
Comparison of ArchitecturesWith Hybrid Scheme
  • Fig 7 connection time

40
Comparison of ArchitecturesWith Hybrid Scheme
  • Fig 8.

41
Comparison of ArchitecturesWith Hybrid Scheme
  • In the hybrid scheme if the number of
    cooperating caches (kc) is very small , the
    connection time is high
  • When number of cooperating caches increases, the
    connection times decreases up to a minimum
  • If the number increases over the threshold , the
    connection time increases very fast

42
Comparison of ArchitecturesWith Hybrid Scheme
  • Fig 9 Transmission time

43
Comparison of ArchitecturesWith Hybrid Scheme
  • For un-congested n/w the no.of coop caches (kt)
    at every level hardly influences Tt
  • If no. of coop caches is very small , high Tt
    vice -versa
  • If the no increases above the threshold the Tt
    increases
  • Optimum no. of caches depends on the no of caches
    reachable avoiding congested links

44
Comparison of ArchitecturesWith Hybrid Scheme
  • Fig 10

45
Comparison of ArchitecturesWith Hybrid Scheme
  • Fig 11 total latency

46
Comparison of ArchitecturesWith Hybrid Scheme
  • The no. of coop caches(kopt) at every level
    depend on the document size to minimize the total
    latency
  • For small documents the optimum no. is closer to
    kc
  • For large documents the the optimum no. is closer
    to kt

47
Comparison of ArchitecturesWith Hybrid Scheme
  • Fig 12

48
Comparison of ArchitecturesWith Hybrid Scheme
  • For any document the optimum kopt that minimizes
    the total latency is such that kc? kopt?kt

49
Cache Deployment Schemes
  • Proxy caching

50
Cache Deployment Schemes
  • Advantages
  • ? Clients point all web requests directly to
    cache no effect on non web traffic
  • ?Cost of upgrading h/w s/w is limited
  • ? Administration on caches limited to basic
    configuration

51
Cache Deployment Schemes
  • Disadvantages
  • ?Every browser must be configured to point to
    the cache
  • ?Each client can hit only one cache
  • ?Single point of failure
  • ? Unnecessary duplication of data
  • ? Bottleneck in cases where content is otherwise
    available in LAN

52
Cache Deployment Schemes
  • Transparent Proxy caching

53
Cache Deployment Schemes
  • Advantages
  • ?No browser configuration
  • ?Cost of upgrading h/w s/w is limited
  • ?No administration of intermediate systems
    required

54
Cache Deployment Schemes
  • Disadvantages
  • ? Each client can hit only one cache
  • ?If cache goes down internet as well as
    intranet access lost
  • ? Negative impact on non web traffic
  • ? Cache has to route non web traffic
  • ? Routing ,packet examination n/w addr.
    translation steal CPU cycles from the main cache
    serving function

55
Cache Deployment Schemes
  • Transparent proxy caching with web cache
    redirection.

56
Cache Deployment Schemes
  • Advantages
  • ?Switch/ router examines the packets
  • ?Minimal impact on non-web traffic
  • ?Frees up CPU cycles for the web cache
  • ? Allows client load to be dynamically spread
    over multiple caches
  • ? Eliminates single point of failure especially
    if redundant redirectors are used

57
Cache Deployment Schemes
  • Disadvantages
  • ?Additional intermediate systems must be
    deployed
  • ? Increases expense

58
Client Side Cache Cooperation.
59
Active Caching
  • Current problem unable to cache dynamic documents
  • Caching Dynamic contents on the web using active
    web
  • Cache applet is server supplied code that is
    attached with an URL , or collection of URLs
  • Applet is written in platform independent language

60
Active Caching
  • On a user request the applet is invoked by the
    cache
  • The applet decides what is to be sent to the user
  • Other functions of the applet-
  • Logging user accesses
  • Checking access permissions
  • Rotating advertising banners

61
Active Caching
  • The proxy has the freedom to not invoke the
    applet but send the request to the server
  • Proxy promises to not send back a cached copy
    without invoking the applet
  • If applet too huge ,send request to server
  • Proxy not obligated to cache any applet , in that
    case agrees to not service the request for that
    document

62
Active Caching
  • Proxy can devote resources to the applets
    associated with the hottest URLs to its user
  • Proxy that receives the request is typically the
    proxy closest to the user , the scheme
    automatically migrates the server processing to
    the nodes that are close to users
  • Thus increasing the scalability of web based
    services
Write a Comment
User Comments (0)
About PowerShow.com