Web Cache Replacements - PowerPoint PPT Presentation

1 / 60
About This Presentation
Title:

Web Cache Replacements

Description:

GreedyDual-Size (GD-Size) associates a cost with each object and evicts object ... Hierarchical GreedyDual (Hierarchical GD) does object placement and replacement ... – PowerPoint PPT presentation

Number of Views:133
Avg rating:3.0/5.0
Slides: 61
Provided by: swa7157
Category:
Tags: cache | gd | replacements | web

less

Transcript and Presenter's Notes

Title: Web Cache Replacements


1
Web Cache Replacements
  • ???
  • ?????
  • ????
  • ykchang_at_mail.ncku.edu.tw

2
Introduction
  • Which page to be removed from its cache?
  • Finding a replacement algorithm that can yield
    high hit rate.
  • Differences from traditional caching
  • nonhomogeneity of the object sizes
  • same frequency and different size, favor smaller
    objects if consider only hit rate,
  • Byte hit rate

3
Introduction
  • Other consideration
  • transfer time cost
  • Expiration time
  • Frequency
  • Measurement metrics?
  • admission control?
  • When or how often to perform the replacement
    operations?
  • How many documents to remove?

4
Measurement Metrics
  • Hit Rate (HR)
  • requests satisfied by cache
  • (shows fraction of requests not sent to server)
  • Volume measures
  • Weighted hit rate (WHR) Byte Hit Ratio
  • client-requested bytes returned by proxy (shows
    fraction of bytes not sent by server)
  • Fraction of packets not sent
  • Reduction in distance traveled (e.g., hop count)
  • Latency Time

5
Three Categories
  • Traditional replacement policies and its direct
    extensions
  • LRU, LFU,
  • Key-based replacement policies
  • Cost-based replacement policies

6
Traditional replacement
  • Least Recently Used (LRU) evicts the object which
    was requested the least recently
  • prune off as many of the least recently used
    objects as is necessary to have sufficient space
    for the newly accessed object.
  • This may involve zero, one, or many replacements.

7
Traditional replacement
  • Least Frequently used (LFU) evicts the object
    which is accessed least frequently.
  • Pitkow/Recker evicts objects in LRU order, except
    if all objects are accessed within the same day,
    in which case the largest one is removed.

8
Key-based Replacement
  • The idea in key-based policies is to sort objects
    based upon a primary key, break ties based on a
    secondary key, break remaining ties based on a
    tertiary key, and so on.

9
Key-based Replacement
  • LRUMIN
  • This policy is biased in favor of smaller sized
    objects so as to minimize the number of objects
    replaced.
  • Let the size of the incoming object be S. Suppose
    that this object will not fit in the cache.
  • If there are any objects in the cache which have
    size at least S, we remove the least recently
    used such object from the cache.
  • If there are no objects with size at least S,
    then we start removing objects in LRU order of
    size at least S/2, then objects of size at least
    S/4, and so on until enough free cache space has
    been created.

10
Key-based Replacement
  • SIZE policy
  • In this policy, the objects are removed in order
    of size, with the largest object removed first.
  • Ties based on size are somewhat rare, but when
    they occur they are broken by considering the
    time since last access. Specifically, objects
    with higher time since last access are removed
    first.

11
Key-based Replacement
  • LRU-Threshold is the same as LRU, but objects
    larger than a certain threshold size are never
    cached.
  • Hyper-G is a refinement of LFU, break ties using
    the recency of last use and size.
  • Lowest Latency First minimizes average latency
    by evicting the document with the lowest download
    latency first.

12
Cost-based Replacement
  • Employ a potential cost function derived from
    different factors such as
  • time since last access,
  • entry time of the object in the cache,
  • transfer time cost,
  • object expiration time and so on.
  • GreedyDual-Size (GD-Size) associates a cost with
    each object and evicts object with the lowest
    cost/size.
  • Hybrid associates a utility function with each
    object and evicts the one has the least utility
    to reduce the total latency.

13
Cost-based Replacement
  • Lowest Relative Value evicts the object with the
    lowest utility value.
  • Least Normalized Cost Replacement (LCN-R) employs
    a rational function of the access frequency, the
    transfer time cost and the size.
  • Bolot/Hoschka employs a weighted rational
    function of transfer time cost, the size, and the
    time last access.

14
Cost-based Replacement
  • Size-Adjusted LRU (SLRU) orders the object by
    ratio of cost to size and choose objects with the
    best cost-to-size ratio.
  • Server-assisted scheme models the value of
    caching an object in terms of its fetching cost,
    size, next request time, and cache prices during
    the time period between requests. It evicts the
    object of the least value.
  • Hierarchical GreedyDual (Hierarchical GD) does
    object placement and replacement cooperatively in
    a hierarchy.

15
GreedyDual
  • GreedyDual is originally proposed by Young and
    Tarjan, concerned with the case when pages in a
    cache have the same size but incur different
    costs to fetch from a secondary storage
  • A value H is initiated with each cached page p
    when a page is brought into cache.
  • H is set to be the cost of bringing p into the
    cache
  • the cost is always nonnegative.
  • (1) Page with the lowest H value (minH) is
    replaced and (2) then all pages reduce their H
    values by minH

16
GreedyDual
  • If a page is accessed, its H value is restored to
    the cost of bringing it into the cache
  • Thus the H values of recently accessed pages
    retain a larger portion of the original cost than
    the pages that have not been accessed for a long
    time
  • By reducing the H values as time goes on and
    restoring them upon access, GreedyDual integrates
    the locality and cost concerns in a seamless
    fashion

17
GreedyDual-Size
  • Setting H to cost/size upon accesses to a
    document, where cost is the cost of bringing the
    document and size is the size of the document in
    bytes
  • call this extended version as GreedyDual-Size
  • The definition of cost depends on the goal of the
    replacement algorithm cost is set to
  • 1 if the goal is to maximize hit ratio
  • the downloading latency if the goal is to
    minimize average latency
  • network cost if the goal is to minimize the total
    cost

18
GreedyDual-Size
  • Implementation
  • Need to decrement all the pages in cache by
    Min(q) every time a page q is replaced, which may
    be very inefficient
  • Improved algorithm is in the next page
  • Maintaining a priority queue based on H
  • Handling a hit requires O(log k) time and
  • Handling an eviction requires O(log k) time since
    in both cases the queue needs update

19
GreedyDual-Size
  • Algorithm GreedyDual (document p)
  • / Initialize L ? 0 /
  • If p is already in memory,
  • H(p) ? L cost(p)/size(p)
  • If p is not in memory,
  • while there is not enough room in memory for
    p,
  • Let L ? min H(q) for all q in
    cache
  • Evict q such that H(q) L
  • Put p into memory set H(p)?Lcost(p)/size(p)

20
Hybrid Algorithm (HYB)
  • Motivated by Bolot and Hoschka's algorithm.
  • HYB is a hybrid of several factors, considering
    not only download time but also number of
    references to a document and document size. HYB
    selects for replacement the document i with the
    lowest value of the following expression

21
HYB
  • Utility function is defined as follows
  • Cs is the estimated time to connect to the server
  • bs is the estimated bandwidth to the server
  • Zp is the size of the document
  • np is the of times document has been referenced
  • Wb and Wn are constants that set the relative
    importance of the variables bsand np, respectively

22
Latency Estimation Algo. (LAT) REF
  • Motivated by estimating the time required to
    download a document, and then replace the
    document with the smallest download time.
  • Apply some function to combine (e.g., smooth)
    these time samples to form an estimate of how
    long it will take to download the document
  • keeping a per-document estimate is probably not
    practical.
  • Alternative keep statistics of past downloads on
    a per-server basis, rather than a per-document
    basis. (less storage)
  • For each server j, the proxy maintains an
  • Clatj estimated latency (time) to open
    connection to server
  • Cbwj estimated bandwidth of the connection (in
    bytes/second),

23
Latency Estimation Algo. (LAT) REF
  • When a new document is received from server, the
    connection establishment latency (sclat) and
    bandwidth for that document (scbw) are measured ,
    the estimates are updated as follows
  • clatj (1-ALPHA) clatj ALPHA sclat
  • cbwj (1-ALPHA) cbwj ALPHA scbw
  • ALPHA is a smoothing constant, set to 1/8 as it
    is in the TCP smoothed estimation of RTT
  • Let ser(i) denote the server on which document i
    resides, and si denote the document size. Cache
    replacement algorithm LAT selects for replacement
    the document i with the smallest download time
    estimate, denoted di
  • di clatser(i) si/cbwser(i)

Replacement Algorithm
24
Latency Estimation Algo. (LAT)
  • One detail remains
  • a proxy runs at the application layer of a
    network protocol stack, and therefore would not
    be able to obtain the connection latency samples
    sclat.
  • Therefore the following heuristic is used to
    estimate connection latency. A constant CONN is
    chosen (e.g., 2Kbytes). Every document that the
    proxy receives whose size is less than CONN is
    used as an estimate of connection latency sclat.
  • Every document whose size exceeds CONN is used as
    a bandwidth sample as follows
  • scbw download time of document current value
    of clatj.

25
Lowest Relative Value (LRV)
  • time from the last access t for its large
    influence on the probability of a new access
  • the probability of a new access conditioned to
    the time from the last access can be expressed as
    (1 - D(t))
  • of previous accesses i this parameter allows
    the proxy to select a relatively small number of
    documents with a much higher probability of being
    accessed again
  • document size s This seems to be the most
    effective parameter that make a selection among
    documents with only one access

26
Distribution of interaccess times, D(t)
27
Prob. Density function of interaccess times, d(t)
28
Lowest Relative Value (LRV)
  • We compute the probability that a document is
    accessed again, Pr(i, t, s), as follows
  • Pr(i, t, s) P1(s)(1 - D(t)) if i 1
  • Pr(i, t, s) Pi (1 D(t)) otherwise
  • Pi conditional probability that a document is
    reference i1 times given that it has been
    accessed i times
  • P1(s) Percentage of size s with at least 2
    accesses
  • D(t) density distribution of times between
    consecutive requests to the same document,
    derived as D(t) 0.035log(t1) 0.45(1 - e
    )

29
Lowest Relative Value (LRV)
30
Lowest Relative Value (LRV)
31
Performance from Pei Cao
  • Use hit ratio, byte hit ratio, reduced latency
    and reduced hops
  • reduced latency the sum of downloading latency
    for the pages that hit in cache as a percentage
    of the sum of all downloading latencies
  • reduced hops the sum of the network costs for
    the pages that hit in cache as a percentage of
    the sum of the network costs of all Web pages
  • model network cost of each document as hops
  • Web server has hop value 1 or 32 we assign 1/8
    of servers with hop value 32 and 7/8 with hop
    value 1
  • The hop value can be thought of either as the
    number of network hops traveled by a document or
    as the monetary cost associated with the document

32
Performance from Pei Cao
  • GD-Size(1) sets cost of each document to be 1,
    thus trying to maximize hit ratio
  • GD-Size(packets) sets the cost for each document
    to 2size/536, i.e. estimated number of network
    packets sent and received if a miss to the
    document happens
  • 1 packet for the request, 1 packet for the reply
    and size/536 for extra data packets assuming a
    536-byte TCP segment size.
  • It tries to maximize both hit ratio and byte hit
    ratio
  • Finally GD-Size(hops) sets the cost for each
    document to the hop value of the document trying
    to minimize network costs

33
Performance from Pei Cao
  • See Caos paper page 4

34
Weighted Hit Rate
  • Results on best primary key are inconclusive
  • Most references are from small files, but most
    bytes are from large files
  • Why Size?
  • Most accesses are for smaller documents
  • A few large documents take the space of many
    small documents
  • Concentration of large inter-reference times

35
Exp. 3 Partitioning Cache by Media
  • Idea
  • Do clients that listen to music degrade the
    performance of clients using text and graphics?
  • Could a partitioned cache with one portion
    dedicated to audio, and the other to non-audio
    documents increase the WHR experienced by either
    audio or non-audio documents?
  • Simulate
  • cache size 10 of max needed
  • two partitions audio and non-audio

36
Exp. 4 Partitioning Cache by Media
  • In Experiment 4,
  • a one-level cache with SIZE as the primary key
  • random as the secondary key
  • three partition sizes dedicate 1/4, 1/2, or 3/4
    of the cache to audio
  • the rest is dedicated to non-audio documents.

37
Exp. 4 Partitioning Cache by Media
38
Exp. 4 Partitioning Cache by Media
39
Problems to solve
  • Certain sorting keys have intuitive appeal.
  • The first is document type. A sorting key that
    puts text documents at the front of the removal
    queue would insure low latency for text in Web
    pages, at the expense of latency for other
    document type.
  • The second sorting key is refetch latency. To a
    user of international documents, the most obvious
    caching criteria is one that caches documents to
    minimize overall latency.
  • A European user of North American documents would
    preferentially cache those documents over ones
    from other European servers to avoid using
    heavily utilized transatlantic network links.
    Therefore a means of estimating the latency for
    refetching documents in a cache could be used as
    a primary sorting key.

40
Problems to solve
  • caching dynamic documents. Cache is only useless
    for dynamic documents if the document content
    completely changes otherwise a portion but not
    all of the cached copy remains valid.
  • allow caches to request the differences between
    the cached version and the latest version of a
    document.

41
Problems to solve
  • For example, in response to a conditional GET a
    server could send the diff" of the current
    version and the version matching the
    Last-Modified date sent by the client or a
    specific tag could allow a server to fill-in a
    previously cached static query response form."
  • Another approach to changing semi-static pages
    (i.e., pages that are HTML but replaced often) is
    to allow Web servers to preemptively update
    inconsistent document copies, at least for the
    most popular.

42
Randomized Strategies
  • These strategies use randomized decisions to find
    an object for replacement.

43
Randomized Strategies
Randomized Strategies
  • 1. RAND
  • This strategy removes a random object.
  • 2. HARMONIC Hosseini-Khayat 1997
  • RAND uses equal probability for each object,
    HARMONIC removes from cache one item at random
    with a probability inversely proportional to its
    specific cost cost ci/si .

44
Randomized Strategies
Randomized Strategies
  • 3. LRU-C and LRU-S Starobinski and Tse 2001.
  • LRU-C is a randomized version of LRU.
  • Let Cmaxc1,cN be the maximum of the access
    costs of all N objects of a request sequence.
  • Let ci ci/cmax be the normalized cost for
    object i. When an object i is requested, it is
    moved to the head of the cache with probability
    ci otherwise, nothing is done.

45
Randomized Strategies
Randomized Strategies
  • LRU-S uses the size instead of the cost. Let
    smins1,sN be the size of the smallest objects
    among the N documents, and di smin/si be the
    normalized density of object i.
  • LRU-S acts as LRU with probability di otherwise
    the cache state is left unmodified.
  • Furthermore, Starobinski and Tse 2001 proposed
    an algorithm which deals with both varying-size
    and varying-cost objects.
  • The following quantities were defined
  • Upon a request for object i, this algorithm
    performs the same operation as LRU with
    probability and with will leave the
    cache state unmodified.

46
Randomized Strategies
Randomized Strategies
  • 4. Randomized replacement with general value
    functions Psounis and Prabhakar 2001.
  • This strategy draws N objects randomly from the
    cache and evicts the least useful object in the
    sample. The usefulness of a document can be
    determined by any utility function. After
    replacing the least useful object, the next M(M lt
    N) least useful objects are retained in memory.
  • At the next replacement, N - M new samples are
    drawn from the cache and the least useful of
    these N-M and M previously retained is evicted.
    The M least useful of the remaining are stored in
    memory and so on.

47
Randomized Strategies Summary
Randomized Strategies
  • 1. Randomization presents a different approach
    to cache replacement.
  • 2. Randomized strategies try to reduce the
    complexity of the replacement process without
    sacrificing the quality too much.

48
Admission control
  • If we store the response in cache or not?
  • First time not save

49
Admission control
  • heuristic to make this decision the most
    frequently accessed objects recently will most
    likely be accessed again. The words frequently
    and recently imply that access frequency of
    objects and a decay function applied on frequency
    are needed.
  • an extra space called URL memory cache is
    introduced to store URLs and the associated
    access frequency of the requested objects.

50
Admission control
  • If the requested object is cacheable, the process
    of storing the object in disk cache is delayed
    until the same object is accessed again. (Or we
    can say that cacheable objects are not stored in
    disk cache unless they have been accessed before.
    )
  • Since the access stream is infinite, the size of
    URL cache must be limited. A replacement policy
    is also needed in URL cache.

51
Admission control operations
  • Cache hits
  • The operations are similar to the original
    algorithm.
  • In addition to unused non-cacheable objects and
    hot objects in memory cache, the cacheable
    objects without disk copies are also the
    candidates for replacement in memory cache.
  • Consider the case that a copy of the requested
    object exists in memory cache but not in disk
    cache.
  • The reference count associated with the requested
    object in memory cache is incremented by one and
    the data is then stored in disk cache.
  • If the evicted objects from memory cache are
    cacheable, its URL along with its reference count
    is then stored in URL cache.

52
Admission control operations
  • Cache misses for cacheable objects
  • If the requested object is cacheable, the caching
    algorithm checks
  • (1) if its URL is not stored in URL cache.
  • Replacement operations are performed for
    allocating enough space for holding the requested
    object.
  • The URL of the replaced object is now stored in
    URL cache along with its reference count.
  • The replacement operations in URL cache must be
    performed.
  • The evicted URLs from URL cache are released.
  • The requested object itself is not stored in disk
    cache at this moment. Thus, no replacement in
    disk cache is needed.

53
Admission control operations
  • Cache misses for cacheable objects
  • (2) if the URL of the requested object is stored
    in URL cache,
  • its associated record in URL cache is removed,
    the requested object is stored in disk cache, and
    the reference count is set to one.
  • Similarly, the replacement operations in disk
    cache must be performed. The URLs of the evicted
    objects from disk cache are stored in URL cache
    and again the replacement operations in URL cache
    are performed.

54
Admission control operations
  • Cache misses for no-cacheable objects
  • For a cache miss, if the object is non-cacheable,
    the operations are similar to original algorithm.
    If the evicted object from memory cache is
    cacheable and it does not exist in disk cache,
    its URL along with the reference count is stored
    in URL cache.
  • Notice that the proposed approach may lose some
    possible hits on the disk cache when objects are
    accessed the second time. However, it removes all
    the disk activity that disk cache stores the
    objects that will not be accessed again before
    evicted.

55
Admission control
  • Efficient Management of URL Cache
  • A separate hash table similar to that in
    memory/disk cache is used in URL cache to support
    efficient search for the URL of requested object.
  • The MD5 of URL is employed as the search key.
  • We employ a replacement policy that is based on
    the URL access frequency.
  • The least frequently accessed entry in URL cache
    is first selected for replacement.
  • A priority queue with access frequency as the key
    is a suitable implementation for such replacement
    policy.

56
Admission control
  • Efficient Management of URL Cache
  • Each entry of the URL cache records the MD5 of
    URL, access frequency, and a few pointers for
    facilitating priority queue and hash table data
    structures.
  • The required memory space for each entry in URL
    cache is constant.
  • The size of hash table and priority queue itself
    is small and does not depend on the number of
    entries hashed, thus can be ignored.
  • Based the size of the UC trace we studied in this
    paper, keeping all the URLs of the requests from
    one-day period in URL cache is reasonable. This
    accounts for 400k URLs. Therefore, assuming 80
    bytes is needed for each entry in URL cache, 32
    MB of the memory space is needed for the URL
    cache.

57
hit ratio h(S)
0.7
CHU
0.65
heff(S)
HR
h(S)
0.6
0.55
1
2
4
8
32
16
58
Removal frequency
  • On-demand Run policy when the size of the
    requested document exceeds the free room in a
    cache. (take time to do the removal)
  • Periodically Run policy every T time units, for
    some T.
  • If removal is time consuming
  • Both on-demand and periodically Run policy at
    the end of each day and on-demand (Pitkow/Recker
    13).

59
On-demand
  • Two arguments suggest that overhead of simply
    using on-demand replacement will not be
    significant.
  • First, the class of removal policies maintains a
    sorted list. If the list is kept sorted as the
    proxy operates, then the removal policy merely
    removes the head of the list for removal, which
    should be a fast and constant time operation.
  • Second, a proxy server keeps read-only documents.
    Thus there is no overhead for writing-back" a
    document, as there is in a virtual memory system
    upon removal of a page that was modified since
    being loaded.

60
How many to remove
  • Removal process is stopped when the free cache
    area equals or exceeds the requested document
    size.
  • Replace documents until a certain threshold
    (Pitkow and Recker's comfort level) is reached.
Write a Comment
User Comments (0)
About PowerShow.com