Web Prefetching: Costs, Benefits and Performance - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Web Prefetching: Costs, Benefits and Performance

Description:

Department of Electrical and Computer Engineering, UNM. 1 ... Dept. of Electrical & Computer Engineering. The University of New Mexico. Aug 15, 2002 ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 25
Provided by: yingyi
Category:

less

Transcript and Presenter's Notes

Title: Web Prefetching: Costs, Benefits and Performance


1
Web Prefetching Costs, Benefits and Performance
  • Yingyin Jiang, Min-You Wu, Wei Shu
  • Dept. of Electrical Computer Engineering
  • The University of New Mexico
  • Aug 15, 2002
  • WCW 2002, Boulder, Colorado

2
Talk Focus
  • A solution space of web prefetching
  • Several object-selection criteria
  • New simple selection algorithm
  • Select a good set of objects
  • Maximize benefit/cost
  • Can be tuned to achieve different goals
  • New performance metric
  • Balance of benefits (hit rate improvement)
    against costs(network bandwidth increase)

3
Outline
  • Motivation
  • Our approach
  • Performance Evaluation
  • Discussions and Conclusions

4
Motivation for Prefetching
  • Limited hit ratio by passive caching
  • Typical -- 20 to 40
  • Limited by newly introduced, dynamically
    generated data and rapid changes of objects in
    the web
  • Prefetching can further improve hit ratio (reduce
    client latency) but sacrifice network bandwidth
  • Predict future accesses to objects
  • Fetch objects before users request them

5
Key Parameters for Prefetching Algorithms
  • Object popularity
  • Zipf popularity distribution
  • pi C / i a
  • The probability of a request for the ith most
    popular document is inversely proportional to i
  • Object lifetime
  • Indication of object modification
  • Key factor for the design of prefetching
    algorithm

6
Solution Space for Web Prefetching
  • Six models
  • Two extreme cases
  • Passive caches(non-prefetching)
  • Prefetching all objects
  • The other four algorithms use different
    object-selecting criteria and fetch objects with
    values that exceed the threshold
  • Popularity
  • Lifetime
  • Good Fetch
  • APL

7
Two Simple Schemes
  • Popularity
  • Keep the most popular objects in the system
  • Update these objects immediately whenever they
    are modified
  • Threshold objects popularity
  • Lifetime
  • Keep objects with longest lifetimes
  • Mostly consider the network resource demands
  • Threshold the expected lifetime of object

8
Good Fetch
  • Proposed by Venkataramani in Venkataramani01
  • Attempt to balance the benefit against the cost
    of keeping an object
  • Threshold probability that a prefetched object
    is accessed before it changes


  • ,
  • li object is expected lifetime
  • a avg. request arrival rate
  • pi object is popularity
  • Prefetch object i if

9
APL
  • Attempt to balance benefit against cost
  • Threshold the expected number of requests for
    the object i that arrive during its lifetime
  • ,
  • li object is expected lifetime
  • a avg. request arrival rate
  • pi object is popularity
  • Prefetch object i if

10
Enhanced APL
  • Enhanced APL
  • Prefetch object i if
  • Motivation -- adapt to network status
  • When the network has abundant bandwidth, a larger
    value of n can be used to fetch more popular
    objects to improve hit ratio
  • When the network has congestions, a smaller value
    of n can be used or prefetching can even be
    disabled to save the bandwidth

11
Performance Evaluation for Prefetching
  • Evaluation metrics
  • Benefit -- hit ratio
  • Cost bandwidth
  • Benefit/cost H/B
  • Algorithms to be evaluated
  • Popularity
  • Lifetime
  • Good Fetch
  • APL

12
New Evaluation Metric H/B
  • Measure benefit/cost
  • Passive caching serves as a baseline for
    comparison
  • Enhanced
  • Emphasize benefit -- hit ratio improvement
  • When system has plenty of spare bandwidth, a
    small fraction of hit ratio improvement can still
    be justified

13
Evaluation Methodology
  • Analytical simulation evaluation
  • Give a proof of concept for performance of
    different algorithms
  • Experimental settings
  • Poisson model of user request arrival
  • Workload of one million objects Douglis97,
    Breslau99, Nishikawa98
  • Zipf popularity distribution, with parameter
    0.986 Breslau99
  • Object lifetimes distribution obtained from
    Douglis97
  • Fixed object size 10K Bytes Bray96,
    Williams96, Abdulla98, Arlitt99
  • No correlation between lifetimes, sizes,
    popularities Crovella98, Breslau99

14
Distribution of Object Lifetimes
Douglis97
  • We vary the mean lifetime of objects across
    several orders of magnitude
  • Shifting factor 0, mean 3.8 months
  • Shifting factor -2, mean 1.2 days
  • Shifting factor -4, mean 16. 7 minutes
  • The shift factor denotes the horizontal
    displacement along the lifetime axis (on log
    scale) of the Cumulative Distribution Function

15
Results -- Hit Ratio
shift factor -4
shift factor 0
Hit Ratio
Hit Ratio
Log10( of prefetched objects)
Log10( of prefetched objects)
  • Popularity -- the highest hit ratio
  • APL (n 1) works very close to GoodFetch
  • GoodFetch and APL (n 1) work closer to Lifetime
    at longer mean lifetime, and closer to Popularity
    at shorter mean lifetime
  • Lifetime the lowest hit ratio

16
Results -- Bandwidth
shift factor -4
BW(kbps)
BW(kbps)
Log10( of prefetched objects)
Log10( of prefetched objects)
  • Popularity consumes the most network bandwidth
    compared to others
  • GoodFetch and APL obtain significant improvement
    in hit ratio at an expense of moderate bandwidth
    increase
  • e.g. when prefetching 0.1 objects, 15 increase
    on hit ratio (39.54 over 24.3)
  • Total bandwidth lt 2demand bandwidth (113.50 kbps
    over 60.57 kbps)
  • Lifetime consumes a smallest amount of bandwidth

17
Results -- H/B
shift factor 0
shift factor -4
H/B
H/B
Log10( of prefetched objects)
Log10( of prefetched objects)
  • Popularity drop quickly, not comparable with
    others
  • GoodFetch and APL -- attain high H/B values and
    show their effectiveness on maximizing
    benefit/cost
  • Lifetime -- slowly decrease all the way from the
    beginning

18
APL Family Hit Ratio
Hit Ratio
Log10( of prefetched objects)
  • a ( pi )n li , n 0.5, 1, 2, 5
  • n 1, APL -gt Good Fetch, maximize benefit/cost
  • n gt 1, APL -gt Popularity, increase hit ratio
  • n lt 1, APL -gt Lifetime, reduce bandwidth
    consumption

19
APL Family Bandwidth
BW(kbps)
Log10( of prefetched objects)
  • E.g., n 5, APLs hit ratio is very close to
    Popularity, and bandwidth cost is favorably
    smaller.

20
APL Family H/B Ratio
H/B
Log10( of prefetched objects)
  • From H/B point of view, APL can bias on popular
    or long-lived objects without sacrificing too
    much benefit/cost
  • n 1, achieve the best benefit/cost
  • n gt 1, get more increase on hit ratio with fair
    BW consumption
  • n lt 1, reduce bandwidth and still with reasonable
    hit ratio

21
Enhanced Hk/B
  • Recall -- why do we extend H/B to Hk/B?
  • Emphasize hit ratio improvement when evaluating
    benefit/cost
  • When evaluating with Hk/B, a small fraction of
    hit ratio improvement can still be justified even
    at the cost of disproportionate bandwidth
    increase

22
Hk/B Evaluation
Hk/B
Log10( of prefetched objects)
  • With higher k, it allows prefetching of more
    objects -gt encourage more hit ratio improvement
    with Hk/B
  • For Hk/B gt 1, how many objects can be prefetched?
  • K 1 --700 k 2 -- 2,000 k 3 -- 7,000 k
    4 -- 30,000 k 5 -- 200,000

23
Discussions
100
Hit Ratio
Bandwidth
  • We obtain a solution space for prefetching where
    different strategies lie along axes of hit ratio
    and bandwidth with different performance

24
Conclusions
  • We propose a new prefetching algorithm APnL
    that can be made adaptive to different network
    status by varying n
  • Prefetching must consider both object popularity
    and lifetime in order to significantly improve
    hit ratios at modest costs
Write a Comment
User Comments (0)
About PowerShow.com