Using%20Database%20Technology%20to%20Improve%20Performance%20of%20Web%20Proxy%20Servers

About This Presentation

Title:

Using%20Database%20Technology%20to%20Improve%20Performance%20of%20Web%20Proxy%20Servers

Description:

Using Database Technology to Improve Performance of Web Proxy Servers K. Cheng , Y. Kambayashi , M. Mohania Kyoto University, Japan Western Michigan ... – PowerPoint PPT presentation

Number of Views:119

Avg rating:3.0/5.0

Slides: 23

Provided by: yme7

Category:

more less

Transcript and Presenter's Notes

Title: Using%20Database%20Technology%20to%20Improve%20Performance%20of%20Web%20Proxy%20Servers

1
Using Database Technology to Improve Performance
of Web Proxy Servers

K. Cheng¹, Y. Kambayashi¹, M. Mohania²
¹Kyoto University, Japan
²Western Michigan University, USA

2
Caching on web proxy servers
Web Servers
Clients

Improve throughput of proxy servers
Improve response times for end users
Bridge bandwidth gap between WAN and LAN
Distribute workload from web servers

3
Characteristics of proxy caching
Traditional Caching Proxy Caching
Storage Memory-based Disk-based
Cache size Small Huge
Object survival time Short Long
Algorithm Simple Can be complex
Who use ? Programmed process People with specific interest
4
Limitations of current caching schemes case 1

Tom found a very good page P1 about car models
John is also looking for that kind of pages, but
he only got P2
Both P1 and P2 were cached, but Tom didnt
know P2 and John didnt know about P1.
After several days, however, both were replaced
since no further visits.
As a result, Tom missed P2, John missed P1,
and cache missed 2 hits

State-of-art caching schemes cannot deal this
case!!
5
Limitations of current caching schemes case 2

Suppose the users of a proxy server are mostly
interested in XML, but rarely favor of Fuzzy
Suppose some clients retrieved pages P1 and
P2
After checking the content of P1and P2, we
know P1 is a XML one, P2 is a Fuzzy one

Should we prefer to cache P1 or P2 ?
6
Why current schemes cant deal with these cases ?

Physical object based cache management
Content transparency ? low utilization rate (Case
1)
Approximately 60 data in cache never used
Approximately 90 data in cache rarely used
Usage-based object replacement ? Needlessly long
stay time for irrelevant contents (Case 2)

7
Our solution

We propose a hierarchical data model for
management of web data (physical pages, logical
pages and topics).
Object replacement based on
Link structure (logical pages)
Semantic similarity with other objects (topics
)
Facilitate active access to cache contents

8
A hierarchical model for web data
Topics
navigate
Topic manager
T1
T2
Mapping
Logical pages
Search
Logical page manager
L1
L2
L3
Mapping
Physical pages
Physical page manager
p1
p2
p3
p4
p5
p6
Browse
9
Physical pages
http//www.difa.unibas.it/webdb2001
../icons/webdblogo.gif
Physical page A
Physical page B
/instructionsPage/index.html
10
Logical page
A
B
11
Managing physical pages

Physical page
HTML/plain text file (.html, .txt)
Embedded media file (.gif, .png, wav, .mp3)
Application Generated File (.pdf, .ps, .doc)
Managing physical pages based on
URL (protocol, ip, port, path)
Physical properties (e.g. size, cost etc.)
Usage (frequency, recency)

12
Constructing logical pages

Basic logical pages
Single multimedia document
HTML(1) embedded media files(1..)
Extended logical pages
Several closely related directly linked pages
E.g. an HTML paper with sections on different
multimedia documents

13
Managing topics

Defining a topic
Topic ltid, name, criteria, popularity, date, gt
Popularityf(F, R, P, U)
F Access Frequency of Topic
R - Time interval between last access time
and current time
P Number of logical pages belonging to
a topic
U Number of users accessing a topic
Deciding membership of a logical page to a topic
IR Approaches (K-NN, )
ML Approaches (e.g. Support Vector Machine-SVM)

14
Definitions

We use a term Priority for object replacement.
It is a function of several parameters, e.g.
access frequency(F), time interval(R), size of
object(S), retrieval cost(C), significance(G).
Significance Importance of the topic

15
Caching policy LRU-SP

Topic management
Priority f(F, R, G)
Logical page management
Basic logical pages only
Priority g(F, R)
Physical page management
LRU-SP --size-adjusted popularity-aware LRU
(K. Cheng et al, Compsac00)
Priority h(F, R, S)

16
Evaluate add new objects
D is of higher priority
T2
T1
Topics
Priority
Higher
Lower
L1
L2
L3
Logical Pages
P10
P40
P30
P20
Physical Pages
P11
P41
P31
P22
P12 P21
P42
New Object D

17
Replace an object

Choose a candidate topic (T1)
T1 has 1 logical page (L1), choose (L1)
(L1) has 3 physical pages (P10), ( P11), (P12),
where (P12) shared by (L2)
Choose a victim (P) from (P10), ( P11).
Replace (P) with the new page

18
Preliminary experiments

Replay access logs of our proxy server(Squid)
30 clients, 30 days
873,824 requests, 21.30GB data
7 Topics, Priority ? 1..5
Significance Factor (0, 2)
Measure the significance of each topic
Hit Rate(HR)
Percentage of requests satisfied by cache
Profit Rate(PR)-- is significance of
topic

19
Baseline algorithm LRV (Rizzo et al 1998)

A physical-page-based algorithm
Using size(S) to predict further access to
incoming objects
Parameters in consideration
Access frequency (F)
Time interval (R)
Size of objects (S)

20
Results Hit Rates 20 UP
Cache space in of total unique data
21
Results Profit Rates 30 Up
Cache space in of total unique data
22
Conclusion and future work

Performance of caching proxies can be remarkably
improved if cache contents were well organized
and managed
Proposed a hierarchical model and the cache
management scheme based on that model
Future work
Tuning various parameters to achieve better
performance(Logical page clustering, priority
balancing significance and popularity etc.)
More experiments

Write a Comment

User Comments (0)

About PowerShow.com

Using%20Database%20Technology%20to%20Improve%20Performance%20of%20Web%20Proxy%20Servers - PowerPoint PPT Presentation

Using%20Database%20Technology%20to%20Improve%20Performance%20of%20Web%20Proxy%20Servers

Using Database Technology to Improve Performance of Web Proxy Servers K. Cheng , Y. Kambayashi , M. Mohania Kyoto University, Japan Western Michigan ... – PowerPoint PPT presentation