Application-Controlled File Caching Policies - PowerPoint PPT Presentation

1 / 30

About This Presentation

Title:

Application-Controlled File Caching Policies

Description:

Interaction between the allocator and the prefetcher would also be useful. The allocator could inform the prefetcher about the current demand for cache ... – PowerPoint PPT presentation

Number of Views:28

Avg rating:3.0/5.0

Slides: 31

Provided by: eceEng

Learn more at: https://ece.eng.wayne.edu

Category:

more less

Transcript and Presenter's Notes

Title: Application-Controlled File Caching Policies

1
ECE 7995 Presentation

Application-Controlled File Caching Policies
Pei Cao, Edward W. Felten and Kai Li
Presented By
Mazen Daaibes
Gurpreet Mavi

2
Outline (part-1)

Introduction
User Level File Caching
Two level replacement
Kernel Allocation Policy
An Allocation Policy
First try
Swapping Position
Place Holders
Final Allocation Scheme

3
Introduction

File Caching is a widely used technique in file
system implementation.
The major challenge is to provide high cache hit
ratio.
Two-level cache management is used
Allocation the kernel allocates physical pages
to individual applications.
Replacement each application is responsible for
deciding how to use its physical pages.
Previous work has focused on replacement, largely
ignoring allocation.
Some applications have special knowledge about
their file access patterns which can be used to
make intelligent cache replacement decisions.
Traditionally, these applications buffer file
data in user address space as a way of
controlling replacement.
This approach leads to double buffering because
the kernel tries to cache file data as well.
Also, this approach does not give the application
real control because virtual memory system can
still page out data in the user address space.

4
Introduction Cont.

This paper considers how to improve the
performance of file caching by allowing
user-level control over file cache replacement
decisions.
To reduce the miss ratio, a user-level cache
needs not only an application-tailored
replacement policy but also enough available
cache blocks.
The challenge is to allow each user process to
control its own caching and at the same time to
maintain the dynamic allocation of cache blocks
among processes in a fair way so that overall
system performance improves.
The key element in this approach is a sound
allocation policy for the kernel, which will be
discussed next.

5
User Level File Caching

Two-Level Replacement
This approach presented in this paper is called
Two-level cache block replacement.
This approach splits responsibility between the
kernel and user levels.
The kernel is responsible for allocating cache
blocks to processes.
Each user process is free to control the
replacement strategy on its share of cache blocks
If the process chooses not to exercise its
choice, the kernel applies a default strategy
(LRU)

6
User Level File Caching Cont.
Application P
Application Q
1. P misses
2. Kernel asks Q
4. Reallocates B
3. Q gives up B
Kernel
7
Kernel Allocation Policy

The Kernels allocation policy should satisfy
three principles
A process that never overrules the kernel does
not suffer more misses than it would under global
LRU.
A foolish decision by one process never causes
another process to suffer more misses.
A wise decision by one process never causes any
process, including itself, to suffer more misses.

8
First Try
Time Ps ref Qs ref Cache State
Global LRU List
Increasing recency
A W X B
A W X B
A is the least recently used but P suggests B
t0
MISS
t1 Y
A W X Y
A W X Y
P has no choice but to use A
t2
MISS
t3 Z
Z W X Y
W X Y Z
t4
MISS
t5 A
Z A X Y
X Y Z A
t6
MISS
t7 B
Z A B Y
Y Z A B
t8
Total misses 4 (2 by Q, 2 by P) Total misses
under global LRU 3 (2 by Q, 1 by P)
This Violates Principle 3
9
Swapping Positions
Time Ps ref Qs ref Cache State
Global LRU List
Increasing recency
A W X B
A W X B
Swap A and B in LRU list then replace B
t0
MISS
t1 Y
A W X Y
W X A Y
t2
MISS
t3 Z
A Z X Y
X A Y Z
t4
t5 A
A Z X Y
X Y Z A
t6
MISS
t7 B
A Z B Y
Y Z A B
t8
Total misses 3 (2 by Q, 1 by P) Total misses
under global LRU 3
10
Swapping Positions

Swapping positions guarantees that if no process
makes foolish choices, the global hit ratio is
the same as or better than it would be under
global LRU.

But what if the process makes a foolish choice?

11
No Place Holders
Time Ps ref Qs ref Cache State
Global LRU List
Increasing recency
X A Y
X A Y
Q makes wrong decision and replaces Y
t0
MISS
t1 Z
X A Z
A X Z
t2
MISS
t3 Y
X Y Z
X Z Y
t4
MISS
t5 A
A Y Z
Z Y A
t6
Total misses 3 (2 by Q, 1 by P) Total misses
under global LRU 1 (by Q)
This Violates Principle 2
12
With Place Holders
Time Ps ref Qs ref Cache State
Global LRU List
Increasing recency
X A Y
X A Y
Q makes wrong decision and replaces Y
t0
MISS
t1 Z
X A Z
A X(Y) Z
A place holder is created for Y
t2
MISS
t3 Y
Y A Z
A Z Y
t4
t5 A
Y A Z
Z Y A
t6
Total misses 2 (2 by Q, 0 by P) Total misses
under global LRU 1 (by Q)
Q hurts itself by its foolish decision, but it
doesnt hurt anyone else. Principle 2 is
satisfied.
13
To summarize

If a reference to cache block b hits
b is moved to the head of the global LRU list
Place-holder pointing to b is deleted

14
To summarize

If a reference to cache block b misses
1st case there is a place-holder for b, pointing
to t. t is replaced and its page is given to b.
If t is dirty, it is written to disk.

2nd case no place-holder for b. The kernel finds
the block at the end of the LRU list. Say block
c, belonging to process P. The kernel consults P.
if P chooses to replace block x. The kernel then
swaps x and c in the LRU list.
If there is place-holder pointing to x, it is
changed to point to c. Otherwise, a place-holder
is built for x, pointing to c.
Finally, xs page is given to b.

15
Outline (part-2)

Design Issues
- User/ Kernel Interaction
- Shared Files
- Prefetching
Simulation
- Simulated Application Policies
- Simulation Environment
- Results
Conclusions

16
Design Issue 1 User- Kernel Interaction

Allow each user process to give hints to the
kernel.
Which blocks it no longer needs or which blocks
are less important than others.
Inform kernel about its access pattern for files
(sequential, random etc.) Kernel can then make
decisions for the user process.
Implement a fixed set of replacement policies in
the kernel and the user process can choose from
this menu- LRU, MRU, LRU-K etc.
For full flexibility, the kernel can make an
upcall to the manager process every time a
replacement decision is needed.
Each manager process can maintain a list of
free blocks and the kernel can take blocks off
the list when it needs them.
Kernel can implement some common policies and
rely on upcalls for applications that do not want
to use the common policies.

17
Design Issue 2 Shared Files

Concurrently shared files are handled in one of
the two ways
If all the sharing processes agree to designate a
single process as manager for the shared file,
then the kernel allows this.
If the sharing processes fail to agree,
management reverts to the kernel and the default
global LRU policy is used.

18
Design Issue 3 Prefetching

Kernel prefetching would be responsible for
deciding how aggressively to prefetch. Simply
treat the prefetcher process as another process
competing for memory in the file cache.
Information about future file references
(essential for prefetching) might be valuable to
the replacement code as well. Adding prefetching
may well make the allocators job easier rather
than harder.
Interaction between the allocator and the
prefetcher would also be useful. The allocator
could inform the prefetcher about the current
demand for cache blocks the prefetcher could
then voluntarily free blocks when it realizes
that some prefetched blocks are no longer useful.

19
Simulation

Trace driven simulation has been used to evaluate
two- level replacement.
In these simulations, the user-level managers
used a general replacement strategy that takes
advantage of knowledge about applications file
references.
Two set of traces were used to evaluate the
scheme.
Ultrix
Sprite

20
Simulated Application Policies

The two-level block replacement enables each user
process to use its own replacement policy.
This solves the problem for those sophisticated
applications that know exactly what replacement
policy they want.
For less sophisticated applications, the
knowledge about an applications file accesses
can be used in replacement policy.
Knowledge about file accesses can often be
obtained through general heuristics or from the
compiler or the application writer.

21
Simulated Application Policies (contd.)

Following replacement policy based on the
principle of RMIN (replace the block whose next
reference is farthest in future) has been
proposed to exploit partial knowledge of the
future file access sequence.
When the kernel suggests a candidate replacement
block to the manager process,
Find all blocks whose next references are
definitely (or with high probability) after the
next reference to the candidate block.
If there is no such block, replace the candidate
block.
Else, choose the block whose reference is
farthest from the next reference of the candidate
block.

22
Simulated Application Policies (contd.)

This strategy can be applied to general
applications with following common file reference
patterns
Sequential Most files are accessed sequentially
most of the time.
File specific sequence Some files are mostly
accessed in one of a few sequences.
Filter Many applications access files one by one
in the order of their names in the command line
and access each file sequentially from beginning
to end.
Same-order a file or group of files are
repeatedly accessed in the same order.
Access Once Many programs do not re-read or
re-write file data that they have already been
accessed.

23
Simulation Environment

Two trace-driven simulations have been used to do
preliminary evaluation of the ideas presented in
this paper
Ultrix
Traces various applications running on a DEC
5000/200workstation.
1.6 MB file cache.
Block size 8K.
Sprite
File system traces from UC at Berkeley.
Recording file activities of about 40 clients
over a period of 48 hours.
Client cache size 7MB.

24
Results- Ultrix Traces

Postgres a relational database system.
Sequential access pattern as the policy
The designer of the database system can certainly
give a better user-level policy, thus further
improving the hit ratio.

25
Results- Ultrix Traces

Cscope an interactive tool for examining C
sources.
It reads the database of all source files
sequentially from beginning to end to answer each
query.
Applying right user-level policy (Same-Order
being the access pattern) the miss ratio is
reduced significantly.

26
Results- Ultrix Traces

Link-editing Ultrix linker
Linker in this system makes a lot of small file
accesses.
Doesnt fit sequential access pattern, but fits
Read-once

27
Results- Ultrix Traces

Multi-process workload Postgres, cscope linker
are running concurrently.
Simulated each application running its own
user-level policy as discussed in previous
slides.
Yields the curve directly above RMIN.

28
Results Sprite Traces

In a system with a slow network (e.g ethernet),
client caching performance
determines the file system performance on each
workstation.
Since most file accesses are sequential, the
sequential heuristic can be
used.
Sequential pattern improves hit ratio for about
10 of the clients.

29
Conclusions

This paper has proposed a two-level replacement
scheme for file cache management.
Its kernel policy for cache block allocation.
Several user-level replacement policies.
Kernel Allocation policy guarantees performance
improvements over the traditional global LRU file
caching approach
The method guarantees that processes that are
unwilling or unable to predict their file access
patterns will perform at least as well as global
LRU.
It guarantees that a process that mis-predicts
its file access patterns cannot cause other
processes to suffer more misses.
The key contribution is the guarantee that a good
user-level policy will improve the file cache hit
ratios of the entire system.