Application-Controlled File Caching Policies - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Application-Controlled File Caching Policies

Description:

Interaction between the allocator and the prefetcher would also be useful. The allocator could inform the prefetcher about the current demand for cache ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 31
Provided by: eceEng
Category:

less

Transcript and Presenter's Notes

Title: Application-Controlled File Caching Policies


1
ECE 7995 Presentation
  • Application-Controlled File Caching Policies
  • Pei Cao, Edward W. Felten and Kai Li
  • Presented By
  • Mazen Daaibes
  • Gurpreet Mavi

2
Outline (part-1)
  • Introduction
  • User Level File Caching
  • Two level replacement
  • Kernel Allocation Policy
  • An Allocation Policy
  • First try
  • Swapping Position
  • Place Holders
  • Final Allocation Scheme

3
Introduction
  • File Caching is a widely used technique in file
    system implementation.
  • The major challenge is to provide high cache hit
    ratio.
  • Two-level cache management is used
  • Allocation the kernel allocates physical pages
    to individual applications.
  • Replacement each application is responsible for
    deciding how to use its physical pages.
  • Previous work has focused on replacement, largely
    ignoring allocation.
  • Some applications have special knowledge about
    their file access patterns which can be used to
    make intelligent cache replacement decisions.
  • Traditionally, these applications buffer file
    data in user address space as a way of
    controlling replacement.
  • This approach leads to double buffering because
    the kernel tries to cache file data as well.
  • Also, this approach does not give the application
    real control because virtual memory system can
    still page out data in the user address space.

4
Introduction Cont.
  • This paper considers how to improve the
    performance of file caching by allowing
    user-level control over file cache replacement
    decisions.
  • To reduce the miss ratio, a user-level cache
    needs not only an application-tailored
    replacement policy but also enough available
    cache blocks.
  • The challenge is to allow each user process to
    control its own caching and at the same time to
    maintain the dynamic allocation of cache blocks
    among processes in a fair way so that overall
    system performance improves.
  • The key element in this approach is a sound
    allocation policy for the kernel, which will be
    discussed next.

5
User Level File Caching
  • Two-Level Replacement
  • This approach presented in this paper is called
    Two-level cache block replacement.
  • This approach splits responsibility between the
    kernel and user levels.
  • The kernel is responsible for allocating cache
    blocks to processes.
  • Each user process is free to control the
    replacement strategy on its share of cache blocks
  • If the process chooses not to exercise its
    choice, the kernel applies a default strategy
    (LRU)

6
User Level File Caching Cont.
Application P
Application Q
1. P misses
2. Kernel asks Q
4. Reallocates B
3. Q gives up B
Kernel
7
Kernel Allocation Policy
  • The Kernels allocation policy should satisfy
    three principles
  • A process that never overrules the kernel does
    not suffer more misses than it would under global
    LRU.
  • A foolish decision by one process never causes
    another process to suffer more misses.
  • A wise decision by one process never causes any
    process, including itself, to suffer more misses.

8
First Try
Time Ps ref Qs ref Cache State
Global LRU List
Increasing recency
A W X B
A W X B
A is the least recently used but P suggests B
t0
MISS
t1 Y
A W X Y
A W X Y
P has no choice but to use A
t2
MISS
t3 Z
Z W X Y
W X Y Z
t4
MISS
t5 A
Z A X Y
X Y Z A
t6
MISS
t7 B
Z A B Y
Y Z A B
t8
Total misses 4 (2 by Q, 2 by P) Total misses
under global LRU 3 (2 by Q, 1 by P)
This Violates Principle 3
9
Swapping Positions
Time Ps ref Qs ref Cache State
Global LRU List
Increasing recency
A W X B
A W X B
Swap A and B in LRU list then replace B
t0
MISS
t1 Y
A W X Y
W X A Y
t2
MISS
t3 Z
A Z X Y
X A Y Z
t4
t5 A
A Z X Y
X Y Z A
t6
MISS
t7 B
A Z B Y
Y Z A B
t8
Total misses 3 (2 by Q, 1 by P) Total misses
under global LRU 3
10
Swapping Positions
  • Swapping positions guarantees that if no process
    makes foolish choices, the global hit ratio is
    the same as or better than it would be under
    global LRU.
  • But what if the process makes a foolish choice?

11
No Place Holders
Time Ps ref Qs ref Cache State
Global LRU List
Increasing recency
X A Y
X A Y
Q makes wrong decision and replaces Y
t0
MISS
t1 Z
X A Z
A X Z
t2
MISS
t3 Y
X Y Z
X Z Y
t4
MISS
t5 A
A Y Z
Z Y A
t6
Total misses 3 (2 by Q, 1 by P) Total misses
under global LRU 1 (by Q)
This Violates Principle 2
12
With Place Holders
Time Ps ref Qs ref Cache State
Global LRU List
Increasing recency
X A Y
X A Y
Q makes wrong decision and replaces Y
t0
MISS
t1 Z
X A Z
A X(Y) Z
A place holder is created for Y
t2
MISS
t3 Y
Y A Z
A Z Y
t4
t5 A
Y A Z
Z Y A
t6
Total misses 2 (2 by Q, 0 by P) Total misses
under global LRU 1 (by Q)
Q hurts itself by its foolish decision, but it
doesnt hurt anyone else. Principle 2 is
satisfied.
13
To summarize
  • If a reference to cache block b hits
  • b is moved to the head of the global LRU list
  • Place-holder pointing to b is deleted

14
To summarize
  • If a reference to cache block b misses
  • 1st case there is a place-holder for b, pointing
    to t. t is replaced and its page is given to b.
    If t is dirty, it is written to disk.
  • 2nd case no place-holder for b. The kernel finds
    the block at the end of the LRU list. Say block
    c, belonging to process P. The kernel consults P.
  • if P chooses to replace block x. The kernel then
    swaps x and c in the LRU list.
  • If there is place-holder pointing to x, it is
    changed to point to c. Otherwise, a place-holder
    is built for x, pointing to c.
  • Finally, xs page is given to b.

15
Outline (part-2)
  • Design Issues
  • - User/ Kernel Interaction
  • - Shared Files
  • - Prefetching
  • Simulation
  • - Simulated Application Policies
  • - Simulation Environment
  • - Results
  • Conclusions

16
Design Issue 1 User- Kernel Interaction
  • Allow each user process to give hints to the
    kernel.
  • Which blocks it no longer needs or which blocks
    are less important than others.
  • Inform kernel about its access pattern for files
    (sequential, random etc.) Kernel can then make
    decisions for the user process.
  • Implement a fixed set of replacement policies in
    the kernel and the user process can choose from
    this menu- LRU, MRU, LRU-K etc.
  • For full flexibility, the kernel can make an
    upcall to the manager process every time a
    replacement decision is needed.
  • Each manager process can maintain a list of
    free blocks and the kernel can take blocks off
    the list when it needs them.
  • Kernel can implement some common policies and
    rely on upcalls for applications that do not want
    to use the common policies.

17
Design Issue 2 Shared Files
  • Concurrently shared files are handled in one of
    the two ways
  • If all the sharing processes agree to designate a
    single process as manager for the shared file,
    then the kernel allows this.
  • If the sharing processes fail to agree,
    management reverts to the kernel and the default
    global LRU policy is used.

18
Design Issue 3 Prefetching
  • Kernel prefetching would be responsible for
    deciding how aggressively to prefetch. Simply
    treat the prefetcher process as another process
    competing for memory in the file cache.
  • Information about future file references
    (essential for prefetching) might be valuable to
    the replacement code as well. Adding prefetching
    may well make the allocators job easier rather
    than harder.
  • Interaction between the allocator and the
    prefetcher would also be useful. The allocator
    could inform the prefetcher about the current
    demand for cache blocks the prefetcher could
    then voluntarily free blocks when it realizes
    that some prefetched blocks are no longer useful.

19
Simulation
  • Trace driven simulation has been used to evaluate
    two- level replacement.
  • In these simulations, the user-level managers
    used a general replacement strategy that takes
    advantage of knowledge about applications file
    references.
  • Two set of traces were used to evaluate the
    scheme.
  • Ultrix
  • Sprite

20
Simulated Application Policies
  • The two-level block replacement enables each user
    process to use its own replacement policy.
  • This solves the problem for those sophisticated
    applications that know exactly what replacement
    policy they want.
  • For less sophisticated applications, the
    knowledge about an applications file accesses
    can be used in replacement policy.
  • Knowledge about file accesses can often be
    obtained through general heuristics or from the
    compiler or the application writer.

21
Simulated Application Policies (contd.)
  • Following replacement policy based on the
    principle of RMIN (replace the block whose next
    reference is farthest in future) has been
    proposed to exploit partial knowledge of the
    future file access sequence.
  • When the kernel suggests a candidate replacement
    block to the manager process,
  • Find all blocks whose next references are
    definitely (or with high probability) after the
    next reference to the candidate block.
  • If there is no such block, replace the candidate
    block.
  • Else, choose the block whose reference is
    farthest from the next reference of the candidate
    block.

22
Simulated Application Policies (contd.)
  • This strategy can be applied to general
    applications with following common file reference
    patterns
  • Sequential Most files are accessed sequentially
    most of the time.
  • File specific sequence Some files are mostly
    accessed in one of a few sequences.
  • Filter Many applications access files one by one
    in the order of their names in the command line
    and access each file sequentially from beginning
    to end.
  • Same-order a file or group of files are
    repeatedly accessed in the same order.
  • Access Once Many programs do not re-read or
    re-write file data that they have already been
    accessed.

23
Simulation Environment
  • Two trace-driven simulations have been used to do
    preliminary evaluation of the ideas presented in
    this paper
  • Ultrix
  • Traces various applications running on a DEC
    5000/200workstation.
  • 1.6 MB file cache.
  • Block size 8K.
  • Sprite
  • File system traces from UC at Berkeley.
  • Recording file activities of about 40 clients
    over a period of 48 hours.
  • Client cache size 7MB.

24
Results- Ultrix Traces
  • Postgres a relational database system.
  • Sequential access pattern as the policy
  • The designer of the database system can certainly
    give a better user-level policy, thus further
    improving the hit ratio.

25
Results- Ultrix Traces
  • Cscope an interactive tool for examining C
    sources.
  • It reads the database of all source files
    sequentially from beginning to end to answer each
    query.
  • Applying right user-level policy (Same-Order
    being the access pattern) the miss ratio is
    reduced significantly.

26
Results- Ultrix Traces
  • Link-editing Ultrix linker
  • Linker in this system makes a lot of small file
    accesses.
  • Doesnt fit sequential access pattern, but fits
    Read-once

27
Results- Ultrix Traces
  • Multi-process workload Postgres, cscope linker
    are running concurrently.
  • Simulated each application running its own
    user-level policy as discussed in previous
    slides.
  • Yields the curve directly above RMIN.

28
Results Sprite Traces
  • In a system with a slow network (e.g ethernet),
    client caching performance
  • determines the file system performance on each
    workstation.
  • Since most file accesses are sequential, the
    sequential heuristic can be
  • used.
  • Sequential pattern improves hit ratio for about
    10 of the clients.

29
Conclusions
  • This paper has proposed a two-level replacement
    scheme for file cache management.
  • Its kernel policy for cache block allocation.
  • Several user-level replacement policies.
  • Kernel Allocation policy guarantees performance
    improvements over the traditional global LRU file
    caching approach
  • The method guarantees that processes that are
    unwilling or unable to predict their file access
    patterns will perform at least as well as global
    LRU.
  • It guarantees that a process that mis-predicts
    its file access patterns cannot cause other
    processes to suffer more misses.
  • The key contribution is the guarantee that a good
    user-level policy will improve the file cache hit
    ratios of the entire system.

30
  • Questions.
Write a Comment
User Comments (0)
About PowerShow.com