Applicationspecific Disk Optimization with Stacktrace Analysis - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

Applicationspecific Disk Optimization with Stacktrace Analysis

Description:

Try to understand app behavior by grouping file accesses by their stack back-traces. ... Hopefully stack trace reflects 'what the program is doing' ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 12
Provided by: zl
Category:

less

Transcript and Presenter's Notes

Title: Applicationspecific Disk Optimization with Stacktrace Analysis


1
Application-specific Disk Optimization with
Stack-trace Analysis
  • Feng Zhou, zf_at_cs
  • 5/14/04

2
Motivation
  • Current OSs often use a single policy for all
    applications in various I/O decisions
  • Page/buffer cache replacement
  • Disk head scheduling
  • File read-ahead
  • Hard to make it work well for all applications
  • Adaptive policy doesnt solve all problems
  • For sequential scans, all LRU-likes are bad
  • Possible root OS doesnt know enough about
    applications, is forced to treat all accesses
    equal
  • Recent developments show needs for multiple
    policies
  • Linux 2.6 supports boot-time selectable schedulers

3
Domain Separated Disk I/O
  • Classify accesses into separate domains. Each
    access is tagged with a domain ID
  • Each domain uses different policy and parameters
  • Similar idea in database literature
  • DBMIN (85) programmer passes in domain type
    when opening a file and DBMIN decides which
    policy to use
  • Making it general and automatic for OS
  • OS kernel collects per-domain statistics
  • A daemon makes policy decisions periodically
  • How do we obtain domain ID?
  • Stack-trace Analysis

4
Stack-trace Analysis
  • Try to understand app behavior by grouping file
    accesses by their stack back-traces.
  • Ideal policies
  • 1 MRU replacement, large readahead
  • 2 LFU replacement, fixed readahead
  • 3 LRU replacement, whole-file readahead

1
2
3
btree_index_scan()
btree_tuple_get(key,)
process_http_req()
get_page(table, index)
send_file()
foo_db
bar_httpd
read(fd, buf, pos, count)
5
Why Stack-trace Analysis?
  • Requirements for a good criterion of domain
    separation
  • Highly correlated to access patterns
  • Easy to obtain
  • Hopefully stack trace reflects what the program
    is doing
  • Exceptions script driven apps, interpreted apps
  • 100 instr. to obtain a hash of the stack-trace
  • Other possible criteria (what current OS uses!)
  • File. Problem DB has very different ways of
    accessing a single table.
  • Process/thread. Problem a single thread can
    exhibit very different access pattern over time.

6
Adaptive Multi-Policy Cache Replacement
1MRU
2LRU
history pages
3LRU
history pages
  • Examine recent trace Use MRU for a domain if
  • There exist a sequential access run of length gt ½
    number of unique pages
  • Or, average distance between runs lt 20 pages
  • Parameters
  • cache_size, domain_no
  • s0cache_size/domain_no
  • scan_sizei for MRU, average of pages in a
    scan
  • Tscani sTscani avg/std dev of time between scans

7
  • Decide policy (MRU/LRU) for each domain
  • Start with each partition capacity s0
  • Keep adding new pages when cache is not full
  • When cache is full, before adding a new page,
  • If current partition is over-capacity, evict a
    page from it
  • If current partition is not over-capacity, evict
    a page from another random over-capacity
    partition
  • Evicted LRU pages are recorded in history queues
    of length s0
  • Adaptation
  • Grow an MRU partition i by 1 page every
    scan_sizei /s0 accesses of this domain. A random
    partition is shrank by one.
  • Grow an LRU partition by 1 page whenever one of
    its history page is accessed
  • MRU garbage collection
  • For each MRU partition i, every Tscani2sTscani
    steps, mark all un-accessed pages free (ready for
    replacement)

8
Example of stack-trace statistics (TPC-W trace)
SCAN time
9
  • Simulation Result, TPC-W-like benchmark (OSDL
    DBT1)on PostgreSQL 7.4.2
  • 65482 accesses, 23843 unique pages

10
Open Source Database Benchmark on MySQL 4.0.18
(Berkekey DB table) 9610 accesses, 3112 unique
pages
11
  • Application-specific Disk Optimization with
    Stack-trace Analysis
Write a Comment
User Comments (0)
About PowerShow.com