Fractal Prefetching B Trees: Optimizing Both Cache and Disk Performance - PowerPoint PPT Presentation

1 / 52
About This Presentation
Title:

Fractal Prefetching B Trees: Optimizing Both Cache and Disk Performance

Description:

Binary search in a large node suffers excessive number of cache misses ... 2000 random insertions after bulkloading 3M keys 70% full ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 53
Provided by: toddc9
Category:

less

Transcript and Presenter's Notes

Title: Fractal Prefetching B Trees: Optimizing Both Cache and Disk Performance


1
Fractal Prefetching B-Trees Optimizing Both
Cache and Disk Performance
Joint work with
2
B-Tree Operations Review
  • Search
  • binary search in every node on the path
  • Insertion/Deletion
  • search followed by data movement
  • Range Scan
  • locate a collection of tuples in a range
  • traverse the linked list of leaf nodes
  • different from search-like operations

3
Disk-optimized B-Trees
  • Traditional focus I/O performance
  • minimize of disk accesses
  • optimal tree nodes are disk pagestypically
    4KB-64KB large

4
Cache-optimized B-Trees
  • Recent studies cache performance
  • e.g. Rao Ross, SIGMOD00, Bohannon,
    McIlroy, Rastogi, SIGMOD01,Chen, Gibbons,
    Mowry, SIGMOD01
  • cache line size is 32-128B
  • optimal tree nodes are only a few cache lines

5
Large Difference in Node Sizes
6
Cache-optimized B-Trees Poor I/O Performance
  • may fetch a distinct disk page for every node on
    the path of a search
  • similar penalty for range scan

7
Disk-optimized B-Trees Poor Cache Performance
  • Binary search in a large node suffers excessive
    number of cache misses(explained later in the
    talk)

8
Optimizing for Both Cache and Disk Performance?
9
Our Approach
  • Fractal Prefetching B-Trees (fpB-Trees)
  • embedding cache optimized trees inside
    disk-optimized trees

10
Outline
  • Overview
  • Optimizing Searches and Updates
  • Optimizing Range Scans
  • Experimental Results
  • Related Work
  • Conclusion

11
Page Structure of Disk-optimized B-Trees
  • We focus on fixed sized keys
  • (please see our full paper for a discussion on
    variable sized keys)

An index entry is ltkey, page IDgt or ltkey, tuple
IDgt
12
Binary Search in a B-Tree Page
  • Suppose
  • an index entry array has 1023 index entries,
    numbered 1-1023
  • 8 index entries / cache line
  • the array occupies 128 cache lines
  • e.g. 8KB page, an entry is lt4B key, 4B page IDgt,
    64B cache line, 8B header

1st cache line
2nd
3rd
4th
8th
9th
128th
13
Binary Search in a B-Tree Page
Active Range
1023
1
71
Search for entry 71
14
Fractal Prefetching B-Trees (fpB-Trees)
  • Embedding cache-optimized trees inside disk
    pages
  • good search cache performance
  • binary search in cache-optimized nodes
  • much better locality
  • use cache prefetching
  • good search disk performance
  • nodes are embedded into disk pages

15
Node Size Mismatch Problem
  • Disk page size and cache-optimized node size
  • determined by hardware parameters and key sizes
  • Ideally cache-optimized trees fit nicely in disk
    pages
  • But usually this is not true !

A 2-level tree overflows
A 2-level tree underflows. But adding one more
level overflows.
16
Two Solutions
  • Solution 1 use different sizes for in-page leaf
    and nonleaf nodes
  • e.g. smaller root when overflow, larger root when
    underflow

Solution 2 overflowing nodes become roots of new
pages
17
The Two Solutions from Another Point of View
  • Conceptually we apply disk and cache
    optimizations in different orders
  • Solution 1 disk-first
  • first build the disk-optimized pages
  • then fit smaller trees into disk pages by
    allowing different node sizes
  • Solution 2 cache-first
  • first build the cache-optimized trees
  • then group nodes together and place them into
    disk pages

18
Insertion and Deletion Cache Performance
  • In disk-optimized B-Trees, data movement is very
    expensive
  • the huge array structure in disk pages
  • on average, we need to move half the array
  • In our fpB-Trees, the cost of data movement is
    much smaller
  • small cache-optimized nodes
  • We show that fpB-Trees have much better
    insertion/deletion performance over
    disk-optimized B-Trees with fixed sized keys

19
Outline
  • Overview
  • Optimizing Searches and Updates
  • Optimizing Range Scans
  • Experimental Results
  • Related Work
  • Conclusion

20
Jump-pointer Array Prefetching for Range Scan
  • Recall that range scans essentially traverse the
    linked list of leaf nodes
  • Previous proposal for range scan cache
    performance (SIGMOD01)
  • build data structures to hold leaf node addresses
  • prefetch leaf nodes during range scans

Internal Jump Pointer Array
21
New Proposal I/O Prefetching
linking leaf parentpages together
  • Employ jump-pointer array prefetching in I/O
  • jump-pointer arrays contain leaf page IDs
  • prefetching leaf pages to improve range scan I/O
    performance
  • Very useful when leaf pages are not sequential on
    disk
  • non-clustered index under frequent updates
  • (when sequential prefetching is not applicable)

22
Both Cache and I/O Prefetching in fpB-Trees
  • Two jump-pointer arrays in fpB-Trees
  • One for range scan cache performance
  • containing leaf node addresses for cache
    prefetching
  • One for range scan disk performance
  • containing leaf page IDs for I/O prefetching

23
More Details in Our Paper
  • Computation for optimal node sizes
  • Data structures
  • Algorithms
  • Bulkload
  • Search
  • Insertion
  • Deletion
  • Range scan

24
Outline
  • Overview
  • Optimizing Searches and Updates
  • Optimizing Range Scans
  • Experimental Results
  • Related Work
  • Conclusion

25
Implementation
  • We implemented a buffer manager and three index
    structures on top of the buffer manager

26
Experiments and Methodology
  • Experiments
  • Search (1) Cache performance (2) Disk
    performanceimproving cache performance while
    preserving good disk performance
  • Update (3) Cache performancesolving data
    movement problem
  • Range Scan (4) Cache performance (5) Disk
    performancejump-pointer array prefetching
  • Methodology
  • cache performance detailed cycle-by-cycle
    simulations
  • memory system parameters in near future
  • better prefetching support
  • range scan I/O performance execution times on
    real machines
  • search I/O performance counting the number of
    I/Os
  • I/O operations in search do not overlap

27
Search Cache Performance
2000 random searches after bulkload 100 full
except root16KB pages
  • fpB-Trees perform significantly better than
    disk-optimized B-Trees
  • achieving speedups 1.09-1.77 at all sizes over
    1.25 when trees contain at least 1M entries
  • The performances of two fpB-Trees are similar

28
Search I/O Performance
2000 random searches after bulkloading 10M index
entries 100 full except root
  • Disk-first fpB-Trees access lt 3 more pages
  • Very small I/O performance impact
  • Cache-first fpB-Trees may access up to 25 more
    pages in our results

29
Insertion Cache Performance
2000 random insertions after bulkloading 3M keys
70 full
  • fpB-Trees are significantly faster than
    disk-optimized B-Trees
  • achieving up to 35-fold speedups over
    disk-optimized B-Trees
  • Data movement costs dominate disk-optimized
    B-Tree performance

30
Range Scan Cache Performance
100 scans starting at random locations in index
bulkloaded with 3M keys 100 full each range
contains 1M keys 16KB pages
  • Disk-first and cache-first fpB-Trees achieve
    speedups of 4.2 and 3.5 over disk-optimized
    B-Trees
  • Jump-pointer array cache prefetching is effective

31
Range Scan I/O Performance
8-processor machine (RS/6000 line), 2GB memory,
80 SSA disksmature index on a 12.8GB table
  • IBM DB2 Universal Database
  • Jump-pointer array I/O prefetching achieves
    speedups of 2.5-5.0 for disk-optimized B-Trees

32
Other Experiments
  • We find similar benefits in deletion cache
    performance
  • Up to 20-fold speedups
  • We performed many cache performance experiments
    and got similar results for
  • Varying tree sizes, bulkload factors, and page
    sizes
  • Mature trees
  • Varying key sizes 20B keys
  • We performed range scan I/O experiments in our
    own index implementations and saw up to 6.9 fold
    speedups

33
Related Work
  • Micro-indexing (discussed briefly by Lomet,
    SIGMOD Record, Sep. 2001)

Micro-index
  • We are the first to quantitatively analyze
    performance for micro-indexing
  • improves search cache performance
  • but suffers from the data movement problem in
    update because of the continuous array structure
  • fpB-Trees have much better update performance

34
Fractal prefetching B-Trees Conclusion
  • Search combine cache-optimized and
    disk-optimized node sizes
  • better cache performance
  • 1.1-1.8 speedup over disk-optimized B-Trees
  • good disk performance for disk-first fpB-Trees
  • disk-first fpB-Trees visit lt 3 more disk pages
  • we only recommend cache-first fpB-Trees with
    very large memory
  • Update solve data movement problem by using
    smaller nodes
  • better cache performance
  • up to a 20-fold speedup over disk-optimized
    B-Trees
  • Range Scan employ jump-pointer array
    prefetching
  • better cache performance
  • better disk performance
  • 2.5-5.0 speedup on IBM DB2

35
Back Up Slides
36
Previous Work Prefetching B-Trees
  • (SIGMOD 2001)
  • Study B-Trees in main memory environment
  • For search prefetching wider tree nodes
  • increase node size to multiple cache lines wide
  • use prefetching to read all cache lines of a node
    in parallel

Prefetching B-Tree with four-line nodes
B-Tree with one-line nodes
37
Prefetching Btrees (contd)
  • For range scan jump-pointer array prefetching
  • build jump-pointer arrays to hold leaf node
    addresses
  • prefetch leaf nodes with jump-pointer array
  • two implementations

External Jump Pointer Array
Internal Jump Pointer Array
38
Optimization in Disk-first Approach
  • Two conflicting goals 1) optimize search cache
    performance2) maximize page fan-out to preserve
    good I/O performance
  • Optimal Criteriamaximize page fan-out while
    maintaining analytical search cost to be within
    10 of the optimal
  • Details in the paper

39
Cache-first fpB-Trees Structure
  • Group sibling leaf nodes into the same pages for
    range scan
  • Group parent and its children into the same page
    for search
  • Leaf parent nodes may be put into overflow pages

40
Simulation Parameters
Models all the gory details, including memory
system contention
41
Optimal Node Sizes Computation (key4B)
  • Optimal criteria maximize page fan-out while
    maintaining analytical search cost to be within
    10 of the optimal
  • We used these optimal values in our experiments

42
Search Cache Performance
2000 random searches after bulkload 100 full
except root16KB pages
  • Cache-sensitive schemes (fpB-Trees and
    micro-indexing) all perform significantly better
    than disk-optimized B-Trees
  • The performances of cache-sensitive schemes are
    similar

43
Search Cache Performance (Varying Page Sizes)
4KB pages
8KB pages
32KB pages
execution time (M cycles)
  • Same experiments but with different page sizes
  • We see the same trends cache-sensitive schemes
    are better
  • They achieve speedups 1.09-1.77 at all sizes
    1.25-1.77 when trees contain at least 1M entries

44
Optimal Width Selection
(16KB pages, 4B keys)
Disk-first fpB-Trees
Cache-first fpB-Trees
  • Our selected trees perform within 2 and 5 of
    the best for disk-first and cache-first fpB-Trees

45
Search I/O Performance
(2000 random searches, 4B keys)
  • Disk-first fpB-Trees access lt 3 more pages
  • Very small I/O performance impact
  • Cache-first fpB-Trees may access up to 25 more
    pages in our results

46
Insertion Cache Performance
2000 random insertions after bulkloading 3M keys
70 full
47
Insertion Cache Performance II
2000 random insertions after bulkloading 3M keys
16KB pages
  • fpB-Trees are significantly faster than both
    disk-optimized B-Trees and Micro-indexing
  • fpB-Trees achieve up to 35-fold speedups over
    disk-optimized B-Trees across all page sizes

48
Insertion Cache Performance II
2000 random insertions after bulkloading 3M keys
16KB pages
  • Two major costs data movement, page split
  • Micro-indexing still suffers from data movement
    costs
  • fpB-Trees avoid this problem with smaller nodes

49
Space Utilization
(4B keys)
After Bulkload 100 full
Mature Trees
  • Disk-first fpB-Trees incur lt 9 space overhead
  • Cache-first fpB-Trees may use up to 36 more
    pages in our results

50
Range Scan Cache Performance
100 scans starting at random locations on index
bulkloaded with 3M keys Each range spans 1M
keys 16KB pages
  • Disk-first and cache-first fpB-Trees achieve
    speedups of 3.5-4.2 and 3.0-3.5 over
    disk-optimized B-Trees

51
Range Scan I/O Performance
(10M entries in the range)
(10 disks)
  • Setup SGI Origin 200 with four 180MHz R10000
    processors, 128MB memory, 12 SCSI disks (10 of
    them used in experiments) Range scan on mature
    trees
  • Jump-pointer array prefetching achieves up to a
    speedup of 6.9

52
Jump-pointer Array Prefetching on IBM DB2
  • Setup 8-processor machine (RS/6000 line), 2GB
    memory, 80 SSA disks mature index on a 12.8GB
    table SELECT COUNT() FROM data
  • Jump-pointer array prefetching achieves speedups
    of 2.5-5.0
Write a Comment
User Comments (0)
About PowerShow.com