Locality Optimizations in OceanStore - PowerPoint PPT Presentation

About This Presentation
Title:

Locality Optimizations in OceanStore

Description:

An introduction to introspective techniques for ... Allows reverse lookup ... the FHB with the results from the reverse lookup in the first-order tables ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 14
Provided by: iako
Category:

less

Transcript and Presenter's Notes

Title: Locality Optimizations in OceanStore


1
Locality Optimizations in OceanStore
An introduction to introspective techniques for
exploiting locality in wide area storage
utilities.
  • Patrick R. Eaton
  • Dennis Geels

2
Agenda
  • OceanStore Review
  • Problem Overview
  • Previous Work
  • Proposed Solution
  • Prefetching Algorithm
  • Preliminary Results
  • Future Work

3
OceanStore Review
  • Properties of OceanStore relevant to
    introspective locality optimizations
  • implemented in the extremely wide area
  • has many places to put any single piece of data
  • cannot rely on users to make relationships among
    data explicit
  • depends on effective locality optimizations for
    improved performance
  • No possible way to solve exactly

4
Problem Overview
  • Passively observe data accesses
  • data shared among multiple users
  • single users accessing the network from different
    physical locations
  • data is replicated across the network
  • Optimize the location of data to provide quicker
    access to users
  • cluster semantically related data
  • replicate data to move it closer to consumers
  • migrate primary replicas toward the source of
    updates

5
Measurable Attributes
  • File Temperature
  • A measure that indicates the frequency of access
    to the file
  • A hot file is frequently accessed
  • Semantic Distance (Kuenning)
  • Any measure that can quantify relationships
    between files on the range 0,?)
  • Local distance relates one instance of a file
    access to another
  • Reference distance is an aggregate measure that
    summarizes all local distances for a pair of
    files
  • Typical measures use access order or timing
    information

6
Prefetching Techniques
  • Automatic Prefetching (Griffoen and Appleton)
  • construct a probability graph that records
    accesses which follow within a lookahead period
  • predict a prefetch when the chance of an access
    is above a tunable parameter
  • Context Modeling (Kroeger and Long)
  • record in a trie all access sequences which have
    been observed
  • maintain pointers to all nodes which represent
    current contexts
  • predict a prefetch when the chance of an access
    to a child of a current context is above a
    probability threshold

7
Our Approach
  • Exploit the ideas of semantic distance to compute
    relationships among data
  • Cluster data based on the observed relationships
  • Store a summary of these relationships with the
    data
  • Migrate (prefetch) files based on familiar
    patterns in the access stream
  • recognize higher order correlations as in context
    modeling
  • tolerate noise in the access stream

8
Motivation for Prefetching Algorithm
A
Y
Many patterns can be predicted only by
observation of higher-order correlation--combining
several pieces of past history.
K
B
Z
A
Other patterns can only be detected through
identification and filtering of noise.
B
C
9
General Prefetching Algorithm
  • Update
  • Record the most recent file accesses in the file
    history buffer (FHB)
  • Each time a new file S is accessed, extract all
    triples of the form (FHB(i), FHB(j)) ?S from the
    FHB and update in the second-order distance table
  • Predict
  • Each time a new file S is accessed, examine the
    distance table entries of (FHB(i), S)
  • Prefetch files that are predicted with confidence
    above a certain threshold
  • Problems
  • O(k2) work to update distance table
  • Noise infects distance table

10
Optimizations to the Prefetching Algorithm
  • First-order distance table
  • Records files that are close, as measured by
    semantic distance
  • Allows reverse lookup
  • Use first-order distance tables to filter out
    irrelevant file relationships
  • Update only relevant entries in the second-order
    distance table
  • Search for predictions based on only relevant
    access pairs

Indicative FHBs
11
Prefetching Algorithm Example
  • Update
  • Extract relevant triples by intersecting the FHB
    with the results from the reverse lookup in
    first-order tables
  • Predict
  • Extract relevant doubles by intersecting the FHB
    with the results from the reverse lookup in the
    first-order tables
  • Prefetch if the second-order table predicts a
    future access with sufficient confidence

12
Preliminary Results (Local System)
13
Future Work
  • Retarget the simulations to model OceanStore
  • Continue to refine the prefetching algorithm
  • Examine the potential of higher order prefetching
  • Combine prefetching and clustering
  • Look for opportunities to test the ideas on
    different workloads
Write a Comment
User Comments (0)
About PowerShow.com