Chisel: A Storageefficient, Collisionfree Hashbased Network Processing Architecture - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Chisel: A Storageefficient, Collisionfree Hashbased Network Processing Architecture

Description:

Collision-free Hashing-Scheme for LPM (Chisel) ... Figure 6 is referred to as a Chisel sub-cell. The Chisel architecture for the LPM application consists of a ... – PowerPoint PPT presentation

Number of Views:208
Avg rating:3.0/5.0
Slides: 46
Provided by: puni
Category:

less

Transcript and Presenter's Notes

Title: Chisel: A Storageefficient, Collisionfree Hashbased Network Processing Architecture


1
Chisel A Storage-efficient, Collision-free
Hash-based Network Processing Architecture
  • Jahangir Hasan, Srihari Cadambi,
  • Venkatta Jakkula, Srimat Chakradhar,
  • Proceedings of the 33rd International Symposium
    on Computer Architecture (ISCA06)

2
Outline
  • Introduction
  • Bloomier Filter
  • Chisel Architecture
  • Results
  • Conclusion

3
Introduction
  • There are three major families of techniques
    for performing LPM
  • TCAM
  • Prohibitive cost and power dissipation
  • Trie-Based
  • Large memory requirements and long lookup
    latencies
  • Hash-Based

4
Introduction
  • Hash-Based Schemes Advantage
  • an order-of magnitude lower power
  • small memory sizes
  • key-length-independent O(1) latencies
  • Hash-Based Schemes Disadvantage
  • incur collisions
  • hash functions cannot directly operate on
    wildcard bits

5
Introduction
  • Collision-free Hashing-Scheme for LPM (Chisel)
  • build upon a recent collision-free hashing scheme
    called Bloomier filter
  • provides support for wildcard bits with small
    additional storage
  • support fast and incremental updates

6
Bloomier Filter
  • An extension of Bloom filters.
  • Support storage and retrieval of arbitrary
    per-key information.
  • Guarantee collision-free hashing for a
    constant-time lookup in the worst case.

7
Bloomier Filter
  • The Bloomier filter stores some function ft?f(t)
    for all t.
  • The collection of all the keys is the key set.
  • The process of storing these f(t) values for all
    t is called function encoding.
  • The process of retrieving f(t) for a given t is
    called a lookup.

8
Bloomier Filter
  • The data structure consists of a table indexed by
    k hash functions. We call this the Index Table.
  • The k hash values of a key are collectively
    referred to as its hash neighborhood, reprented
    by HN(t).
  • If some hash value of a key is not in the hash
    neighborhood of an other key in the set, then
    that value is said to be a singleton.

9
Bloomier Filter
  • The Index Table is set up such that for every t,
    we find a location t(t) among HN(t) such that
    there is a one-to-one mapping between all t and
    t(t).
  • Because t(t) is unique for each t we can
    guarantee collision-free lookups.
  • Let us call the hash function which hashes t to
    location t(t) as the ht(t)th function.

10
Bloomier Filter
  • The idea is to setup the Index Table, so that a
    lookup for t returns t(t).
  • Then we can store f(t) in a separate Result Table
    at address t(t) and thereby guarantee
    deterministic, collision-free lookups of
    arbitrary functions.

11
Bloomier Filter
  • Encoding the Index Table
  • During encoding, once we find t(t) for a certain
    t, we write V(t) from Equation 1 into the
    location t(t).
  • Because (t) is unique for each t, we are
    guaranteed that this location will not be altered
    by the encoding of other keys.

12
Bloomier Filter
  • Encoding the Index Table
  • Now during a lookup for t, ht(t) can be retrieved
    by a simple XOR operation of the values in all k
    hash locations of t, as given in Equation 2.
  • We can use ht(t) in turn to obtain t(t), then
    read f(t) from the location t(t) in the Result
    Table.

13
Bloomier Filter
14
Bloomier Filter
  • The Bloomier Filter Setup Algorithm
  • We first hash all keys into the Index Table and
    then make a single pass over the entire table to
    find keys that have any singletons (locations
    with no collisions).
  • All these keys with singletons are then pushed
    onto the top of a stack.

15
Bloomier Filter
  • The keys are considered one by one starting from
    the bottom of the stack, and removed from all of
    their k hash locations in the Index Table.
  • The affected locations are examined for any new
    singletons, which are then pushed onto the top of
    the stack.

16
Bloomier Filter
  • This process is repeated until the Index Table
    becomes empty.
  • The final stack, considered top to bottom,
    represents an ordering G.

17
Bloomier Filter
18
Bloomier Filter
19
Bloomier Filter
  • G ensures that every key t has at least one
    unique hash location t(t) in its hash
    neighborhood.
  • We can now process the keys in order G, encoding
    V(t) into t(t) for each t, using Equation 1.
  • Therefore, lookups are guaranteed obtain the
    correct ht(t) values for all t using Equation 2.
  • The running time of the setup algorithm is O(n)
    for n keys.

20
Chisel Architecture
  • Convergence of the Setup Algorithm
  • Removing False Positives
  • Supporting Wildcards
  • Incremental Updates

21
Convergence of the Setup Algorithm
  • At each step the setup algorithm removes some key
    from the Index Table and then searches for new
    singletons.
  • If at some step a new singleton is not found then
    the algorithm fails to converge.
  • A Bloomier Filter with k hash functions, n keys
    and an Index Table size m?kn, the probability of
    setup failure P(fail) is upper bounded as follows

22
Convergence of the Setup Algorithm
23
Convergence of the Setup Algorithm
  • In the event that a setup failure does occur, we
    remove a few problematic keys to a spillover TCAM
    and resume setup.
  • The probability of the same setup subsequently
    failing 1, 2, 3 and 4 times is 1014, 1021,
    1028, and 1035.
  • Therefore, a small spillover TCAM (e.g., 16 to 32
    entries) suffices.

24
Removing False Positives
  • False positive can occur when a Bloomier filter
    lookup involves some key t which was not in the
    set of original keys used for setup.
  • 6 addresses such false positives by
    concatenating a checksum c(t) to ht(t) and using
    this concatenation in place of ht(t) in Equation
    1 during setup.
  • The wider this checksum field the smaller the
    probability of false positives (PFP).

25
Removing False Positives
  • A non-zero PFP means that some specific keys will
    always incur false positives.
  • Therefore a non-zero PFP, no matter how small, is
    unacceptable for LPM.

26
Removing False Positives
  • We propose a storage-efficient scheme to
    eliminate false positives for our LPM
    architecture.
  • The basic idea is to store in the data structure,
    all original keys, and match them against the
    lookup keys.

27
Removing False Positives
  • During setup, we encode a pointer p(t) for each t
    instead of encoding ht(t).
  • p(t) directly points into a Result Table having n
    locations.
  • Thus, the Index Table encoding equation (Equation
    1) is modified as follows

28
Removing False Positives
  • During lookup, we extract p(t) from the Index
    Table (using Equation 2), and read out both f(t)
    and t from the location p(t) in the Result Table.
  • We then compare the lookup key against the value
    of t.
  • If the two match then f(t) is a correct lookup
    result, otherwise it is a false positive.

29
Removing False Positives
  • In order to facilitate hardware implementation,
    the actual architecture uses two separate tables
    to store f(t) and t, the former being the Result
    Table and the latter the Filter Table.

30
Chisel Architecture
31
Supporting Wildcards
  • We propose a novel technique called prefix
    collapsing which efficiently supports wildcard
    bits.
  • In contrast to CPE, prefix collapsing converts a
    prefix of length x into a single prefix of
    shorter length xl (l?1) by replacing its l least
    significant bits with wildcard bits.
  • The maximum number of bits collapsed is called
    the stride.

32
Chisel Architecture
33
Supporting Wildcards
  • Each instance of Figure 6 is referred to as a
    Chisel sub-cell.
  • The Chisel architecture for the LPM application
    consists of a number of such sub-cells, one for
    each of the collapsed prefix lengths l1 lj.
  • Prefixes having lengths between li and li1 are
    stored in the sub-cell for li.

34
Supporting Wildcards
  • A lookup collapses the lookup-key to lengths l1
    lj, and searches all sub-cells in parallel.
  • The results from all sub-cells are sent to a
    priority encoder, which picks the result from
    that matching sub-cell which corresponds to the
    longest collapsed length.

35
Chisel Architecture
36
Incremental Updates
  • The Bloomier Filter supports only a static set of
    keys.
  • To address this shortcoming, we equip the Chisel
    architecture with extensions based on certain
    heuristics, in order to support fast and
    incremental updates.

37
Incremental Updates
  • We observe that in real update traces, 99.9 of
    prefixes added by updates are such that when
    those prefixes are collapsed to the appropriate
    length, they become identical to some collapsed
    prefix already present in the Index Table.
  • Therefore, we need to update only the Bit-vector
    Table, and not the Index Table, for these updates.

38
Incremental Updates
  • we observe that in real update traces a large
    fraction of updates are actually route-flaps
    (i.e., a prefix is added back after being
    recently removed).
  • We temporarily mark the prefix dirty and
    temporarily retain it in the Index Table, instead
    of immediately removing it.

39
Incremental Updates
  • We maintain a shadow copy of the data structures
    in software.
  • When an update command is received, we first
    incrementally update the shadow copy, and then
    transfer then modified portions of the data
    structure to the hardware engine.

40
Results
  • Chisel versus EBFCPE
  • Comparison against EBF with No Wildcards
  • Prefix Collapsing vs. Prefix Expansion
  • Chisel versus EBFCPE
  • Scalability
  • Scaling with Router Table Size
  • Scaling with Key Width
  • Power using Embedded DRAM
  • Updates
  • Comparison with Other Families
  • Chisel vs Tree Bitmap
  • Chisel vs TCAMs

41
Results
42
Results
43
Results
44
Conclusion
  • Based upon a recently-proposed hashing scheme
    called Bloomier filter, we architected an LPM
    solution and proposed a novel technique, called
    prefix collapsing, for supporting wildcard bits.
    We also built support for fast and incremental
    updates by exploiting key characteristics found
    in real update traces.

45
Conclusion
  • Another significant advantage of Chisel is that
    it has memory requirements small enough to be
    implemented on-chip using embedded DRAM.
  • Chisel performs only one off-chip access at the
    end of a lookup, when a pointer is sent to an
    off-chip next-hop table.
Write a Comment
User Comments (0)
About PowerShow.com