Lecturer: Moni Naor - PowerPoint PPT Presentation

About This Presentation
Title:

Lecturer: Moni Naor

Description:

Lecturer: Moni Naor. Foundations of Privacy. Informal Lecture. Anti-Persistence or History Independent Data Structures. Why hide your history? Core dumps ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 43
Provided by: wisdomWe
Category:

less

Transcript and Presenter's Notes

Title: Lecturer: Moni Naor


1
Foundations of PrivacyInformal Lecture
Anti-Persistence or History Independent Data
Structures
  • Lecturer Moni Naor

2
Why hide your history?
  • Core dumps
  • Losing your laptop
  • The entire memory representation
  • of data structures is exposed
  • Emailing files
  • The editing history may
  • be exposed (e.g. Word)
  • Maintaining lists of people
  • Sports teams, party invitees

3
Election Day
Carol
Alice
Alice
Bob
  • Elections for class president
  • Each student whispers in Mr. Drews ear
  • Mr. Drew writes down the votes

Carol
  • ProblemMr. Drews notebook leaks sensitive
    information
  • First student voted for Carol
  • Second student voted for Alice

Alice
Alice
Bob
3
4
Learning from history only whats necessary
  • A data structure has
  • A legitimate interface the set of operations
    allowed to be performed on it
  • A memory representation
  • The memory representation should reveal no
    information that cannot be obtained from the
    legitimate interface

5
History of history independence
  • Issue dealt with in Cryptographic and Data
    Structures communities
  • Micciancio (1997) history independent trees
  • Motivation incremental crypto
  • Based on the shape of the data structure, not
    including memory representation
  • Stronger performance model!
  • Uniquely represented data structures
  • Treaps (Seidel Aragon), uniquely represented
    dictionaries
  • Ordered hash tables (Amble Knuth 1974)

6
More History
  • Persistent Data Structures possible to
    reconstruct all previous states of the data
    structure (Sarnak and Tarjan)
  • We want the opposite anti-persistence
  • Oblivious RAM (Goldreich and Ostrovsky)

7
Overview
  • Definitions
  • History independent open addressing hashing
  • History independent dynamic perfect hashing
  • Memory Management
  • (Union Find)
  • Open problems

8
Precise Definitions
  • A data structure is
  • history independent if any two sequences of
    operations S1 and S2 that yield the same content
    induce the same probability distribution on the
    memory representation.
  • strongly history independent if given any two
    sets of breakpoints along S1 and S2 s.t.
    corresponding points have identical contents, S1
    and S2 induce the same probability distributions
    on memory representation at those points.

Alternative Definition transition probability
9
Relaxations
  • Statistical closeness
  • Computational indistinguishability
  • Example where helpful erasing
  • Allow some information to be leaked
  • Total number of operations
  • n-history independent identical distributions if
    the last n operations where identical as well
  • Under-defined data structures same query can
    yield several legitimate answers,
  • e.g. approximate priority queue
  • Define identical content no suffix T such that
    set of permitted results returned by S1?T is
    different from the one returned by S2?T

10
History independence is easy (sort of)
  • If it is possible to decide the
    (lexicographically) first sequence of
    operations that produce a certain content, just
    store the result of that
  • This gives a history independent version of a
    huge class of data structures
  • Efficiency is the problem

11
Dictionaries
  • Operations are insert(x), lookup(x) and possibly
    delete(x)
  • The content of a dictionary is the set of
    elements currently inserted (those that have been
    inserted but not deleted)
  • Elements x ? U some universe
  • Size of table/memory N

12
Goal
  • Find a history independent implementation of
    dictionaries with good provable performance.
  • Develop general techniques for history
    independence

13
Approaches
  • Unique representation
  • e.g. array in sorted order
  • Yields strong history independence
  • Secret randomness
  • e.g. array in random order
  • only history independence (not strongly)

14
Open addressing traditional version
  • Each element x has a probe sequence
  • h1(x), h2(x), h3(x), ...
  • Linear probing h2(x) h1(x)1, h3(x) h1(x)2,
    ...
  • Double hashing
  • Uniform hashing
  • Element is inserted into the first free space in
    its probe sequence
  • Search ends unsuccessfully at a free space
  • Efficient space utilization
  • Almost all the table can be full

15
Open addressing traditional version
Not history independent later-inserted elements
move further along in their probe sequence
x
y
x arrived before y, so move y
y
y
No clash, so insert y
16
History independent version
  • At each cell i, decide elements priorities
    independently of insertion order
  • Call the priority function pi(x,y).
  • If there is a clash, move the element of lower
    priority
  • At each cell, priorities must form a total order

17
Insertion
x
y
x
p2(x,y)? No, so move x
x
y
x
18
Search
  • Same as in the traditional algorithm
  • In unsuccessful search, can quit as soon as you
    find a lower-priority element

No deletions
  • Problematic in open addressing
  • Possible way out - clusters

19
Strong history independence
  • Claim
  • For all hash functions and priority functions,
    the final configuration of the table is
    independent of the order of insertion.
  • Conclusion
  • Strongly history independent

20
Proof of history independence
A static insertion algorithm (clearly history
independent)
Gather up the rejects and restart
x2
x1
p1(x2,x1) so insert x2
x2
x2
x3
x1
x3
x1
p3(x4,x5) and p3(x4,x6). Insert x4 and remove x5
x5
x5
x5
x6
x4
insert x5
x4
x4
x5
x4
x4
x5
x6
x5
x2
x2
x1
x1
x4
x3
x6
p6(x6,x4) and p6(x3,x6), so insert x3
x3
x3
21
Proof of history independence
  • Nothing moves further in the static algorithm
    than in the dynamic one
  • By induction on rounds of the static alg.
  • Vice versa
  • By induction on the steps in the dynamic alg.
  • Strongly history independent

Alternative view Blelloch-Golovin Stable
Matching
22
Some priority functions
  • Global
  • A single priority function independent of cell
  • Random
  • Choose a random order at each cell
  • Youth-rules
  • Call an element younger if it has moved less
    far along its probe sequence younger elements
    get higher priority

23
Youth-rules
y
p2(x,y) because x has taken fewer steps than y
y
x
  • Use a tie-breaker if of steps the same
  • This is a priority function

y
x
24
Specifying a scheme
  • Priority rule
  • Choice of priority functions
  • In Youth-rules determined by probe sequence
  • Probe functions
  • How are they chosen
  • Maintained
  • Computed

25
Implementing Youth-rules
  • Let each hi be chosen from a pair-wise
    independent collection
  • For any two x and y the r.v. hi(x) and hi(y) are
    uniform and independent.
  • Let h1, h2, h3, be chosen independently
  • Example hi(x) (aix mod U) bi mod N
  • Space two elements per function
  • Need only log N functions

Prime
26
Performance Analysis
  • Based on worst-case insertion sequence
  • The important parameter ? - the fraction of the
    table that is used ?N elements
  • Analysis of expected insertion time and search
    time (number of probes to the table)
  • Have to distinguish successful and unsuccessful
    search

27
Analysis via the Static Algorithm
  • For insertions, the total number of probes in
    static and dynamic algorithm are identical
  • Easier to analyze the static algorithm
  • Key point for Youth-rules in the phase i all
    unsettled elements are in the ith probe of their
    sequence
  • Assures fresh randomness of hi (x)

28
Performance
  • For Youth-rules, implemented as specified
  • For any sequence of insertion the expected
    probe-time for insertion is at most 1/(1-?)
  • For any sequence of insertions the expected
    probe-time for successful or unsuccessful search
    is at most 1/(1-?)
  • Analysis based on static algorithm
  • ? is the fraction of the table that is used

29
Comparison to double hashing
  • Analysis of double hashing with truly random
    functions Guibas Szemeredi, Lueker
    Molodowitch
  • Can be replaced by log n wise independent
    functions Schmidt Siegel
  • log n wise independent is relatively expensive
  • either a lot of space or log n time
  • Youth-rules is a simple and provably efficient
    scheme with very little extra storage
  • Extra benefit of considering history independence

30
Other Priority Functions
  • Amble Knuth log(1/(1-?)) for global
  • Truly random hash functions
  • Experiments show about log(1/(1-?)) for most
    priority functions tried
  • Performance is for amortized search

31
Other types of data structures
  • Memory management (dealing with pointers)
  • Memory Allocation
  • Other state-related issues

32
Dynamic perfect hashingFKS scheme, dynamized
Low-level tables O(n) space total. Each gets
about si2
n elements to be inserted
Top-level table O(n) space
h0
x1
x3
s0
h1
h
s1
x5
x4
x6
x2
hk
sk
The hi are perfect on their respective sets.
Rechoose h or some hi to maintain perfection and
linear space.
33
A subtle problemthe intersection bias problem
  • Suppose we have
  • a set of states ?1, ?2, ...
  • a set of objects h1, h2, ...
  • a way to decide whether hi is good for ?j.
  • Keep a current h as states change
  • Change h only if it is no longer good.
  • Choose uniformly from the good ones for ?.
  • Then this is not history independent
  • h is biased towards the intersection of those
    good for current ? and for previous states.

34
Dynamized FKS is not history independent
  • Does not erase upon deletion
  • Uses history-dependent memory allocation
  • Hash functions (h, h1, h2, ...) are changed
    whenever they cease to be good
  • Hence they suffer from the intersection bias
    problem, since they are biased towards functions
    that were good for previous sets of elements
  • Hence they leak information about past sets of
    elements

35
Making it history independent
  • Use history independent memory allocation
  • Upon deletion, erase the element and rechoose the
    appropriate hi. This solves the low-level
    intersection bias problem.
  • Some other minor changes
  • Solve the top-level intersection bias problem...

36
Solving the top-level intersection bias problem
  • Cant afford a top-level rehash on every deletion
  • Generate two potential hs ?1 and ?2 at the
    beginning
  • Always use the first good one
  • If neither are good, rehash at every deletion
  • If not using ?1, keep a top-level table for it
    for easy goodness checking (likewise for ?2)

37
Proof of history independence
  • Tables state is defined by
  • The current set of elements
  • Top-level hash functions
  • Always the first good ?i, or rechosen each step
  • Low-level hash functions
  • Uniformly chosen from perfect functions
  • Arrangement of sub-tables in memory
  • Use history-independent memory allocation
  • Some other history independent things

38
Performance
  • Lookup takes two steps
  • Insertion and deletion take expected amortized
    O(1) time
  • There is a 1/poly chance that they will take more

39
SHI and Unique Representation
  • Theorem Hartline et al for a reversible data
    structure to be SHI, a canonical (unique)
    representation for each state must be determined
    during the data structures initialization.

40
SHI with Deletions
  • Blelloch and Golovin a dictionary based on
    linear probing
  • Goal search in O(1) time (guaranteed)
  • Each cluster of size O(log n)
  • Can be obtained using 5-wise independence Pagh
    et al., STOC 2007
  • Needs random oracle for high level intersection
    bias

41
Open Problems
  • Better analysis for youth-rules as well as other
    priority functions with no random oracles.
  • Efficient memory allocation
  • ours is O(s log s)
  • Separations
  • Between strong and weak history independence
    Buchbinder-Petrank
  • Between history independent and traditional
    versions
  • e.g. for Union Find
  • Can persistence and (computational) history
    independence co-exist efficiently?

42
References
  • Moni Naor and Vanessa Teague,  Anti-persistence
    History Independent Data Structures, STOC, 2001.
  • Hartline, Hong, Mohr, Pentney and Rocke,
    Characterizing History Independent Data
    Structures, Algorithmica 2005
  • Buchbinder and Petrank, Lower and upper bounds on
    obtaining history independence, Information and
    Computation 2006.
  • Guy Blelloch and Daniel Golovin, Strongly
    History-Independent Hashing with Applications,
    FOCS 2007
  • Tal Moran, Moni Naor and Gil Segev Deterministic
    History-Independent strategies for Storing
    Information in Write-Once Memories, ICALP 2007
Write a Comment
User Comments (0)
About PowerShow.com