Data types in P2P systems PowerPoint PPT Presentation

presentation player overlay
1 / 14
About This Presentation
Transcript and Presenter's Notes

Title: Data types in P2P systems


1
Data types in P2P systems
  • Henning Schulzrinne
  • Columbia University

2
Data issues
  • What do we store and fetch?
  • How do we protect it?
  • separate topic CMS, homebrew, ...
  • Analogies
  • file system (meta data, ACL, blob)
  • some had/have data types (e.g., VMS records)
  • hash libraries (e.g., hcreate())
  • database libraries (e.g., gdbm)
  • language features such as arrays and dictionaries
  • Tcl, Perl, PHP, Python, Ruby, ...

ID
data
crypto wrapper
3
Requirements
  • Need to be able to store any data item
  • of reasonable size
  • New types MUST NOT require rewriting node
    software
  • otherwise, little chance to generalize
  • ID (database key) must allow non-AOR values
  • otherwise, doesnt generalize to VoD and other
    uses
  • Allow both standard data objects and
    application-specific ones
  • e.g., SIP registration records and
    vendor-specific configuration data or I-D 00
    experiments
  • Provide access control
  • e.g., only creator can read
  • should work with semi-trusted nodes
  • Provide policy control per node or overlay
  • e.g., maximum size of objects

4
Data typing
  • RELOAD -03 32-bit integer
  • with some TBD allocation scheme
  • e.g., similar to IANA ports registry
  • compromise between space efficiency and
    extensibility
  • a usage may define one or more data types

5
Multiple objects with same key (ID)
  • What happens if theres a hash collision?
  • Possibilities
  • global space
  • egg is
  • may cause DOS attacks if ID AOR
  • creator
  • e.g., ID space could be local to creator
  • egg, alice and egg, bob are two different
    objects
  • allows only AOR can register AOR policy
  • data type
  • egg, 17, alice and egg, 42, alice are two
    different objects
  • auxiliary data
  • egg, 17, alice, label1 and egg, 17, alice,
    label2 are two different objects

6
Access control versioning
  • Basic access control policy only owner (
    creator) can replace (or delete) object
  • enforced by peer
  • cant prevent peer from disobeying policy
  • Do we need more elaborate policies?
  • similar to Unix ACLs or chmod bits?
  • unbounded complexity

7
Versioning timestamps
  • Do objects have timestamps?
  • Option 1 Just metadata
  • Option 2 Cannot replace newer object with older
    one
  • prevents replay attack
  • does not require synchronized clock in overlay
  • as long as all instances of owner have roughly
    synchronized clocks or fetch current value

8
Compound data structures
  • Should there be compound data structures?
  • Options (at least)
  • But also STL (see next slide)
  • Scripting languages (PHP, Tcl, Ruby, ...)
  • Need data structures AND operations
  • e.g., enumeration, traversal (iterator),
    insertion at beginning/end, ...
  • Interaction with policies and replication

9
Example STL containers
Sequential vector fast inserts at end
Sequential list inserts anywhere
Sequential deque inserts start end
Associative multiset duplicates allowed
Associative set no duplicates
Associative multimap 1-to-many
Associative map 1-to-1
Adapter stack FILO
Adapter queue FIFO
Adapter priority_queue sorted order
10
Example Tcl
  • set name(first) "Mary
  • Fake two-dimensional arrays name(a,b)
  • No ordering guarantees
  • PHP differs, for example (insertion order)

11
Proposed simple type
  • Uniquely identified by H(data)
  • within ID, owner, type
  • Operations
  • replace
  • list all ID, owner, type hashes (?)

12
Other proposals
  • Three types
  • singleton
  • numeric array
  • dictionary
  • Issues that need to be resolved
  • operations beyond single-element
  • replication
  • access control for x separate from data?

13
Example applications
  • registrations
  • multiple handsets for same AOR
  • voicemail
  • may generally use a server (announcement), not
    just storage
  • multiple writers?
  • but probably want to hide meta data
  • TURN servers
  • possibly indexed by some location indication

14
Summary
  • Data model independent of DHT and protocol
  • But if more complicated, may need additional
    operations beyond store fetch
  • or at least additional sub-operations (last)
  • Security issues -- what gets exposed to the
    (untrusted) server?
Write a Comment
User Comments (0)
About PowerShow.com