Hypertable - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Hypertable

Description:

High random insert, update, and delete rate. hypertable.org. Data Model ... Deletes are carried out by inserting 'delete' records. CellStore ... – PowerPoint PPT presentation

Number of Views:101
Avg rating:3.0/5.0
Slides: 38
Provided by: dou144
Category:

less

Transcript and Presenter's Notes

Title: Hypertable


1
Hypertable
  • Doug Judd
  • Zvents, Inc.

2
Background
3
Web 2.0 Data Explosion
Web 2.0
Web 1.0
Web 2.0
Web 1.0
4
Traditional ToolsDont Scale Well
  • Designed for a single machine
  • Typical scaling solutions
  • ad-hoc
  • manual/static resource allocation

5
The Google Stack
  • Google File System (GFS)
  • Map-reduce
  • Bigtable

6
Architectural Overview
7
What is Hypertable?
  • A open source high performance, scalable
    database, modelled after Google's Bigtable
  • Not relational
  • Does not support transactions

8
Hypertable Improvements Over Traditional RDBMS
  • Scalable
  • High random insert, update, and delete rate

9
Data Model
  • Sparse, two-dimensional table with cell versions
  • Cells are identified by a 4-part key
  • Row
  • Column Family
  • Column Qualifier
  • Timestamp

10
Table Visual Representation
11
Table Actual Representation
12
Anatomy of a Key
  • Row key is \0 terminated
  • Column Family is represented with 1 byte
  • Column qualifier is \0 terminated
  • Timestamp is stored big-endian ones-compliment

13
Concurrency
  • Bigtable uses copy-on-write
  • Hypertable uses a form of MVCC(multi-version
    concurrency control)
  • Deletes are carried out by inserting delete
    records

14
CellStore
  • Sequence of 65K blocks of compressed key/value
    pairs

15
System Overview
16
Range Server
  • Manages ranges of table data
  • Caches updates in memory (CellCache)
  • Periodically spills (compacts) cached updates to
    disk (CellStore)

17
Client API
class Client void create_table(const String
name, const String
schema) Table open_table(const String
name) String get_schema(const String
name) void get_tables(vectorltStringgt
tables) void drop_table(const String name,
bool if_exists)
18
Client API (cont.)
class Table TableMutator create_mutator()
TableScanner create_scanner(ScanSpec
scan_spec) class TableMutator void
set(KeySpec key, const void value, int
value_len) void set_delete(KeySpec key)
void flush() class TableScanner bool
next(CellT cell)
19
Language Bindings
  • Currently C only
  • Thrift Broker

20
Write Ahead Commit Log
  • Persists all modifications (inserts and deletes)
  • Written into underlying DFS

21
Range Meta-Operation Log
  • Facilitates Range meta operation
  • Loads
  • Splits
  • Moves
  • Part of Master and RangeServer
  • Ensures Range state and location consistency

22
Compression
  • Cell Stores store compressed blocks of key/value
    pairs
  • Commit Log stores compressed blocks of updates
  • Supported Compression Schemes
  • zlib (--best and --fast)
  • lzo
  • quicklz
  • bmz
  • none

23
Caching
  • Block Cache
  • Caches CellStore blocks
  • Blocks are cached uncompressed
  • Query Cache
  • Caches query results
  • TBD

24
Bloom Filter
  • Negative Cache
  • Probabilistic data structure
  • Indicates if key is not present

25
Scaling (part I)
26
Scaling (part II)
27
Scaling (part III)
28
Access Groups
  • Provides control of physical data layout --
    hybrid row/column oriented
  • Improves performance by minimizing I/OCREATE
    TABLE crawldb Title MAX_VERSIONS3, Content
    MAX_VERSIONS3, PageRank MAX_VERSIONS10,
    ClickRank MAX_VERSIONS10, ACCESS GROUP default
    (Title, Content), ACCESS GROUP ranking
    (PageRank, ClickRank)

29
Filesystem Broker Architecture
  • Hypertable can run on top of any distributed
    filesystem (e.g. Hadoop, KFS, etc.)

30
Keys To Performance
  • C
  • Asynchronous communication

31
C vs. Java
  • Hypertable is CPU intensive
  • Manages large in-memory key/value map
  • Alternate compression codecs (e.g. BMZ)
  • Hypertable is memory intensive
  • Java uses 2-3 times the amount of memory to
    manage large in-memory map (e.g. TreeMap)
  • Poor processor cache performance

32
Performance Test(AOL Query Logs)
  • 75,274,825 inserted cells
  • 8 node cluster
  • 1 1.8 GHz Dual-core Opteron
  • 4 GB RAM
  • 3 x 7200 RPM SATA drives
  • Average row key 7 bytes
  • Average value 15 bytes
  • Replication factor 3
  • 4 simultaneous insert clients
  • 500K random inserts/s
  • 680K scanned cells/s

33
Performance Test II
  • Simulated AOL query log data
  • 1TB data
  • 9 node cluster
  • 1 2.33 GHz quad-core Intel
  • 16 GB RAM
  • 3 x 7200 RPM SATA drives
  • Average row key 9 bytes
  • Average value 18 bytes
  • Replication factor 3
  • 4 simultaneous insert clients
  • Over 1M random inserts/s (sustained)

34
Weaknesses
  • Range data managed by a single range server
  • Though no data loss, can cause periods of
    unavailability
  • Can be mitigated with client-side cache or
    memcached

35
Project Status
  • Currently in alpha
  • Just released version 0.9.0.7
  • Will release beta version end of August
  • Waiting on Hadoop JIRA 1700

36
License
  • GPL 2.0
  • Why not Apache?

37
Questions?
  • www.hypertable.org
Write a Comment
User Comments (0)
About PowerShow.com