Title: Prophecy: Using History for High-Throughput Fault Tolerance
 1Prophecy Using History for High-Throughput Fault 
Tolerance
- Siddhartha Sen 
 - Joint work with Wyatt Lloyd and Mike Freedman 
 - Princeton University
 
  2Non-crash failures happen
Model as Byzantine (malicious) 
 3Mask Byzantine faults
Service
Clients 
 4Mask Byzantine faults
Throughput
Clients
Replicated service 
 5Mask Byzantine faults
Throughput
Linearizability (strong consistency)
Clients
Replicated service 
 6Byzantine fault tolerance (BFT)
- Low throughput 
 - Modifies clients 
 - Long-lived sessions
 
  7Prophecy
- High throughput  good consistency 
 - No free lunch 
 - Read-mostly workloads 
 - Slightly weakened consistency
 
  8Byzantine fault tolerance (BFT)
- Low throughput 
 - Modifies clients 
 - Long-lived sessions
 
D-Prophecy
Prophecy 
 9Traditional BFT reads
application
Agree?
Clients
Replica Group 
 10A cache solution
cache
application
Agree?
Clients
Replica Group 
 11A cache solution
cache
application
- Problems 
 -  
 -  Huge cache 
 -  Invalidation 
 
Agree?
Clients
Replica Group 
 12A compact cache
cache
application
Requests Responses
req1 resp1
req2 resp2
req3 resp3
Clients
Replica Group 
 13A compact cache
cache
application
Requests Responses
sketch(req1) sketch(resp1)
sketch(req2) sketch(resp2)
sketch(req3) sketch(resp3)
Requests Responses
Clients
Replica Group 
 14A sketcher
sketcher
application
Clients
Replica Group 
 15A sketcher
sketch
webpage
Clients
Replica Group 
 16Executing a read
sketch
webpage
Agree? 
- Fast, load-balanced reads 
 
Clients
Replica Group 
 17Executing a read
sketch
webpage
Agree? 
Clients
Replica Group 
 18Executing a read
sketch
webpage
key-value store
replicated state machine
Clients
Replica Group 
 19Executing a read
sketch
webpage
Agree? 
Maintain a fresh cache
Clients
Replica Group 
 20NO!
Did we achieve linearizability? 
 21Executing a read
sketch
webpage
Clients
Replica Group 
 22Executing a read
sketch
webpage
Agree? 
Clients
Replica Group 
 23Executing a read
sketch
webpage
Agree? 
Fast reads may be stale
Clients
Replica Group 
 24Load balancing
sketch
webpage
Agree? 
Pr(k stale)  gk
Clients
Replica Group 
 25D-Prophecy vs. BFT
- Traditional BFT 
 - Each replica executes read 
 - Linearizability 
 - D-Prophecy 
 - One replica executes read 
 - Delay-once linearizability
 
  26Byzantine fault tolerance (BFT)
- Low throughput 
 - Modifies clients 
 - Long-lived sessions
 
D-Prophecy
Prophecy 
 27Key-exchange overhead
11
3 
 28Internet services
Clients
Replica Group 
 29A proxy solution
Consolidate sketchers
Clients
Replica Group 
 30A proxy solution
Sketcher must be fail-stop
Clients
Trusted
Replica Group 
 31A proxy solution
Sketcher must be fail-stop
-  Trust middlebox already 
 -  Small and simple
 
Clients
Trusted
Replica Group 
 32Executing a read
Prophecy
Fast, load-balanced reads
q
Clients
Trusted
Req Resp
s(q) 
??? ???
Replica Group 
 33Prophecy
Fast reads may be stale
Clients
Trusted
Req Resp
s(q) 
??? ???
Replica Group 
 34Delay-once linearizability 
 35Delay-once linearizability
Read-after-write property
? W, R, W, W, R, R, W, R ? 
 36Delay-once linearizability
Read-after-write property
? W, R, W, W, R, R, W, R ? 
 37Example application
- Upload embarrassing photos 
 - 1. Remove colleagues from ACL 
 - 2. Upload photos 
 - 3. (Refresh) 
 - Weak may reorder 
 - Delay-once preserves order
 
  38Byzantine fault tolerance (BFT)
- Low throughput 
 - Modifies clients 
 - Long-lived sessions
 
D-Prophecy
Prophecy 
 39Implementation
- Modified PBFT 
 - PBFT is stable, complete 
 - Competitive with Zyzzyva et. al. 
 - C, Tamer async I/O 
 - Sketcher ?2000 LOC 
 - PBFT library ?1140 LOC 
 - PBFT client ?1000 LOC 
 
  40Evaluation
- Prophecy vs. proxied-PBFT 
 - Proxied systems 
 - D-Prophecy vs. PBFT 
 - Non-proxied systems
 
  41Evaluation
- Prophecy vs. proxied-PBFT 
 - Proxied systems 
 - We will study 
 - Performance on null workloads 
 - Performance with real replicated service 
 - Where system bottlenecks, how to scale 
 
  42Basic setup
(concurrent)
Clients (100)
Replica Group (PBFT) 
 43Fraction of failed fast reads
Alexa top sites lt 15 
 44Small benefit on null reads 
 45Apache webserver setup
Clients
Replica Group 
 46Large benefit on real workload
3.7x
2.0x 
 47Benefit grows with work
94?s (Apache)
Null workloads are misleading! 
 48Benefit grows with work 
 49Single sketcher bottlenecks 
 50Scaling out 
 51Scales linearly with replicas 
 52Summary
- Prophecy good for Internet services 
 - Fast, load-balanced reads 
 - D-Prophecy good for traditional services 
 - Prophecy scales linearly while PBFT stays flat 
 - Limitations 
 - Read-mostly workloads (meas. study corroborates) 
 - Delay-once linearizability (useful for many apps)
 
  53Thank You 
 54Additional slides 
 55Transitions
- Prophecy good for read-mostly workloads 
 - Are transitions rare in practice? 
 
  56Measurement study
- Alexa top sites 
 - Access main page every 20 sec for 24 hrs 
 
  57Mostly static content 
 58Mostly static content
15 
 59Dynamic content
- Rabin fingerprinting on transitions 
 - 43 differ by single contiguous change 
 - Sampled 4000 of them, over half due to 
 - Load balancing directives 
 - Random IDs in links, function parameters