Title: Boxwood: Abstractions as the Foundation for Storage Infrastructure
1Boxwood Abstractions as the Foundation for
Storage Infrastructure
- Lidong Zhou, Microsoft Research Silicon Valley
-
- Joint work with Chandu Thekkath, John MacCormick,
Nick Murphy, and Marc Najork
2Distributed Storage Applications are Hard to Build
- Distributed storage low hardware cost, but high
development/deployment cost - Application logic on low-level storage interface
- Hardware parallelism and concurrency control
- Fault tolerance a necessity
- Incremental expansion and dynamic reconfiguration
vs. system consistency - Our goal Distributed storage applications made
easyto design, build, and deploy
3Target Application and Setting
Enterprise storage applications and back-end
storage for data-intensive Internet services
4Roadmap
- Boxwood Vision
- Boxwood Architecture
- Building Applications on Boxwood
- Performance
- Related Work and Conclusion
5Boxwood Vision
- Incorporate rich virtualized abstractions into
low levels of the storage - An evolution path for distributed storage
Storage Applications
6Boxwood Vision
- Incorporate rich virtualized abstractions into
low levels of the storage - An evolution path for distributed storage
Storage Applications
Virtual Disk
7Boxwood Vision
- Incorporate rich virtualized abstractions into
low levels of the storage - An evolution path for distributed storage
Storage Applications
Tree
Table
List
8Why High-Level Abstractions
- Reduce the complexity of distributed storage
applications - Natural continuum of storage virtualization
- High-level programming language for building
distributed storage applications - Potential built-in performance optimization by
exploiting structural information - Caching
- Prefetching
9Roadmap
- Boxwood Vision
- Boxwood Architecture
- Building Applications on Boxwood
- Performance
- Related Work and Conclusion
10Boxwood Architecture
Storage Application
B-Tree
High-level Storage Abstractions
Chunk Store
Reliable Media
Replicated Logical Device
Magnetic Media
11Chunk Store
- Persistent storage with malloc-like interface
- Virtualization layer that hides the distributed
nature - Manage address space or free space for higher
layers - Reliable storage through replicated logical device
Allocate
Read
De-allocate
Write
Chunk Store
Replicated Logical Device
12B-Tree Abstraction
- B-Tree A proven useful data structure for
storage applications - Distributed/reliable B-Link trees in Boxwood
- B-Link trees high concurrency with simple
locking - Distributed reliable storage from chunk store
- Caching for performance
- Distributed lock service for consistency
- Logging for recovery
Create
Lookup
Insert
Enumerate
Delete
B-Link Tree
Locking
Logging
Chunk Store
13Boxwood Services
- Distributed lock service for coordinating
concurrent access to shared data - Logging and recovery service for atomicity in
face of transient failures - Consensus service for system consistency
- Clean design of these services is crucial for
scalability and for managing complexity
14Roadmap
- Boxwood Vision
- Boxwood Architecture
- Building Applications on Boxwood
- Performance
- Related Work and Conclusion
15Distributed Storage Applications on Boxwood A
Recipe
- Design applications for local storage
- Map application logic to storage abstractions
- Adapt the design for a distributed storage
infrastructure - Boxwood abstractions are virtualized
- Boxwood offers facilitating distributed services
- Separating algorithmic design from distributed
system concerns is attractive.
16From B-Link Tree Algorithm to Distributed
Reliable B-Link Trees
B-Link Tree Algorithm
Local Locks
B-Link trees on a single machine
17From B-Link Tree Algorithm to Distributed
Reliable B-Link Trees
B-Link Tree Algorithm
Global Lock Service
Reliable Logging
Chunk Store
Replicated Logical Device
Distributed and reliable B-Link trees
18BoxFSMulti-Node File Server on Boxwood
- Exported via NFS v2
- Directory/File ? B-Tree
- Directory maps names to NFS file handle with
embedded B-tree handle - File maps block number to chunk handle
- File blocks ? chunks
- Locking/caching at file system level
- 2500 lines of C code
BoxFS
Services
B-Link Tree
Chunk Store
19Roadmap
- Boxwood Vision
- Boxwood Architecture
- Building Applications on Boxwood
- Performance
- Related Work and Conclusion
20Prototype Deployment and Performance Evaluation
- System setup
- Eight Dell PowerEdge 2650 servers with a single
2.4 GHz Xeon processor, 1GB of RAM - Gigabit Ethernet switch
- Adaptec AIC-7899 dual SCSI adapter, and 5 SCSI
drives - Performance evaluation
- Single-machine non-replicated performance (BoxFS
vs. NFS) - B-tree operation scalability
- BoxFS operation scalability
21BoxFS vs. NFS over NTFSConnectathon Benchmarks
22B-Tree Scaling (Private Tree)
23BoxFS Scaling (Read)
24B-Tree Scaling (Shared Tree)
25BoxFS Scaling (Write/MkDirEnt)
26Roadmap
- Boxwood Vision
- Boxwood Architecture
- Building Applications on Boxwood
- Performance
- Related Work and Conclusion
27Related Work
- Distributed Storage/Operating Systems
- Virtual/Logical disks
- File systems
- Database systems
- Scalable Distributed Data Structures
- Linear Hash Table (LH) and its variants
- (Litwin, 1980--present)
- Scalable distributed hash table (Gribble et al.,
2000) - Highly concurrent B-trees
- (Lehman and Yao, 1981 Sagiv, 1986)
28Conclusion and Future Directions
- A storage infrastructure offering virtualized
high-level abstractions is promising - Future Work
- Explore more abstractions and applications
expose flexible interfaces (e.g., through hints) - Leverage high-level abstractions for better load
balancing, prefetching, and caching - Graceful degradation during massive failures