Boxwood: Abstractions as the Foundation for Storage Infrastructure - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Boxwood: Abstractions as the Foundation for Storage Infrastructure

Description:

Chunk Store. Persistent storage with 'malloc'-like interface ... Distributed reliable storage from chunk store. Caching for performance ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 29
Provided by: lido5
Category:

less

Transcript and Presenter's Notes

Title: Boxwood: Abstractions as the Foundation for Storage Infrastructure


1
Boxwood Abstractions as the Foundation for
Storage Infrastructure
  • Lidong Zhou, Microsoft Research Silicon Valley
  • Joint work with Chandu Thekkath, John MacCormick,
    Nick Murphy, and Marc Najork

2
Distributed Storage Applications are Hard to Build
  • Distributed storage low hardware cost, but high
    development/deployment cost
  • Application logic on low-level storage interface
  • Hardware parallelism and concurrency control
  • Fault tolerance a necessity
  • Incremental expansion and dynamic reconfiguration
    vs. system consistency
  • Our goal Distributed storage applications made
    easyto design, build, and deploy

3
Target Application and Setting
Enterprise storage applications and back-end
storage for data-intensive Internet services
4
Roadmap
  • Boxwood Vision
  • Boxwood Architecture
  • Building Applications on Boxwood
  • Performance
  • Related Work and Conclusion

5
Boxwood Vision
  • Incorporate rich virtualized abstractions into
    low levels of the storage
  • An evolution path for distributed storage

Storage Applications
6
Boxwood Vision
  • Incorporate rich virtualized abstractions into
    low levels of the storage
  • An evolution path for distributed storage

Storage Applications
Virtual Disk
7
Boxwood Vision
  • Incorporate rich virtualized abstractions into
    low levels of the storage
  • An evolution path for distributed storage

Storage Applications
Tree
Table
List


8
Why High-Level Abstractions
  • Reduce the complexity of distributed storage
    applications
  • Natural continuum of storage virtualization
  • High-level programming language for building
    distributed storage applications
  • Potential built-in performance optimization by
    exploiting structural information
  • Caching
  • Prefetching

9
Roadmap
  • Boxwood Vision
  • Boxwood Architecture
  • Building Applications on Boxwood
  • Performance
  • Related Work and Conclusion

10
Boxwood Architecture
Storage Application
B-Tree
High-level Storage Abstractions
Chunk Store
Reliable Media
Replicated Logical Device
Magnetic Media
11
Chunk Store
  • Persistent storage with malloc-like interface
  • Virtualization layer that hides the distributed
    nature
  • Manage address space or free space for higher
    layers
  • Reliable storage through replicated logical device

Allocate
Read
De-allocate
Write
Chunk Store
Replicated Logical Device
12
B-Tree Abstraction
  • B-Tree A proven useful data structure for
    storage applications
  • Distributed/reliable B-Link trees in Boxwood
  • B-Link trees high concurrency with simple
    locking
  • Distributed reliable storage from chunk store
  • Caching for performance
  • Distributed lock service for consistency
  • Logging for recovery

Create
Lookup
Insert
Enumerate
Delete
B-Link Tree
Locking
Logging
Chunk Store
13
Boxwood Services
  • Distributed lock service for coordinating
    concurrent access to shared data
  • Logging and recovery service for atomicity in
    face of transient failures
  • Consensus service for system consistency
  • Clean design of these services is crucial for
    scalability and for managing complexity

14
Roadmap
  • Boxwood Vision
  • Boxwood Architecture
  • Building Applications on Boxwood
  • Performance
  • Related Work and Conclusion

15
Distributed Storage Applications on Boxwood A
Recipe
  • Design applications for local storage
  • Map application logic to storage abstractions
  • Adapt the design for a distributed storage
    infrastructure
  • Boxwood abstractions are virtualized
  • Boxwood offers facilitating distributed services
  • Separating algorithmic design from distributed
    system concerns is attractive.

16
From B-Link Tree Algorithm to Distributed
Reliable B-Link Trees
B-Link Tree Algorithm
Local Locks
B-Link trees on a single machine
17
From B-Link Tree Algorithm to Distributed
Reliable B-Link Trees
B-Link Tree Algorithm
Global Lock Service
Reliable Logging
Chunk Store
Replicated Logical Device
Distributed and reliable B-Link trees
18
BoxFSMulti-Node File Server on Boxwood
  • Exported via NFS v2
  • Directory/File ? B-Tree
  • Directory maps names to NFS file handle with
    embedded B-tree handle
  • File maps block number to chunk handle
  • File blocks ? chunks
  • Locking/caching at file system level
  • 2500 lines of C code

BoxFS
Services
B-Link Tree
Chunk Store
19
Roadmap
  • Boxwood Vision
  • Boxwood Architecture
  • Building Applications on Boxwood
  • Performance
  • Related Work and Conclusion

20
Prototype Deployment and Performance Evaluation
  • System setup
  • Eight Dell PowerEdge 2650 servers with a single
    2.4 GHz Xeon processor, 1GB of RAM
  • Gigabit Ethernet switch
  • Adaptec AIC-7899 dual SCSI adapter, and 5 SCSI
    drives
  • Performance evaluation
  • Single-machine non-replicated performance (BoxFS
    vs. NFS)
  • B-tree operation scalability
  • BoxFS operation scalability

21
BoxFS vs. NFS over NTFSConnectathon Benchmarks
22
B-Tree Scaling (Private Tree)
23
BoxFS Scaling (Read)
24
B-Tree Scaling (Shared Tree)
25
BoxFS Scaling (Write/MkDirEnt)
26
Roadmap
  • Boxwood Vision
  • Boxwood Architecture
  • Building Applications on Boxwood
  • Performance
  • Related Work and Conclusion

27
Related Work
  • Distributed Storage/Operating Systems
  • Virtual/Logical disks
  • File systems
  • Database systems
  • Scalable Distributed Data Structures
  • Linear Hash Table (LH) and its variants
  • (Litwin, 1980--present)
  • Scalable distributed hash table (Gribble et al.,
    2000)
  • Highly concurrent B-trees
  • (Lehman and Yao, 1981 Sagiv, 1986)

28
Conclusion and Future Directions
  • A storage infrastructure offering virtualized
    high-level abstractions is promising
  • Future Work
  • Explore more abstractions and applications
    expose flexible interfaces (e.g., through hints)
  • Leverage high-level abstractions for better load
    balancing, prefetching, and caching
  • Graceful degradation during massive failures
Write a Comment
User Comments (0)
About PowerShow.com