Archival Storage Venti : A new approach to archival storage Sean Quinlan and Sean Dorward - PowerPoint PPT Presentation

Loading...

PPT – Archival Storage Venti : A new approach to archival storage Sean Quinlan and Sean Dorward PowerPoint presentation | free to download - id: aae6e-ZDA3M



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Archival Storage Venti : A new approach to archival storage Sean Quinlan and Sean Dorward

Description:

... 1 : Vac ... vac archive file - 45 bytes long. 20 bytes for root fingerprint. 25 bytes fixed ... users vac same data only 1 copy stored. vac on changed ... – PowerPoint PPT presentation

Number of Views:407
Avg rating:3.0/5.0
Slides: 22
Provided by: rohitku
Learn more at: http://langevin.usc.edu
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Archival Storage Venti : A new approach to archival storage Sean Quinlan and Sean Dorward


1
Archival StorageVenti A new approach to
archival storageSean Quinlan and Sean Dorward
  • CS 599 Special topics in OS and Distributed
    Storage Systems
  • Rohit Kulkarni
  • 4th Feb 2004

2
Outline
  • Archival Storage
  • Venti key ideas
  • Applications
  • Implementation
  • Performance
  • Reliability and Recovery
  • Conclusion Questions?

3
Archival Storage
  • Storing data for long periods of time - forever
  • Tape backup
  • Central server for no. of clients
  • Restoring data painful
  • Full backup Vs Incremental backup
  • Snapshots
  • consistent read-only view of file system at some
    point in past
  • Maintains file system permissions
  • Can be accessed by standard tools ls, cat, cp,
    grep, diff
  • Avoids tradeoff between full Vs incremental
    backup
  • Looks like full backup
  • Implementation resembles incremental backup
    share blocks

4
Venti
  • GOAL To provide a write-once archival
    repository that can be shared by multiple client
    machines and applications
  • Block level network storage system
  • actually a backend storage for client apps
  • Blocks addressed by hash of their contents
  • uses SHA-1 algorithm
  • SHA-1 output is 160 bit (20 byte) fingerprint of
    data block
  • Write once policy
  • once written cannot be deleted
  • Multiple writes of same data coalesced
  • data sharing saves storage capacity
  • makes write operation idempotent

5
Venti (2)
  • Multiple clients can share a Venti server
  • Hash fn gives an universal namespace
  • Inherent integrity checking of data
  • Fingerprint computation on retrieval
  • Caching is simplified
  • Uses magnetic disks as storage technology
  • access time comparable to non-archival data

6
Data Organization
  • Data divided into blocks
  • App needs fingerprint for retrieval
  • Fingerprints packed together -gt pointer blks
  • Above repeated recursively to get single
    fingerprint -gt root of tree

7
Data organization (2)
  • New or modified data blocks are stored
  • Unchanged sections of tree reused

8
Data organization (3)
  • More complex data structures
  • Mixing data and fingerprints in a block
  • e.g. structure for storing file system
  • 3 types of blocks
  • Directory has file meta info root fingerprint
  • Pointer
  • Data

9
Venti application 1 Vac
  • Similar to zip tar storing collection of
    files and directories as single object
  • tree of blocks formed for selected files
  • vac archive file -gt 45 bytes long
  • 20 bytes for root fingerprint
  • 25 bytes fixed header string
  • any amount of data compressed down to 45 bytes
  • Unvac to restore file from archive
  • Duplicate copies of file coalesced on server
  • Multiple users vac same data only 1 copy stored
  • vac on changed contents

10
Venti application 2 Physical Level Backup
  • Copy the raw disk blocks to Venti
  • No need to walk file hierarchy
  • Gives higher throughput
  • Duplicate blocks are coalesced
  • User sees full backup of device
  • Storage space advs of incremental backup retained
  • Random access possible
  • Directly mounting a backup file system image
  • lazy restore restore on demand

11
Venti application 3 Plan 9 File System
  • Plan 9 FS on top of Venti
  • Primary location for data
  • Small amount of read/write storage
  • Stores daily changes to file system
  • Smaller than active file system
  • Venti stores permanent changes

12
Implementation
  • Append-only log of data blocks
  • RAID array
  • Separate index maps fingerprints to log location
  • Fingerprint location in index is random
  • Striped across multiple disks
  • Write buffering
  • Block cache
  • Hit -gt index lookup data log bypassed
  • index cache
  • Hit -gt index lookup bypassed

13
Implementation (2)
14
Implementation (3)
  • no. of fingerprints gtgt no. of blocks on a server
  • Index as disk-resident hash table
  • Hashing fn maps fingerprints to index buckets

15
Performance computing environments
  • 2 plan 9 file servers, bootes and emelie
  • Spanning 1990 to 2001
  • 522 user accounts, 50-100 active all the time
  • Numerous development projects hosted
  • Several large data sets
  • Astronomical data, satellite imagery, multimedia
    files

16
Performance (2)
17
Performance (3)
  • When stored on Venti, size of jukebox data
    reduced by 3 factors
  • Elimination of duplicate blocks
  • Elimination of block fragmentation
  • Compression of block contents

18
Reliability and Recovery
  • Tools for integrity checking error recovery
  • Verifying structure of arena
  • Checking there is an index entry for every block
    in data log, vice versa
  • Rebuilding index form data log
  • Copying arena to removable media
  • Data log on RAID 5 disk array
  • Protection against single drive failures
  • Off-site mirrors for server
  • Storing to write-once read-many optical jukeboxes

19
Future Work
  • Load balancing
  • Distribute Venti across multiple machines
  • Replicate server
  • Use of proxy server to hide it from client
  • Better access control
  • Currently just authentication to server
  • Single root fingerprint gives access to entire
    file tree
  • Use of variable sized blocks as in LBFS

20
Conclusion
  • Addressing block by SHA-1 hash of contents
  • Write once model
  • Reduces accidental or malicious data loss
  • Simplifies administration
  • Simplifies caching
  • Allows sharing of data
  • Magnetic disks as storage technology
  • Large capacity at low price
  • Random access
  • Performance comparable to non-archival data

21
Questions ?
  • Thank You
About PowerShow.com