HPSS - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

HPSS

Description:

map physical volume to cartridge, cartridge to PVR. Physical Volume Repository. control cartridge mount/dismount functions. modules for Ampex D2, STK 4480/90 & SD-3, ... – PowerPoint PPT presentation

Number of Views:141
Avg rating:3.0/5.0
Slides: 21
Provided by: christo79
Category:
Tags: hpss

less

Transcript and Presenter's Notes

Title: HPSS


1
HPSS
The High Performance Storage System
Developed by IBM, LANL, LLNL, ORNL, SNL, NASA
Langley, NASA Lewis, Cornell, MHPCC, SDSC, UW
with funding from DoE, NASA NSF Presented by
Christopher Ho, CSci 599
2
Motivation
  • In last 10 years, processor speeds have increased
    50-fold
  • Disk transfer rates have increased lt 4 X
  • RAID now successful, inexpensive
  • Tape speeds have increased lt 4 X
  • tape striping not widespread
  • Performance gap is widening!
  • Bigger bigger files (10s, 100s of GB, soon TB)
  • gt Launch scalable storage initiative

3
IEEE Mass Storage Reference Model
  • Defines layers of abstraction transparency
  • device, location independence
  • Separation of policy and mechanism
  • Logical separation of control and data flow
  • Defines common terminology
  • compliance does not imply inter-operability
  • Scalable, Hierarchical Storage Management
  • see http//www.ssswg.org/sssdocs.html

4
Introduction Hierarchical Storage
  • Storage pyramid

Decreasing cost speed,Increasing capacity
Memory
Disk
Optical disk
Magnetic Tape
5
HPSS Objectives
  • Scalable
  • transfer rate, file size, name space, geography
  • Modular
  • software subsystems replaceable,network/tape
    technologies updateable, API access
  • Portable
  • multiple vendor platforms, no kernel
    modifications,multiple storage technologies,
    standards-based, leverage commercial products

6
HPSS Objectives (cont)
  • Reliable
  • distributed software and hardware components
  • atomic transactions
  • mirror metadata
  • failed/restarted servers can reconnect
  • storage units can be varied on/offline

7
Access into HPSS
  • FTP
  • protocol already supports 3rd party transfers
  • new partial file transfer (offset size)
  • Parallel FTP
  • pget, pput, psetblocksize, psetstripewidth
  • NFS version 2
  • most like traditional file system, slower than
    FTP
  • PIOFS
  • parallel distributed FS on IBM SP2 MPP
  • futures AFS/DCE DFS, DMIG-API

8
HPSS architecture
Processing node
Processing node
HPSS server
MPP interconnect
Storage System Mgmt
I/O node
Control Network
HiPPI/FC/ATM data network
Network Attached Disk
NFS FTP DMIG-API
-NETWORK
Network Attached Tape
9
Software infrastructure
  • Encina transaction processing manager
  • two-phase commit, nested transactions
  • guarantees consistency of metadata, server state
  • OSF Distributed Computing Environment
  • RPC calls for control messages
  • Thread library
  • Security (registry privilege service)
  • Kerberos authentication
  • 64 bit Arithmetic functions
  • file sizes up to 264 bytes
  • 32 bit platforms, big/little endian architectures

10
Software components
  • Name server
  • map POSIX filenames to internal file, directory
    or link
  • Migration/Purge policy manager
  • when/where to migrate to next level in hierarchy
  • after migrated, when to purge copy on this level
  • purge initiated when usage exceeds
    administrator-configured high-water mark
  • each file evaluated by size, time since last read
  • migration, purge can also be manually initiated

11
Software components (cont)
  • Bitfile server
  • provides abstraction of bitfiles to client
  • provides scatter/gather capability
  • supports access by file offset, length
  • supports random and parallel reads/writes
  • works with file segment abstraction (see Storage
    server)

12
Software components (cont)
  • Storage server
  • map segments onto virtual volumes, virtual
    volumes onto physical volumes
  • virtual volumes allow tape striping
  • Mover
  • transfers data from a source to a sink
  • tape, disk, network, memory
  • device control seek, load/unload, write tape
    mark, etc.

13
Software components (cont)
  • Physical Volume Library
  • map physical volume to cartridge, cartridge to
    PVR
  • Physical Volume Repository
  • control cartridge mount/dismount functions
  • modules for Ampex D2, STK 4480/90 SD-3,IBM
    3480 3590 robotic libraries
  • Repack server
  • deletions leave gaps on sequential media
  • read live data, rewrite on new sequential
    volume,free up previous volume

14
Software components (cont)
  • Storage system management
  • GUI to monitor/control HPSS
  • stop/start software servers
  • monitor events and alarms, manual mounts
  • vary devices on/offline

15
Parallel transfer protocol - goals
  • Provide parallel data exchange between
    heterogeneous systems and devices
  • Support different combinations of parallel and
    sequential source/sink
  • Support gather/scatter and random access
  • combinations of stripe width, both regular and
    irregular data block size
  • Scalable I/O bandwidth
  • Transport independent (TCP/IP, HiPPI, FCS, ATM)

16
Gather/scatter lists
logical window
D1
D2
D3
17
Parallel transport architecture
control connections
client
control connections
S1
Sn
D1
Dn
parallel data flow
18
Parallel FTP transfer (pget)
1
Name server
Parallel FTPd
Parallel FTP client
2
6
Client mover
6
Bitfile server
Client mover
3
Storage server
4
4
5
Mover
5
Mover
19
Summary
  • High performance
  • up to 1 GB/s aggregate transfer rates
  • Scalable storage
  • parallel architecture
  • terabyte-sized files
  • petabytes in archive
  • Robust
  • transaction processing manager
  • Portable
  • IBM, Sun implementations available

20
Conclusion
  • Feasability has been demonstrated for large,
    scalable storage
  • Software exists, is shipping, and is actively
    used in the national labs on a daily basis
  • Distributed architecture and parallel
    capabilities mesh well with grid computing
Write a Comment
User Comments (0)
About PowerShow.com