Storage Research in the UCSC Storage Systems Research Center SSRC - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Storage Research in the UCSC Storage Systems Research Center SSRC

Description:

UC Santa Cruz. Storage Research in the UCSC Storage Systems Research Center (SSRC) ... Supported by low-level research in materials, devices and interconnects ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 22
Provided by: scottb85
Category:

less

Transcript and Presenter's Notes

Title: Storage Research in the UCSC Storage Systems Research Center SSRC


1
Storage Research in the UCSC Storage Systems
Research Center (SSRC)
  • Scott A. Brandt
  • (scott_at_cs.ucsc.edu)
  • Computer Science Department
  • Storage Systems Research Center
  • Jack Baskin School of Engineering
  • University of California, Santa Cruz

2
SSRC Overview
  • Systems-oriented storage research center
  • Supported by low-level research in materials,
    devices and interconnects
  • Funded by DOE and NSF, and industry sponsors
  • 10 faculty members
  • Others involved as associates
  • External researchers as affiliates
  • Faculty growth in databases security
  • Significant educational component
  • Undergraduate and graduate storage and related
    systems courses
  • 20-25 graduate and undergraduate students
  • Close cooperation with industry research sponsors
  • HP, IBM, Microsoft, Intel, Agile/OnStor
  • Actively pursuing others

3
Primary SSRC Faculty
  • Darrell Long, Director
  • High-performance storage systems, distributed
    systems
  • Patrick Mantey
  • Sensor networks, data acquisition, and multimedia
    systems
  • Scott Brandt
  • High-performance storage systems, distributed
    systems, soft real-time systems
  • Ethan Miller
  • Scale security in file systems, next generation
    file system design

4
Other SSRC Faculty
  • Alexandre Brandwajn (Performance Analysis)
  • Performance modeling and analysis of computer
    systems
  • Katia Obraczka (Networking)
  • Network architectures for large-scale storage
  • Hamid Sadjapour (Coding Theory)
  • Storage system reliability and availability
  • Raymie Stata (Web Archaeology)
  • Mechanisms to preserve and mine multi-terabyte
    data sets
  • Claire Gu (Optical Storage)
  • Phokion Kolaitis (Logic and Databases)

5
SSRC Storage Research Challenges
  • Huge Capacity and Scalability
  • Internet Archive
  • Genomic databases
  • Performance
  • Ever-increasing gap between CPU and secondary
    storage
  • Security
  • Networked storage introduces many new security
    issues
  • Portability
  • Power management, disconnected operation, new
    devices
  • New storage technologies
  • MEMS, MRAM, Flash, etc.
  • Large-scale information management

6
SSRC Research Thrusts
  • Object-based Storage
  • Scalable high-performance distributed storage
  • Archival Storage
  • Large-scale on-line disk-based storage systems
  • New Storage Technologies
  • Systems-level research in new storage
    technologies
  • Predictive/Adaptive Techniques
  • Machine-learning based techniques for increasing
    performance and reducing I/O latency and traffic
  • Secure Storage
  • Secure file systems techniques and systems

7
1. Object-based Storage
  • Scalable High-Performance Object-based Storage
    from Commodity Components
  • 2 Petabytes
  • 100 GB/sec aggregate throughput
  • Parallel accesses from up to 10,000 clients
  • Possibly to the same file
  • Files bytes to terabytes
  • 1-10,000 files/directory
  • 50 msec access times
  • Mid-performance local access by visualization
    workstations
  • Wide-area access

8
1. Object-based StorageOBFS Object-based
Storage Manager
  • Design Principles
  • Flat object name space
  • Variable block size with fixed maximum (common)

OBFS outperforms Ext2/3 and meets or exceeds the
performance of XFS with 1/25 the code
9
1. Object-based StorageLazy Hybrid Metadata
Management
  • Efficient, flexible, scalable metadata cluster
    management
  • Filename hashing
  • Efficient
  • Avoids hot spots
  • Directory hierarchy
  • Provides standard hierarchical directory
    semantics
  • Lazy policies
  • Efficient metadata operations
  • Dual-entry Access Control List (DACL)
  • Server-side permission caching
  • Update Logging

10
1. Object-based StorageReliability in
Large-Scale Storage
  • More disks ? reliability problem
  • Disk failures
  • Non-recoverable bit errors
  • 1 in 10131015 bits
  • Large disks ? long rebuild time
  • Capacity outpaces data transfer rate
  • RAID alone cannot solve the problem
  • Solution for Disk failures
  • Configuration for a redundancy set
  • 2-way or 3-way mirroring
  • RAID51
  • Fast Mirroring Copy
  • Lazy Parity Backup
  • Solution for Bit errors
  • Signature scheme

Mean-Time-To-Data-Loss of a 2PB storage system by
using three configurations and fast recovery
mechanisms. The upper lines are for a system
built from disks with 106 hour MTTF, and lower
lines are for a system built from disks with 105
hour MTTF.
11
1. Object-based StorageRobust Data Distribution
  • OBSDs are added to the system in groups
  • Allocation/Reallocation
  • Objects are placed in the new group with a
    probability equal to the fraction of the total
    number of OBSDs in the new group
  • Lookup
  • If an object isnt found in the newest group, the
    next newest group is checked as if the newest
    group does not exist
  • In-memory ? very fast

This figure shows placement of an object into a
system with three groups. In this case, the
object didnt get placed in Group 3, or Group 2,
so it will be placed in Group 1.
12
2. Archival StorageDeep Store
  • Efficient On-line Deep Store
  • Differentially Compress Data (1001)
  • Disk-based for on-line performance
  • Search for similar files
  • Compress against existing data
  • Organize similar files using data clusters
  • Scale to billions of files

13
3. New Storage TechnologiesMEMS-based Storage
  • MEMS-based storage very dense non-rotating
    orthogonal magnetic or physical recording
  • 2D seeks
  • Large of active read/write tips
  • Device Modeling
  • Power management
  • Aggressive spin-down, sequential request merging,
    subsector accesses
  • 50 lower power consumption with no performance
    penalty
  • Storage Subsystem Architectures
  • MEMS metadata storage and MEMS disk write buffer
  • Performance ? MEMS alone
  • Request scheduling
  • Zone-based Shortest Positioning Time First
  • SPTF-like response times and C-SCAN-like
    variability
  • Storage Allocation
  • Zone-based allocation (in progress)

14
3. New Storage TechnologiesHeRMES MRAM-based
Storage
  • Magnetic RAM Fast non-volatile DRAM-like storage
  • How to use MRAM in a file system?
  • Combined disk/MRAM file system
  • File systems for mobile devices
  • MRAM metadata storage and data caching
  • Online metadata and file compression

15
4. Predictive/Adaptive Techniques
  • Dynamic techniques improve application and system
    performance
  • Much better than static parameter tuning
  • Formal Machine Learning-based approach
  • Profs. Manfred Warmuth and David Helmbold
  • Problems identified/examined
  • File access pattern prediction for prefetching
    and grouping
  • Cache management algorithm selection
  • Disk spin-down timeout selection
  • File lifetime prediction
  • Network congestion control

16
4. Predictive/Adaptive TechniquesPredictive
Prefetching/Data Grouping
  • Recency-based models track file access patterns
  • Successor information maintained in metadata
  • Prefetch related files or groups
  • Reduces storage latency and increases cache
    effectiveness
  • Predictors
  • Finite Multi-Order Context (Kroger)
  • Noah aggregating cache (Amer)
  • Program-based Successor (Yeh)
  • Current research
  • Hoarding for mobility

17
4. Predictive/Adaptive TechniquesAdaptive
Caching
  • Best cache management policy changes over time
  • Workloads change
  • Filtering occurs
  • Cache relationships change
  • Solution Dynamically choose best policy
  • Machine Learning Fixed-Share to Uniform Past
  • Refetching helps

18
4. Predictive/Adaptive TechniquesAdaptive
Caching Results
28 fewer cache misses than LRU 8 fewer than
BestFixed 4-24 reduction in I/O traffic
19
5. Secure StorageSecure Network-Attached Storage
  • For each file block on disk, keep sufficient
    information to
  • Decode the data (at the client)
  • Validate the sender of the data
  • Ensure data integrity
  • Use encryption to keep data secret on disk and in
    transit
  • Decryption occurs at the client
  • Information to decrypt available only at client!
  • Prevent compromise of data
  • Impossible to protect against denial of service
  • Loss of data may occur ? make sure its noticed!
  • Three similar security schemes
  • Trade off resistance to intrusion for speed

20
5. Secure StorageIntra-file Security (IFS)
  • IFS end-to-end file system encryption technology
  • Encrypts independent file extents
  • Flexible encryption region size
  • Files may contain one or more isolated or
    overlapping secure regions
  • Transparent to the user
  • Supports strong encryption

21
Summary
  • The UCSC Storage Systems Research Center is
    becoming a nationally recognized storage systems
    research group
  • Darrell Long recently founded the Conference on
    File and Storage Technology (FAST), already the
    premier storage systems research conference
  • We are actively recruiting faculty and students
    to participate in SSRC research activities
  • We are actively soliciting corporate sponsorships
    and research relationships
Write a Comment
User Comments (0)
About PowerShow.com