Automated administration for storage system - PowerPoint PPT Presentation

About This Presentation
Title:

Automated administration for storage system

Description:

Automated administration for storage system Presentation by Amitayu Das Introduction Major challenges in storage management System design and configuration (device ... – PowerPoint PPT presentation

Number of Views:141
Avg rating:3.0/5.0
Slides: 36
Provided by: Mural8
Learn more at: https://www.cse.psu.edu
Category:

less

Transcript and Presenter's Notes

Title: Automated administration for storage system


1
Automated administration for storage system
  • Presentation by Amitayu Das

2
Introduction
  • Major challenges in storage management
  • System design and configuration (device
    management)
  • Capacity Planning (space management)
  • Performance tuning (performance management)
  • High Availability (availability management)
  • Automation (all of the above, in a self-managing
    manner)

3
Motivation
  • Large disk arrays and networked storage lead to
    huge storage capacities and high bandwidth access
    to facilitate consolidated storage systems.
  • Enterprise-scale storage systems contain hundreds
    of host computers and storage devices and up to
    tens of thousands of disks.
  • Designing, deploying and runtime management of
    such systems lead to huge cost (often higher than
    procuring cost)
  • Look at the problems in greater details

4
Storage System life-cycle
5
Storage administration functions
  • Data protection
  • Performance tuning
  • Planning and deployment
  • Monitoring and record-keeping
  • Diagnosis and repair

6
Few notable attempts
  • System-managed storage (IBM)
  • Attribute-managed storage (HP)
  • Replication
  • RAID
  • Online snapshot support
  • Remote replication
  • Online archival
  • Interposed request routing
  • Smart file-system switches

7
Designing problem
  • Given a pool of resources and workload, determine
    appropriate choice of devices, configure them and
    assign the workload to the configured storage.
  • Solution is not straight-forward because,
  • Huge size of system and thousands of design
    choices and many choices have unforeseen
    circumstances.
  • Personnel with detailed knowledge of
    applications storage behavior are in short
    supply and hence, are quite expensive.
  • Design process is tedious and complicated to do
    by hand, usually leading to solutions that are
    grossly over-provisioned, substantially
    under-performing or, in the worst case, both.
  • Once a design is in place, implementing it is
    time-consuming, tedious and error-prone.
  • A mistake in any of these steps is difficult to
    identify and can result in a failure to meet the
    performance requirements.

8
Storage System life-cycle design/configuration
9
System design and assignment problem
Application
Application
Application
Application
Workload requirements
Workload
Storage
Assignment engine
Storage System
System configuration
Storage device abilities
10
Initial system design
  • Problem convert workloads, business needs and
    device characteristics into assignment of stores
    and streams to devices
  • One approach constraint-based multi-dimensional
    bin-packing
  • Sample constraints of device 1
  • - Sum of store sizes ? capacity
  • - Sum of stream utilizations ? 1.0
  • Sample objective functions
  • - Minimize cost
  • - Balance load

Req. size
Capacity
I/O rate
How many drives? Holding which data?
11
Initial system design gt disk arrays
  • Problem
  • extending the single disk solution to disk arrays
  • The space of array designs is potentially huge
  • LUN sizes and RAID levels, stripe unit sizes,
    disks in LUNs
  • More work needed before the solver can run

12
Minerva Control flow. The array designer is
called as a subroutine by allocator.
Minervas role in storage system life cycle.
Input and output are shown.
13
Minerva running a sample workload
14
Merits/demerits
  • Merits
  • Reasonable automation
  • Demerits
  • Requires accurate models of workloads,
    performance requirements, and devices
  • Address only the mechanisms, not the policy

15
Storage System life-cycle redesign/reconfigure
16
System redesign/reconfiguration
  • new application added
  • new users added
  • system load increases
  • hardware/software upgraded
  • device fails
  • new storage arrives

Reconfigured System
  • performance tuning

Running System
Events triggering redesign/reconfiguration
17
Iterative storage management loop
Design new system
Implement design
Analyze workload
Events triggering reconfiguration
18
Hippodrome
  • Two objectives
  • The automated loop must converge on a viable
    design that meets the workloads requirements
    without over- or under-provisioning.
  • It must converge to a stable final system as
    quickly as possible, with as little as input from
    its users.

19
Components of Hippodrome
  • Analysis component (1)
  • Performance model component (2)
  • Solver components (3)
  • Migration component (4)

candidate design
2
utilzn (dsgn)
4
workload
1
summary
dsgn
finalized design
3
20
Issues in system design and allocation
  • What optimization algorithms are most effective?
  • What optimization objectives and constraints
    produce reasonable designs?
  • ex cost of reconfiguring system
  • What's the right part of the storage design space
    to explore?
  • ex RAID level vs. stripe unit size vs. cache
    management parameters
  • What are reasonable general guidelines for
    tagging a store's RAID level?
  • What (other) decompositions of the design and
    allocation problem are reasonable?
  • How to generalize system design?
  • for SAN environment
  • for host and applications

21
Issues in reconfiguration
  • How to do system discovery?
  • e.g., existing state, presence of new devices
  • Dealing with inconsistent information
  • In a scalable fashion
  • How to abstractly describe storage devices?
  • For system discovery output
  • For input to tools that perform changes
  • How to automate the physical redesign process?
  • e.g., physical space allocation etc.
  • Events trigger redesign decision
  • How do we decide when to reconfigure?
  • Reconfiguration inputs
  • current system configuration/assignment
  • desired system configuration/assignment

22
Self- storage architecture
23
Administration and organization
  • Administrative interface
  • Supervisors
  • Administrative assistants
  • Data access and storage
  • Routers
  • Workers

24
Merits
  • Simpler storage administration
  • Data protection
  • Performance tuning
  • Planning and deployment
  • Monitoring and record-keeping
  • Diagnosis and repair

25
Demerits
  • The proposed solution is too simplistic to handle
    the issues raised.
  • Authors have provided solution from a high-level
    viewpoint, but the solution is not complete in
    any sense.
  • The implementation and evaluation is not
    convincing enough.
  • All the aspects of self- has not been
    addressed as claimed.

26
Storage System life-cycle virtualization
(Dynamic) business requirements
Configure/ reconfigure
Design/ redesign
Monitor
Performance tuning
27
Runtime management problem
  • Often, enterprise customers outsource their
    storage needs to data centers.
  • At data centers, different workload /application
    /services share the underlying storage
    infrastructure.
  • Sharing (of disk drives, storage caches, network
    links, controllers etc.) can lead to interference
    between the users/applications leading to
    possible violations in performance-based QoS
    guarantees.
  • To prevent that, data centers needs to insulate
    the users from each other virtualization.

28
Need for virtualization
  • At data centers, many different enterprise
    servers that support different business
    processes, such as, Web servers, file servers,
    database serves may have very different
    performance requirements on their backend storage
    server.
  • Sophisticated resource allocation and scheduling
    technology is required to effectively isolate
    these logical storage servers as if they are
    separate physical storage servers.
  • Storage Virtualization refers to the technology
    that allows creation of a set of logical storage
    devices from a single physical storage structure.

29
Storage virtualization
Storage management
Application
Clients
Abstract Interface
Virtual Disks
Storage Virtualization
Operating System
Hardware resources
Disks, Controllers
Physical Disks
  • Examples LVM, xFS, StorageTank
  • Hides Physical details from high-level
    applications

30
Dimensions of virtualization
  • Commercial storage virtualization systems are
    rather limited because they can virtualize
    storage capacity.
  • However, from the standpoint of storage clients
    or enterprise servers, the virtual storage
    devices are desired to be as tangible as physical
    disks.
  • Need to virtualize efficiently any standard
    attribute associated with a physical disk, such
    as capacity, bandwidth, latency, availability
    etc.

31
Hardware Organization
client
client
Object interface
File interface
Object interface
Storage manager
Data/cmds
Control mesg
Gigabit network
32
A 2-level CVC Scheduler
Storage Server
Storage Server
4
1
Storage Manager
Client
5
2
7
Storage Server
3
6
33
References
  • Hippodrome running circles around storage
    administration. Eric Anderson et. al., FAST 02,
    pp. 175-188, January 2002.
  • Minerva an automated resource provisioning tool
    for large-scale storage systems. G. Alveraz et.
    al., ACM Transactions on Computer Systems 19 (4)
    483-518, November 2001
  • Ergastulum quickly finding near-optimal storage
    system designs. Eric Anderson et. al., Technical
    Report from HP Laboratories.
  • Disk Array Models in Minerva. Arif Merchant et.
    al., Technical Report, HP Laboratories.
  • Self- Storage Brick-based Storage with
    Automated Administration. G. Ganger et. al.,
    Technical report,2003

34
References
  • SIGMETRICS 00 Tutorial, HP Laboratories.
  • Optimization algorithms
  • Bin-packing Heuristics Coffman84
  • Toyoda Gradient Toyoda75
  • Simulated Annealing Drexl88
  • Relaxation Approaches Pattipati90, Trick92
  • Genetic Algorithms Chu97
  • Multidimensional Storage Virtualization. Lan
    Huang et. al., SIGMETRICS 04, New York, June
    2004.
  • An Interposed 2-Level I/O Scheduling Framework
    for Performance Virtualization. J. Zhang et. al.,
    SIGMETRICS 05
  • Efficiency-aware disk scheduler
  • - Cello, Prism, YFQ

35
THANK YOU !!!
Write a Comment
User Comments (0)
About PowerShow.com