Redundant Array of Inexpensive Disks - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Redundant Array of Inexpensive Disks

Description:

... first defined by David A. Patterson, Garth A. Gibson and Randy Katz at the ... the case of RAID storage) in a round-robin fashion and thus written concurrently. ... – PowerPoint PPT presentation

Number of Views:256
Avg rating:3.0/5.0
Slides: 29
Provided by: Step598
Category:

less

Transcript and Presenter's Notes

Title: Redundant Array of Inexpensive Disks


1
RedundantArray of InexpensiveDisks
  • Stephen Clarke
  • CS 215
  • Dr. Joel

2
History
  • An acronym first defined by David A. Patterson,
    Garth A. Gibson and Randy Katz at the University
    of California, Berkeley in 1987 to describe a
    Redundant Array of Inexpensive Disks,
  • Technology that allowed computer users to achieve
    high levels of storage reliability from low-cost
    and less reliable PC-class disk-drive components,
    via the technique of arranging the devices into
    arrays for redundancy.

3
Also Known as
  • Redundant Array of Independent Disks
  • Inexpensive changed to Independent to allow for
    more profit.

4
What is RAID?
  • Technology that allows users to achieve high
    levels of storage reliability from low-cost and
    less reliable PC-class disk-drive components
  • Done via the technique of arranging the devices
    into arrays for redundancy.

5
Purposes
  • Redundancy writing the same data to multiple
    drives (known as mirroring), or writing extra
    data (known as parity data) across the array,
    calculated such that the failure of one (or
    possibly more, depending on the type of RAID)
    disks in the array will not result in loss of
    data.
  • So Its a form of preventing data loss by
    creating back ups of the data

6
How RAID works
  • RAID combines two or more physical hard disks
    into a single logical unit by using either
    special hardware or software. Hardware solutions
    often are designed to present themselves to the
    attached system as a single hard drive, so that
    the operating system would be unaware of the
    technical workings. For example, you might
    configure a 1TB RAID 5 array using three 500GB
    hard drives in hardware RAID, the operating
    system would simply be presented with a "single"
    1TB disk. Software solutions are typically
    implemented in the operating system and would
    present the RAID drive as a single drive to
    applications running upon the operating system.

7
Key Design Goals
  • Increased data reliability a property of some
    disk arrays which provides fault tolerance, so
    that all or part of the data stored in the array
    can be recovered in the case of disk failure.
  • Increased input/output performance.

8
Raid Array
  • When multiple physical disks are set up to use
    RAID technology, they are said to be in a RAID
    array.
  • Array distributes data across multiple disks, but
    the array is seen by the user and operating
    system as one single disk.

9
Single and Dual Parity
  • Large-capacity drives lengthen the time needed to
    recover from the failure of a single drive.
  • Single parity RAID levels are vulnerable to data
    loss until the failed drive is rebuilt the
    larger the drive, the longer the rebuild will
    take.
  • Dual parity gives time to rebuild the array
    without the data being at risk if a (single)
    additional drive fails before the rebuild is
    complete.

10
Common RAID Levels
  • RAID 0 - Striped set without parity
  • RAID 1 - Mirrored settings/disks
  • RAID 5 - Striped disks with parity
  • RAID 6 - Striped disks with dual parity
  • RAID 10 - both striping and mirroring

11
Striping
  • The splitting of data across more than one disk.
  • Segments can be assigned to multiple physical
    devices (usually disk drives in the case of RAID
    storage) in a round-robin fashion and thus
    written concurrently.
  • This technique is useful if the processor is
    capable of reading or writing data faster than a
    single disk can supply or accept it. While data
    is being transferred from the first disk, the
    second disk can locate the next segment. Striping
    can be either of type coarse or fine.

12
RAID 0
  • "Striped set without parity" or "Striping"
  • Provides improved performance and additional
    storage but no redundancy or fault tolerance.
  • Any disk failure destroys the array, which
    becomes more likely with more disks in the array.
    A single disk failure destroys the entire array
    because when data is written to a RAID 0 drive,
    the data is broken into fragments.
  • RAID 0 does not implement error checking so any
    error is unrecoverable. More disks in the array
    means higher bandwidth, but greater risk of data
    loss.

The number of fragments is dictated by the number
of disks in the array. The fragments are
written to their respective disks simultaneously
on the same sector. This allows smaller sections
of the entire chunk of data to be read off the
drive in parallel, increasing bandwidth.
13
RAID 1
  • Duplicates data across every disk in the array,
    providing full redundancy.
  • Two (or more) disks each store exactly the same
    data, at the same time, and at all times.
  • Data is not lost as long as one disk survives.
  • Total capacity of the array equals the capacity
    of the smallest disk in the array.
  • At any given instant, the contents of each disk
    in the array are identical to that of every other
    disk in the array.

14
RAID 5
  • Distributed parity requires all drives but one to
    be present to operate
  • Drive failure requires replacement, but the array
    is not destroyed by a single drive failure. Upon
    drive failure, any subsequent reads can be
    calculated from the distributed parity such that
    the drive failure is masked from the end user.
  • The array will have data loss in the event of a
    second drive failure and is vulnerable until the
    data that was on the failed drive is rebuilt onto
    a replacement drive.
  • A single drive failure in the set will result in
    reduced performance of the entire set until the
    failed drive has been replaced and rebuilt.

15
RAID 6
  • Less common
  • Provides fault tolerance from two drive failures
  • Array continues to operate with up to two failed
    drives.
  • More practical for larger RAID groups

16
Raid 10
  • RAID 10 (or 10) uses both striping and
    mirroring. "01" or "01" is sometimes
    distinguished from "10" or "10" a striped set
    of mirrored subsets and a mirrored set of striped
    subsets are both valid, but distinct,
    configurations.

17
Benefits of using RAID
  • Higher Data Security
  • Fault Tolerance
  • Improved Availability
  • Increased, Integrated Capacity
  • Improved Performance

18
Higher Data Security
  • Through the use of redundancy, most RAID levels
    provide protection for the data stored on the
    array.
  • This means that the data on the array can
    withstand even the complete failure of one hard
    disk (or sometimes more) without any data loss,
    and without requiring any data to be restored
    from backup.
  • This security feature is a key benefit of RAID
    and probably the aspect that drives the creation
    of more RAID arrays than any other.
  • All RAID levels provide some degree of data
    protection, depending on the exact
    implementation, except RAID level 0.

19
Fault Tolerance
  • RAID implementations that include redundancy
    provide a much more reliable overall storage
    subsystem than can be achieved by a single disk.
    This means there is a lower chance of the storage
    subsystem as a whole failing due to hardware
    failures. (At the same time though, the added
    hardware used in RAID means the chances of having
    a hardware problem of some sort with an
    individual component, even if it doesn't take
    down the storage subsystem, is increased see
    this full discussion of RAID reliability for
    more.)

20
Improved Availability
  • Availability refers to access to data. Good RAID
    systems improve availability both by providing
    fault tolerance and by providing special features
    that allow for recovery from hardware faults
    without disruption. See the discussion of RAID
    reliability and also this discussion of advanced
    RAID features.

21
Integrated Capacity
  • By turning a number of smaller drives into a
    larger array, you add their capacity together
    (though a percentage of total capacity is lost to
    overhead or redundancy in most implementations).
    This facilitates applications that require large
    amounts of contiguous disk space, and also makes
    disk space management simpler.

22
Improved Performance
  • RAID systems improve performance by allowing the
    controller to exploit the capabilities of
    multiple hard disks to get around
    performance-limiting mechanical issues that
    plague individual hard disks. Different RAID
    implementations improve performance in different
    ways and to different degrees, but all improve it
    in some way. See this full discussion of RAID
    performance issues for more.

23
Disadvantages
  • Less usable storage capacity For instance, a
    2-disk RAID 1 array loses half of the total
    capacity that would have otherwise been available
    using both disks independently, and a RAID 5
    array with several disks loses the capacity of
    one disk. Other types of RAID arrays are arranged
    so that they are faster to write to and read from
    than a single disk

24
Disadvantages (Continued)
  • When the bad disk is replaced by a new one the
    array is rebuilt while the system continues to
    operate normally. Some systems have to be powered
    down when removing or adding a drive others
    support hot swapping, allowing drives to be
    replaced without powering down. RAID with
    hot-swapping is often used in high availability
    systems, where it is important that the system
    remains running as much of the time as possible.
  • RAID is NOT a good alternative to backing up
    data. Data may become damaged or destroyed
    without harm to the drive(s) on which they are
    stored. For example, part of the data may be
    overwritten by a system malfunction.
  • A file may be damaged or deleted by user error or
    malice and not noticed for days or weeks
  • The entire array is at risk of physical damage.

25
More History
  • Now used as an umbrella term for computer data
    storage schemes that can divide and replicate
    data among multiple hard disk drives.

26
History
  • Norman Ken Ouchi at IBM was awarded a 1978 U.S.
    patent titled "System for recovering data stored
    in failed memory unit."
  • The claims for this patent describe what would
    later be termed RAID 5 with full stripe writes.
    This 1978 patent also mentions that disk
    mirroring or duplexing (what would later be
    termed RAID 1) and protection with dedicated
    parity (that would later be termed RAID 4) were
    prior art at that time.

27
History (Continued)
  • The term RAID was first defined by David A.
    Patterson, Garth A. Gibson and Randy Katz at the
    University of California, Berkeley in 1987. They
    studied the possibility of using two or more
    drives to appear as a single device to the host
    system and published a paper "A Case for
    Redundant Arrays of Inexpensive Disks (RAID)" in
    June 1988 at the SIGMOD conference.
  • This specification suggested a number of
    prototype RAID levels, or combinations of drives.
    Each had theoretical advantages and
    disadvantages. Over the years, different
    implementations of the RAID concept have
    appeared. Most differ substantially from the
    original idealized RAID levels, but the numbered
    names have remained. This can be confusing, since
    one implementation of RAID 5, for example, can
    differ substantially from another. RAID 3 and
    RAID 4 are often confused and even used
    interchangeably.

28
Further Reading
  • Wikipedias article on RAID contains a detailed
    entry and many useful sources and citations
  • http//en.wikipedia.org/wiki/RAID
  • http//en.wikipedia.org/wiki/Standard_RAID_levels
Write a Comment
User Comments (0)
About PowerShow.com