The TickerTAIP Parallel RAID Architecture - PowerPoint PPT Presentation

About This Presentation
Title:

The TickerTAIP Parallel RAID Architecture

Description:

Parity calculations are done in decentralized fashion: ... Design Issues (I) Normal-mode reads are trivial to implement. Normal mode writes: ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 22
Provided by: jeha1
Learn more at: https://www2.cs.uh.edu
Category:

less

Transcript and Presenter's Notes

Title: The TickerTAIP Parallel RAID Architecture


1
The TickerTAIPParallel RAID Architecture
  • P. Cao, S. B. LimS. Venkatraman, J. WilkesHP
    Labs

2
RAID Architectures
  • Traditional RAID architectures have
  • A central RAID controller interfacing to the host
    and processing all I/O requests
  • Disk drives organized in strings
  • One disk controller per disk string (mostly SCSI)

3
Limitations
  • Capabilities of RAID controller are crucial to
    the performance of RAID
  • Can become memory-bound
  • Presents a single point of failure
  • Can become a bottleneck
  • Having a spare controller is an expensive
    proposition

4
Our Solution
  • Have a cooperating set ofarray controller nodes
  • Major benefits are
  • Fault-tolerance
  • Scalability
  • Smooth incremental growth
  • Flexibility can mix and match components

5
TickerTAIP
Hostinterconnects
Controller nodes
6
TickerTAIP ( I)
  • A TickerTAIP array consists of
  • Worker nodes connected with one or more local
    disks through a bus
  • Originator nodes interfacing with host computer
    clients
  • A high-performance small area network
  • Mesh based switching network (Datamesh)
  • PCI backplanes for small networks

7
TickerTAIP ( II)
  • Can combine or separate worker and originator
    nodes
  • Parity calculations are done in decentralized
    fashion
  • Bottleneck is memory bandwidth not CPU speed
  • Cheaper than having faster paths to a dedicated
    parity engine

8
Design Issues (I)
  • Normal-mode reads are trivial to implement
  • Normal mode writes
  • three ways to calculate the new parity
  • full stripe calculate parity from new data
  • small stripe requires at least four I/Os
  • large stripe if we rewrite more than half a
    stripe, we compute the parity by reading the
    unmodified data blocks

9
Design Issues (II)
  • Parity can be calculated
  • At originator node
  • Solely parity at the parity node for the stripe
  • Must ship all involved blocks to party node
  • At parity same as solely parity but partial
    results for small stripe writes are computed at
    worker node and shipped to parity node
  • Occasions less traffic than solely parity

10
Handling single failures (I)
  • TickerTAIP must provide request atomicity
  • Disk failures are treated as in standard RAID
  • Worker failures
  • Treated like disk failures
  • Detected by time-outs(assuming fail-silent
    nodes)
  • A distributed consensus algorithm reaches
    consensus among remaining nodes

11
Handling single failures (II)
  • Originator failures
  • Worst case is failure of a originator/worker node
    during a write
  • TickerTAIP uses a two-phase commit protocol
  • Two options
  • Late commit
  • Early commit

12
Late commit/Early commit
  • Late commit only commits after parity has been
    computed
  • Only the writes must be performed
  • Early commit commits as soon as new data and old
    data have been replicated
  • Somewhat faster
  • Harder to implement

13
Handling multiple failures
  • Power failures during writes can corrupt stripe
    being written
  • Use UPS to eliminate them
  • Must guarantee that some specific requests will
    always be executed in a given order
  • Cannot write data blocks before updating the
    i-nodes containing block addresses
  • Uses request sequencing to achieve partial write
    ordering

14
Request sequencing (I)
  • Each request
  • Is given a unique identifier
  • Can specify one or more requests on whose
    previous completion it depends(explicit
    dependencies)
  • TickerTAIP adds enough implicit dependencies to
    prevent concurrent execution of overlapping
    requests

15
Request sequencing (II)
  • Sequencing is performed by acentralized
    sequencer
  • Several distributed solutions were considered but
    not selected because of the complexity of the
    recovery protocols they would require

16
Disk Scheduling
Not discussed in class in Fall 2005
  • Considered
  • First come first served (FCFS) implemented in
    the working prototype
  • Shortest seek time first (SSTF)
  • Shortest access time first (SATF)Considers both
    seek time and rotation time
  • Batched nearest neighbor (BNN)Runs SATF on all
    reuests in queue

17
Evaluation (I)
  • Based upon
  • Working prototype
  • Used seven relatively slow Parsytec cards each
    with its own disk drive
  • Event-driven simulator was used to test other
    configurations
  • Results were always within 6 of prototype
    measurements

18
Evaluation (II)
  • Read performance
  • 1MB/s links are enough unless the request sizes
    exceed 1MB

19
Evaluation (III)
  • Write performance
  • Large stripe policy always results in aslight
    improvement
  • At-parity significantly better than at-originator
    especially for link speeds below 10MB/s
  • Late commit protocol reduces throughput by at
    most 2 but can increase response time by up
    to 20
  • Early commit protocol is not much better
  • TickerTAIP always outperforms a comparable
    centralized RAID architecture
  • best disk scheduling policy is Batched Nearest
    Neighbor

20
Evaluation (IV)
  • TickerTAIP always outperforms a comparable
    centralized RAID architecture
  • Best disk scheduling policy is Batched Nearest
    Neighbor (BNN)

21
Conclusion
  • Can use physical redundancy to eliminate single
    points of failure
  • Can use eleven 5 MIPS processors instead of
    single 50 MIPS
  • Can use off-the-shelf processors for parity
    computations
  • Disk drives remain the bottleneck for small
    request sizes
Write a Comment
User Comments (0)
About PowerShow.com