Scalability Terminology: Farms, Clones, Partitions, and Packs: RACS and RAPS - PowerPoint PPT Presentation


PPT – Scalability Terminology: Farms, Clones, Partitions, and Packs: RACS and RAPS PowerPoint presentation | free to download - id: d3dfa-ZDc1Z


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

Scalability Terminology: Farms, Clones, Partitions, and Packs: RACS and RAPS


A farm's hardware, applications and data are duplicated at one or more ... IP sieve like Network Load Balancing in Windows 2000 ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 35
Provided by: admi81


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Scalability Terminology: Farms, Clones, Partitions, and Packs: RACS and RAPS

Scalability Terminology Farms, Clones,
Partitions, and Packs RACS and RAPS
  • Bill Devlin, Jim Cray, Bill Laing, George Spix
  • Microsoft Research
  • Dec. 1999
  • Presented by Hongwei Zhang, CIS Dept. OSU
  • May, 2001

  • Introduction
  • Basic scalability terminology / techniques
  • Software requirements for these scalable systems
  • Cost/performance metrics to consider
  • Summary

Why need to Scale ?
  • Server systems must be able to start small
  • Small-size company (garage-scale) v.s.
    international company (kingdom-scale)
  • System should be able to grow as demand grows
  • e.g.
  • eCommerce made system growth more rapid dynamic
  • ASP also need dynamic growth

How to scale ?
  • Two types
  • Scale up
  • Replacing existing servers with larger servers
  • Scale Out
  • Adding extra servers
  • Interesting slogans
  • Buy computing by the slice
  • Build systems from CyberBricks
  • Slice, CyberBricks fundamental building block
    for a scalable system

Part 1 Basic Terminologies Associated with
  • Farms/Geoplex
  • Clones (RACS)
  • Partitions (RAPS)
  • Packs

  • DFN the collection of servers, applications and
    data at a particular site
  • Features
  • functionally specialized services (email, WWW,
    directory, database, etc.)
  • Administered as a unit
  • Common staff/management policies/facilities

  • Disaster-tolerance of a farm
  • A farms hardware, applications and data are
    duplicated at one or more geographically remote
  • This collection of farms is called geoplex
  • Active-active all farms carry some of the load
  • Active-passive one or more are hot-standbys
    (waiting for fail-over of corresponding active

(No Transcript)
Ways to scale a farm
  • Cloning
  • Partitioning

  • DFN
  • A service is cloned on many replica nodes, each
    having the same software and data.
  • Feature
  • Requests are routed to individual members of the
    cloned sets load balancing
  • e.g.
  • when single-node service becomes overloaded, we
    could duplicate the nodes hardware, software and
    data on a second node

Clone (contd.)
  • Two ways of load balancing
  • External to the clones
  • IP sprayer like Cisco LocalDirectorTM
  • LocalDirectorTM dispatches (sprays) requests to
    different nodes in the clone to achieve
  • Internal to the clones
  • IP sieve like Network Load Balancing in Windows
  • Every requests arrive at every node in the clone,
    but each node intelligently accept a part of
    these requests
  • Distributed coordination among nodes

  • DFN
  • The collection of clones for a particular service
    is is called a RACS (Reliable Array of Cloned
  • Two types of RACS (Fig. 2)
  • Shared-nothing RACS
  • Each node duplicate all the storage locally
  • Shared-disk RACS (also called a cluster)
  • All the nodes (clones) share a common storage
    manager. Stateless servers at different nodes
    access a common backend storage server

(No Transcript)
RACS (contd.)
  • Advantages of cloning and RACS
  • Offer both scalability and availability
  • Scalability excellent ways to add processing
    power, network bandwidth, and storage bandwidth
    to a farm
  • Nodes can act as backup for one another one node
    fail, other nodes continue to offer service
    (probably with degraded performance)
  • Failures could be masked, if node- and
    application-failure detection mechanisms are
    integrated with the load-balancing system or with
    client applications
  • Easy to manage
  • Administrative operations on one service instance
    at one node could be replicated to all others.

RACS (contd.)
  • Challenges
  • Shared-nothing RACS
  • not a good way to grow storage capacity updates
    at one nodes must be applied to all other nodes
  • problematic for write-intensive services all
    clones must perform all writes (no throughput
    improvement) and need subtle coordination
  • Shared-disk RACS could ameliorate (to some
    extent) this cost and complexity of cloned
  • Shared-disk RACS
  • Storage server should be fault-tolerant for
    availability (only one copy of data)
  • Still require subtle algorithms to manage updates
    (such as cache validation, lock managers,
    transaction logs, etc.)

  • DFN
  • To grow a service by duplicating the hardware and
    software but dividing the data among the nodes
    (Fig. 3).
  • Features
  • Only the software is cloned, data is divided
    among the nodes (unlike shared-nothing clone)
  • Transparent to applications
  • Simple partitioning has only one copy of data,
    thus not improving availability
  • Geoplex to guard against loss of storage
  • More common locally duplex (raid 1) or parity
    protect (raid 5) the storage

(No Transcript)
Partition (contd.)
  • Example
  • Typically, the application middleware partitions
    the data and workload by object
  • Mail servers partition by mailboxes
  • Sales systems partition by customer accounts or
    product lines
  • Challenge
  • When a partition (node) is added, the data should
    be automatically repartitioned among the nodes to
    balance the storage and computational load.
  • The partitioning should automatically adapt as
    new data is added and as the load changes.

  • Purpose
  • To deal with hardware/software failure at a
  • DFN
  • Each partition is implemented as a pack of two or
    more nodes that provide access to the storage
    (Fig. 3).

Pack (contd.)
  • Two types of Pack
  • Shared-disk pack
  • All members of the pack may access all the disks
    in the pack
  • Similar to shared-disk clone, except that the
    pack is serving just one part of the total
  • Shared-nothing pack
  • Each member of the pack may serve just one
    partition of the disk pool during normal
    conditions, but serve a failed partition if the
    partitions primary server fails

Shared-nothing Pack (contd.)
  • Two modes
  • Active-active pack
  • each member of the pack can have primary
    responsibility for one or more partitions
  • When a node in the pack fails, the service of its
    partition migrates to another node of the pack.
  • Active-passive pack
  • Just one node of the pack is actively serving
    requests while the other nodes are acting as

  • DFN
  • The collection of nodes that support a
    packed-partitioned service are called a RAPS
    (Reliable Array of Partitioned Services).
  • Advantage
  • Provides both scalability and availability
  • Better performance than RACS for write-intensive

Brief Summary of Scalability Design
Summary (contd.)
  • Clones and RACS
  • For read-mostly applications with low
    consistencey and modest storage requirements (lt
    100 GB)
  • Web/file/security/directory servers
  • Partitions and RAPS
  • For update-intensive and large database
    applications (routing requests to specific
  • Email/instant messaging/ERP/record keeping

  • Multi-tier applications
  • (Functional separation)
  • front-tier
  • Web and firewall services (read mostly)
  • middle-tier
  • File servers (read mostly)
  • data-tiere
  • SQL servers (update intensive)

(No Transcript)
Example (contd.)
  • Load balancing and routing at each tier
  • Front-tier
  • IP-level load distribution scheme
  • Middle-tier
  • Data and process specific load steering,
    according to request semantics
  • Data-tier
  • Routing to the correct partition

Part 2 Software Requirements for Geoplex, Farms,
  • (more of a wish list than a reflection of current
    tools and capabilities)
  • Be able to manage everything from a single remote
    console, treating RACS and RAPS as entities
  • Automated operation software to deal with normal
    events (summarizing, auditing, etc.) and to help
    the operator manage exceptional events (detects
    failures and orchestrates repair, etc.) reduce
    operator error (thus, enhancing site availability)

Software requirements (contd.)
  • Both the software and hardware componets must
    allow online maintenance and replacement
  • Tools to support versioned software deployment
    and staging across a site
  • Good tools to design user interfaces, services,
    and databases
  • Good tools to configure and then load balance the
    system as it evolves

Software requirements (contd.)
  • RACS
  • Automatic replication of software and data to new
  • Automatic request routing to load balance the
    work and to route around failures
  • Recognize repaired and new nodes
  • RAPS
  • Automatic routing requests to nodes dedicated to
    serving a partition of data (affinity routing)
  • Middleware to provide transparent partitioning
    and load-balancing (a application-level service)
  • Similar manageability features of cloned system
    (for Pack)

Part 3 Price/Performance Metrics
  • Why need cloning/partitioning ?
  • One cannot buy a single 60 billion-instructions
    per second processor or a single 100TB server
  • So, at least some degree of cloning and
    partitioning is required
  • What is the right building block for a site?

Right building block
  • Mainframe vendors
  • Mainframe is the right choice!
  • Their hardware and software offer high
  • Easier to manage their systems than to manage
    coned PCs
  • But, mainframe prices are fairly high !
  • 3x to 10x more expensive

Right building block (contd.)
  • Commodity servers and storage
  • Less costly to use inexpensive clones for those
    CPU intensive services, such as web service.
  • Commodity software is easier to manage than the
    traditional services that require skilled
    operators and administrators
  • One consensus
  • Much easier to manage homogenious sites (all NT,
    all FreeBSD, etc.) than to manage heterogenious
  • Stats middleware (such as Netscape, IIS, Notes,
    Exchange) are where the administrators spend most
    of their time

Part 4 Summary
  • Scalability technique
  • Replicate a service at many nodes
  • Simpler forms of replication
  • Duplicate both programs and data RACS
  • For large databases or update-intensive services
  • Data partitioned RAPS
  • Packs make partitions highly available
  • Against disaster
  • The entire farm is replicated to form a geoplex