Sun Clusters Ira Pramanick Sun Microsystems, Inc. - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Sun Clusters Ira Pramanick Sun Microsystems, Inc.

Description:

Characteristics of tomorrow's clusters. High-availability. Cluster-wide resource sharing: files, devices, LAN. Flexibility & Scalability ... – PowerPoint PPT presentation

Number of Views:183
Avg rating:3.0/5.0
Slides: 25
Provided by: sunm6
Category:

less

Transcript and Presenter's Notes

Title: Sun Clusters Ira Pramanick Sun Microsystems, Inc.


1
Sun ClustersIra PramanickSun Microsystems,
Inc.

2
Outline
  • Todays vs. tomorrows clusters
  • How they are used today and how this will change
  • Characteristics of future clusters
  • Clusters as general-purpose platforms
  • How they will be delivered
  • Suns Full Moon architecture
  • Summary conclusions

3
Clustering Today
  • Mostly for HA
  • Little sharing of resources
  • Exposed topology
  • Hard to use
  • Layered on OS
  • Reactive Solution

4
Clustering Tomorrow
LAN/WAN
Central Console
5
Sun Full Moon architecture
  • Turns clusters into general-purpose platforms
  • Cluster-wide file systems, devices, networking
  • Cluster-wide load-balancing and resource
    management
  • Integrated solution
  • HW, system SW, storage, applications,
    support/service
  • Embedded in Solaris 8
  • Builds on existing Sun Cluster line
  • Sun Cluster 2.2 -gt Sun Cluster 3.0

6
Characteristics of tomorrows clusters
  • High-availability
  • Cluster-wide resource sharing files, devices,
    LAN
  • Flexibility Scalability
  • Close integration with the OS
  • Load-balancing Application management
  • Global system management
  • Integration of all parts HW, SW, applications,
    support, HA guarantees

7
High Availability
  • End-to-end application availability
  • What matters Applications as seen by network
    clients are highly-available
  • Enable Service Level Agreements
  • Failures will happen
  • SW, HW, operator errors, unplanned maintenance,
    etc.
  • Mask failures from applications as much as
    possible
  • Mask application failures from clients

8
High Availability...
  • No single point of failure
  • Use multiple components for HA scalability
  • Need strong HA foundation integrated into OS
  • Node group membership, with quorum
  • Well-defined failure boundaries--no shared memory
  • Communication integrated with membership
  • Storage fencing
  • Transparently restartable services

9
High Availability...
  • Applications are the key
  • Most applications are not cluster-aware
  • Mask most errors from applications
  • Restart when node fails, with no recompile
  • Provide support for cluster-aware apps
  • Cluster APIs, fast communication
  • Disaster recovery
  • Campus-separation and geographical data
    replication

10
Resource Sharing
  • What is important to applications?
  • Ability to run on any node in cluster
  • Uniform global access to all storage and network
  • Standard system APIs
  • What to hide?
  • Hardware topology, disk interconnect, LAN
    adapters, hardwired physical names

11
Resource Sharing...
  • What is needed?
  • Cluster-wide access to existing file systems,
    volumes, devices, tapes
  • Cluster-wide access to LAN/WAN
  • Standard OS APIs no application
    rewrite/recompile
  • Use SMP model
  • Apps run on machine (not CPU 5, board 3, bus 2)
  • Logical resource names independent of actual path

12
Resource Sharing...
  • Cluster-wide location-independent resource access
  • Run applications on any node
  • Failover/switchover apps to any node
  • Global job/work queues, print queues, etc.
  • Change/maintain hardware topology without
    affecting applications
  • But need not require fully-connected SAN
  • Main interconnect can be used through software
    support

13
Flexibility
  • Business needs change all the time
  • Therefore, platform must be flexible
  • System must be dynamic -- all done on-line
  • Resources can be added and removed
  • Dynamic reconfiguration of each node
  • Hot-plug in and out of IO, CPUs, memory, storage,
    etc.
  • Dynamic reconfiguration between nodes
  • More nodes, load-balancing, application
    reconfiguration

14
Scalability
  • Cluster SMP nodes
  • Choose nodes as big as needed to scale
    application
  • Need expansion room within nodes too
  • Dont use clustering exclusively to scale
    applications
  • Interconnect speed slower than backplane speed
  • Few cluster-aware applications
  • Clustering large number of small nodes is like
    herding chicken

15
Close integration with OS
  • Currently multi-CPU SMP support in OS
  • Does not make sense otherwise
  • Next step cluster support in the OS
  • Next dimension of OS support across nodes
  • Clustering will become part of the OS
  • Not a loosely-integrated layer

16
Advantages of OS integration
  • Ease of use
  • Same administration model, commands, installation
  • Availability
  • Integrated heartbeat, membership, fencing, etc.
  • Performance
  • In-kernel support, inter-node/process messaging,
    etc.
  • Leverage
  • All OS features/support available for clustering

17
Load-balancing
  • Load-balancing done at various levels
  • Built-in network load-balancing
  • For example, incoming http requests TCP/IP
    bandwidth
  • Transactions at middleware level
  • Global job queues
  • All nodes have access to all storage and network
  • Therefore any node can be eligible to perform the
    work

18
Resource management
  • Cluster-wide resource management
  • CPU, network, interconnect, IO bandwidth
  • Cluster-wide application priorities
  • Global resource requirements guaranteed locally
  • Need per-node resource management
  • High-availability is not just making sure an
    application is started
  • Must guarantee resources to finish job

19
Global cluster management
  • System management
  • Perform administrative functions once
  • Maintain same model as single node
  • Same tools/commands as base OS--minimize
    retraining
  • Hide complexity
  • Most administrative operations should not deal
    with HW topology
  • But still enable low-level diagnostics and
    management

20
A Total Clustering Solution
Applications

Service and Support
HA Guarantee Practice
Integration of all components
21
Roadmap
  • Sun Cluster 2.2 currently shipping
  • Solaris 2.6, Solaris 7, Solaris 8 3/00
  • 4 nodes
  • Year 2000 compliant
  • Choice of servers, storage, interconnects,
    topologies, networks
  • 10 Km separation
  • Sun Cluster 3.0
  • External Alpha 6/99, Beta Q1 CY00, GA 2H CY00
  • 8 nodes
  • Extensive set of new features cluster fs, global
    devices, network load-balancing, new APIs (RGM),
    diskless application failover, SyMON integration

22
Wide Range of Applications
  • Agents developed, sold, and supported by Sun
  • Databases (Oracle, Sybase, Informix, Informix
    XPS), SAP
  • Netscape (http, news, mail, LDAP), Lotus Notes
  • NFS, DNS, Tivoli
  • Sold and supported by 3rd parties
  • IBM DB2 and DB2 PE, BEA Tuxedo
  • Agents developed and supported by Sun
    Professional Services
  • A large list, including many in-house
    applications
  • Toolkit for agent development
  • Application management API, training, Sun PS
    support

23
Full Moon clustering
Dynamic domains
Global file system
Global devices
24
Summary
  • Clusters as general-purpose platforms
  • Shift from reactive to proactive clustering
    solution
  • Clusters must be built on a strong foundation
  • Embed into a solid operating system
  • Full Moon -- bakes clustering technology into
    Solaris
  • Make clusters easy to use
  • Hide complexity, hardware details
  • Must be an integrated solution
  • From platform, service/support, to HA guarantees
Write a Comment
User Comments (0)
About PowerShow.com