The Sun Cluster Grid Architecture (Sun Grid Engine Project) - PowerPoint PPT Presentation

Loading...

PPT – The Sun Cluster Grid Architecture (Sun Grid Engine Project) PowerPoint presentation | free to download - id: 4f530a-NjdlY



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

The Sun Cluster Grid Architecture (Sun Grid Engine Project)

Description:

The Sun Cluster Grid Architecture (Sun Grid Engine Project) Adam Belloum Computer Architecture & Parallel Systems group University of Amsterdam adam_at_science.uva.nl – PowerPoint PPT presentation

Number of Views:171
Avg rating:3.0/5.0
Slides: 74
Provided by: Pime
Learn more at: http://www.science.uva.nl
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: The Sun Cluster Grid Architecture (Sun Grid Engine Project)


1
The Sun Cluster Grid Architecture(Sun Grid
Engine Project)
  • Adam Belloum
  • Computer Architecture Parallel Systems group
  • University of Amsterdam
  • adam_at_science.uva.nl

2
Sun Cluster Architecture
  • The architecture includes
  • Front-end access nodes
  • Middle-tier management nodes
  • Back-end compute nodes

3
Access Tier
  • The access tier provides access and
    authentication services to the Cluster Grid
    users.
  • Access methods
  • telnet, rlogin, or ssh, can be used to grant
    access.
  • Web-based services
  • can be provided to permit easy (tightly-controlled
    ) access to the facility.

4
Management tier
  • This middle tier includes one or more servers
    which run
  • the server elements of client-server software
    such as Distributed Resource Management (DRM)
  • The hardware diagnosis software
  • The system performance monitors.
  • File server provide NFS service to other nodes
    in the Cluster Grid
  • License key server manage software license keys
    for the Cluster Grid
  • Software provisioning server
  • manage operating system
  • application software versioning
  • patch application on other nodes in the Cluster
    Grid

5
Compute Tier
  • Supplies the compute power for the Cluster Grid.
    Jobs submitted through upper tiers in the
    architecture are scheduled to run on one or more
    nodes in the compute tier.
  • Nodes in this tier run
  • the client-side of the DRM software,
  • the daemons associated with message-passing
    environments,
  • any agents for system health monitoring.
  • The compute tier communicates with the management
    tier, receiving jobs to run and reporting job
    completion status and accounting details.

6
Hardware consideration
  • The essential hardware components of a Cluster
    grid are
  • Computing systems,
  • Networking equipment,
  • Storage.
  • The choice of hardware at each tier depends on a
    number of factors, primarily
  • What are the required services?
  • What kind of service level is needed?
  • What is the expected user load?
  • What is the expected application type and mix?

7
Access, Compute, and Management Nodes
  • Nodes in the access tier are utilized by users to
    submit, control, and monitor jobs
  • Nodes in the compute tier are used to execute
    jobs
  • Nodes in the management tier run the majority of
    the software needed to implement the Cluster
    Grid.
  • The hardware requirements for each node depend,
    in part, on its location in the system
    architecture.

8
Access Node Requirements
  • Access nodes typically require no special
    configuration
  • Any desktop or server that is connected to the
    network can be configured to allow direct access
    the Cluster Grid.
  • Introduction or modification of nodes in the
    access tier
  • is a simple operation that does not affect other
    tiers in the architecture.
  • Users without a system directly connected to the
    local area network can interface with an access
    node via conventional methods (e.g., telnet,
    rlogin, ftp, and ssh),

9
Compute Node Requirements
  • Compute nodes run jobs that are submitted to the
    Cluster Grid, and design of this tier is crucial
    to maximize application performance.
  • the Cluster Grid software places little load upon
    nodes in the compute tier.
  • Access nodes can also be configured as compute
    nodes, and may be appropriate in certain
    environments.
  • For example, desktop machines used as access
    nodes can also be tapped for their spare compute
    cycles after business hours or when the CPU is
    otherwise idle.

10
Management Node Requirements
  • Cluster Grids can be designed with one or more
    systems in the management tier.
  • managing services on a single system is simplest,
    and this may be the best choice for small Cluster
    Grids.
  • managing services on a multiple systems provide
    greater scalability and can provide increased
    performance, especially for larger Cluster Grids.

11
Management Node Requirements
  • The system requirements for the master node are
    dependent on
  • Size of the compute cluster,
  • Volume of jobs being submitted,
  • Complexity of any scheduling decisions that must
    be made.
  • In large clusters, the DRM master node is
    dedicated
  • Should not perform compute tier duties or act in
    any other capacity.
  • Particularly relevant in clusters running large
    MPI jobs.
  • If the DRM server is acting as a compute node,
    system services can continually interrupt the MPI
    job in progress, thereby delaying a large job
    running across many nodes.

12
Networking Infrastructure
  • A typical Cluster Grid can be configured with
    three separate types of network interconnects
  • Ethernet,
  • serial interconnect,
  • specialized low-latency,
  • High bandwidth

13
Networking Infrastructure
  • Compute, management, and access nodes in a
    Cluster Grid typically are connected by a local
    area network utilizing Fast Ethernet or Gigabit
    Ethernet technology.
  • This network is used for file sharing,
    interprocess communication, and system
    management.
  • Care should be taken to separate standard
    Ethernet traffic from compute-related
    communications as far as possible if these
    functions share network hardware.

14
Networking Infrastructure
  • For resiliency and increased reliability, the
    network infrastructure can be configured to
    ensure that no single point of failure can
    compromise availability.
  • the network can be designed with redundant
    switches,
  • and multiple network interfaces can be used for
    increased throughput and to help meet network
    bandwidth requirements.
  • An additional serial interconnect can be used for
    administrative convenience.
  • A serial network can connect the system console
    of all compute nodes in the Cluster Grid to one
    or more terminal concentrators, which are in turn
    connected to the local area network.
  • A specialized low-latency, high-bandwidth system
    interconnect is crucial to the performance of
    large, high communication MPI jobs..

15
Networking Infrastructure
  • Using a separate high-performance interconnect
    also reduces the networking load on the server
    CPU, freeing it for other tasks.
  • An additional network can be added to provide
    rapid data delivery to and from compute nodes if
    required.
  • This network can utilize high-speed Ethernet, or
    a Storage Area Network (SAN) can be implemented

16
Software Integration
  • Resilience
  • Interoperability
  • Manageability

17
Software Integration
  • It includes writing utility scripts or,
    modifying the scripts that do application setup.
  • Ideal case, applications can be submitted to the
    DRM without requiring recompilation or linking
    with special libraries.
  • Software integration also includes the
    integration of the DRM with parallel environments
    such as
  • Parallel Virtual Machine (PVM)
  • Message Passing Interface (MPI). With
    integration, parallel jobs submitted by users can
    be controlled and properly accounted for by the
    DRM.

18
Software Integration
  • Other aspects of software integration can
  • include the design of special interfaces at the
    access tier which automate or simplify the
    submission of tasks to the management tier for
    running on the compute tier.
  • This can include writing specialized wrapper
    scripts, Web interfaces, or more fully-featured
    graphical user interfaces (GUIs).

19
Resilience
  • On the compute tier, nodes are anonymous and
    independent.
  • If one node fails, the remaining nodes are
    unaffected and remain available to execute user
    jobs.
  • The cluster can be configured to redo any work
    that is lost if a server fails mid-job, making
    users unaware of any individual node failures and
    providing increased availability.
  • The RAS (Reliability, Availability, and
    Serviceability) features of the hardware and
    software elements are most relevant to the
    management tier.

20
Resilience
  • The system operating environment can also
    contribute to high availability with features
    like
  • live upgrades,
  • automatic dynamic reconfiguration,
  • file system logging,
  • and IP network failover.
  • The availability of data can be increased with
    redundant, hot-swappable storage components,
    multiple paths to data storage and hardware or
    software RAID capabilities.

21
High Availability
  • If required, High Availability (HA) software can
    provide even greater levels of availability. For
    example,
  • HA software can be used to provide a highly
    available NFS service to the Cluster Grid.
  • If the primary NFS server should fail for any
    reason, NFS data services are automatically and
    transparently failed over to a backup server.
  • Similar to the compute tier, the access tier
    generally contains many systems or devices, thus
    providing inherent redundancy.

22
Interoperability
  • Cluster Grid implementations work on the
    principle of an integratable stack and should be
    able to run across a heterogeneous environment.
  • servers running different operating environments
    should be permitted to belong to the same compute
    cluster.
  • Users should be able to submit jobs to any
    available architecture by simply submitting their
    job to the DRM software.
  • If the job must run on a particular architecture,
    users can specify this as a resource requirement
    when submitting the job.
  • The DRM software can then ensure that this job
    runs only on the correct system types and
    dispatch it appropriately.

23
Manageability
  • The scalability of the Cluster Grid architecture
    can result in hundreds or even thousands of
    managed nodes.
  • Management tools must scale with the size of a
    Cluster Grid, provide a single point of
    management, offer flexibility, and ensure
    security in a distributed environment.
  • Pro-active system management monitoring the
    health and functionality of systems can provide
    improved service.
  • System management costs can be reduced
    significantly by utilizing installation and
    deployment technologies
  • that help minimize the amount of time
    administrators spend installing and patching
    systems and software.

24
Cluster Grid Components
25
  • One of the most important features of the Cluster
    Grid architecture is its modular and open design.
  • Components are separate and have unique roles
    within the architecture.
  • This design is commonly referred to as a software
    stack, with each layer in the stack
    representing a different functionality.

26
(No Transcript)
27
Sun Grid Engine
  • Distributed Resource Management
  • Cluster Queues
  • Hostgroup and Hostlist
  • Scheduler

28
Sun Grid Engine
  • The Sun Grid Engine distributed resource
    management software the essential component of
    any Cluster Grid
  • Optimizes utilization of software/hardware
    resources
  • Aggregates the compute power available in cluster
    grids and presents a unified and simple access
    point to users needing compute cycles.
  • Sun Grid Engine software provides dependable,
    consistent, and pervasive access to both
    high-throughput and highly parallel computational
    capabilities.

29
Sun Grid Engine
  • Sun Grid Engine can also provide
  • The job accounting information
  • Statistics that are used to monitor resource
    utilization and determine how to improve resource
    allocation.
  • Administrators can specify job options
  • priority,
  • hardware and license requirements,
  • dependencies,
  • define and control user access to computer
    resources.

30
Distributed Resource Management
  • The basis for DRM is the batch queuing mechanism.
  • In the normal operation of a cluster, if the
    proper resources are not currently available to
    execute a job, then the job is queued until the
    resources are available.
  • DRM further enhances batch queuing by monitoring
    host computers in the cluster for properly
    balanced load conditions.
  • Sun Grid Engine software provides DRM functions
  • batch queueing, load balancing, job accounting
    statistics, user-specifiable resources,
    suspending and resuming jobs, and cluster-wide
    resources.

31
Cluster Queues
  • The new cluster queue design is based on three
    major points
  • Multiple hosts per queue configuration.
  • Different queue attributes per execution host.
  • Introduction of the concept o Hostgroups.

32
Cluster Queues
  • The cluster queue named big serves 3 different
    hosts
  • balrog, durin and ori.
  • The seq_no attribute value is
  • 1 for balrog,
  • 2 for durin
  • and zero for ori.
  • Both the load_thresholds and suspend_thresholds
    attributes are the same for all execution hosts.

33
Hostgroup and Hostlist
  • A hostgroup contains a list of grid engine
    execution hosts and is referred to by an at ('_at_')
    sign followed by a string.
  • A hostlist is a cluster queue attribute that will
    contain exec hosts and / or hostgroups.
  • Figure illustrates an example where the two
    created hostgroups _at_solaris64 and _at_linux belong
    to the queue named big.

34
How to dispatch jobs?
  • The scheduler selects queues for the submitted
    jobs
  • via attribute matching in an N1 Grid Engine 6
    cluster as opposed to submitting to a queue,
    popular in other DRM products.
  • can still submit to a specific queue if desired.
    An example would be for a job to request mem /
    cpu resources to be submitted to a queue setup to
    fulfill this type of request.

35
How to dispatch jobs?
  • N1 Grid Engine 6 provides the capability to use
    regular expressions for matching resource
    requests.
  • qsub -q medium job.sh
  • ? submits job.sh to the medium queue
  • qsub -q fast_at__at_solaris64 job.sh
  • ? submits job.sh to the fast queue with the
    _at_solaris64 hostlist.
  • qsub -q fast_at_sf15k job.sh
  • ? submits job.sh to queue instance fast that
    belongs to the sf15k host.
  • qmod -e big
  • ? Enables queue big
  • qmod -c big_at__at_linux
  • ? big_at_balrog Clears the alarm state from the
    queue big which is attached to the hosts in
    the _at_linux hostgroup.

36
Scheduler
  • Scheduler internal status creation is optimized
    for performance and the task of sending tickets
    from the scheduler to qmaster is streamlined.
  • The scheduler has look-ahead features, such as
  • Resource reservation
  • Back filling
  • New prioritization scheme
  • Improved algorithm
  • Scheduling profile choices at install time

37
Scheduler
38
Scheduler
  • The new scheduler, the high priority job can use
    resource reservation to block the resources.
  • Although the new scheduler ensures proper
    prioritization of jobs, it may leave resources
    idle for extended periods of time

39
Scheduler (backfilling)
  • Grid Engine notices that queues 2 and 3 will be
    idle because the 3 CPU Job 2 will have to wait
    until Job 1 finishes.
  • It then scans the wait list for short jobs that
    could be run on queues 2 and 3 without delaying
    Job 2.
  • After this analysis, Jobs 3 and 4 are started.
  • Finally, Job 1 finishes after 3 and 4, freeing up
    the resources to start Job 2.

40
Scheduler
  • Wait lists are controlled by three factors
  • priority (from POSIX priority),
  • urgency
  • number of tickets
  • Priority normalized urgencyweight_urgency
  • normalized ticketsweight_tickets
  • normalized ppriorityweight_priority

41
Scheduler
  • The scheduler has two new parameters to obtain
    more information about scheduling activities
  • PROFILE if set to true, the scheduler will show
    how much time it spent on each step of a
    scheduling run.
  • MONITOR If is set to true, the scheduler will
    dump all information necessary to reproduce job
    resource.

42
  • ARCo has several predefined reports such as
  • Accounting per Department
  • Accounting per Project
  • Accounting per User
  • Host Load
  • Statistics
  • Average Job Turnaround time
  • Average Job Wait Time per day
  • Job Log
  • Number of Jobs Completed
  • Queue Consumables

43
Grid Sun Engine Architecture
  • Master host
  • A single host is selected to be the Sun Grid
    Engine master host.
  • This host handles all requests from users, makes
    job scheduling decisions, and dispatches jobs to
    execution hosts.
  • Execution hosts
  • Systems in the cluster that are available to
    execute jobs are called execution hosts.
  • Submit hosts
  • Submit hosts are machines configured to submit,
    monitor, and administer jobs, and to manage the
    entire cluster.

44
Grid Sun Engine Architecture
  • Software Job flow
  • Security
  • High Availability

45
Grid Sun Engine Architecture
  • Administration hosts
  • Sun Grid Engine administrators use administration
    hosts to make changes to the cluster
    configuration, such as
  • changing DRM parameters,
  • adding new nodes,
  • adding or changing users.
  • Shadow master host
  • While there is only one master host, other
    machines in the cluster can be designated as
    shadow master hosts to provide greater
    availability.
  • A shadow master host continually monitors the
    master host, and automatically and transparently
    assumes control in the event that the master host
    fails.

46
Software Job flow
  • Jobs are submitted to the master host and are
    held in a spooling area until the scheduler
    determines that the job is ready to run.
  • Sun Grid Engine software matches available
    resources to job requirements, such as available
    memory, CPU speed, and available software
    licenses.
  • The requirements of the jobs may be very
    different and only certain hosts may be able to
    provide the corresponding service.

47
Software Job flow
  • Job submission
  • User submits a job from a submit host, the job
    submission request is sent to the master host
  • Job scheduling
  • The master host determines the host to which the
    job will be assigned. It assesses the load,
    checks for licenses, and evaluates any other job
    requirements.
  • Job execution
  • After obtaining scheduling information, the
    master host then sends the job to the selected
    execution host. The execution host saves the job
    in a job information database and starts a
    shepherd process which starts the job, and waits
    for completion.
  • Accounting information
  • When the job is complete, the shepherd process
    returns the job information, and the execution
    host then reports the job completion to the
    master host and removes the job from the job
    information database. The master host updates the
    job accounting database to reflect job completion.

48
Security
  • To control access to the cluster, the Sun Grid
    Engine master host maintains information about
    eligible submit and administration hosts.
  • Systems which have been explicitly listed as
    eligible submit hosts are able to submit jobs to
    the cluster.
  • Systems which have been added to the list of
    eligible administration hosts can be used to
    modify the cluster configuration.

49
High Availability
  • The cluster can be configured with one or more
    shadow master hosts,
  • eliminating the master host as a single point of
    failure and providing increased availability to
    users.
  • If the master goes down, the shadow master host
    automatically and transparently takes over as the
    master.
  • Shadow master host functionality is a
    fully-integrated part of the Sun Grid Engine
    software.
  • The only prerequisite for its use is a
    highly-available file system on which to install
    the software and configuration files.

50
Development Tools and Run-Time Libraries
  • Sun HPC ClusterTools
  • Parallel Application Development
  • Sun HPC ClusterTools Software
  • Integration with Sun Grid Engine
  • Forte for High Performance Computing
  • Technical Computing Portal

51
Development Tools and Run-Time Libraries
  • Sun HPC ClusterTools and Forte for High
    Performance Computing (HPC) software are commonly
    used to develop and run applications on Cluster
    Grids.
  • Sun HPC ClusterTools provides an integrated
    software environment for developing and deploying
    parallel distributed applications.
  • Forte HPC provides support for developing
    high-performance (non-parallel) applications in
    the FORTRAN, C, and C programming languages

52
Sun HPC ClusterTools
  • Sun HPC ClusterTools 4 software is a complete
    integrated environment for parallel application
    development.
  • It delivers an end-to-end software development
    environment for parallel distributed applications
    and provides middleware to manage a workload of
    highly resource-intensive applications.
  • Sun HPC ClusterTools Software enables users to
    develop and deploy distributed parallel
    applications with continuous scalability from one
    to 2048 processes within a single well-integrated
    parallel development environment.

53
Parallel Application Development
  • Two primary high performance parallel programming
    models are supported the single-process model
    and the multi-process model.
  • The single-process model includes all types of
    multi-threaded applications.
  • may be automatically parallelized by high
    performance compilers using parallelization
    directives (e.g., OpenMP) or explicitly
    parallelized with user-inserted Solaris or POSIX
    threads.
  • The multi-process model supports the MPI standard
    for parallel applications that run both on single
    SMPs and on clusters of SMPs or thin nodes.

54
Parallel Application Development
  • Sun HPC ClusterTools software includes
  • a high-performance,
  • multi-protocol implementation of the industry
    standard MPI
  • a full implementation of the MPI I/O protocol,
  • A tools for executing, debugging, performance
    analysis, and tuning of technical computing
    applications.
  • Sun HPC ClusterTools software is thread-safe,
    facilitating a third, hybrid parallel
    application
  • the mixing of threads and MPI parallelism to
    create applications that use MPI for
    communication between cooperating processes and
    threads within each process.

55
Sun HPC ClusterTools Software
  • Sun HPC ClusterTools software provides the
    features to effectively develop, deploy, and
    manage a workload of highly resource-intensive,
    MPI-parallel applications
  • Sun HPC ClusterTools is integrated to work with
    Sun Grid Engine software for use in Cluster Grid
    environments.
  • Sun HPC ClusterTools software supports standard
    programming paradigms like
  • MPI message passing and includes a parallel file
    system that delivers high-performance, scalable
    I/O.

56
Integration with Sun Grid Engine
  • Sun CRE provides Sun Grid Engine with the
    relevant information about parallel applications
    in which multiple resources are reserved for a
    single job.
  • the Sun Grid Engine software uses the Sun CRE
    component to handle the details of launching MPI
    jobs, while still presenting the familiar Sun
    Grid Engine interface to the user.
  • Integration of Sun HPC ClusterTools with the Sun
    Grid Engine framework provides a distinct
    advantage to users of a Sun Cluster Grid.
  • running parallel jobs with Sun CRE under the DRM
    of Sun Grid Engine, users achieve both efficient
    resource utilization and effective control over
    parallel applications.

57
Forte for High Performance Computing (HPC)
  • 64-bit application development 64-bit technology
    offers many benefits, including
  • address space to handle large problems,
  • 64-bit integer arithmetic to increase the
    calculation speed for mathematical operations
  • support for files greater than 4 GB in size.
  • Sun Performance Library compatibility
    Compatibility with the Sun Performance Library
    helps provide
  • optimized performance for matrix algebra
  • signal processing tasks on single-processor and
    multiprocessor systems.

58
Forte for High Performance Computing (HPC)
  • Integrated programming environment Forte HPC
    includes
  • integrated programming environment that enables
    to browse, edit, compile, debug and tune
    applications efficiently.
  • Software configuration management tools Forte
    HPC provides
  • software configuration management tools to enable
    development teams to work together effectively
    and efficiently.

59
Forte for High Performance Computing (HPC)
  • Multi-threading technology Forte HPC software
    enables
  • develop and tune multi-threaded/multi-processing
    applications using capabilities such as OpenMP
    API support for C and FORTRAN programs.
  • Performance analysis tools Performance analysis
    tools enable
  • evaluate code performance, spot potential
    performance issues, and locate problems quickly

60
Technical Computing Portal
  • The Technical Computing Portal is a
    services-centric, Web-based, shared-everything
    approach to technical computing.
  • It offers an easy-to-use interface for job
    submission, job control, and access to results
    via the Sun ONE Portal Server (formerly iPlanet
    Portal Server) and the Sun Grid Engine software.
  • The Sun ONE Portal Server is a community-based
    server application that securely provides an
    aggregation of key content, application and
    services personalized based on user
    role/identity, user preferences and system
    determined relevancy

61
System Management Center
  • Sun Management Center
  • Intelligent Agent-Based Architecture
  • Sun Validation Test Suite
  • Installation and Deployment Technologies
  • Web Start Flash
  • Solaris JumpStart software
  • Solaris Live Upgrade

62
System Management
  • Cluster Grids can contain large numbers of
    distributed systems, and ensuring efficient and
    effective system management is essential.
  • Powerful system administration tools such as Sun
    Management Center provide comprehensive
    administrative and management operations.
  • Other tools include Sun Validation Test Suite
    (SunVTS) to test and verify hardware
    functionality
  • across a network, and automated installation and
    deployment technologies like the Solaris Web
    Start product line to help reduce the amount of
    time administrators spend installing and patching
    systems and software in a Cluster Grid.

63
Sun Management Center
  • Sun Management Center software is an advanced
    system management tool designed to support Sun
    systems.
  • It offers a single point of management for Sun
    systems, the Solaris Operating Environment,
    applications, and services for data center and
    highly distributed computing environments.
  • Sun Management Center software enables
  • system administrators to perform remote system
    management
  • monitor performance,
  • isolate hardware/software faults for hundreds of
    Sun systems,
  • all through an easy-to-use Web interface.
  • Enhanced, proactive event/alarm management
    provide early notification of potential service
    problems.

64
Intelligent Agent-Based Architecture
  • Sun Management Center is based on an intelligent
    agent-based architecture.
  • a manager monitors and controls managed
    entities by sending requests to agents residing
    on the managed node.
  • Agents are key software components that collect
    management data on behalf of the manager.

65
Intelligent Agent-Based Architecture
  • Scalability Distributing responsibility to the
    agents
  • improves the Sun Management Center softwares
    ability to scale as the number of managed nodes
    increases.
  • Increased reliability and availability
  • agents process data locally and are not
    dependent on other software components,
    reliability and availability are enhanced

66
Intelligent Agent-Based Architecture
  • Flexibility and extensibility Additional
    modules can be dynamically loaded to Sun
    Management Center agents.
  • Decreased bandwidth requirements Intelligent
    agents offer a savings in network bandwidth,
  • agents collect data on the managed nodes and
    only report status and significant events when
    necessary.
  • Security All users are authenticated, limiting
    administrators access to and management of only
    the systems within their control.

67
Sun Validation Test Suite
  • SunVTS is a comprehensive diagnostic tool that
    tests and validates Sun hardware by verifying the
    connectivity and functionality of most system
    hardware.
  • SunVTS can be tailored to run on various types of
    machines ranging from desktops to servers, and
    supports testing in both 32-bit and 64-bit
    Solaris operating environments.
  • Tests examine subsystems such as processors,
    peripherals, storage, network, memory, graphics
    and video, audio, and communication.

68
Sun Validation Test Suite
  • The primary goal of the SunVTS software is to
    create an environment in which Sun systems can be
    thoroughly tested to enable their proper
    operation or to find elusive problems.
  • SunVTS can be used to validate a system during
    development or production, as well as for
    troubleshooting, periodic maintenance, and system
    or subsystem stressing

69
Installation and Deployment Technologies
  • .Solaris Web Start software and the Solaris Web
    Start Wizards technology, the Solaris Operating
    Environment and other applications can be
    installed interactively with a browser-based
    interface.
  • Solaris JumpStart software provides automated
    installation and setup of multiple systems over
    the network.
  • Web Start Flash, Solaris JumpStart, and Solaris
    Live Upgrade technologies are particularly
    relevant to the Cluster Grid environment where
    large numbers of similarly configured systems
    must be managed.

70
Web Start Flash
  • Take a complete system image of the Solaris
    Operating Environment, application stack, and
    system configuration and replicate that reference
    server configuration image onto multiple servers.
  • applicable to Cluster Grid environments that
    contain large numbers of identical systems.
  • Complete system replication System administrators
    can capture a snapshot image of a complete server
  • Rapid deployment Web Start Flash technology can
    reduce configuration complexity, improve
    deployment scalability, and can significantly
    reduces installation time for rapid deployment.

71
Web Start Flash
  • Layered Flash deployment Web Start Flash
    technology provides
  • the ability to layer Flash Archives, increasing
    the flexibility of the Web Start Flash
    installation while also reducing the disk space
    required to store Flash Archives.
  • FRU Server Snapshot Web Start Flash technology
    can also be used to store existing server
    configurations, thus making them a field
    replaceable unit (FRU).

72
Solaris JumpStart software
  • install and set up a Solaris system anywhere on
    the network without any user interaction.
  • the Solaris Operating Environment and application
    software can be placed on centralized servers,
    and the install process can be customized by
    system administrators.
  • highly customizable.
  • Administrators can set rules which automatically
    match the characteristics of the node being
    installed to an installation method.

73
Solaris Live Upgrade
  • promotes greater availability
  • providing a mechanism to upgrade and manage
    multiple on-disk instances of the Solaris
    Operating Environment
  • allowing operating system upgrades to take place
    while the system continues to operate.
  • Can be used for patch testing and roll-out, and
    can also provide a safe fall-back environment
    to quickly recover from upgrade problems or
    failures.
About PowerShow.com