Platform Computing: Strategy for Production Grid Computing PowerPoint PPT Presentation

presentation player overlay
1 / 33
About This Presentation
Transcript and Presenter's Notes

Title: Platform Computing: Strategy for Production Grid Computing


1
Platform Computing Strategy for Production
Grid Computing
Dr. Fubo Zhang zfb_at_platform.com
2
Agenda
  • Grid evolution and technologies
  • Platforms Grid Strategy Solutions
  • Production user requirements experience
  • Summary

3
The Fifth Wave
Late 90s
90s
80s
70s
Mainframe
PC
Client/Server
Internet
Operational Productivity
Personal Productivity
Departmental Productivity
Channel Productivity
Distributed computing aggregates resources to
provide always-on, unlimited compute
power Virtual, adaptable, open, on-demand
4
Distributed Computing
Computing Ubiquity
Internet Grid Computing
Grid Research
JobScheduler Parallel Analyzer MultiCluster
Distributed Batch Queuing NQS, DQS, Condor, LSF
Batch
DC Research
System Arch Trend
UNIX workstations supercomputers
SMPs UNIX workstations
Linux Windows farms with commd. chips
2 Vaxen Ethernet
1992
1996
2000
2005
1985
5
Grid Its Three Stages
Grid Transparent, secure and coordinated
computing resource sharing across sites
cluster of clusters
Scope of Sharing
Inter-Grid supported by xSPs
extra-Grid across multiple organizations
DoD HPC, NASA IPG, Data Grid,
intra-Grid inter-departmental sharing within
organizations
TI, Toshiba, GM, Monsanto,
2004
1998
2001
2007
6
LSF MultiCluster intra- extra-Grid
  • Cluster-to-cluster sharing management
  • Integratable with external Grid services, e.g.,
    Kerberos authentication
  • Reliable file transfer staging
  • User account mapping, SSL, Firewalls

Workstation
...
Compute Server
File Server
I.S.C.
I.S.C.
...
I.S.C.
...
I.S.C. Inter-cluster Sharing Conditions (e.g.,
time windows, types of jobs, job volume)
7
Extra- and Inter-Grid
Clients
GridPortal
Admin
GridManager
Resource Directory
. . . . . . . .
Cluster
Cluster
Internet Enterprise Data Centers
Cluster
SubCluster
Desktops
8
Internet Computing Grid
Internet Data Centers operated by xSPs
Interconnected IDCs
9
3 Levels of Grid Perspectives
  • User Perspective
  • Totally transparent single system environment
  • Service capacity on demand
  • Application Perspective
  • Transparent to existing apps no change needed
    some new apps can be built using Grid APIs
  • System Perspective
  • Dynamic grid of autonomous clusters with sharing
    agreements and common protocols
  • Combination of cluster-to-cluster and global
    management
  • Open with levels of coherency and cooperation

10
Grid Software Environment
Applications
Application Services
Core Services
DRM
Servers, Networks Node OSes
  • Core Services Grid Environmental services
  • Distributed Resource Management (DRM) Management
    of work and computing resources
  • Application Services Support those applications
    programmed for Grid

11
Grid Software Functions
12
Evolution of Grid
  • From Technology to Solution
  • Integrate pieces into packages
  • From Special-purpose to General-purpose
  • Projects to gain experience over time,
    general-purpose tech always replace
    special-purpose
  • From Toolkit to End-user Products
  • Install config, but not program each grid
  • From HPC/Research to Industrial Applications
  • Transparent collaboration and resource sharing
  • Individual vs. organization trust
  • From ResearchProducts to Industry Standards
  • Positive experience broad adoption

13
Platforms Grid Strategy
  • Focus on complete DRM solution be the aggregator
    solution provider for Grid
  • Focus on production users and evolve from
    extra-Grid to Inter-Grid
  • Support all app types interactive, batch,
    parallel, sessions, transactions, multi-site
    distributed,
  • Transparently expand Cluster functions to Grid
  • Hybrid Global Cluster-to-Cluster architecture
  • Provide strong and dynamic system for thin
    clients
  • Partner to go to market for total Grid systems
    offerings
  • Stay open Interoperate with other Grid software
    like PBS and Globus-based systems
  • Drive Grid standards through NPI to open up
    market

14
Management of Distributed Computing
Visibility
Performance Management
Workload Management
Resource Management
User Demand
Resource Supply
15
Resource Management
  • Config, admin monitoring
  • Ensure supply of critical apps and services
  • Event automation self healing
  • Automate routine operations
  • Security management

16
Performance Management
17
Grid Resource Directory
Resource Directory
Who have Linux boxes and Synopsys Licenses ?
I join this grid. I export 128 Linux boxes
and 48 Synopsys licenses.
Site B.
Site B
Site A
Support Grid Managers resource selection and
matching based on resource requirements
18
Ensuring Site Autonomy
  • Resources exported by the local cluster
    management
  • Grid Manager enforces cross-cluster sharing
    policies and flow restrictions upon sites
    (voluntary) participation
  • Submission cluster forwards job to execution
    cluster, directed by Grid Manager
  • Execution cluster accepts remote jobs just like
    local jobs, and it has full control of remote jobs

19
Advance Reservation
  • Advance Reservations allow resources such as job
    slots and special devices to be booked in
    advance, guaranteeing access to those resources
    at the specified time
  • Other jobs are backfilled around the reservation
  • A reservation may be on behalf of a user, a user
    group or a project
  • Advance Reservations may be possible on all
    reusable resources such as software licenses,
    allowing guaranteed access to booked software
    licenses

20
Grid Resource Leasing
Licenses
Licenses
Site A
Site B
21
Grid Fairshare Scheduling
  • Unique and Grid-wide resources can be dynamically
    fairshared across sites (e.g., software licenses,
    devices)
  • Resources scattered across sites can be
    aggregated and fairshared
  • Owner-guest bias can be supported
  • Flexible fairshare policies can be specified and
    enforced across Grid

22
Grid Remote batch
Licenses
Receive Queue
Send Queue
Licenses
Division A
Division B
Clients
Clients
23
Production User Requirements
  • Full products, not middleware or toolkit
  • Configure, but not program Grid no developers
  • Grid among organizations, not individuals
  • Resource management sharing policies key
  • Not user accounts everywhere
  • Existing apps more important than new ones
  • Share resources to get more done, not just
    capability
  • Transparent across Grid single system
    environment
  • At most specify resource requirements
  • Thin client support
  • All services by the collective Grid resources
  • Separation of apps from Grid infrastructure
  • Open, standards, choices

24
Grid at DoD HPCMO
  • Initiative to share resources on HPCMPs
    resources easily transparently SMDC, TACOM, NRL,
    NAVO and WSMR,
  • Build a meta-queuing system to integrate the
    centers
  • Primary Benefit the capability to submit a job
    to a single, common queue, which will be sent to
    the first available computer in the queuing pool.

25
DoD Requirements for the Grid
Requirements
Solutions
  • Transparent sharing of jobs
  • Resource reservation protocol
  • Transparent job control
  • Accounting
  • All client-server, server-server interactions
    Kerberized
  • Ticket forwarding/renewal
  • Multi-realm support
  • Account mapping
  • Platform FTA
  • Kerberized
  • Fault Tolerant
  • Fire and forget
  • Full Kerberos 5 Support
  • Reliable, Secure File Transfer


Fully Operational Grid Computing
26
DoD HPCMP
Building a Production Grid
27
Grid at General Motors
NAENG Warren, MI
TPC Pontiac, MI
MLCGF Flint,MI
NA HPC Warren, MI
3 Independent Product Divisions 8
clusters Submit work to central Data Center
using MultiCluster Provide transparent access to
HPC center for jobs that cannot / should not run
locally (e.g., structure, rash, some CFD) Share
resources between local cluster and HPC
Center Control sharing of HPC Center by 8 clusters
28
Grid at Pharmacia
Situation Requirement to share resources,
across newly acquired centers Solution Two
Clusters totaling 600 servers and
workstations Workload balancing by moving work to
an under utilized cluster Extra capacity is
transparently available.
29
Compaq and Platform
  • Partners since 1993 co-engineering, joint
    marketing, OEM, reselling, NPI
  • LSF BatchParallel OEMed with SC systems
  • Planning partnering on Linux systems products -
    opportunity for end-to-end solutions from PCs to
    supercomputers
  • Joint customers start to share across sites
    partnership opportunity to build on existing
    successes to deliver Grid solutions
  • Both companies committed to whole solution
    easier for adoption and strong market position
  • Partnership between the best experts for best
    solution

30
Grid Value Chain
  • Technology/toolkit developers
  • How to do it develop key pieces
  • Product vendors
  • Integrate pieces into packages support
  • Solution developers
  • Install config, maybe program apps but not Grid
  • Service providers
  • Support Grid operations
  • End users
  • Just want the solution

Partnership is key to the success of Grid
Computing
31
About Platform
  • Founded 1992 by Berkeley PHD Engineers
  • 400 Employees and more than 100 developers
  • 1500 Customers Globally
  • Key Partnerships - IBM, Compaq, Sun, SGI, etc
  • World wide company, local support and consultant
  • 30 developers and support engineers in Beijing

32
Summary
  • DC is mainstream Grid is emerging
  • RD so far helps define the standards and
    architecture, shows feasibility
  • Need to address both research and enterprise
    requirements
  • Need both open source and commercial products
  • Partnerships and standards are keys
  • Platform takes a strong interest in Grid

33
Platform Computingwww.Platform.Comzfb_at_platform
.com
  • Thank You!Questions?
Write a Comment
User Comments (0)
About PowerShow.com