Globus Virtual Workspaces - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Globus Virtual Workspaces

Description:

Pool. node. Trusted Computing Base (TCB) Pool. node. Pool. node. Pool. node. Pool. node. Pool ... Pool. node. Pool. node. Pool. node. Pool. node. Pool. node ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 25
Provided by: Office2004237
Category:

less

Transcript and Presenter's Notes

Title: Globus Virtual Workspaces


1
Globus Virtual Workspaces
  • HEPiX Fall 2007, St Louis
  • Kate Keahey
  • Argonne National Laboratory
  • University of Chicago
  • keahey_at_mcs.anl.gov

2
Why Virtual Workspaces?
  • Quality of Service
  • We get batch-style provisioning
  • One size fits all
  • Side-effect of job scheduling
  • We need advance reservations, urgent computing,
    periodic, best-effort, and others
  • Separation of job scheduling and resource
    management
  • E.g. workflow-based apps and batch apps have
    different needs
  • Quality of Life
  • We have I have a 100 nodes I cannot use
  • Complex applications
  • Hard to install
  • Require validation
  • Separation of environment preparation and
    resources leasing

3
What are Virtual Workspaces?
  • A dynamically provisioned environment
  • Environment definition we get exactly the
    (software) environment we need on demand.
  • Resource allocation Provision the resources the
    workspace needs (CPUs, memory, disk, bandwidth,
    availability), allowing for dynamic renegotiation
    to reflect changing requirements and conditions.
  • Implementation
  • Traditional means publishing, automated
    configuration, coarse-grained enforcement
  • Virtual Machines encapsulated configuration and
    fine-grained enforcement

Paper Virtual Workspaces Achieving Quality of
Service and Quality of Life in the Grid
4
Virtual Machines (Xen)
  • Open source
  • Paravirtualization
  • The Good high-performance
  • The Bad difficult to run proprietary OSs, and to
    mix 32-bit and 64-bit kernels (VT needed)
  • Xen terminology
  • Domain0 (the host),
  • DomainU (user domain, the guest)

5
Deploying WorkspacesRemotely
Pool node
Pool node
Pool node
VWS Service
Pool node
Pool node
Pool node
  • Workspace
  • Workspace metadata
  • Pointer to the image
  • Logistics information
  • Deployment request
  • CPU, memory, node count, etc.

Pool node
Pool node
Pool node
Pool node
Pool node
Pool node
6
Interacting with Workspaces
The workspace service publishes information on
each workspace as standard WSRF
Resource Properties.
Pool node
Pool node
Pool node
VWS Service
Pool node
Pool node
Pool node
Users can query those properties to find
out information about their workspace (e.g. what
IP the workspace was bound to)
Pool node
Pool node
Pool node
Pool node
Pool node
Pool node
Users can interact directly with their workspaces
the same way the would with a physical machine.
Trusted Computing Base (TCB)
7
Workspace Service Components
Workspace WSRF front-end that allows clients to
deploy and manage virtual workspaces
VWS Service
Pool node
Pool node
Pool node
Workspace back-end
Pool node
Pool node
Pool node
Resource manager for a pool of physical
nodes Deploys and manages Workspaces on the
nodes
Pool node
Pool node
Pool node
Each node must have a VMM (Xen)? installed, as
well as the workspace control program that
manages individual nodes
Pool node
Pool node
Pool node
Contextualization creates a common context for a
virtual cluster
Trusted Computing Base (TCB)
8
Workspace Service Components
  • GT4 WSRF front-end
  • Leverages GT core and services, notifications,
    security, etc.
  • Follows the OGF WS-Agreement provisioning model
  • Publishes available lease terms
  • Provides lease descriptions
  • Workspace Resource Manager (back-end)
  • Currently focused on Xen
  • Works with multiple Resource Managers
  • Workspace Control
  • Contextualization
  • Put the virtual appliance in its deployment
    context
  • Current release 1.3, available at
  • http//workspace.globus.org

9
Workspace Resource Managers
  • Default resource manager (basic slot fitting)
  • Commercial datacenter technology would also fit
  • Amazon Elastic Compute Cloud (EC2)
  • EC2 Selling cycles as Xen VMs
  • Software similar to Workspace Service
  • No virtual clusters, contextualization,
    fine-grain allocations, etc.
  • Grid credential admission - EC2 charging model
  • STAR 100 node VM run

10
Virtual Workspaces for STAR
  • STAR image configuration
  • A virtual cluster composed of an OSG headnode and
    STAR worker nodes
  • Using the workspace service over EC2 to provision
    resources
  • Allocations of up to 100 nodes
  • Dynamically contextualized for out-of-the-box
    cluster

11
Workspace Resource Managers
  • Default resource manager (basic slot fitting)
  • Commercial datacenter technology would also fit
  • Amazon Elastic Compute Cloud (EC2)
  • EC2 Selling cycles as Xen VMs
  • Software similar to Workspace Service
  • No virtual clusters, contextualization,
    fine-grain allocations, etc.
  • Grid credential admission - EC2 charging model
  • STAR 100 node VM run
  • Workspace Pilot
  • Integrating VMs into current provisioning models
  • Long-term solutions
  • Interleaving soft and hard leases
  • Providing better articulated leasing models
  • Developed in the context of existing schedulers

12
Providing Resources The Workspace Pilot
  • Challenge find the simplest way to integrate VMs
    into current provisioning models
  • Glide-ins (Condor) poor mans resource leasing
  • Best-effort semantics submit a job pilot that
    claims resources but does not run a job
  • The Workspace Pilot
  • Resources booted to dom0
  • Pilot adjusts memory
  • VWS leases slots to VMs
  • Kill-all facility

13
Workspace Control
  • VM control
  • Starting, stopping etc.
  • To be replaced by Xen API
  • Integrating into the network
  • Assigning MAC addresses and IP addresses
  • DHCP Delivery tool
  • Building up a trusted networking layer
  • VM image propagation
  • Image management and reconstruction
  • creating blank partitions
  • Talks to the workspace service via ssh

14
Security Issues
  • Secure admission of appliances/workspaces
  • The appliance vendor configures the appliance,
    asserts its properties and signs them to the
    appliance
  • Security and other updates, configuration and
    versioning assertions, disallowing offsite root
    access, etc.
  • The appliance deployer validates the signature
    and matches the assertions to policies
  • SC05 Poster Making your workspace secure
    establishing trust with VMs in the Grid
  • Secure networking
  • Controlling spoofing
  • Isolating networks between different VM groups
  • Traffic monitoring

15
So -- youve deployed some VMs Now what?
  • Do they have public IP addresses?
  • Do they actually represent something useful?
  • I need an OSG cluster
  • How do the VMs find out about each other?
  • Can they share storage?
  • Do they have host certificates?
  • And gridmapfile?
  • And all the other things that will integrate them
    into my VO?

16
Virtual Clusters
  • Challenge what is a virtual cluster?
  • A more complex virtual machine
  • Networking, shared storage, etc. that will be
    portable across sites and implementations
  • Available at the same time and sharing a common
    context
  • Example
  • A set of worker nodes with some edge services in
    front and NFS-based shared storage
  • Solution management of ensembles and sharing
  • Ensemble deployment, EPR management
  • Flexible, configurable cluster deployment
  • Networking
  • Edge Services have public IPs
  • Worker nodes are on a private network shared with
    the Edge Services
  • Exporting and sharing a common context
  • Configuring and joining context

Paper Virtual Clusters for Grid Communities,
CCGrid 2006
17
Contextualization
  • Challenge Putting a VM in the deployment context
    of the Grid, site, and other VMs
  • Assigning and sharing IP addresses, name
    resolution, application-level configuration, etc.
  • Solution Management of Common Context
  • Configuration-dependent
  • providesrequires
  • Common understanding between the image vendor
    and deployer
  • Mechanisms for securely delivering the required
    information to images across different
    implementations

contextualization agent
Common Context
IP hostname pk
Paper A Scalable Approach To Deploying And
Managing Appliances, TeraGrid conference 2007
18
Where Do VM Images Come From?
  • Appliance providers
  • Appliance providers configure, manage, attest
    images
  • Contextualization collaboration between
    appliance vendors and appliance deployers
  • Appliance providers
  • rPath
  • Recipe-style configuration (create a project,
    choose packages, cook, build the software
    appliance_
  • Freely available online, many appliances
  • http//www.rpath.com/rbuilder/
  • Bcfg2
  • Incrementally constructed configuration profiles
  • Configuration analysis capabilities
  • http//trac.mcs.anl.gov/projects/bcfg2

19
Image Management
  • Image partitions
  • Efficiency
  • Security
  • Flexibility
  • Partition management on deployment
  • Partition caching and generation
  • Partition sharing
  • Mounting

20
Workspace Ecosystem
21
Parting Thoughts
  • VMs are the raw materials from which a working
    system can be built
  • But we still have to build it!
  • Technical challenges taking one step at a time
  • Social/procedural challenges
  • Division of labor
  • Resource providers
  • Appliance providers
  • Can we build trust between these two groups?
  • If you have a specific problem, give us a call
  • http//workspace.globus.org
  • In our copious spare time we also do research
  • Migration, fine-grained enforcement, resource
    management, load balancing, migration in time,
    lots of one-offs
  • VTDC07 (co-located with SC07)

22
Acknowledgements
  • Workspace team
  • Kate Keahey
  • Tim Freeman
  • Borja Sotomayor
  • Funding
  • NSF SDCI Missing Links
  • NSF CSR Virtual Playgrounds
  • DOE CEDPS Project
  • With thanks to many collaborators
  • Jerome Lauret (STAR, BNL), Doug Olson (STAR,
    LBNL), Marty Wesley (rPath), Stu Gott (rPath),
    Ken Van Dine (rPath), Predrag Buncic (Alice,
    CERN), Haavard Bjerke (CERN), Rick Bradshaw
    (Bcfg2, ANL), Narayan Desai (Bcfg2, ANL), Duncan
    Penfold-Brown (Atlas,uvic), Ian Gable (Atlas,
    uvic), David Grundy (Atlas, uvic), Ti Leggit
    (University of Chicago), Greg Cross (University
    of Chicago), Mike Papka (University of
    Chicago/ANL)

23
with thanks to Jerome Lauret and Doug Olson of
the STAR project
Running jobs 230
Running jobs 150
Running jobs 150
Running jobs 142
Running jobs 124
Running jobs 109
Running jobs 94
Running jobs 73
Running jobs 42
Running jobs 0
VWS/EC2
BNL
Running jobs 300
Running jobs 300
Running jobs 300
Running jobs 282
Running jobs 243
Running jobs 221
Running jobs 195
Running jobs 140
Running jobs 76
Running jobs 0
WSU
Fermi
Running jobs 150
Running jobs 200
Running jobs 195
Running jobs 183
Running jobs 152
Running jobs 136
Running jobs 96
Running jobs 54
Running jobs 37
Running jobs 0
Running jobs 50
Running jobs 50
Running jobs 42
Running jobs 39
Running jobs 34
Running jobs 27
Running jobs 21
Running jobs 15
Running jobs 9
Running jobs 0
PDSF
Job Completion
File Recovery
24
with thanks to Jerome Lauret and Doug Olson of
the STAR project
with thanks to Jerome Lauret and Doug Olson of
the STAR project
Nersc PDSF
EC2 (via Workspace Service)
WSU
Accelerated display of a workflow job state Y
job number, X job state
Write a Comment
User Comments (0)
About PowerShow.com