Slicing with SHARP - PowerPoint PPT Presentation

About This Presentation
Title:

Slicing with SHARP

Description:

Jeff Chase Duke University Federated Resource Sharing Location-Independent Services The last decade has yielded immense progress on building and deploying large-scale ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 41
Provided by: JeffC68
Learn more at: http://issg.cs.duke.edu
Category:

less

Transcript and Presenter's Notes

Title: Slicing with SHARP


1
Slicing with SHARP
  • Jeff Chase
  • Duke University

2
Federated Resource Sharing
  • How do we safely share/exchange resources across
    domains?
  • Administrative, policy, security, trust
    domains or sites
  • Sharing arrangements form dynamic Virtual
    Organizations
  • Physical and logical resources (data sets,
    services, applications).

3
Location-Independent Services
dynamic server set
Clients
request routing
varying load
  • The last decade has yielded immense progress on
    building and deploying large-scale adaptive
    Internet services.
  • Dynamic replica placement and coordination
  • Reliable distributed systems and clusters
  • P2P, indirection, etc.

Example services caching network or CDN,
replicated Web service, curated data, batch
computation, wide-area storage, file sharing.
4
Managing Services
Servers
Service Manager
Clients
sensor/actuator feedback control
  • A service manager adapts the service to changes
    in load and resource status.
  • e.g., gain or lose a server, instantiate
    replicas, etc.
  • The service manager itself may be decentralized.
  • The service may have contractual targets for
    service quality - Service Level Agreements or
    SLAs.

5
An Infrastructure Utility
Resource efficiency Surge protection Robustness P
ay as you grow Economy of scale Geographic
dispersion
Resource pool
  • In a utility/grid, the service obtains resources
    from a shared pool.
  • Third-party hosting providers - a resource
    market.
  • Instantiate service wherever resources are
    available and demand exists.
  • Consensus we need predictable performance and
    SLAs.

6
The Slice Abstraction
  • Resources are owned by sites.
  • E.g., node, cell, cluster
  • Sites are pools of raw resources.
  • E.g., CPU, memory, I/O, net
  • A slice is a partition or bundle or subset of
    resources.
  • The system hosts application services as
    guests, each running within a slice.

Site A
Service S1
Site B
Service S2
Site C
7
Slicing for SLAs
  • Performance of an application depends on the
    resources devoted to it Muse01.
  • Slices act as containers with dedicated bundles
    of resources bound to the application.
  • Distributed virtual machine / computer
  • Service manager determines desired slice
    configuration to meet performance goals.
  • May instantiate multiple instances of a service,
    e.g., in slices sized for different user
    communities.
  • Services may support some SLA management
    internally, if necessary.

8
Example Cluster Batch Pools in Cluster-on-Demand
(COD)
  • Partition a cluster into isolated virtual
    clusters.
  • Virtual cluster owner has exclusive control over
    servers.
  • Assign nodes to virtual clusters according to
    load, contracts, and resource usage policies.
  • Example service a simple wrapper for SGE batch
    scheduler middleware to assess load and
    obtain/release nodes.

request
Service Managers (e.g., SGE)
COD Cluster
grant
Dynamic Virtual Clusters in a Grid Site Manager
HPDC 2003 with Laura Grit, David Irwin, Justin
Moore, Sara Sprenkle
9
A Note on Virtualization
  • Ideology for the future Grid adapt the grid
    environment to the service rather than the
    service to the grid.
  • Enable user control over application/OS
    environment.
  • Instantiate complete environment down to the
    metal.
  • Dont hide the OS its just another replaceable
    component.
  • Requires/leverages new underware for
    instantiation.
  • Virtual machines (Xen, Collective, JVM, etc.)
  • Net-booted physical machines (Oceano, UDC, COD)
  • Innovate below the OS and alongside it
    (infrastructure services for the control plane).

10
SHARP
  • Secure Highly Available Resource Peering
  • Interfaces and services for external control of
    federated utility sites (e.g., clusters).
  • A common substrate for extensible
    policies/structures for resource management and
    distributed trust management.
  • Flexible on-demand computing for a site, and
    flexible peering for federations of sites
  • From PlanetLab to the New Grid
  • Use it to build a resource dictatorship or a
    barter economy, or anything in between.
  • Different policies/structures may coexist in
    different partitions of a shared resource pool.

11
Goals
  • The question addressed by SHARP is how do the
    service managers get their slices?
  • How does the system implement and enforce
    policies for allocating slices?
  • Predictable performance under changing
    conditions.
  • Establish priorities under local or global
    constraint.
  • Preserve resource availability across failures.
  • Enable and control resource usage across
    boundaries of trust or ownership (peering).
  • Balance global coordination with local control.
  • Extensible, pluggable, dynamic, decentralized.

12
Non-goals
  • SHARP does NOT
  • define a policy or style of resource exchange
  • E.g., barter, purchase, or central control (e.g.,
    PLC)
  • care how services are named or instantiated
  • understand the SLA requirements or specific
    resource needs of any application service
  • define an ontology to describe resources
  • specify mechanisms for resource control/policing.

13
Resource Leases
request
Site A Authority
S1 Service Manager
grant
ltleasegt ltissuergt As public key lt/issuergt
ltsigned_partgt ltholdergt S1s public key
lt/holdergt ltrsetgt resource description
lt/rsetgt ltstart_timegt lt/start_timegt
ltend_timegt lt/end_timegt ltsngt unique ID
at Site A lt/sngt lt/signed_partgt
ltsignaturegt As signature lt/signaturegt lt/leasegt
14
Agents (Brokers)
request
request
S1 Service Manager
Site A Authority
grant
grant
  • Introduce agent as intermediary/middleman.
  • Factor policy out of the site authority.
  • Site delegates control over its resources to the
    agent.
  • Agent implements a provisioning/allocation policy
    for the resources under its control.

15
Leases vs. Tickets
  • The site authority retains ultimate control over
    its resources only the authority can issue
    leases.
  • Leases are hard contracts for concrete
    resources.
  • Agents deal in tickets.
  • Tickets are soft contracts for abstract
    resources.
  • E.g., You have a claim on 42 units of resource
    type 7 at 3PM for two hoursmaybe. (also signed
    XML)
  • Tickets may be oversubscribed as a policy to
    improve resource availability and/or resource
    utilization.
  • The subscription degree gives configurable
    assurance spanning continuum from a hint to a
    hard reservation.

16
Service Instantiation
instantiate service in virtual machine
7
Site
redeem ticket
request
6
5
1
grant ticket
grant lease
2
request
3
grant ticket
4
Agent
Service Manager
Like an airline ticket, a SHARP ticket must be
redeemed for a lease (boarding pass) before the
holder can occupy a slice.
17
Ticket Delegation
Transfer of resources, e.g., as a result of a
peering agreement or an economic transaction.
The site has transitive trust in the delegate.
Agents may subdivide their tickets and delegate
them to other entities in a cryptographically
secure way. Secure ticket delegation is the basis
for a resource economy. Delegation is
accountable if an agent promises the same
resources to multiple receivers, it may/will be
caught.
18
Peering
Sites may delegate resource shares to multiple
agents. E.g., Let my friends at UNC use 20 of
my site this week and 80 next weekend. UNC can
allocate their share to their local users
according to their local policies. Allocate the
rest to my local users according to my policies.
Note tickets issued at UNC self-certify their
users.
19
A SHARP Ticket
ltticketgt ltsubticketgt ltissuergt As public
key lt/issuergt ltsigned_partgt
ltprincipalgt Bs public key lt/principalgt
ltagent_addressgt XML RPC redeem_ticket()
lt/agent_addressgt ltrsetgt resource
description lt/rsetgt ltstart_timegt
lt/start_timegt ltend_timegt lt/end_timegt
ltsngt unique ID at Agent A lt/sngt
lt/signed_partgt ltsignaturegt As signature
lt/signaturegt lt/subticketgt ltsubticketgt
ltissuergt Bs public key lt/issuergt
ltsigned_partgt lt/signed_partgt ltsignaturegt
Bs signature lt/signaturegt lt/subticketgt lt/ticket
gt
20
Tickets are Chains of Claims
claimID a holder A a.Rset a.term
claimID b holder B issuer A parent
a b.rset/term
21
A Claim Tree
anchor
The set of active claims for a site forms a claim
tree. The site authority maintains the claim
tree over the redeemed claims.
40
ticket T
25
8
final claim
9
3
3
10
22
Ticket Distribution Example
A 40
  • Site transfers 40 units to Agent A

B 8
D 3
  • A transfers to B and C

E 3
C 25
H 7
  • B and C further subdivide resources

F 9
resource space
  • C oversubscribes its holdings in granting to H,
    creating potential conflict

conflict
G 10
t0
t
time
23
Detecting Conflicts
A 40
100
B 8
D 3

E 3
40
C 25
H 7
resource space
F 9
26
25
8
G 10
10
9
7
3
3
t0
t
time
24
Conflict and Accountability
  • Oversubscription may be a deliberate strategy, an
    accident, or a malicious act.
  • Site authorities serve oversubscribed tickets
    FCFS conflicting tickets are rejected.
  • Balance resource utilization and conflict rate.
  • The authority can identify the accountable agent
    and issue a cryptographically secure proof of its
    guilt.
  • The proof is independently verifiable by a third
    party, e.g., a reputation service or a court of
    law.
  • The customer must obtain a new ticket for
    equivalent resource, using its proof of rejection.

25
Agent as Arbiter
  • Agents implement local policies for apportioning
    resources to competing customers. Examples
  • Authenticated client identity determines
    priority.
  • Sell tickets to the highest bidder.
  • Meet long-term contractual obligations sell the
    excess.

26
Agent as Aggregator
Agents may aggregate resources from multiple
sites.
Example PlanetLab Central
  • Index by location and resource attributes.
  • Local policies match requests to resources.
  • Services may obtain bundles of resources across
    a federation.

27
Division of Knowledge and Function
Agent/Broker
Authority
Service Manager
Knows the Application Instantiate app Monitor
behavior SLA/QoS mapping Acquire
contracts Renew/release
Guesses global status Availability of
resources What kind How much Where (site
grain) How much to expose about resource
types? About proximity?
Knows local status Resource status Configuration
Placement Topology Instrumentation Thermals, etc.
28
Issues and Ongoing Work
  • SHARP combines resource discovery and brokering
    in a unified framework.
  • Configurable overbooking degree allows a
    continuum.
  • Many possibilities exist for SHARP agent/broker
    representations, cooperation structures,
    allocation policies, and discovery/negotiation
    protocols.
  • Accountability is a fundamental property needed
    in many other federated contexts.
  • Generalize accountable claim/command exchange to
    other infrastructure services and applications.
  • Bidding strategies and feedback control.

29
Conclusion
  • Think of PlanetLab as an evolving prototype for
    planetary-scale on-demand utility computing.
  • Focusing on app services that are light resource
    consumers but inhabit many locations network
    testbed.
  • Its growing organically like the early Internet.
  • Rough consensus and (almost) working code.
  • PlanetLab is a compelling incubator and testbed
    for utility/grid computing technologies.
  • SHARP is a flexible framework for utility
    resource management.
  • But its only a framework.

30
Performance Summary
  • SHARP prototype complete and running across
    PlanetLab
  • Complete performance evaluation in paper
  • 1.2s end-to-end time to
  • Request resource 3 peering hops away
  • Obtain and validate tickets, hop-by-hop
  • Redeem ticket for lease
  • Instantiate virtual machine at remote site
  • Oversubscription allows flexible control over
    resource utilization versus rate of ticket
    rejection

31
Related Work
  • Resource allocation/scheduling mechanisms
  • Resource containers, cluster reserves,
    Overbooking
  • Cryptographic capabilities
  • Taos, CRISIS, PolicyMaker, SDSI/SPKI
  • Lottery ticket inflation
  • Issuing more tickets decreases value of existing
    tickets
  • Computational economies
  • Amoeba, Spawn, Muse, Millenium, eOS
  • Self-certifying trust delegation
  • PolicyMaker, PGP, SFS

32
Physical cluster
COD servers backed by configuration database
Network boot Automatic configuration Resource
negotiation
Dynamic virtual clusters
Database-driven network install
33
SHARP
  • Framework for distributed resource management,
    resource control, and resource sharing across
    sites and trust domains

Challenge SHARP Approach
Maintain local autonomy Sites are ultimate arbiters of local resources Decentralized protocol
Resource availability in the presence of agent failures Tickets time out (leases) Controlled oversubscription
Malicious actors Signed tickets Audit full chain of transfers
34
40
claimID a holder A issuer A parent
a.rset/term
anchor
40
8
ticket T
25
3
claimID b holder B issuer A parent
a b.rset/term
3
final claim
25
8
future claim
9
10
claimID c holder C issuer B parent
b c.rset/term
10
9
3
3
7
T
claim tree at time t0
claim delegation
t0
t
subclaim(claim c, claim p) ? c.issuer
p.holder ? c.claimID p
? contains(p.rset, c.rset) ?
subinterval(c.term, p.term) contains(rset p,
rset c) ? c.type p.type ? c.count ?
p.count subinterval(term c, term p) ?
p.start ? c.start ? p.end ? p.start ? c.end ?
p.end
ticket(c0,,cn) ? ? ci, i 0..n-1
subclaim(ci1, ci) anchor(c0)
anchor(claim a) ? a.issuer a.holder
a.parent null
agent
35
Mixed-Use Clustering
Virtual clusters
BioGeometry batch pool
SIMICS/Arch batch pool
Internet Web/P2P emulations
Student semester project
Somebodys buggy hacked OS
Physical clusters
Vision issues leases on isolated partitions of
the shared cluster for different uses, with
push-button Web-based control over software
environments, user access, file volumes, DNS
names.
36
Grids are federated utilities
  • Grids should preserve the control and isolation
    benefits of private environments.
  • Theres a threshold of comfort that we must reach
    before grids become truly practical.
  • Users need service contracts.
  • Protect users from the grid (security cuts both
    ways).
  • Many dimensions
  • decouple Grid support from application
    environment
  • decentralized trust and accountability
  • data privacy
  • dependability, survivability, etc.

37
COD and Related Systems
Other cluster managers based on database-driven
PXE installs
Oceano
hosts Web services under dynamic load.
Dynamic clusters
OS-Agnostic
COD
NPACI Rocks
Netbed/Emulab
Flexible configuration Open source
configures Linux compute clusters.
configures static clusters for emulation
experiments.
COD addresses hierarchical dynamic resource
management in mixed-use clusters with pluggable
middleware (multigrid).
38
Dynamic Virtual Clusters
Web interface
Reserve pool (off-power)
COD database
COD Manager
Virtual Cluster 2
negotiate configure
Allocate resources in units of raw
servers Database-driven network install
Pluggable service managers Batch schedulers (SGE,
PBS), Grid PoP, Web services, etc.
Virtual Cluster 1
39
Enabling Technologies
DHCP host configuration
Linux driver modules, Red Hat Kudzu, partition
handling
DNS, NIS, etc. user/group/mount configuration
NFS etc. network storage automounter
PXE network boot
IP-enabled power units
Power APM, ACPI, Wake-on-LAN
Ethernet VLANs
40
Recent Papers on Utility/Grid Computing
  • Managing Energy and Server Resources in Hosting
    Centers ACM Symposium on Operating Systems
    Principles 2001
  • Dynamic Virtual Clusters in a Grid Site Manager
    IEEE High-Performance Distributed Computing
    2003
  • Model-Based Resource Provisioning for a Web
    Service Utility USENIX Symposium on Internet
    Technologies and Systems 2003
  • An Architecture for Secure Resource Peering ACM
    Symposium on Operating Systems Principles 2003
  • Balancing Risk and Reward in a Market-Based Task
    Manager IEEE High-Performance Distributed
    Computing 2004
  • Designing for Disasters USENIX Symposium on File
    and Storage Technologies 2004
  • Comparing PlanetLab and Globus Resource
    Management Solutions IEEE High-Performance
    Distributed Computing 2004
  • Interposed Proportional Sharing for a Storage
    Service Utility ACM Symposium on Measurement and
    Modeling of Computer Systems 2004
Write a Comment
User Comments (0)
About PowerShow.com