Cellular%20Disco:%20resource%20management%20using%20virtual%20clusters%20on%20shared%20memory%20multiprocessors - PowerPoint PPT Presentation

About This Presentation
Title:

Cellular%20Disco:%20resource%20management%20using%20virtual%20clusters%20on%20shared%20memory%20multiprocessors

Description:

Cellular Disco: resource management using virtual clusters on shared memory multiprocessors Published in ACM 1999 by K.Govil, D. Teodosiu,Y. Huang, M. Rosenblum. – PowerPoint PPT presentation

Number of Views:196
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Cellular%20Disco:%20resource%20management%20using%20virtual%20clusters%20on%20shared%20memory%20multiprocessors


1
Cellular Disco resource management using virtual
clusters on shared memory multiprocessors
  • Published in ACM 1999 by K.Govil, D. Teodosiu,Y.
    Huang, M. Rosenblum.
  • Presenter Soumya Eachempati

2
Motivation
  • Large scale shared-Memory Multiprocessors
  • Large number of CPUs (32-128)
  • NUMA Architectures
  • Off-the-shelf OS not scalable
  • Cannot handle large number of resources
  • Memory management not optimized for NUMA
  • No fault containment

3
Existing Solutions
  • Hardware partitioning
  • Provides fault containment
  • Rigid resource allocation
  • Low resource utilization
  • Cannot dynamically adapt to workload
  • New Operating System
  • Provides flexibility and efficient resource
    management.
  • Considerable effort and time
  • Goal To exploit hardware resources to the
    fullest with minimal effort while improving
    flexibility and fault-tolerance.

4
Solution DISCO(VMM)
  • Virtual Machine monitor
  • Addresses NUMA awareness issues and scalability
  • Issues not dealt by DISCO
  • Hardware fault tolerance/containment
  • Resource management policies

5
Cellular DISCO
  • Approach Convert Multiprocessor machine into a
    Virtual Cluster
  • Advantages
  • Inherits the benefits of DISCO
  • Can support legacy OS transparently
  • Combines the goodness of H/W Partitioning and new
    OS.
  • Provides fault containment
  • Fine grained resource sharing
  • Less effort than developing an OS

6
Cellular DISCO
  • Internally structured into semi-independent
    cells.
  • Much less development effort compared to HIVE
  • No performance loss - with fault containment.
  • WARRANTED DESIGN DECISION Code of Cellular DISCO
    is correct.

7
Cellular Disco Architecture
8
Resource Management
  • Over-commits resources
  • Gives flexibility to adjust fraction of resources
    assigned to VM.
  • Restrictions on resource allocation due to fault
    containment.
  • Both CPU and memory load balancing under
    constraints.
  • Scalability
  • Fault containment
  • Avoid contention
  • First touch allocation, dynamic migration,
    replication of hot memory pages

9
Hardware Virtualization
  • VMs interface mimics the underlying H/W.
  • Virtual Machine Resources (User-defined)
  • VCPUs, memory, I/O devices(physical)
  • Physical vs. machine resources(allocated
    dynamically - priority of VM)
  • VCPUs - CPUs
  • Physical - machine pages
  • VMM intercepts privileged instructions
  • 3 modes - user supervisor(guest OS),
    kernel(VMM).
  • Supervisor mode all memory accesses are mapped.
  • Allocates machine memory to back the physical
    memory.
  • Pmap and memmap data structure.
  • Second level software TLB(L2TLB).

10
Hardware fault containment
11
Hardware fault containment
  • VMM - software fault containment.
  • Cell
  • Inter-cell communication
  • Inter-processor RPC
  • Messages - no need for locking since serialized.
  • Shared memory for some data structures(pmap,
    memmap).
  • Low latency, exactly once semantics
  • Trusted system software layer - enables us to use
    shared memory.

12
Implementation 1 MIPS R10000
  • 32-processor SGI Origin 2000
  • Piggybacked on IRIX 6.4(Host OS)
  • Guest OS - IRIX 6.2
  • Spawns Cellular DISCO(CD) as a multi-threaded
    kernel process.
  • Additional overhead lt 2(time spent in host IRIX)
  • No fault isolation IRIX kernel is monolithic
  • Solution Some host OS support needed-one copy of
    host OS per cell.

13
I/O Request execution
  • Cellular Disco piggybacked on IRIX kernel

14
32 - MIPS R10000
15
Characteristics of workloads
  • Database - decision support workload
  • Pmake - IO intensive workload
  • Raytrace - CPU intensive
  • Web - kernel intensive web-server workload.

16
Virtualization Overheads
17
Fault-containment Overheads
Left bar - single cell config Right bar - 8 cell
system.
18
CPU Management
  • Load Balancing mechanisms
  • Three types of VCPU migrations - Intra-node,
    Inter-node, Inter-cell.
  • Intra node - loss of CPU cache affinity
  • Inter node - cost of copying L2TLB, higher long
    term cost.
  • Inter cell - loss of both cache and node
    affinity, increases fault vulnerability.
  • Alleviates penalty by replicating pages.
  • Load balancing policies - idle (local load
    stealer) and periodic (global redistribution)
    balancers.
  • Each CPU has local run queue of VCPUs.
  • Gang-scheduling
  • Run all VCPUs of a VM simultaneously.

19
Load Balancing
  • Low contention distributed data structure - load
    tree.
  • Contention on higher level nodes
  • List of cells vulnerable to - VCPU.
  • Heavy loaded - idle balancer not enough
  • Local periodic balancer for 8 CPU region.

20
CPU Scheduling and Results
  • Scheduling - highest-priority gang runnable VCPU
    that has been waiting. Sends out RPC.
  • 3 configs 32- processors.
  • One VM - 8 VCPUs--8 process raytrace.
  • 4 VMs
  • 8 VMs (total of 64 VCPUs).
  • Pmap migrated only when all VCPUs are migrated
    out of a cell.
  • Data pages also migrated for independence

21
Memory Management
  • Each cell has its own freelist of pages indexed
    by the home node.
  • Page allocation request
  • Satisfied from local node
  • Else satisfied from same cell
  • Else borrowed from another cell
  • Memory balancing
  • Low memory threshold for borrowing and lending
  • Each VM has priority list of lender cells

22
Memory Paging
  • Page Replacement
  • Second-chance FIFO
  • Avoids double paging overheads.
  • Tracking used pages
  • Use annotated OS routines
  • Page Sharing
  • Explicit marking of shared pages
  • Redundant Paging
  • Avoids by trapping every access to virtual paging
    disk

23
Implementation 2 FLASH Simulation
  • FLASH has hardware fault recovery support
  • Simulation of FLASH architecture on SimOS
  • Use Fault injector
  • Power failure
  • Link failure
  • Firmware failure (?)
  • Results 100 fault containment

24
Fault Recovery
  • Hardware support needed
  • Determine what resources are operational
  • Reconfigure the machine to use good resources
  • Cellular Disco recovery
  • Step 1 All cells agree on a liveset of nodes
  • Step 2 Abort RPCs/messages to dead cells
  • Step 3 Kill VMs dependent on failed cells

25
Fault-recovery Times
  • Recovery times higher for larger memory
  • Requires memory scanning for fault detections

26
Summary
  • Virtual Machine Monitor
  • Flexible Resource Management
  • Legacy OS support
  • Cellular Disco
  • Cells provide fault-containment
  • Create Virtual Cluster
  • Need hardware support
Write a Comment
User Comments (0)
About PowerShow.com