Chapter 8 System Virtual Machines - PowerPoint PPT Presentation

1 / 46
About This Presentation

Chapter 8 System Virtual Machines


Some guest instructions need to be emulated (usually via interpretation) by the VMM. ... VMM believes an unneeded page is still in use (teal pages) ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 47
Provided by: altair8


Transcript and Presenter's Notes

Title: Chapter 8 System Virtual Machines

Chapter 8System Virtual Machines
System VMs
  • 2005.11.9
  • Dong In Shin
  • Distributed Computing System Laboratory
  • Seoul National Univ.

Performance Enhancement of System Virtual
Reasons for Performance Degradation
  • Setup
  • Emulation
  • Some guest instructions need to be emulated
    (usually via interpretation) by the VMM.
  • Interrupt handling
  • State saving
  • Bookkeeping
  • Ex. The accounting of time charged to a user
  • Time elongation

Instruction Emulation Assists
  • The VMM emulates the privilege instruction using
    a routine whose operation depends on whether the
    virtual machine is supposed to be executing in
    system mode or in user mode.
  • Hardware assist for checking the state and
    performing the actions.

Virtual Machine Monitor Assists
  • Context switch
  • Using hardware to save and restore registers
  • Decoding of privileged instructions
  • Hardware assists, such as decoding the privileged
  • Virtual interval timer
  • Decrementing the virtual counter by some amount
    estimated by the VMM from the amount that the
    real timer decrements.
  • Adding to the instruction set
  • A number of new instructions that are not a part
    of the ISA of the machine.

Improving Performance of the Guest System
  • Non-paged mode
  • The guest OS disables dynamic address translation
    and defines its real address space to be as large
    as the largest virtual address space. ? Page
    frames are mapped to fixed real pages.
  • The guest OS no longer has to exercise demand
  • No double paging
  • No potential conflict in paging decisions by the
    guest OS system and the VMM

Double Paging
  • Two independent layers of paging will interact,
    perform poorly.

Guest OS incorrectly believe a page to be in
physical memory ( green/gold pages )
VMM believes an unneeded page is still in use
(teal pages)
Guest evicts a page despite available physical
memory (red pages)
Pseudo-page-fault handling
  • A page fault in a VM system
  • A page fault in some VMs page table
  • A page fault of VMMs page table
  • Pseudo page-fault handling
  • Process
  • Initialize page-in operation from backing store.
  • Triggers guest pseudo page fault.
  • Guest OS suspends guests user process.
  • VMM does not suspend the guest.
  • On completion of page-in operation
  • VMM calls guest pseudo page fault handler again
  • Guest OS handler wakes up blocked user process.

The others
  • Spool files
  • Without any special mechanism, VMM should
    intercept the I/O commands and decipher that the
    virtual machines are simultaneously attempting to
    send a job to the I/O devices .
  • Handshaking allows the VMM picks up the spool
    file and continues to merge this file into its
    own buffer.
  • Inter-virtual-machine communication
  • Communication between two physical machines
    involves the processing of message packets
    through several layers at the sender/receiver
  • This process can be streamlines, simplified, and
    made faster if the two machines are virtual
    machines on the same host platform.

Specialized Systems
  • Virtual-equals-real (VR) virtual machine
  • The host address space representing the guest
    real memory is mapped one-to-one to the host real
    memory address space.
  • Shadow-table bypass assist
  • The guest page tables can point directly to
    physical addresses if the dynamic address
    translation hardware is allowed to manipulate the
    guest page tables.
  • Preferred-machine assist
  • Allow a guest OS system to operate in system mode
    rather than user mode.
  • Segment sharing
  • Sharing the code segments of the operating system
    among the virtual machines, provided the
    operating system code is written in a reentrance

Generalized Support for Virtual Machines
  • Interpretive Execution Facility (IEF)
  • The processor directly executes most of the
    functions of the virtual machine in hardware.
  • An extreme case of a VM assist.
  • Interpretive Execution Entry and Exit
  • Entry
  • Start Interpretive Execution (SIE) The software
    give up control to the hardware IEF part and
    processor enters the interpretive execution mode.
  • Exit
  • Host Interrupt
  • Interception
  • Unsupported hardware instructions.
  • Exception during the execution of interpreted
  • Some special case

Interpretive Execution Entry and Exit
VMM Software
Entry into interpretive execution mode
Exit for interception
Host interrupt handler
Exit for host interrupt
Full-virtualization Versus Para-virtualization
  • Full virtualization
  • Provide total abstraction of the underlying
    physical system and creates a complete virtual
    systems in which the guest operating systems can
  • No modification is required in the guest OS or
  • The guest OS or application is not aware of the
    virtualized environment.
  • Advantages
  • Streamlining the migration of applications and
    workloads between different physical systems.
  • Complete isolation of different applications,
    which make this approach highly secure.
  • Disadvantages
  • Performance penalty
  • Microsoft Virtual Server and Vmware ESX Server

Full-virtualization Versus Para-virtualization
  • Para Virtualization
  • The virtualization technique that presents a
    software interface to virtual machines that is
    similar but not identical to that of the
    underlying hardware.
  • This techniques require modifications to the
    guest OS that are running on the VMs.
  • The guest OSs are aware that they are executing
    on a VM.
  • Advantages
  • Near-native performance
  • Disadvantages
  • Some limitations, including several insecurities
    such as the guest OS cache data, unauthenticated
    connections, and so forth.
  • Xen system

Case StudyVmware Virtual Platform
Vmware Virtual Platform
  • A popular virtual machine infrastructure for
    IA-32-based PCs and server.
  • An example of a hosted virtual machine system
  • Native virtualization architecture product ?
    Vmware ESX Server
  • This book is limited to the hosted system, Vmware
    GSX Server (VMWare2001)
  • Challenges
  • Difficulties to virtualize efficiently based on
    IA-32 environment.
  • The openness of the system architecture.
  • Easy Installation.

Vmwares Hosted Virtual Machine Model
Processor Virtualization
  • Critical Instructions in Intel IA-32 architecture
  • not efficiently virtualizable.
  • Protection system references
  • Reference the storage protection system, memory
    system, or address relocation system. (ex. mov
    ax, cs )
  • Sensitive register instructions
  • Read or change resource-related registers and
    memory locations (ex. POPF)
  • Problems
  • The sensitive instructions executed in user mode
    do not executed as correct as we expected unless
    the instruction is emulated.
  • Solutions
  • The VM monitor substitutes the instruction with
    another set of instruction and emulates the
    action of the original code.

Input/Output Virtualization
  • The PC platform supports many more devices and
    types of devices than any other platform.
  • Emulation in VMMonitor
  • Converting the in and out I/O to new I/O
  • Requires some knowledge of the device interfaces.
  • New Capability for Devices Through Abstraction
  • VMApps ability to insert a layer of abstraction
    above the physical device.
  • Advantages
  • Reduce performance losses due to virtualization.
  • Ex) Virtual Ethernet switch between a virtual NIC
    and a physical NIC.

Using the Services of the Host Operating System
  • The request is converted into a host OS call.
  • Advantages
  • No limitations for VMMs access of the host OSs
    I/O features.
  • Running the Performance-Critical applications

Memory Virtualization
  • Paging requests of the guest OS
  • Not directly intercepted by the VMM, but
    converted into disk read/writes.
  • VMMonitor translates it to requests on the host
    OS throught VMApp.
  • Page replacement policy of host OS
  • The host could replace the critical pages of VM
    system in the competition with other host
  • VMDrivers critical pages pinning for virtual
    memory system.

Vmware ESX Server
  • Native VM
  • A thin software layer designed to multiplex
    hardware resources among virtual machines
  • Providing higher I/O performance and complete
    control over resource management
  • Full Virtualization
  • For servers running multiple instances of
    unmodified operating systems

Page Replacement Issues
  • Problem of double paging
  • Unintended interactions with native memory
    management policies between in guest operating
    systems and host system.
  • Ballooning
  • Reclaims the pages considered least valuable by
    the operating system running in a virtual
  • Small balloon module loaded into the guest OS as
    a pseudo-device driver or kernel service.
  • Module communicates with ESX server via a private

Ballooning in VMware ESX Server
  • Inflating a balloon
  • When the server wants to reclaim memory
  • Driver allocate pinned physical pages within the
  • Increase memory pressure in the guest OS, reclaim
    space to satisfy the driver allocation request
  • Driver communicates the physical page number for
    each allocated page to ESX server
  • Deflating
  • Frees up memory for general use within the guest

Virtualizing I/O Devices on VMware Workstation
  • Supported virtual devices of VMware
  • PS/2 keyboard, PS/2 mouse, floppy drive, IDE
    controllers with ATA disks and ATAPI CD-ROMs, a
    Soundblaster 16 sound card, serial and parallel
    ports, virtual BusLogic SCSI controllers, AMD
    PCNet Ethernet adapters, and an SVGA video
  • Procedures
  • Intercept I/O operations issued by the guest OS.
    ( IA-32 IN and OUT )
  • Emulated either in the VMM or the VMApp.
  • Drawbacks
  • Virtualizing I/O devices can incur overhead from
    world switches between the VMM and the host
  • Handling the privileged instructions used to
    communicate with the hardware

Case StudyThe Intel VT-x (Vanderpool) Technology
  • VT-x (Vanderpool) technology for IA-32 processors
  • enhance the performance VM implementation through
    hardware enhancements of the processor.
  • Main Feature
  • The inclusion of the new VMX mode of operation
    (VMX root/non-root operation)
  • VMX root operation
  • Fully privileged, intended for VM monitor New
    instructions VMX instructions
  • VMX non-root operation
  • Not fully privileged, intended for guest software
  • Reduces Guest SW privilege w/o relying on rings

Technological Overview
VT-x Operations
VMX Non-root Operation
. . .
VM Exit
IA-32 Operation
VMX Root Operation
Capabilities of the Technology
  • A Key aspect
  • The elimination of the need to run all guest code
    in the user mode.
  • Maintenance of state information
  • Major source of overhead in a software-based
  • Hardware technique that allows all of the
    state-holding data elements to be mapped to their
    native structures.
  • VMCS (Virtual Machine Control Structure)
  • Hardware implementation take over the tasks of
    loading and unloading the state from their
    physical locations.

Virtual Machine Control Structure (VMCS)
  • Control Structures in Memory
  • Only one VMCS active per virtual processor at any
    given time
  • VMCS Payload
  • VM execution, VM exit, and VM entry controls
  • Guest and host state
  • VM-exit information fields

Case StudyXen Virtualization
Xen Design Principle
  • Support for unmodified application binaries is
  • Supporting full multi-application operating
    system is important.
  • Paravirtualization is necessary to obtain high
    performance and strong resource isolation.

Xen Features
  • Secure isolation between VMs
  • Resource Control and QoS
  • Only guest kernel needs to be ported
  • All user-level apps and libraries run unmodified.
  • Linux 2.4/2.6 , NetBSD, FreeBSD, WinXP
  • Execution performance is close to native.
  • Live Migration of VMs between Xen nodes.

Xen 3.0 Architecture
Xen para-virtualization
  • Arch Xen/X86 , replace privileged instructions
    with Xen hypercalls.
  • Hypercalls
  • Notifications are delivered to domains from Xen
    using an asynchronous event mechanism
  • Modify OS to understand virtualized environment
  • Wall-clock time vs. virtual processor time
  • Xen provides both types of alarm timer
  • Expose real resource availability
  • Xen Hypervisor
  • Additional protection domain between guest OSes
    and I/O devices.

X86 Processor Virtualization
  • Xen runs in ring 0 (most privileged)
  • Ring 1,2 for guest OS, 3 for user-space
  • Xen lives in top of 64MB of linear address space.
  • Segmentation used to protect Xen as switching
    page tables too slow on standard X86
  • Hypercalls jump to Xen in ring 0
  • Guest OS may install fast trap handler
  • MMU-virtualization shadow vs. direct-mode

Para-virtualizing the MMU
  • Guest OS allocate and manage own page-tables
  • Hypercalls to change PageTable base.
  • Xen Hypervisor is responsible for trapping
    accesses to the virtual page table, validating
    updates and propagating changes.
  • Xen must validate page table updates before use
  • Updates may be queued and batch processed
  • Validation rules applied to each PTE
  • Guest may only map pages it owns
  • XenoLinux implements a balloon driver
  • Adjust a domains memory usage by passing memory
    pages back and forth between Xen and XenoLinux

MMU virtualization
Writable Page Tables
I/O Architecture
  • Asynchronous buffer descriptor rings
  • Using shared-memory
  • Xen I/O-Spaces delegate guest Oses protected
    access to specified h/w devices
  • The guest OS passes buffer information vertically
    through the system.
  • Xen performs validation checks.
  • Xen supports a lightweight event-delivery
    mechanism which is userd for sending asynchronous
    notifications to a domain.

Data Transfer I/O Descriptor Rings
Device Channel Interface
Thank You !
Write a Comment
User Comments (0)