Eclipse: an Operating System with Quality of Service support - PowerPoint PPT Presentation

About This Presentation
Title:

Eclipse: an Operating System with Quality of Service support

Description:

company A. page 2. 0.4. 0.4. 0.2. hierarchical. proportional ... Hosting two companies' web sites, each with two web pages. cpu cycles. company B. company A ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 68
Provided by: banuo5
Category:

less

Transcript and Presenter's Notes

Title: Eclipse: an Operating System with Quality of Service support


1
QoS Support in Operating Systems
Banu Özden Bell Laboratories ozden_at_research.bell
-labs.com
2
Vision
  • Service providers will offer storage and
    computing services
  • through their distributed data centers
  • connected with high bandwidth networks
  • to globally distributed clients.
  • Clients will access these services via diverse
    devices and networks, e.g.
  • mobile devices and wireless networks,
  • high-end computer systems and high bandwidth
    networks.
  • These services will become utilities (e.g.,
    storage utility, computing utility).
  • Eventually resources will be exchanged and traded
    between geographically dispersed data centers to
    address fluctuating demand.

3
Eclipse/BSDan Operating System with Quality of
Service Support
Banu Özden ozden_at_research.bell-labs.com
4
Motivation
  • QoS support for (server) applications
  • web servers
  • video servers
  • Isolation and differentiation of different
  • entities serviced on the same platform
  • applications running on the same platform
  • QoS requirements
  • client-based
  • service-based
  • content-based

5
Design Goals
  • QoS support in a general purpose operating system
  • Remain compatible with the underlying operating
    system
  • QoS parameters
  • Isolation
  • Differentiation
  • Fairness
  • (Cumulative) throughput
  • Flexible resource management
  • capable of implementing a large set of
    provisioning needs
  • supports a large set of server applications
    without imposing significant changes to their
    design

6
Talk Outline
  • Schedulers
  • Reservation File System (reservfs)
  • Tagging
  • Web Server Experiments
  • Access Control and Profiles
  • Eclipse/BSD Status
  • Related Work
  • Future Work

7
Proportional sharing
  • Generalized processor sharing (GPS)
  • weight of flow i
  • service received by flow i in
  • set of flows
  • For any flow i continuously backlogged in
  • Thus, rate of flow i in is

8
QoS Guarantees
  • Fairness
  • Throughput
  • Packet delay

9
Schedulers in Eclipse
  • Resource characteristics differ
  • Different hierarchical proportional-share
    schedulers for resources
  • Link scheduler WF2Q
  • Disk scheduler YFQ
  • CPU scheduler MTR-LS
  • Network input SRP

10
Hierarchical GPS Example
hierarchical proportional sharing
proportional sharing
11
Schedulers
  • Hierarchical proportional-sharing (GPS)
  • descendant queue nodes of node
    n
  • serviced received by scheduler
    node n
  • in
  • set of immediate descendant nodes of the
    parent of node n
  • For any node n continuously backlogged in

12
Link Aggregation
  • Need to incrementally scale bandwidth
  • Resource aggregation is emerging as a solution
  • Grouping multiple resources into a single logical
    unit
  • QoS over such aggregated links?

13
Multi-Server Model
  • Multi Server Fair Queuing (MSFQ)
  • A packetized algorithm for a system with N links,
    each with a bandwidth of r, that approximates a
    GPS system with a single link with Nr bandwidth

Reference model
Packetized scheduler
14
Multi-Server Model (Contd.)
  • Goals
  • Guarantee bandwidth and packet delay bounds that
    are independent of the number of flows
  • Allow flows arrive and depart dynamically
  • Be work-conserving
  • Algorithm
  • When a server is idle, schedule the packet that
    would complete transmission earliest under a
    single server GPS system with a bandwidth of Nr

Sigcomm 2001
15
MSFQ Preliminary Properties
Multi-Server specific properties
  • Ordering a pair of packets scheduled in the
    order of their GPS finishing times may complete
    in reverse order
  • GPS busy MSFQ busy, but converse is not true
  • Non-coinciding busy periods
  • Work backlog?

16
MSFQ Properties
  • Maximum service discrepancy (buffer requirement)
  • Maximum packet delay
  • Maximum per-flow service discrepancy

17
Schedulers (contd.)
  • Disk scheduling with QoS
  • tradeoffs between QoS and total disk performance
  • driver queue management
  • queue depth
  • queue ordering
  • fragmentation
  • Hierarchical YFQ
  • CPU scheduling with QoS
  • length of cpu phases are not known a priori
  • cumulative throughput
  • Hierarchical MTR-LS

18
Eclipses Key Elements
  • Hierarchical, proportional share resource
    schedulers
  • Reservation, reservation file system (reservfs)
  • Tagging mechanism
  • Access and admission control, reservation domain

19
Reservations and Schedulers
  • (Resource) reservations
  • unit for QoS assignment
  • similar to the concept of a flow in packet
    scheduling
  • Hierarchical schedulers
  • a tree with two kinds of nodes
  • scheduler nodes
  • queue nodes
  • each node corresponds to a reservation
  • Schedulers are dynamically reconfigurable

20
Web Server Example
  • Hosting two companies web sites, each with two
    web pages

network bandwidth
company B
company A
21
Reservfs
  • We built the reservation file system
  • to create and manipulate reservations
  • to access and configure resource schedulers

22
Reservfs
  • Hierarchical
  • Each reservation directory corresponds to a node
    at a scheduler
  • Each resource is represented by a reservation
    directory under /reserv

23
Reservfs
  • Two types of reservation directories
  • scheduler directories
  • queue directories
  • Scheduler directories are hierarchically
    expandable
  • Queue directories are not expandable

24
Reservfs
  • Scheduler directory
  • share
  • newqueue
  • newreserv
  • special queue q0
  • Queue directory
  • share
  • backlog

25
Reservfs
Web Server
Video Server
Application Interface
Reservation file system
Scheduler Interface
26
Reservfs API
  • Creation of a new queue/scheduler reservation
  • fdopen(newqueue/newreserve,O_CREAT)
  • fd of newly created share file

27
Creating Queue Reservation
/reserv
cpu
fxp0
da0
fxp1
q0
q0
q0
r1
q0
q0
q1
open(newqueue,O_CREAT)
fd
28
Creating Scheduler Reservation
/reserv
cpu
fxp0
fxp1
q0
q0
q0
r1
q0
q1
29
Reservfs API
  • Changing QoS parameters
  • writing a weight and min value to the share file
  • Getting QoS parameters
  • reading the share file
  • Getting/setting queue parameters
  • reading/writing the backlog file

30
Reservfs API
Command line output
killerbee cd /reserv killerbee ls -al total
5 dr-xr-xr-x 0 root wheel 512 Sep 15 1137
. drwxr-xr-x 20 root wheel 512 Sep 12 2154
.. dr-xr-xr-x 0 root wheel 512 Sep 15 1137
cpu dr-xr-xr-x 0 root wheel 512 Sep 15 1137
fxp0 dr-xr-xr-x 0 root wheel 512 Sep 15 1137
fxp1
killerbee cd fxp0 killerbee ls -alR total
6 dr-xr-xr-x 0 root wheel 512 Sep 15 1139
. dr-xr-xr-x 0 root wheel 512 Sep 15 1139
.. -rw------- 1 root wheel 1 Sep 15 1139
newqueue -rw------- 1 root wheel 1 Sep 15
1139 newreserv dr-xr-xr-x 0 root wheel 512
Sep 15 1139 q0 -r-------- 1 root wheel 1
Sep 15 1139 share ./q0 total 4 dr-xr-xr-x 0
root wheel 512 Sep 15 1139 . dr-xr-xr-x 0
root wheel 512 Sep 15 1139 .. -rw------- 1
root wheel 1 Sep 15 1139 backlog -rw-------
1 root wheel 1 Sep 15 1139 share
31
Reservfs API
killerbee cd q1 killerbee ls -al total
4 dr-xr-xr-x 0 root wheel 512 Sep 15 1139
. dr-xr-xr-x 0 root wheel 512 Sep 15 1139
.. -rw------- 1 root wheel 1 Sep 15 1139
share -rw------- 1 root wheel 1 Sep 15 1139
backlog killerbee cat share 50
1000000 killerbee
killerbee cd r0 killerbee ls -al total
6 dr-xr-xr-x 0 root wheel 512 Sep 15 1139
. dr-xr-xr-x 0 root wheel 512 Sep 15 1139
.. -rw------- 1 root wheel 1 Sep 15 1139
newqueue -rw------- 1 root wheel 1 Sep 15
1139 newreserv dr-xr-xr-x 0 root wheel 512
Sep 15 1139 q0 -r-------- 1 root wheel 1
Sep 15 1139 share killerbee echo 50 1000000 gt
newqueue killerbee ls -al total 6 dr-xr-xr-x 0
root wheel 512 Sep 15 1139 . dr-xr-xr-x 0
root wheel 512 Sep 15 1139 .. -rw------- 1
root wheel 1 Sep 15 1139 newqueue -rw-------
1 root wheel 1 Sep 15 1139
newreserv dr-xr-xr-x 0 root wheel 512 Sep 15
1139 q0 dr-xr-xr-x 0 root wheel 512 Sep 15
1139 q1 -r-------- 1 root wheel 1 Sep 15
1139 share
32
Reservfs
Web Server
Video Server
Application Interface
Reservation file system
Scheduler Interface
33
Reservfs Scheduler Interface
  • Schedulers registers by providing
  • the following interface routines via
  • reservfs_register()
  • init(priv)
  • create(priv, parent, type)
  • start(priv, parent, type)
  • delete(priv, node)
  • get/set(priv, node, values, type)

34
Reservfs Implementation
  • Built via vnode/vfs interface
  • A reserv structure represents each reservfs
    file
  • reserv representing a directory contains a
    pointer to the corresponding node at scheduler
  • Scheduler independent
  • Implements garbage collection mechanism

35
Talk Outline
  • Introduction
  • Schedulers
  • Reservation File System (reservfs)
  • Tagging
  • Web Server Experiments
  • Access Control and Profiles
  • Eclipse/BSD Status
  • Related Work
  • Future Work

36
Tagging
  • A request arriving at a scheduler must be
    associated with the appropriate reservation
  • Each request is tagged with a pointer to a queue
    node
  • mbuf, buf and proc are augmented
  • How is a request tagged?

37
Tagging (contd.)
  • For a file, its file descriptor is tagged with a
    disk reservation
  • For a connected socket, its file descriptor is
    tagged with a network reservation
  • For unconnected sockets, we provide a late
    tagging mechanism
  • Each process is tagged with a cpu reservation
  • We associate reservations with references to
    objects

38
Default List of a Process
  • Default reservations of a process, one for each
    resource
  • A list of tags (pointers to queue directories)
  • Used when a tag is otherwise not specified
  • Two new files are added for each process pid in
    /proc/pid
  • /proc/pid/default to represent the default list
  • /proc/pid/cdefault to represent the child
    default list

39
Default List of a Process (contd.)
  • Reading these file returns the name of default
    queue directories, e.g.,
  • /reserv/cpu/q1
  • /reserv/fxp0/r2/q1
  • /reserv/da0/r1/q3
  • A process, with the appropriate access rights,
    can change the entries of default files

40
Implicit Tagging
  • The file descriptor returned by open(), accept()
    or connect() is automatically tagged with
    default
  • The tag of the file descriptor of an unconnected
    socket is set to default at sendto() and
    sendmesg()
  • When a process forks, the child process is tagged
    with the default cpu reservation

41
Explicit Tagging
  • The tag of a file descriptor can be set/read with
    new commands to fcntl()
  • F_SET_RES
  • F_GET_RES
  • A new system call chcpures() to change the cpu
    reservation of a process

42
Reservation Domains
  • Permissions of a process to use, create and
    manipulate reservations
  • The reservation domain of a process is
    independent of its protection domain

43
Reservations and Reservation Domains
Reservation domain
1
Reservation domain 2
44
Reservfs Garbage Collection
  • Based on reference counts
  • every application that is using a specific node
    adds a reference on it (to the vnode)
  • Triggered by the vnode layer
  • when the last application finishes using the node
    this is garbage collected
  • fcntl() available to maintain the node even if no
    references to it exist

45
SRP Input Processing
  • Demultiples incoming packets
  • before network and higher-level protocol
    processing
  • Unprocesed input queue per socket
  • Processes input protocols in context of receiving
    process
  • Drops packets when per-socket queue is full
  • Avoids receive livelock

46
Talk Outline
  • Introduction
  • Schedulers
  • Reservation File System (reservfs)
  • Tagging
  • Web Server Experiments
  • Access Control and Profiles
  • Eclipse/BSD Status
  • Related Work
  • Future Work

47
QoS Support for Web Server
  • Virtual hosting with Apache server
  • separate Apache server for each virtual host
  • single Apache server for all virtual hosts
  • Eclipse/BSD isolates and differentiates
    performance of virtual hosts
  • multiple Apache servers----implicit tagging
  • single Apache server----explicit tagging
  • We implemented an Apache module for explicit
    tagging

48
Experimental Setup
  • Apache Web Server
  • A multi-process server
  • (Pre)spawns helper processes
  • A process handles one request at a time
  • Each process calls accept() to service the next
    connection request
  • HTTP clients run on five different machines
  • Servers are running FreeBSD 2.2.8 or Eclipse/BSD
    2.2.8 on a PC (266 MHz Pentium Pro, 64 MB RAM, 9
    GB Seagate ST39173W fast wide SCSI disk)
  • Machines are connected with a 10/100 Mbps
    Ethernet switch

49
Experiments
  • Hosting two sites with two servers

Reservation domain of server 1
Reservation domain of server 2
50
CPU Intensive Workload
51
CPU Intensive Workload
52
Network Intensive Workload
53
Disk Intensive Workload
54
Input Intensive Workload
55
Input Intensive Workload
56
Experiments
  • Hosting virtual hosts with a single Apache server
  • Four web sites

57
Apache Module for Tagging
  • Apache code not modified module added
  • Apache config defines which reservation to use
    based on a rule, e.g.,
  • directory-based
  • port-based
  • Module uses fcntl() and chcpures() for explicit
    tagging

58
Isolating Web Sites
Eclipse/BSD
59
Isolating Web Sites
FreeBSD
60
Talk Outline
  • Introduction
  • Reservation File System (reservfs)
  • Tagging
  • Schedulers
  • Apache Web Server Experiments
  • Access Control and Profiles
  • Eclipse/BSD Status
  • Related Work
  • Future Work

61
Access Control
  • Permissions of a process to use or modify the
    objects belonging to the reservfs
  • Currently, a process can use/modify reservations
    below its default list
  • Soon, Eclipse/BSD will have more sophisticated
    access control
  • process can have different permissions on a
    reservation (e.g., permission for tagging but
    not for modifying)
  • process can have permission on arbitrary set of
    reservations

62
Multiple Default Lists Profiles
  • Multiple default lists (profiles) simplifies
    explicit tagging
  • Server applications typically serve different
    entities (depending on client, content, etc.)
    with different QoS assignments
  • Global list of system-wide profiles
  • Profiles provide an easy way to manage and share
    default reservations of different entities

63
Talk Outline
  • Introduction
  • Reservation File System (reservfs)
  • Tagging
  • Schedulers
  • Apache Web Server Experiments
  • Access Control and Profiles
  • Eclipse/BSD Status
  • Related Work
  • Future Work

64
Eclipse/BSD Status
  • Derived from FreeBSD
  • 3.2
  • 2.2.8
  • FreeBSD compatible
  • Eclipse/BSD code is available at
    http//www.bell-labs.com/project/eclipse
    including
  • reservfs
  • hierarchical network scheduling
  • hierarchical disk scheduling
  • hierarchical cpu scheduling
  • input scheduling
  • also, Apache module for tagging and other
    applications

65
Related Work
  • ALTQ
  • good for routers
  • not sufficient for QoS support in a
    general-purpose OS
  • Resource Containers
  • different from Reservation Domains
  • limited (similar to our Profiles)
  • not flexible enough to specify a number of useful
    provisioning needs

66
Future work
  • QoS on cluster of servers
  • Support for fine-grained automatic tagging
  • More server applications
  • Supporting other QoS parameters
  • Other schedulers

67
Eclipse/BSDan Operating System with Quality of
Service Support
Banu Özden ozden_at_research.bell-labs.com
Write a Comment
User Comments (0)
About PowerShow.com