Conclusions from the European Roadmap on Control of Computing Systems

About This Presentation

Title:

Conclusions from the European Roadmap on Control of Computing Systems

Description:

... Control of CPU resources Feedback scheduling of control systems, ... based control State and ... for network control Network models suitable for ... – PowerPoint PPT presentation

Number of Views:1487

Avg rating:3.0/5.0

Slides: 53

Provided by: karl89

Learn more at: http://www.controlofsystems.org

Category:

more less

Transcript and Presenter's Notes

Title: Conclusions from the European Roadmap on Control of Computing Systems

1
Conclusions from the European Roadmap on Control
of Computing Systems
Karl-Erik Årzén, Anders Robertsson, Dan
Henriksson LTH, Lund University, Sweden Mikael
Johansson, Håkan Hjalmarsson, Karl Henrik
Johansson Royal Institute of Technology ,
Sweden
FeBiD06, Vancouver, April 3, 2006
2
Background

Recent large research interest,
(academically as well as industrially initiated)
in
Control-based methods for resource management
in real-time computing and communication systems
In most cases, allocation of memory, computing
and/or communication resources

3
Examples

Performance control of web-servers,
Dynamic resource management in embedded systems,
Traffic control in communication networks,
Transaction management in database servers,
Autonomic computing
etc.

4
eBusiness

Multi-tier systems of Web browsers, business
logic and databases
Feedback at various levels
Queue Control
IBM, HP, Microsoft, Amazon, .
Challenges
Modeling formalisms (DES, ODEs, queuing theory,
)
Design of software and computing systems for
controllability

courtesy J. Hellerstein
5
ARTIST2

Roadmap outcome from ARTIST2-workshop in Lund,
Sweden, May 2005
EU/IST FP6 Network of Excellence
Embedded Systems Design
NSF-supported workshop on
Future trends in control of computer systems
by Hellerstein, Tilbury Abdelzaher, May 2005

6
(No Transcript)
7
Roadmap

Available for download at
http//www.control.lth.se/user/karlerik/roadmap1.p
df
Experiment
You have wireless network access try the
server!
or not.

8
An admission (control) problem
9
Report from Swed. Emergency Management Agency
10
How to handle the overload problem?

Overprovision
(more capacity than needed on average)
Admission control
Some are denied access, but server continues to
operate.
Change service
(sending text-only at high loads)

11
Why is control of computing systems interesting?

Multidisciplinary
Several new challanges
Not covered within one traditional research
domain (queueing theory, computer science,
systems and control)
Need systematic tools for design and analysis
robustness to disturbances
better performance
Cost of operating computing systems is
raising/dominating (60-90) Hellerstein et al,
2005

12
Outline

Background Motivation
Computer systems in a control theoretic framework
Modeling issues
Roadmap Research challenges in
Control of server systems,
Control of CPU resources,
Feedback scheduling of control systems,
Control of communication networks,
Error control of software systems,
Control middleware.
- - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - -
- - - -
4.15 pm Panel Top Three Challenges in Control
of Networks and Systems

13
Contents of roadmap

Six research areas
Control of server systems,
Control of CPU resources
Feedback scheduling of control systems,
Control of communication networks,
Error control of software systems,
Control middleware.

how flexibility, adaptivity, performance and
robustness can be achieved in a real-time
computing or communication system through the use
of control theory
14
Modeling Formalisms

Heuristic approach vs. Model based control
Inherent robustness from feedback control
One reason why many ad hoc stratergies work
More can be gained (systematic design analysis)
Basic principle Use simple enough models for
design and analysis
Model should capture essential dynamics and show
similar behavior as system for different
distributions and load cases.

15
Modeling Formalisms

Identification
Sampling (SoH), noise, inherent nonlinearities
First principles (conservation, queueing
theory)
Computing systems discrete-event dynamic systems
(DEDS)
real-time systems gt timed automata or
timed Petri nets
Risk of state-space explosion (does not scale
with arrival/service rates)
Well-suited for safetey and blocking properties,
but how does it relate to stability and
robustness?

16
Modeling of queueing-systems

Discrete event models
Queue theoretic model (Markov chains etc.)
Flow models (cont. time / average models)
Discrete time models

17
Modeling aspects

Gain-scheduling (standard control principle)
Choose among different control-parameters
depending on e.g., operating condition.
Good model structure of corresponding computing
system may change with work load (e.g., for
server systems)
Flow models OK for high loads
DEDS-models feasible för low loads
Interpolation between different model
structures?!
Transient vs steady-state behavior

18
Actuator Mechanisms

The difference between the service rate, µ, and
the arrival rate, ?, determines the delay
experienced by the requests.
Enqueue actuators (Changing the arrival rate)
Admission control mechanism
Change inter-arrival period of task upstream in
multitiered system
Dequeue actuator Changing the service rate
Number of server threads
Quality adaptation
Dynamic voltage scaling

19
Actuators - Implementation aspects

Gate model
Call gapping accept first u(kh) calls in
control interval
Percent blocking preserves distribution

20
Related reseach areas

Similarities/differences
of the different domains
Traffic flow control
Manufacturing and supply chains
Communication networks
Power networks
with respect to
Where does the congestion appear?
Routing?
Available information (dest.)?
Time/distance matters?
Package dropping OK or not?
Control action?

21
Control of server systems

Temporal control locally at server
Direct or indirect objective
(service provider vs. customer)
Queue-management and load balancing
Inherent nonlinearities
Multi-tiered systems including large eCommerce
systems

22
Example Admission control

Objective
Good transient behavior for traffic changes
Preserve good performance for overload
situations
Measure of admission
queue length
average time
utilization
CPU load / energy consumption
memory

23
Example Feedforward feedback
24
Control of server systems

Prediction and state estimation based control
State and actuator constraints
Interestings region When do the flow-models
cease to be valid?
Changing models and criteria in different load
situations...
Very exciting new results on discrete-event based
estimation and control
DE-sampling vs. DT-sampling
control ratio 1/5,
bandwidth allocation 1/2

25
Server systems - Research challenges

Modeling issues (as discussed before)
Control queueing theory ?
Event-based control theory gap
Control objectives
References (load, utilization)
Performance metrics and cost functions
(upcrossing probabilities)
Security, reliability, availability, efficiency
Design patterns/Control patterns
Software structure control structure and
analysis for software design
Well known in e.g., process control (ratio
control, cascade, midranging etc)
When should a queue problem be considered as
an admission problem?
an delay control problem?
Large-scale distributed systems / multi-tier
systems
Distributed control, MPC,

26
Control of CPU resources

A large amount of feedback-based or adaptive
global QoS management systems have been proposed.
Early ad hoc schemes
of multi-level feedback
queue scheduling
control-theoretical approaches
using FC-EDF, EUCON
Stancovic, Lu, Buttazzo,

The EDF-FC scheme (from Stankovic et al., 1999)
27
Control of CPU resources The challenges and
research directions

Multiprocessor systems
Power-aware CPU scheduling
Dynamic Voltage Scaling
joint optimization problem of minimizing energy
while still meeting real-time constraints
already today receives a considerable attention
from the research community.
End-to-end resource management
Resource management in distributed systems where
an activity spans multiple nodes
Hierarchical resource allocation schemes
Cascaded structures with local allocation
Efficient feedback scheduling mechanisms
Scheduling algorithm overhead online
optimization doable?

28
Feedback scheduling of control tasks

Actuation
Task period hi
Solve two different problems
Resource regulation
Control the total utilization to avoid overloads
Optimal resource distribution
Assign individual task periods to optimize
performance

29
Example Dynamic Real-Time Scheduling of Model
Predictive Controllers

Based on on-line optimization of a cost function
Convex optimization problem solved in each sample
Iterative anytime algorithm
Result gradually refined up to a certain bound
Attractive control strategy
Straightforward to use for multi-variable
processes
Ability to handle constraints
Unattractive real-time properties
High computational demands
Very large variations in execution times

Henriksson et al. 2004
30
Example Feedback scheduling of MPC control
tasksMain idea

A process in stationarity may need less resources
than a process in a transient phase
Use feedback from the optimization algorithm to
determine
for each MPC task, when to terminate the
optimization and output the control signal, and
the optimization may be terminated early and
still produce acceptable results.
which of several ready MPC tasks that should be
scheduled for execution.

Henriksson et. al., 2004
31

Current values of the cost functions act as
dynamic task priorities
Constitutes an on-line QoS measure for the task
Reflects the relative importance of the tasks
Feedback scheduler distributes the computing
resources
Schedules MPC task with highest cost
Invoked after each iteration
Implemented as a separate task

Cooperative robot task under resource constraints
Master and slave configuration
Ball and beam application

Problems
MPC tasks exhibit very large variations in
execution time
Traditional scheduling theory not applicable
Solutions
Premature termination of optimization
Dynamic scheduling based on cost functions

34
The challenges and research directions for
feedback scheduling of control tasks

include all the challenges and research direction
of control of CPU resources.
Additionally, the following items are important
Temporal robustness indices
Formal performance guarantees
open question whether it is possible to combine
the flexibility implied by feedback scheduling
with formal guarantees

35
Control of Communication Networks

Example
Feedback control is embedded in the TCP protocol
in the form of a sliding window mechanism.
Introduced in the 80s to solve the congestive
failure problems that had brought down the
network.
We have not experienced system-wide congestive
failures again even though the network has grown
orders of magnitude.
This is a testament of the effectiveness of
feedback control in a highly dynamic,
decentralized, and fast changing environment.
Remark
9.00 Robust yet Fragile Intrinsic Tradeoffs in
Layered Architectures

36
Control of Communication Networks

Feedback control mechanisms are fundamental for
the separation of communication layers
Gives robustness and allows local optimization
and refinements

Example
Reliable data transfer over wireless link through
suitable feedback control of
transmission power
modulation scheme
channel coding

37
Research Challenges in Control of Communication
Networks

Architectures and model abstractions for network
control
Network models suitable for control and observer
design
Robustness of large scale and distributed systems
Resource management in wireless networks
Cross-layer adaptation for new services and
optimized performance

38
Cross-layer adaptation for improved performance
of cellular and wired networks

Bandwidth variations in radio link give
performance degradations due to large
end-to-end delay and improper transport protocol
Proxy between cellular and wired networks adapt
sending rate to bandwidth variations through
available radio link state information

TCP
App Server
3G-SGSN
RNC
BW variations
3G-GGSN
Internet
PROXY
BTS
3G Cellular Network
TCP
BTS
Terminal
39
Proxy hybrid control law

Controller in proxy regulates sending rate based
on
Events generated by bandwidth changes obtained
from RNC
Sampled measurements of queue length in RNC

Möller et al., 2005
40
Experimental evaluation

Improved time-to-serve-user and link utilization
compared to traditional end-to-end protocol

Stability and robustness analysis of new protocol
Ongoing experimental evaluation and testing with

Möller et al., 2005
41
Network-aware control architecture

Estimate network state
Delay
Data loss probability
Bandwidth
Adjust controller accordingly

42
Network-aware controllers

Control algorithms to cope with communication
imperfections
Control under network delay
Control under data loss
Control under bandwidth limitation
Control under topology constraints

Characteristics depend on network technology
43
Delay estimation

Internet round-trip time (RTT) data are
noisy with piecewise constant
average
Complex network dynamics hard to model
RTT estimation in TCP
Improved estimation thru Kalman filter with
hypothesis test (CUSUM filter)

Jacobsson et al., 2004
44
Control middleware

Middleware
a software abstraction layer that mediates the
interactions between a component or application
Commonly used in distributed system to provide
communication services.
Java-RMI, Microsofts .COM, and CORBA
Networked embedded system applications,
e.g., mobile systems and sensor systems.
GAIA Romn et al., 2002, WSAMI Issarny et al.,
2005, and AURA

45
Control middleware

Research Directions
The most important research item for control
middleware is to develop these systems from
research prototypes to something that may be used
more widely.
Middleware functionality
Still an open question whether the middleware
should
be passive, i.e., provide sensing and actuation
services that the application can use to itself
implement the feedback control, or if it should
be
active, i.e., the middleware should be
responsible for the actual control loop.
Both of these approaches have advantages and
disadvantages.

46
Error control of software systems L.Sha

The idea behind error control of software is to
use ideas similar to the ideas used in feedback
control in order to detect malfunctioning
software components and, in that case fall back
on, a well-tested core software component that is
able to provide the basic application service
with guarantees on performance and safety.
Provide techniques and tools that support making
the semantic assumptions of each software
component explicit and machine checkable.

Simple and reliable core
System remain in recoverable states
SIMPLEX-architecture Sha
High accurance vs high performance
Need to stay in recoverable state
Runs in parallell --- cmp bumpless transfer
--------------------------------------------------
--------------------------
ORTGA FeBID06
Maximum stability region
How to detect conditions for switches? (FDI)
False alarm vs. Non-recovery risk of instability

48
Roadmap

Available for download at
http//www.control.lth.se/user/karlerik/roadmap1.p
df

49
(No Transcript)
50
Conclusions

Thank you for your attention!
Questions?
Panel debate

51
(No Transcript)
52
Proposed solutions for wireless TCP

Split connection
Destroys end-to-end semantics
End-to-end protocols
Deployment issues
Link-layer improvements
Performance limitations
E.g., Balakrishnan et al., Ludwig and Katz,
Xylomenos et al., Huang et al., Hossain et al.,
RFC 3135 and 3366,

Write a Comment

User Comments (0)