Conclusions from the European Roadmap on Control of Computing Systems - PowerPoint PPT Presentation

1 / 52
About This Presentation
Title:

Conclusions from the European Roadmap on Control of Computing Systems

Description:

... Control of CPU resources Feedback scheduling of control systems, ... based control State and ... for network control Network models suitable for ... – PowerPoint PPT presentation

Number of Views:1482
Avg rating:3.0/5.0
Slides: 53
Provided by: karl89
Category:

less

Transcript and Presenter's Notes

Title: Conclusions from the European Roadmap on Control of Computing Systems


1
Conclusions from the European Roadmap on Control
of Computing Systems
Karl-Erik Årzén, Anders Robertsson, Dan
Henriksson LTH, Lund University, Sweden Mikael
Johansson, Håkan Hjalmarsson, Karl Henrik
Johansson Royal Institute of Technology ,
Sweden
FeBiD06, Vancouver, April 3, 2006
2
Background
  • Recent large research interest,
  • (academically as well as industrially initiated)
    in
  • Control-based methods for resource management
    in real-time computing and communication systems
  • In most cases, allocation of memory, computing
    and/or communication resources

3
Examples
  • Performance control of web-servers,
  • Dynamic resource management in embedded systems,
  • Traffic control in communication networks,
  • Transaction management in database servers,
  • Autonomic computing
  • etc.

4
eBusiness
  • Multi-tier systems of Web browsers, business
    logic and databases
  • Feedback at various levels
  • Queue Control
  • IBM, HP, Microsoft, Amazon, .
  • Challenges
  • Modeling formalisms (DES, ODEs, queuing theory,
    )
  • Design of software and computing systems for
    controllability

courtesy J. Hellerstein
5
ARTIST2
  • Roadmap outcome from ARTIST2-workshop in Lund,
    Sweden, May 2005
  • EU/IST FP6 Network of Excellence
  • Embedded Systems Design
  • NSF-supported workshop on
  • Future trends in control of computer systems
  • by Hellerstein, Tilbury Abdelzaher, May 2005

6
(No Transcript)
7
Roadmap
  • Available for download at
  • http//www.control.lth.se/user/karlerik/roadmap1.p
    df
  • Experiment
  • You have wireless network access try the
    server!
  • or not.

8
An admission (control) problem
9
Report from Swed. Emergency Management Agency
10
How to handle the overload problem?
  • Overprovision
  • (more capacity than needed on average)
  • Admission control
  • Some are denied access, but server continues to
    operate.
  • Change service
  • (sending text-only at high loads)

11
Why is control of computing systems interesting?
  • Multidisciplinary
  • Several new challanges
  • Not covered within one traditional research
    domain (queueing theory, computer science,
    systems and control)
  • Need systematic tools for design and analysis
  • robustness to disturbances
  • better performance
  • Cost of operating computing systems is
    raising/dominating (60-90) Hellerstein et al,
    2005

12
Outline
  • Background Motivation
  • Computer systems in a control theoretic framework
  • Modeling issues
  • Roadmap Research challenges in
  • Control of server systems,
  • Control of CPU resources,
  • Feedback scheduling of control systems,
  • Control of communication networks,
  • Error control of software systems,
  • Control middleware.
  • - - - - - - - - - - - - - - - - - - - - - - - -
    - - - - - - - - - - - - - - - - - - - - - - - - -
    - - - -
  • 4.15 pm Panel Top Three Challenges in Control
    of Networks and Systems

13
Contents of roadmap
  • Six research areas
  • Control of server systems,
  • Control of CPU resources
  • Feedback scheduling of control systems,
  • Control of communication networks,
  • Error control of software systems,
  • Control middleware.

how flexibility, adaptivity, performance and
robustness can be achieved in a real-time
computing or communication system through the use
of control theory
14
Modeling Formalisms
  • Heuristic approach vs. Model based control
  • Inherent robustness from feedback control
  • One reason why many ad hoc stratergies work
  • More can be gained (systematic design analysis)
  • Basic principle Use simple enough models for
    design and analysis
  • Model should capture essential dynamics and show
    similar behavior as system for different
    distributions and load cases.

15
Modeling Formalisms
  • Identification
  • Sampling (SoH), noise, inherent nonlinearities
  • First principles (conservation, queueing
    theory)
  • Computing systems discrete-event dynamic systems
    (DEDS)
  • real-time systems gt timed automata or
    timed Petri nets
  • Risk of state-space explosion (does not scale
    with arrival/service rates)
  • Well-suited for safetey and blocking properties,
    but how does it relate to stability and
    robustness?

16
Modeling of queueing-systems
  • Discrete event models
  • Queue theoretic model (Markov chains etc.)
  • Flow models (cont. time / average models)
  • Discrete time models

17
Modeling aspects
  • Gain-scheduling (standard control principle)
  • Choose among different control-parameters
    depending on e.g., operating condition.
  • Good model structure of corresponding computing
    system may change with work load (e.g., for
    server systems)
  • Flow models OK for high loads
  • DEDS-models feasible för low loads
  • Interpolation between different model
    structures?!
  • Transient vs steady-state behavior

18
Actuator Mechanisms
  • The difference between the service rate, µ, and
    the arrival rate, ?, determines the delay
    experienced by the requests.
  • Enqueue actuators (Changing the arrival rate)
  • Admission control mechanism
  • Change inter-arrival period of task upstream in
    multitiered system
  • Dequeue actuator Changing the service rate
  • Number of server threads
  • Quality adaptation
  • Dynamic voltage scaling

19
Actuators - Implementation aspects
  • Gate model
  • Call gapping accept first u(kh) calls in
    control interval
  • Percent blocking preserves distribution

20
Related reseach areas
  • Similarities/differences
  • of the different domains
  • Traffic flow control
  • Manufacturing and supply chains
  • Communication networks
  • Power networks
  • with respect to
  • Where does the congestion appear?
  • Routing?
  • Available information (dest.)?
  • Time/distance matters?
  • Package dropping OK or not?
  • Control action?

21
Control of server systems
  • Temporal control locally at server
  • Direct or indirect objective
  • (service provider vs. customer)
  • Queue-management and load balancing
  • Inherent nonlinearities
  • Multi-tiered systems including large eCommerce
    systems

22
Example Admission control
  • Objective
  • Good transient behavior for traffic changes
  • Preserve good performance for overload
    situations
  • Measure of admission
  • queue length
  • average time
  • utilization
  • CPU load / energy consumption
  • memory

23
Example Feedforward feedback
24
Control of server systems
  • Prediction and state estimation based control
  • State and actuator constraints
  • Interestings region When do the flow-models
    cease to be valid?
  • Changing models and criteria in different load
    situations...
  • Very exciting new results on discrete-event based
    estimation and control
  • DE-sampling vs. DT-sampling
  • control ratio 1/5,
  • bandwidth allocation 1/2

25
Server systems - Research challenges
  • Modeling issues (as discussed before)
  • Control queueing theory ?
  • Event-based control theory gap
  • Control objectives
  • References (load, utilization)
  • Performance metrics and cost functions
  • (upcrossing probabilities)
  • Security, reliability, availability, efficiency
  • Design patterns/Control patterns
  • Software structure control structure and
    analysis for software design
  • Well known in e.g., process control (ratio
    control, cascade, midranging etc)
  • When should a queue problem be considered as
  • an admission problem?
  • an delay control problem?
  • Large-scale distributed systems / multi-tier
    systems
  • Distributed control, MPC,

26
Control of CPU resources
  • A large amount of feedback-based or adaptive
    global QoS management systems have been proposed.
  • Early ad hoc schemes
  • of multi-level feedback
  • queue scheduling
  • control-theoretical approaches
  • using FC-EDF, EUCON
  • Stancovic, Lu, Buttazzo,

The EDF-FC scheme (from Stankovic et al., 1999)
27
Control of CPU resources The challenges and
research directions
  • Multiprocessor systems
  • Power-aware CPU scheduling
  • Dynamic Voltage Scaling
  • joint optimization problem of minimizing energy
    while still meeting real-time constraints
  • already today receives a considerable attention
    from the research community.
  • End-to-end resource management
  • Resource management in distributed systems where
    an activity spans multiple nodes
  • Hierarchical resource allocation schemes
  • Cascaded structures with local allocation
  • Efficient feedback scheduling mechanisms
  • Scheduling algorithm overhead online
    optimization doable?

28
Feedback scheduling of control tasks
  • Actuation
  • Task period hi
  • Solve two different problems
  • Resource regulation
  • Control the total utilization to avoid overloads
  • Optimal resource distribution
  • Assign individual task periods to optimize
    performance

29
Example Dynamic Real-Time Scheduling of Model
Predictive Controllers
  • Based on on-line optimization of a cost function
  • Convex optimization problem solved in each sample
  • Iterative anytime algorithm
  • Result gradually refined up to a certain bound
  • Attractive control strategy
  • Straightforward to use for multi-variable
    processes
  • Ability to handle constraints
  • Unattractive real-time properties
  • High computational demands
  • Very large variations in execution times

Henriksson et al. 2004
30
Example Feedback scheduling of MPC control
tasksMain idea
  • A process in stationarity may need less resources
    than a process in a transient phase
  • Use feedback from the optimization algorithm to
    determine
  • for each MPC task, when to terminate the
    optimization and output the control signal, and
  • the optimization may be terminated early and
    still produce acceptable results.
  • which of several ready MPC tasks that should be
    scheduled for execution.

Henriksson et. al., 2004
31
  • Current values of the cost functions act as
    dynamic task priorities
  • Constitutes an on-line QoS measure for the task
  • Reflects the relative importance of the tasks
  • Feedback scheduler distributes the computing
    resources
  • Schedules MPC task with highest cost
  • Invoked after each iteration
  • Implemented as a separate task

32
  • Cooperative robot task under resource constraints
  • Master and slave configuration
  • Ball and beam application

33
  • Problems
  • MPC tasks exhibit very large variations in
    execution time
  • Traditional scheduling theory not applicable
  • Solutions
  • Premature termination of optimization
  • Dynamic scheduling based on cost functions

34
The challenges and research directions for
feedback scheduling of control tasks
  • include all the challenges and research direction
    of control of CPU resources.
  • Additionally, the following items are important
  • Temporal robustness indices
  • Formal performance guarantees
  • open question whether it is possible to combine
    the flexibility implied by feedback scheduling
    with formal guarantees

35
Control of Communication Networks
  • Example
  • Feedback control is embedded in the TCP protocol
    in the form of a sliding window mechanism.
  • Introduced in the 80s to solve the congestive
    failure problems that had brought down the
    network.
  • We have not experienced system-wide congestive
    failures again even though the network has grown
    orders of magnitude.
  • This is a testament of the effectiveness of
    feedback control in a highly dynamic,
    decentralized, and fast changing environment.
  • Remark
  • 9.00 Robust yet Fragile Intrinsic Tradeoffs in
    Layered Architectures

36
Control of Communication Networks
  • Feedback control mechanisms are fundamental for
    the separation of communication layers
  • Gives robustness and allows local optimization
    and refinements
  • Example
  • Reliable data transfer over wireless link through
    suitable feedback control of
  • transmission power
  • modulation scheme
  • channel coding

37
Research Challenges in Control of Communication
Networks
  • Architectures and model abstractions for network
    control
  • Network models suitable for control and observer
    design
  • Robustness of large scale and distributed systems
  • Resource management in wireless networks
  • Cross-layer adaptation for new services and
    optimized performance

38
Cross-layer adaptation for improved performance
of cellular and wired networks
  • Bandwidth variations in radio link give
    performance degradations due to large
    end-to-end delay and improper transport protocol
  • Proxy between cellular and wired networks adapt
    sending rate to bandwidth variations through
    available radio link state information

TCP
App Server
3G-SGSN
RNC
BW variations
3G-GGSN
Internet
PROXY
BTS
3G Cellular Network
TCP
BTS
Terminal
39
Proxy hybrid control law
  • Controller in proxy regulates sending rate based
    on
  • Events generated by bandwidth changes obtained
    from RNC
  • Sampled measurements of queue length in RNC

Möller et al., 2005
40
Experimental evaluation
  • Improved time-to-serve-user and link utilization
    compared to traditional end-to-end protocol
  • Stability and robustness analysis of new protocol
  • Ongoing experimental evaluation and testing with

Möller et al., 2005
41
Network-aware control architecture
  • Estimate network state
  • Delay
  • Data loss probability
  • Bandwidth
  • Adjust controller accordingly

42
Network-aware controllers
  • Control algorithms to cope with communication
    imperfections
  • Control under network delay
  • Control under data loss
  • Control under bandwidth limitation
  • Control under topology constraints

Characteristics depend on network technology
43
Delay estimation
  • Internet round-trip time (RTT) data are
    noisy with piecewise constant
    average
  • Complex network dynamics hard to model
  • RTT estimation in TCP
  • Improved estimation thru Kalman filter with
    hypothesis test (CUSUM filter)

Jacobsson et al., 2004
44
Control middleware
  • Middleware
  • a software abstraction layer that mediates the
    interactions between a component or application
  • Commonly used in distributed system to provide
    communication services.
  • Java-RMI, Microsofts .COM, and CORBA
  • Networked embedded system applications,
  • e.g., mobile systems and sensor systems.
  • GAIA Romn et al., 2002, WSAMI Issarny et al.,
    2005, and AURA

45
Control middleware
  • Research Directions
  • The most important research item for control
    middleware is to develop these systems from
    research prototypes to something that may be used
    more widely.
  • Middleware functionality
  • Still an open question whether the middleware
    should
  • be passive, i.e., provide sensing and actuation
    services that the application can use to itself
    implement the feedback control, or if it should
    be
  • active, i.e., the middleware should be
    responsible for the actual control loop.
  • Both of these approaches have advantages and
    disadvantages.

46
Error control of software systems L.Sha
  • The idea behind error control of software is to
    use ideas similar to the ideas used in feedback
    control in order to detect malfunctioning
    software components and, in that case fall back
    on, a well-tested core software component that is
    able to provide the basic application service
    with guarantees on performance and safety.
  • Provide techniques and tools that support making
    the semantic assumptions of each software
    component explicit and machine checkable.

47
  • Simple and reliable core
  • System remain in recoverable states
  • SIMPLEX-architecture Sha
  • High accurance vs high performance
  • Need to stay in recoverable state
  • Runs in parallell --- cmp bumpless transfer
  • --------------------------------------------------
    --------------------------
  • ORTGA FeBID06
  • Maximum stability region
  • How to detect conditions for switches? (FDI)
  • False alarm vs. Non-recovery risk of instability

48
Roadmap
  • Available for download at
  • http//www.control.lth.se/user/karlerik/roadmap1.p
    df

49
(No Transcript)
50
Conclusions
  • Thank you for your attention!
  • Questions?
  • Panel debate

51
(No Transcript)
52
Proposed solutions for wireless TCP
  • Split connection
  • Destroys end-to-end semantics
  • End-to-end protocols
  • Deployment issues
  • Link-layer improvements
  • Performance limitations
  • E.g., Balakrishnan et al., Ludwig and Katz,
    Xylomenos et al., Huang et al., Hossain et al.,
    RFC 3135 and 3366,
Write a Comment
User Comments (0)
About PowerShow.com