Control%20Theory%20in%20Log%20Processing%20Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Control%20Theory%20in%20Log%20Processing%20Systems

Description:

Log data are data streams. Preprocessing tasks are ... load splitter. 5. 2. 6. 3. 4. 1. SLT 1. SLT 2. TCQ. query Q. TCQ. query Q. TCQ. query Q. combiner ... – PowerPoint PPT presentation

Number of Views:13
Avg rating:3.0/5.0
Slides: 45
Provided by: abc99
Category:

less

Transcript and Presenter's Notes

Title: Control%20Theory%20in%20Log%20Processing%20Systems


1
Control Theory in Log Processing Systems
  • Wei Xu (xuw_at_cs.berkeley.edu)
  • UC Berkeley
  • Joseph L. Hellerstein
  • IBM T.J. Watson Research Center

2
Outline
  • Data streams and log processing
  • Applying control theory
  • Controlling queue length
  • Load balancing
  • Lessons learned

3
Introduction
  • Goal of our project
  • A tool
  • A testbed
  • Problem data rate up to 1 TB a day
  • Distributed Infrastructure
  • How to make itself reliable?

4
Example of system log data
  • request data
  • Apache log, etc
  • performance data
  • CPU, mem etc.
  • failure data
  • Detected problems /error messages
  • reports from operators

5
The big picture
Production System
Data Collection Automatic analysis
preprocessing
?
?
Repository
Sanitized Data
?
Failure Detection
6
Preprocessing
  • Sanitize the data
  • Put logs into common format
  • Merge information from various sources
  • Sampling
  • Needs to be fast

7
Stream processing
  • Log data are data streams
  • Preprocessing tasks are continuous queries
  • Telegraph Continuous Query (TCQ)
  • SQL queries
  • adaptive execution optimized on-the-fly
  • performance doesnt depend on queries

8
Data preprocessing architecture
load splitter
combiner
SLT 1
SLT 2
Intra-Event Processing
Inter-Event Processing
9
Problem performance disturbance
  • CPU contention
  • Maintenance Tasks
  • Packets drop
  • Other failures

SELECTIVITY changes
10
The result of disturbance
End to End Response time (ms)
Time (second)
11
Solution Control Theory
  • Treat this as a failure?
  • Not necessary and too expensive
  • Feedback control theory as first tier defense
    mechanism
  • Try to make it stable at least for sometime
  • If doesnt work out, try recovery

12
Outline
  • Data streams and log processing
  • Applying control theory
  • Controlling queue length
  • Load balancing
  • Lessons learned

13
The problem
Source
Buffer
TCQ
Result Q
14
Why does this happen?
TCQ Complex internal structure
Input Buffer
Controlled Data Source
TCQ drops tuples silently if result queue is
full Back pressure not possible
15
Control Problems
  • Goal?
  • No dropping tuples
  • What to control?
  • The result queue length
  • The Knob?
  • Input data rate to the TCQ node

16
Control block diagram
Target system (System identification)
u(k)u(k-1)(KpKI)e(k)-Kpe(k-1)
Error
Data rate in next interval
Last Error
Data rate in last interval
17
Result Under CPU Contention
Source
Buffer
TCQ
Result Q
18
Why useful?
  • Original system
  • Input data rate gttuple drop v.s. not drop
  • New system
  • Input data rate gt Response time
  • Make it ready for load balancing

19
Outline
  • System log as data streams
  • Applying control theory
  • Controlling queue length
  • Load balancing
  • Lessons learned

20
The problem
  • Barrier in system
  • Different response times
  • End to end response time matches the slower node

21
The control problem
  • Goal?
  • Make the response time equal
  • What to control?
  • Response time on each node
  • The knob?
  • Tuples assigned to each node
  • What to monitor?
  • Queue length v.s. response time

22
System with control
Response time
23
Control block diagram
24
Result
End to End Response time (ms)
Time (second)
25
Outline
  • System log as data streams
  • Applying control theory
  • Controlling queue length
  • Load balancing
  • Lessons learned

26
Advantages of control theory
  • Performance can be analyzed
  • Stability
  • Accuracy
  • Settling time
  • Overshoot

27
Other advantages
  • Simple implementation
  • Encourage good system design
  • Modeling the system
  • Treat system as black box
  • First defense mechanism against disturbances in
    system

28
Limitations
  • Not all software systems are designed to be
    controlled
  • Finite input produces unbounded output
  • E.g. Join in TCQ
  • Useful state not measurable
  • Queuing theory helps, but lacks other good theory
  • Many binary variables
  • Failed v.s working correctly

29
Other Limitations
  • The model for target system is complex
  • Lack of a reliable knob
  • E.g. change result queue length of TCQ sometime
    it crash
  • What is the range you can turn?
  • How often you can turn?
  • How long will the system respond?
  • Can not find the cause of problem

30
Solution?
  • More advanced modeling and controller?
  • Adaptive control
  • Design controller-friendly systems?
  • A simple model
  • User configurable parameter -gt knobs?

31
Future Work
  • As a tool, real users?
  • Scheduling multiple streams
  • Dynamically scale up/down
  • Other control theory applications

32
Backup Slides
33
Future Work
  • Load balancer
  • Load control across multiple tiers
  • Scheduling of multiple streams

34
System with control
35
Result
Source
Buffer
TCQ
Result Q
36
Conclusion
  • Advantages of feedback control
  • Make system more robust under disturbance
  • Allows more time for failure detection
  • Treat complex systems as black boxes
  • Cope with the system characteristics instead of
    having to change it
  • Theoretical analysis
  • Implementation is easy
  • System statistics can also be used for SLT

37
What is going on?
Controlled Output Thread(Code Reuse)
Queue Length Controller
Desired Queue length
Data Rate to TCQ
Actual Queue Length
38
Theory meets reality
Queue length
Time
39
Tricky part of parameter estimation
Model evaluation Making the system operate in
desired range
Data rate vs free space
Free Space
Non-Linear range
Easy for data source, but queue length ..
40
Why do we need control?
  • Data source does not provide accurate data rate

41
Control Problems
  • Not accurate for various reasons
  • Scheduling
  • Time spent on I/O
  • Etc.
  • Providing an accurate data source using feedback
    control
  • By controlling the input of desired rate

42
The Control Architecture
1500
1900
1600
P Controller (with precompensation)
u(k)Kpe(k)
U(k)u(k-1)(KpKI)e(k)-Kpe(k-1)
43
Result An accurate data source
P Controller with Pre-compensation
PI Controller
44
Zoom In
A lot of small disturbance in a Java
program Incremental garbage collection
P Controller
PI Controller
Write a Comment
User Comments (0)
About PowerShow.com