Trace-based Network Bandwidth Analysis and Prediction - PowerPoint PPT Presentation

About This Presentation
Title:

Trace-based Network Bandwidth Analysis and Prediction

Description:

Data Collection and Transformation. Basic Statistical Analysis of Bandwidth. Trace Classification ... Trace Collection and Transformation. Classification of ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 42
Provided by: Qy
Category:

less

Transcript and Presenter's Notes

Title: Trace-based Network Bandwidth Analysis and Prediction


1
Trace-based Network Bandwidth Analysis and
Prediction
  • Yi QIAO
  • 06/10/2002

2
  • OUTLINE
  • Introduction
  • Data Collection and Transformation
  • Basic Statistical Analysis of Bandwidth
  • Trace Classification
  • Bandwidth Prediction
  • Conclusion

3
  • Introduction
  • Fact Network bandwidth is one of the most
    important characteristics for both WANs and LANs
  • We want to know
  • What does bandwidth time series looks like?
  • Are there any correlations between bandwidth at
    different times?
  • Do bandwidth from different traces share any
    common properties?
  • Is network bandwidth predictable or not?
  • Are there any differences between bandwidth data
    from long period traces and those from short
    traces?

4
Step by step Trace Collection and
Transformation Classification of the Traces
Bandwidth Prediction
5
  • 2. Data Collection and Transformation
  • Three Data Sets
  • NLANR short-period (90 seconds) WAN traces
  • AUCKLAND long-period (1 day) WAN traces
  • BC Traces, 2 WAN traces and 2 LAN traces

6
Converting Trace file to Bandwidth
Data Original Trace file (Time Stamp IP Header
TCP Header) Time Stamp Packet Length (From
IP Header) assign packets to their bins according
to their timestamp, and computes instantaneous
bandwidth Final Bandwidth File

7
3. Basic Statistical Analysis
After some basic statistical analysis of the
bandwidth data, such as mean and maximum value of
bandwidth, standard deviation of bandwidth, we
get

Correlation Coefficient
8
Relationship between Mean, Min and Max Bandwidth
Now, whats the effect of bin size on these
properties?

Correlation Coefficient
9
Relationship between bin sizes and COV
Relationship between bin sizes and Max/Mean
10
4. Traces Classification       How
To?        What does the time series plot looks
like?   What does the shape for the ACF plot
looks like? What percentage of ACFs is
significant? What best describes the
distribution (histogram) of bandwidth?
What does the PSD plot looks like? Is it
decreasing linearly (in log-log plot) as the
frequency increase? Result 12 Classes
for NLANR traces, 8 Classes for AUCKLAND
traces.


11
  • NLANR short period WAN traces classification
  • Class 1 Not predictable, under-utilized

Bin size 0.001S
ACF Small value, low percentage ACFs are
significant Bandwidth Distribution Heavy-tailed
distribution yx-a PSD Flat, contains
all-frequency components like white noise.
12
Effect of different bin sizes

0.01S
0.1S
Different bin sizes can all give us some useful
information We should all these bin sizes for
each trace.
1S
13
B. Class 2 Little predictability, under-utilized
Bin size 0.1S for ACF 0.001S for other plots
ACF Small value, low percentage significant ACFs
Bandwidth Distribution Multiple heavy-tailed
distribution yx-a PSD Flat, contains
all-frequency components like white noise.
14
C. Class 2a No predictability, well-utilized
Bin size 0.1S for ACF 0.001S for other plots
ACF Small value, low percentage significant ACFs
Bandwidth Distribution Left branch - half a
normal distribution Right-branch heavy-tailed
distribution yx-a PSD Flat, contains
all-frequency components like white noise.
15
D. Class 4 Some predictability, under-utilized
Bin size 0.1S for ACF 0.001S for other plots
ACF Over 50 significant ACFs Bandwidth
Distribution Multiple heavy-tailed distribution
in the form of yx-a PSD Decreasing linearly in
log-log plot as frequency increases
low-frequency components are dominant
16
E. Class 5 Some predictability, fairly-utilized
Bin size 0.01S for ACF 0.001S for other plots
ACF Over 50 significant ACFs, high-frequency
vibration Bandwidth Distribution Left branch -
half a normal distribution Right-branch
heavy-tailed distribution yx-a PSD A dominant
frequency (frequency band) component
17
II. Auckland long period WAN traces
classification A. Class 1 Good predictability,
fairly-utilized
Bin size 1 S for all plots
ACF Over 90 significant ACFs, regular and
smooth plot Bandwidth Distribution Two separate
parts and two separate peaks, all
heavy-tailed PSD Decreasing linearly in log-log
plot as frequency increases low-frequency
components are dominant
18
B. Class 1a Good predictability, fairly-utilized
Bin size 1 S for all plots
ACF Over 85 significant ACFs, regular and
smooth plot Bandwidth Distribution Two separate
parts and two separate peaks, with large parts
overlapping PSD Decreasing linearly in log-log
plot as frequency increases low-frequency
components are dominant
19
C. Class 2 Some predictability, well-utilized
Bin size 1 S for all plots
ACF Over 70 significant ACFs, with some high
frequency fluctuation Bandwidth Distribution
Left branch - half a normal distribution
Right-branch heavy-tailed distribution
yx-a PSD Decreasing linearly in log-log plot as
frequency increases low-frequency components are
dominant
20
III. Tree-based Classification
Why do this? Some classes could
be very similar to each other while some are
quite different. This can be best described by a
tree structure. Tree-based
classification enables us to do classification at
different granularity.

21
  1. Tree-based Classification for NLANR traces

22
B. Tree-based Classification for Auckland traces
23
IV. Summary of Traces Classification
Summary for NLANR traces (12 classes)

24
Summary for AUCKLAND traces (8 classes)

25
Pie Chart for NLANR traces and AUCKLAND traces
26
What else can we learn? All the
long traces have some predictability. Most
of the short traces are not predictable. And even
for those short traces which are predictable,
their predictability are still not as good as
long traces. Only a small fraction of short
traces could make good use of the bandwidth,
while all the long traces have good (or fairly
good) utilization of the bandwidth. All
traces that are predictable have demonstrated
some degree of long-range-dependency, including
both short NLANR traces and long AUCKLAND traces.

27
5. Bandwidth Prediction      What do we want to
know?     Whats the real predictability for each
class that we classified? Which prediction
model is best suited for bandwidth prediction?
Whats the effect of different bin sizes on
bandwidth prediction? Prediction models
used (part of RPS Toolkit) MEAN, LAST, MA,
BM, AR, ARMA, ARIMA, ARFIMA


28
How to evaluate predictability? Three
evaluation criterions I.  The ratio of mean
squared error (msqerr) to the variance of testing
sequence, that is
  1. How well does the error distribution fit the
    normal distribution? (1 ideally)
  2. What percentage of ACFs for prediction error is
    significant? (0 ideally)

29
  • Effectiveness of different predictors
  • A. Bandwidth prediction for NLANR traces

Mean squared err/variance of testing sequence
Bin size 0.01 S
30
Normal Distribution Fit
Bin size 0.01 S
Percentage of error ACFs that are significant
Bin size 0.01 S
31
B. Bandwidth prediction for AUCKLAND traces
Mean squared err/variance of testing sequence
Bin size 10 S
32
Normal Distribution Fit
Bin size 10 S
Percentage of error ACFs that are significant
Bin size 10 S
33
C. Bandwidth prediction for BC traces
Mean squared err/variance of testing sequence
Bin size 10 S for 2 WAN traces, 0.1 S for 2 LAN
traces
34
What does bandwidth prediction really look
like?

An AUCKLAND Trace
Bin Size 1000S, 100S, 10S and 1S
A NLANR Trace
Bin Size 1S, 0.1S, 0.01S and 0.001S
35
D. Observations For almost all classes of
traces, AR model can yield the optimal or near
optimal prediction results among all the eight
predictors that have been tested. For
almost all the classes and all the predictors,
the error distribution are very close to normal
distribution. The value of sigacffrac for
AR model are almost the lowest among all
predictors for any class. Our expectation
of predictability for different classes have been
confirmed by real results All these long traces
are predictable, and a large fraction of them
have very good predictability. While for short
traces, only 20 of them have some
predictability. BC traces also have some
predictability.




36
II. Influence of bin size on bandwidth
prediction A. NLANR traces (AR 32)
Mean squared err/variance of testing sequence at
different bin sizes (0.001S, 0.01S, 0.1 S and 1S)
37
B. AUCKLAND traces (AR 32)
Mean squared err/variance of testing sequence at
different bin sizes (1S, 10S, 100S and 1000S)
38
C. Observations For NLANR traces, bin size
of 0.1 second gives the best prediction among all
the four bin sizes. For most AUCKLAND
trace, bin size of 100 second or 10 second can
give the best prediction performance among the
four bin sizes. For any trace, there
probably exists a optimal bin size that can give
the best prediction performance.



39
D. Further Probe For Auckland traces, there
are seems to be an optimal bin size for bandwidth
prediction

Red a Class 1 trace
Green a Class 1c trace
There seems to be an optimal bin size around 20
second
40
6. Conclusion
Bandwidth traces can be classified based on
their time series plot, ACF plot, distribution of
bandwidth, and PSD plot. Most long period
WAN traces are predictable, with some degree of
long-range dependency. A small part of short
period WAN traces have some predictability, also
with some degree of long-range dependency. The BC
LAN traces are also predictable. AR model is
an ideal model for prediction because of its
accuracy and efficiency. For each trace,
there exists an optimal bin size where we can
get the best prediction performance.




41
Acknowledgement Many Thanks to Peter, Dong,
and Jason!
Write a Comment
User Comments (0)
About PowerShow.com