Fast PatternBased Throughput Prediction for TCP Bulk Transfers - PowerPoint PPT Presentation

About This Presentation
Title:

Fast PatternBased Throughput Prediction for TCP Bulk Transfers

Description:

Fast Pattern-Based Throughput Prediction. for TCP Bulk Transfers. Tsung-i (Mark) Huang ... TCP Pattern prediction. Average error in predicted future throughput ... – PowerPoint PPT presentation

Number of Views:112
Avg rating:3.0/5.0
Slides: 32
Provided by: tmark
Learn more at: https://www2.cs.uh.edu
Category:

less

Transcript and Presenter's Notes

Title: Fast PatternBased Throughput Prediction for TCP Bulk Transfers


1
Fast Pattern-Based Throughput Prediction for
TCP Bulk Transfers
  • Tsung-i (Mark) Huang
  • Jaspal Subhlok
  • University of Houston
  • GAN05 / May 10, 2005

2
Outline
  • Background
  • Problem Description
  • Methodology
  • Experiments and Results
  • Conclusion and Future Works

3
Are we there yet?
  • When you need Throughput Prediction?
  • File download xx minutes left MS IE vs. Mozilla
  • Mirror site selection Knoppix Florida State
    Univ. (fsu.edu) or TU Ilmenau, Germany
    (tu-ilmenau.de)
  • Resource selection in a grid environment
  • Cache selection for web content
  • delivery services

4
Which site will give the best throughput?
  • Current approaches and tools
  • Geographical distance
  • Ping (ICMP)
  • Download 512 KBytes (fixed size) NWS / iperf
  • Download 10 seconds (fixed duration) - iperf
  • Last two approaches are most accurate
  • How much data to download / How long?
  • Is Bandwidth Delay the answer? One size fits
    all?
  • All or nothing no result is available until
    the
  • end of transmission

5
Problem Description
  • Predicted future throughput can be used in
    mirror/replica site selection
  • Predict throughput of a TCP bulk transfer
  • Single TCP stream
  • Input Time Series of (Arrival time, Bytes
    received)
  • Output Predicted future throughput
  • Make a prediction of future throughput after 10
    100 RTTs
  • Utilize knowledge of TCP flow patterns
  • Assume TCP flow patterns will repeat later in the
    same TCP stream

6
TCP Flow Patterns
  • Textbook Examples

(a) Rate Control
(b) Congestion Control
  • In Reality

(c) Rate Control with delay
(d) Mixed Congestion Control
7
Approach to Throughput Prediction
  • Analyze Time-Series (TS1) of (Arrival Time, Bytes
    received) to get a meaningful throughput
    Time-Series
  • Possible solutions
  • Instant throughput throughput since previous TCP
    segment
  • Fixed Interval throughput avg throughput over a
    fixed time period
  • Per RTT throughput partition using fixed SYN-ACK
    RTT
  • Idea TCP sends a window full of data segments
    every RTT
  • Partition Time-Series (TS1 ) with fixed SYN-ACK
    RTT, and get per RTT Throughput (TS2 )
  • Analyze per RTT Throughput Time-Series (TS2 ) to
    predict future throughput
  • Compare different prediction methods across all
    traces

8
TCP Segment Partitioning (1)
SYN-ACK RTT 176 ms
per RTT Throughput
Fixed Interval of 100 ms
Log Scaled
121 KB/sec
40 KB/sec
Instant throughput shows wide-range of
fluctuation.
Fixed Interval throughput shows less fluctuation.
9
TCP Segment Partitioning (2)
  • RTT estimation
  • Use fixed SYN-ACK RTT
  • Simple and effective
  • Partition TCP segments into per RTT throughput
    time series

10
Throughput Prediction (1)
  • TCP Patterns
  • Rate Control limited (RC)
  • Congestion Control limited (CC)
  • Identify basic elements
  • Flat regions
  • Exponential Climb regions
  • Linear Climb regions
  • Drop points

11
Throughput Prediction (2)
  • Peak of slow start
  • Data points up to end of 1st slow
  • start are ignored for prediction
  • initial slow start does not repeat
  • RC-based prediction
  • Use flat regions
  • CC-based prediction
  • Use complete CC cycles
  • Window-based prediction
  • If no clear pattern observed

12
Experiments (1) - Setup
  • Download data files from 290 web sites
    (Debian/Gentoo mirrors)
  • Use TCPDUMP to capture receivers traffic
  • Record SYN-ACK RTTs
  • Include Retransmitted packets (0.09)
  • Average file size is 30 MBytes
  • 461 traces collected at Univ. of Houston
  • Traces are analyzed using perl scripts

13
Experiments (2) Prediction Methods
  • Prediction methods compared
  • Moving Average (MA) avg throughput of previous
    10 RTTs
  • Exponential Weighted Moving Average (EWMA)
  • Aggregate throughput average past throughput
    (same as cumulative average) use this as
    predicted throughput
  • TCP Pattern prediction
  • Average error in predicted future throughput
  • Cut off at 100 if over, in case measured future
    throughput is very small

14
Illustration of Prediction (1)
Make a prediction for next 200 RTTs
Drop at 27th RTT
Window size (in RTTs)
25th RTT
40th RTT
Prediction at 25th RTT
  • Aggregate Throughput Prediction average
    throughput
  • of 025 RTTs
  • TCP Throughput Prediction average throughput of
  • 925 RTTs (RC-based prediction)

Prediction at 40th RTT
  • TCP Throughput Prediction using Window-based
  • prediction after 27th RTTs (a significant drop)

15
Illustration of Prediction (2)
Make a prediction for next 200 RTTs
Window size (in RTTs)
Closer to 0, better the prediction.
  • Avg error against measured future throughput of
    next 200 RTTs
  • (for example, at 20th RTT, avg throughput of
    21220 RTTs is used)

16
Illustration of Prediction (3)
Make a prediction for next 200 RTTs
One complete CC cycle
Prediction made at 65th RTT using 3 CC complete
cycles
Closer to 0, better the prediction.
Throughput prediction using Congestion-Control
based patterns.
17
Results (1) predict next 200 RTTs at different
time
30th RTT
  • Aggregate is not accurate for small window size
    (lt 30 RTTs)
  • MA / EWMA generally not as accurate

18
Results (2) predict at 15th RTT for different
time in the future
  • When only limited data is available,
  • Aggregate is not accurate
  • MA performs best TCP Pattern is close

19
Results (3) predict at 25th RTT for different
time in the future
  • More data is available,
  • Aggregate performs better
  • TCP Pattern performs best MA is close

20
Results (4) predict at 50th RTT for different
time in the future
  • Even more data is available,
  • TCP Pattern best and Aggregate is close
  • MA now performs worse, due to dynamic of TCP
    flows

21
Summary of Results
  • Aggregate is accurate with sufficient data, not
    with a few RTTs of data
  • MA performs very well for a few RTTs of data
  • EWMA is not a good predictor
  • TCP Pattern generally performs better or as well
    as other methods

22
Summary of Results (table view)
23
Conclusion and Future Works
  • TCP-pattern based throughput prediction is as
    good or better than other methods.
  • Good predictions within 25 RTTs (or 5 sec).
  • Patterns observed 65 Rate Control, few
    Congestion Control
  • Methods using Aggregate (e.g. NWS) can not be
    expected to work well for small test files
  • Whats next?
  • Identify more patterns
  • Add a degree of confidence for each prediction
  • Multiple TCP streams

24
Thats all, folks!
  • Thank You!

25
Supplement Slides
26
Characteristics of collected traces (1)
27
Characteristics of collected traces (2)
  • Classification one trace presents over 50
    some type of patterns.

28
Some Trace Patterns (300 RTTs)
Under-estimated RTT 100 RTTs
29
Results (0.5) predict next 100 RTTs at
different time
30
Results (1.5) predict next 400 RTTs at
different time
31
Bandwidth
  • Bandwidth
  • The amount of data that can be pushed through a
    link in unit time. Usually measured in bits or
    bytes per second.
  • Bottleneck Bandwidth (BB)
  • Available Bandwidth (AB)
  • Throughput (T)
  • T AB BB
Write a Comment
User Comments (0)
About PowerShow.com