Title: The Joint Distribution of Internet Flow Sizes and Durations
1The Joint Distribution of Internet Flow Sizes
and Durations
- CHEOLWOO PARK
- J. STEPHEN MARRON
- The University of North Carolina at Chapel Hill
2The Joint Distribution of Internet Flow Sizes and
Durations
- Motivation of the study
- Data description, scatter plots and density
estimation - Correlation plots
- Conclusions and future plans
3The Joint Distribution of Internet Flow Sizes and
Durations
- Started from conflict between two papers
- Extremal Dependence Internet Traffic
Applications (2002) - Felix Hernandez Campos, J.
S. Marron, Sidney I. Resnick and Kevin Jeffay - On the Characteristics and Origins of Internet
Flow Rates (2002) - Yin Zhang, Lee Breslau, Vern
Paxson and Scott Shenker, SIGCOMM02
4The Joint Distribution of Internet Flow Sizes and
Durations
- Why interested in this topic?
- Size and rate are naturally considered as
independent - Users determine sizes of files transferred
depending on their available bandwidths? - Modeling of Internet traffic
5The Joint Distribution of Internet Flow Sizes and
Durations
- Different earlier analyses of Internet Flow Sizes
and Durations
S Size, D Duration, R (S/D) Rate, IR
Inverse Rate
- Nearly contradictory answers!
6The Joint Distribution of Internet Flow Sizes and
Durations
- Why? Possibilities
- Data from different sources?
- Different types of data? (HTTP Resp. vs all web
traces) - Different correlation measure?
- Different threshold values?
7The Joint Distribution of Internet Flow Sizes and
Durations
- applied thresholding to different variables
- used different threshold values
8The Joint Distribution of Internet Flow Sizes and
Durations
- Motivation of the study
- Data description, scatter plots and density
estimation - Correlation plots
- Conclusions and future plans
9The Joint Distribution of Internet Flow Sizes and
Durations
- Data
- HTTP responses
- Sunday Morning (800 AM 1200 PM)
- In April 2001
- From UNC Main Link
- Variables of Interest
- S Size (bytes)
- D Duration (time in seconds)
- R Rate (throughput, byte/sec)
- IR Inverse Rate (sec/byte)
10The Joint Distribution of Internet Flow Sizes and
Durations
- Scatterplot log10(Size) vs. log10(Duration)
11The Joint Distribution of Internet Flow Sizes and
Durations
- Scatterplot log10(Size) vs. log10(Rate)
12The Joint Distribution of Internet Flow Sizes and
Durations
- Scatterplot log10(Duration) vs. log10(Inv. Rate)
13The Joint Distribution of Internet Flow Sizes and
Durations
- Motivation of the Study
- Data description and scatter plots
- Log-log correlation plots with global
thresholdings - Conclusions and future plans
14The Joint Distribution of Internet Flow Sizes and
Durations
log10(Size) vs. log10(Duration)
15The Joint Distribution of Internet Flow Sizes and
Durations
log10(Size) vs. log10(Rate)
16The Joint Distribution of Internet Flow Sizes and
Durations
log10(Duration) vs. log10(Inv. Rate)
17The Joint Distribution of Internet Flow Sizes and
Durations
Simulated bivariate normal
log10(Size) vs. log10(Rate)
18The Joint Distribution of Internet Flow Sizes and
Durations
- Motivation of the Study
- Data description and scatter plots
- Log-log correlation plots with global
thresholdings - Conclusions and future plans
19The Joint Distribution of Internet Flow Sizes and
Durations
- Conclusions
- The blind men and the elephant
- Thresholding is CRITICAL
20The Joint Distribution of Internet Flow Sizes and
Durations
- Deeper investigation
- What values should we use ?
- On Size ?
- On Duration ?
- On Both ?
- How to handle 0 durations ?
- Which methods are robust to thresholding?