Title: Factor Analysis of Network Flow Throughput Measurements for Inferring Congestion Sharing
1Factor Analysis of Network Flow Throughput
Measurements for Inferring Congestion Sharing
- Dogu Arifler and Brian L. Evans
- Eastern Mediterranean University - The University
of Texas at Austin - European Signal Processing Conference
- Antalya, Turkey, September 4-8, 2005
http//www.emu.edu.tr
http//www.ece.utexas.edu
2Inference of congested path sharing
- Motivation Network managers need information
about resource sharing in other networks to
better plan for services and diagnose performance
problems - Internet service providers need todiagnose
configuration errors and link failures in peer
networks - Content providers need to balance workload and
plan cache placement - Problem In general, properties of networks
outside ones administrative domain are unknown - Little or no information on routing, topology, or
link utilizations - Solution Network tomography
- Inferring characteristics of networks from
available network traffic measurements
3Autoregressive model for available capacity
Duration of f120
Throughput Correlation
- Throughputs of TCP flows that temporally overlap
at a congested resource are correlated - Removing large- and small-sized flows helps in
capturing positive throughput correlations due to
resource sharing
high correlation for temporally overlapping
flows
Start time of f2
4Measured data component variances
- Use 4 flow classes AOL1, AOL2, HotMail1, and
Hotmail2 - Filter flow records based on
- Packets Discard flows consisting of only 1
packet - Duration Discard flows with duration shorter
than 1 second - Size Discard flows with sizes lt 8 kB or gt 64 kB
- Normalized component variances
- 2 significant components with explanatory power
of 72 for Dataset2002 and 63 for Dataset2004
Principal component Dataset2002 95 confidence interval Dataset2004 95 confidence interval
1 (1.5457, 1.7900) (1.3646, 1.4786)
2 (1.0861, 1.3206) (1.0237, 1.1603)
3 (0.7058, 0.9150) (0.8230, 0.9690)
4 (0.2194, 0.4458) (0.5413, 0.6379)
5Measured data factor analysis
- Based on 2 significant components, determine
factor loadings - Rotated factor loading estimates
- Rows correspond to classes
- Columns correspond to shared infrastructure
- Estimate 95 bootstrap confidence intervals for
loadings to establish accuracy - With 95 confidence, we can identify which flow
classes share infrastructure!
Dataset2002
Dataset2004
AOL1 AOL2 HotMail1 Hotmail2
AOL1 AOL2 HotMail1 Hotmail2
D. Arifler, Network Tomography Based on Flow
Level Measurements, Ph.D. Dissertation, 2004.