Performance Analysis of a Parallel Downloading Scheme from Mirror Sites Throughout the Internet

About This Presentation

Title:

Performance Analysis of a Parallel Downloading Scheme from Mirror Sites Throughout the Internet

Description:

Performance Analysis of a Parallel Downloading Scheme from Mirror Sites Throughout the Internet Allen Miu, Eugene Shih 6.892 Class Project December 3, 1999 – PowerPoint PPT presentation

Number of Views:72

Avg rating:3.0/5.0

Slides: 25

Provided by: tee105

Learn more at: http://nms.csail.mit.edu

Category:

more less

Transcript and Presenter's Notes

Title: Performance Analysis of a Parallel Downloading Scheme from Mirror Sites Throughout the Internet

1
Performance Analysis of a Parallel Downloading
Scheme from Mirror Sites Throughout the Internet

Allen Miu, Eugene Shih
6.892 Class Project
December 3, 1999

2
Overview

Problem Statement
Advantages/Disadvantages
Operation of Paraloading
Goals of Experiment
Setup of Experiment
Current Results
Summary
Questions

3
Problem Statement Is Paraloading Good?
Paraloading is the downloading from multiple
mirror sites in parallel.
Mirror C
Paraloader
Mirror A
Mirror B
4
Advantages of Paraloading

Performance is proportional to the realized
aggregate bandwidth of the parallel connections
Less prone to complete download failures compared
to the single connection download
Facilitates dynamic load balancing among parallel
connections
Facilitates reliable, out-of-order delivery
(similar to Netscape)

5
Disadvantages of Paraloading

Can be overly aggressive
Consumes more server resources
Overhead costs for scheduling, maintaining
buffers, and sending block request messages
Only effective when mirror servers are available

6
Step 1 Obtain Mirror List

Hard-coded
DNS?

Mirror List
Mirror C
Paraloader
Mirror B
Mirror A
7
Step 2 Obtain File Length
Mirror C
Paraloader
Mirror B
Mirror A
8
Step 3 Send Block Requests
Mirror C
Paraloader
Mirror B
Mirror A
9
Step 4 Re-order
Mirror C
Paraloader
Mirror B
Mirror A
10
Step 5 Send Next Request
Mirror C
Paraloader
Mirror B
Mirror A
11
Goals of Experiment

Main goal To compare the performance of serial
and parallel downloading
To verify the results of Rodriguez et al.
To examine whether varying the degree of
parallelism, the number of mirror servers used,
affects performance
To gain experience with paraloading and to find
out what issues are involved in designing
efficient paraloading systems

12
Experiment Setup

Implemented a paraloader application in Java,
using HTTP1.1 (range-requests and persistent
connections)
Files are downloaded at MIT from 3 different sets
(kernel, mars, tucows) of 7 mirror servers
Degree of parallelism examined M 1, 3, 5, 7
Downloaded a 1MB and a 300KB file (S 1MB,
300KB) in 1 hour intervals for 7 days
Block Size 32KB

13
Results

Paraloading decreases download time over the
average single connection case
Speedup is far from optimal case (aggregate
bandwidth)
Block request gaps result in wasted bandwidth
Gaps are proportional to RTT
Congestion at client? Possible but unlikely.

14
S 1MB
15
S 1MB
16
S - 763K
S 763KB, B 30, M 4
17
Acknowledgements

Dave Anderson
Dorothy Curtis
Wendi Heinzelmann
WIND Group

18
Questions
19
(No Transcript)
20
Summary of Contributions

Implemented a paraloader
Verified that paraloading indeed provides
performance gain sometimes
Increasing degree of parallelism improves overall
performance
Performance gains are not as good as those
reported by Rodriguez et al.

21
Future Work

Examine how block size affects performance gain
Examine cost of paraloading
Implement and test various optimization
techniques
Perform measurements at different client sites

22
Paraloading Will Not Be Effective In All
Situations

Clients should have enough slack bandwidth
capacity to open more than one connection
Parallel connections are bottleneck disjoint
Target data on mirror servers is consistent and
static
Security and authentication services are
installed where appropriate
Data transport is reliable
Mirror locations are quickly and easily obtained

23
Step-by-step Process of the Block Scheduling
Paraloading Scheme

1. Obtain a list of mirror sites
2. Open a connection to a mirror server and
obtain file length
3. Divide file length into blocks
4. Send a block request to each open connection
5. Wait for a response
6. Send a new block request to the first
connection that finished downloading a block
7. Loop back to 5 until all blocks are retrieved

24
Paraloading is Not a Well-studied Concept

Byers et al. proposed using Tornado codes to
facilitate paraloading.
Rodriguez et al. proposed the block scheduling
paraloading scheme that is used in our project

Write a Comment

User Comments (0)