Performance Analysis of a Parallel Downloading Scheme from Mirror Sites Throughout the Internet - PowerPoint PPT Presentation

About This Presentation
Title:

Performance Analysis of a Parallel Downloading Scheme from Mirror Sites Throughout the Internet

Description:

Performance Analysis of a Parallel Downloading Scheme from Mirror Sites Throughout the Internet Allen Miu, Eugene Shih 6.892 Class Project December 3, 1999 – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 25
Provided by: tee105
Learn more at: http://nms.csail.mit.edu
Category:

less

Transcript and Presenter's Notes

Title: Performance Analysis of a Parallel Downloading Scheme from Mirror Sites Throughout the Internet


1
Performance Analysis of a Parallel Downloading
Scheme from Mirror Sites Throughout the Internet
  • Allen Miu, Eugene Shih
  • 6.892 Class Project
  • December 3, 1999

2
Overview
  • Problem Statement
  • Advantages/Disadvantages
  • Operation of Paraloading
  • Goals of Experiment
  • Setup of Experiment
  • Current Results
  • Summary
  • Questions

3
Problem Statement Is Paraloading Good?
Paraloading is the downloading from multiple
mirror sites in parallel.
Mirror C
Paraloader
Mirror A
Mirror B
4
Advantages of Paraloading
  • Performance is proportional to the realized
    aggregate bandwidth of the parallel connections
  • Less prone to complete download failures compared
    to the single connection download
  • Facilitates dynamic load balancing among parallel
    connections
  • Facilitates reliable, out-of-order delivery
    (similar to Netscape)

5
Disadvantages of Paraloading
  • Can be overly aggressive
  • Consumes more server resources
  • Overhead costs for scheduling, maintaining
    buffers, and sending block request messages
  • Only effective when mirror servers are available

6
Step 1 Obtain Mirror List
  • Hard-coded
  • DNS?

Mirror List
Mirror C
Paraloader
Mirror B
Mirror A
7
Step 2 Obtain File Length
Mirror C
Paraloader
Mirror B
Mirror A
8
Step 3 Send Block Requests
Mirror C
Paraloader
Mirror B
Mirror A
9
Step 4 Re-order
Mirror C
Paraloader
Mirror B
Mirror A
10
Step 5 Send Next Request
Mirror C
Paraloader
Mirror B
Mirror A
11
Goals of Experiment
  • Main goal To compare the performance of serial
    and parallel downloading
  • To verify the results of Rodriguez et al.
  • To examine whether varying the degree of
    parallelism, the number of mirror servers used,
    affects performance
  • To gain experience with paraloading and to find
    out what issues are involved in designing
    efficient paraloading systems

12
Experiment Setup
  • Implemented a paraloader application in Java,
    using HTTP1.1 (range-requests and persistent
    connections)
  • Files are downloaded at MIT from 3 different sets
    (kernel, mars, tucows) of 7 mirror servers
  • Degree of parallelism examined M 1, 3, 5, 7
  • Downloaded a 1MB and a 300KB file (S 1MB,
    300KB) in 1 hour intervals for 7 days
  • Block Size 32KB

13
Results
  • Paraloading decreases download time over the
    average single connection case
  • Speedup is far from optimal case (aggregate
    bandwidth)
  • Block request gaps result in wasted bandwidth
  • Gaps are proportional to RTT
  • Congestion at client? Possible but unlikely.

14
S 1MB
15
S 1MB
16
S - 763K
S 763KB, B 30, M 4
17
Acknowledgements
  • Dave Anderson
  • Dorothy Curtis
  • Wendi Heinzelmann
  • WIND Group

18
Questions
19
(No Transcript)
20
Summary of Contributions
  • Implemented a paraloader
  • Verified that paraloading indeed provides
    performance gain sometimes
  • Increasing degree of parallelism improves overall
    performance
  • Performance gains are not as good as those
    reported by Rodriguez et al.

21
Future Work
  • Examine how block size affects performance gain
  • Examine cost of paraloading
  • Implement and test various optimization
    techniques
  • Perform measurements at different client sites

22
Paraloading Will Not Be Effective In All
Situations
  • Clients should have enough slack bandwidth
    capacity to open more than one connection
  • Parallel connections are bottleneck disjoint
  • Target data on mirror servers is consistent and
    static
  • Security and authentication services are
    installed where appropriate
  • Data transport is reliable
  • Mirror locations are quickly and easily obtained

23
Step-by-step Process of the Block Scheduling
Paraloading Scheme
  • 1. Obtain a list of mirror sites
  • 2. Open a connection to a mirror server and
    obtain file length
  • 3. Divide file length into blocks
  • 4. Send a block request to each open connection
  • 5. Wait for a response
  • 6. Send a new block request to the first
    connection that finished downloading a block
  • 7. Loop back to 5 until all blocks are retrieved

24
Paraloading is Not a Well-studied Concept
  • Byers et al. proposed using Tornado codes to
    facilitate paraloading.
  • Rodriguez et al. proposed the block scheduling
    paraloading scheme that is used in our project
Write a Comment
User Comments (0)
About PowerShow.com