Course Overview - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

Course Overview

Description:

the Shoemaker-Levy 9 comet struck Jupiter. the Mars Pathfinder's successful landing on Mars ... 9. Caching and Replication. Caching. Replication (Mirroring) Goals ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 14
Provided by: ceSej
Category:

less

Transcript and Presenter's Notes

Title: Course Overview


1
Course Overview
  • Dongkyoo Shin
  • Fall Semester, 2002
  • Sejong University

2
Textbook
  • Web Caching and Replication, Michael Rabinovich
    and Oliver Sparscheck, Addison-Wesley 2002
  • Book Contents
  • Introduction
  • Part 1 The Background prerequisite
    information, terminology, Web behavior
  • Part 2 Web Caching
  • Part 3 Web Replication
  • Part 4 Further Directions

3
Grading
  • Presentation of the Selected Papers (30)
  • Midterm Exam (30)
  • Final Exam (30)
  • Presence (10)
  • Presentation will be scheduled after mid-term exam

4
Background
  • With the emergence of the World Wide Web, the
    primary use of the Internet is content
    distribution/delivery. Cheriton 01
  • 7080 percent of wide-area Internet traffic is
    HTTP traffic. McCreary 00
  • Much of the remainder consists of RealAudio
    streams and DNS. McCreary 00
  • However, the Internet was never architected for
    scalable content delivery. Cheriton 01

5
Problem Flash Crowds
  • Unpredictably Moving Hot-Spots caused by
    Unpredictable Demands on the Internet Seltzer
    96
  • Server loads, Network loads and Long Latency
  • Information access has not been, nor will it
    likely be, evenly distributed. As have been
    repeatedly observed, popular Web pages create
    hot spots of network load, with the same data
    transmitted over the same network links again and
    again to thousands of different users. Hot-spots
    also move around. Recent studies by Margo Seltzer
    of Harvard University also confirms that
    flash-crowds are very common, and that the cool
    site of the day moves around. Bottleneck hot
    spots develop and break up more quickly than the
    network or the Web servers can be re-provisioned.
    A brute force approach to provisioning is not
    only infeasible, but also ineffective. Zhang
    97

6
Famous examples of flash crowds
  • JPL Web site after
  • the Shoemaker-Levy 9 comet struck Jupiter
  • the Mars Pathfinders successful landing on Mars
  • IBM site during the Deep Blue Kasparov chess
    tournament
  • CNN site just after the 911 Terror attacks on
    WTC, New York.
  • And a lot more..
  • Much as a stadium parking lot gets jammed after a
    World-Cup Soccer game

7
Revisiting the Internet Architecture
  • The End-to-End Arguments Clark 81, Clark 88
  • Dumb networks and Smart End systems
  • For simple and easy internetworking and
    robustness
  • No state management inside the networks.
  • Good things ? Plank 99
  • It scales
  • It has worked in the past. It works now. It will
    work in the future.
  • Bad things? Plank 99
  • End-to-End retransmissions required
  • Performance suffers.
  • Data movement cannot be managed/cached/replicated
    inside the networks.
  • Locality (Network Proximity information) cannot
    be exploited.

8
Early Approaches (1994 )
  • To reduce
  • Network bandwidth consumptions, Server loads, and
  • User latency
  • Buying more resources
  • Cluster-based Server engineering
  • Web Caching
  • Mirroring

9
Caching and Replication
10
Lessons Learned
  • Both Caching and Replication are certainly proved
    to be promising technologies
  • From 1997, most ISPs have deployed caches on
    their expensive trans-oceanic links (3070 of
    hit-ratio)
  • Several sites have replicated their popular
    contents and redirected users requests to
    replicas. (e.g., JPLs Mars Pathfinder site,
    Linux/FreeBSD S/W Archives, )
  • However, each technology has its own problems...
  • Limitations of caching
  • Point solution effective for within an ISPs
    administrative boundary only.
  • Content providers lose accountability.
  • Limitations of replication
  • Automatic Redirection/Resolution mechanisms on
    the Internet was not available.
  • Another point solution useful for the server
    load distribution problem only does little to
    network congestion problems
  • Content providers should invest to build their
    own private replication network.

11
What Do We Need?
  • We need to develop a new scalable infrastructure
    for data dissemination which .
  • Covers global Internet connectivity.
  • Handles with the network congestion problem.
  • Helps the load distribution of servers at the
    same time.
  • Dynamically adaptive to quickly moving hot spots
  • Dynamic, Adaptive Scalable Solution!!!!

12
Infra-wide solutions
  • Cooperative Cache Network
  • SQUID Cache Network Wessels 97
  • Adaptive Web Caching Zhang 97
  • LSAM Proxy Cache Touch 98
  • Distributed Storage Infrastructure Beck 98
  • Content Delivery Network Day 01

13
Pros. And Cons.
Write a Comment
User Comments (0)
About PowerShow.com