Title: Link Layer
1Link Layer
2Admin
- Written AssignmentNetwork new due date Monday,
April 23 - If you are considering replacement work, please
stop by to talk to me - Any feedback/suggestions on the course will be
appreciated.
3Recap Internet Routing
- Intradomain routing and interdomain routing
- CIDR to allow flexibility in aggregation of
destination addresses to improve routing
scalability - Longest prefix matching to determine the next hop
to a destination - Basic switching fabric design
4Putting it Together Example 1 (same network)
A-gtB
src
dst
misc fields
data
223.1.1.1
223.1.1.3
- Look up dest address
- find dest is on same net
- Hand datagram to link layer to send inside a
link-layer frame
To Internet
223.1.1.1
223.1.4.1
223.1.2.1
223.1.1.2
223.1.2.9
223.1.1.4
223.1.2.2
223.1.1.3
223.1.3.27
223.1.3.2
223.1.3.1
5Putting it Together Example 2 (Different
Networks) A-gt E
misc fields
data
223.1.1.1
223.1.2.3
- look up dest address in forwarding table
- routing table next hop router to dest is
223.1.1.4 - Hand datagram to link layer to send to router
223.1.1.4 inside a link-layer frame - the dest. of the link layer frame is 223.1.1.4
0.0.0.0/0 223.1.1.4 -
To Internet
223.1.4.1
6Summary of Network Layer
- We have covered the basics of the network layer
- routing and forwarding
- There are multiple other topics that we did not
cover - Multicast/anycast
- QoS
- slides will be linked on the schedule page just
in case you need reading in the summer
7Recap The Hourglass Architecture of the Internet
Telnet
Email
FTP
WWW
TCP
UDP
IP
Ethernet
FDDI
Wireless
ADSL
CableDOCSIS
8Link Layer Introduction
link
- Some terminology
- hosts and routers are nodes
- (bridges and switches too)
- communication channels that connect adjacent
nodes along a communication path are links - wired, wireless
- dedicated, shared
- 2-PDU is called a frame, encapsulates 3-PDU
datagram
9Link layer Context
- Data-link layer has responsibility of
transferring datagram from one node to another
node - Datagram may be transferred by different link
protocols over different links, e.g., - Ethernet on first link,
- frame relay on intermediate links
- 802.11 on last link
- transportation analogy
- trip from New Haven to San Francisco
- taxi home to union station
- train union station to JFK
- plane JFK to San Francisco airport
- shuttle airport to hotel
10Link Layer Services
- Framing
- encapsulate datagram into frame, adding header,
trailer and error detection/correction - Multiplexing/demultiplexing
- frame headers to identify src, dest
- Media access control
- Forwarding/switching with a link-layer (Layer 2)
domain - Reliable delivery between adjacent nodes
- we learned how to do this already !
- seldom used on low bit error link (fiber, some
twisted pair) - common for wireless links high error rates
11Adaptors Communicating
datagram
receiving node
link layer protocol
sending node
adapter
adapter
- sending side
- encapsulates datagram in a frame
- adds error checking bits, rdt, flow control, etc.
- receiving side
- looks for errors, rdt, flow control, etc
- extracts datagram, passes to receiving node
- link layer typically implemented in adaptor
(aka NIC) - Ethernet card, modem, 802.11 card
- adapter is semi-autonomous, implementing link
physical layers
12LAN/MAC/Physical Address
- In most link-layer, each adapter has a unique
link layer address (also called MAC address) - used as address in datalink frames to identify
the interface - 48 bit MAC address (for most types of LANs)
burned in the adapter ROM - MAC address allocation administered by IEEE
- manufacturer buys portion of MAC address space
(to assure uniqueness)
13Recall Earlier Routing Discussion
- Starting at A, given IP datagram addressed to E
- look up net. address of E, find C
- link layer sends datagram to C inside link-layer
frame the dest. address should be Cs MAC
address
14ARP Address Resolution Protocol
- Each IP node (Host, Router) on LAN has ARP table
- ARP Table IP/MAC address mappings for some LAN
nodes - lt IP address MAC address TTLgt
- TTL (Time To Live) time after which address
mapping will be forgotten (typically 20 min)
yry3_at_cicada yry3 /sbin/arp Address
HWtype HWaddress Flags Mask
Iface zoo-gatew.cs.yale.edu ether
AA00040020D4 C
eth0 artemis.zoo.cs.yale.edu ether
00065B3F6E21 C
eth0 lab.zoo.cs.yale.edu ether
00B0D0F3C7A5 C eth0
15ARP Protocol
- ARP is plug-and-play
- nodes create their ARP tables without
intervention from net administrator - A broadcast protocol
- source broadcasts query frame, containing queried
IP address - all machines on LAN receive ARP query
- destination D receives ARP frame, replies
- frame sent to As MAC address (unicast)
16Comparison of IP address and MAC Address
- IP address is locator
- address depends on network to which an interface
is attached - NOT portable
- introduces features (e.g., CIDR) for routing
scalability - IP address needs to be globally unique (if no
NAT)
- MAC address is an identifier
- dedicated to a device
- portable
- flat
- MAC address does not need to be globally unique,
but the current assignment ensures uniqueness
17Outline
- Admin
- Link layer overview
- Error detection
18Error Detection
- D Data protected by error checking, may
include header fields - ED Error Detection bits (redundancy)
- Error detection not 100 reliable!
- a good error detector may miss some errors, but
rarely - larger ED field generally yields better
detection - Error detection design considers computation
primitives.
19Cyclic Redundancy Check Background
- Widely used in practice, e.g.,
- Ethernet, DOCSIS (Cable Modem), FDDI, PKZIP,
WinZip, PNG - For a given data D, consider it as a polynomial
D(x) - consider the string of 0 and 1 as the
coefficients of a polynomial - e.g. consider string 10011 as x4x1
- addition and subtraction are modular 2, thus the
same as xor - Choose generator polynomial G(x) with r1 bits,
where r is called the degree of G(x)
20Cyclic Redundancy Check Encode
- Given data G(x) and D(x), choose R(x) with r
bits, such that - D(x)xrR(x) is exactly divisible by G(x)
- The bits correspond to D(x)xrR(x) are sent to
the receiver
21Ethernet Frame Structure
- Sending adapter encapsulates IP datagram (or
other network layer protocol packet) in Ethernet
frame - Preamble 8 bytes
- 7 bytes with pattern 10101010 followed by one
byte with pattern 10101011 (why the preamble?) - Source and dest. addresses 6 bytes
- Type indicates the higher layer protocol, mostly
IP but others may be supported such as Novell IPX
and AppleTalk - CRC CRC-32 checked at receiver, if error is
detected, the frame is simply dropped
22Cyclic Redundancy Check Decode
- Since G(x) is global, when the receiver receives
the transmission T(x), it divides T(x) by G(x) - if non-zero remainder error detected!
- if zero remainder, assumes no error
T
T D(x)xrR(x)
EncodeCRC(G)
check
D
23CRC Steps and an Example
- Suppose the degree of G(x) is r
- Append r zero to D(x), i.e. consider D(x)xr
- Divide D(x)xr by G(x). Let R(x) denote the
reminder - Send ltD, Rgt to the receiver
24The Power of CRC
- Let T(x) denote D(x)xrR(x), and E(x) the
polynomial of the error bits - the received signal is T(x) T(x)E(x)
- Since T(x) is divisible by G(x), we only need to
consider if E(x) is divisible by G(x)
T
T D(x)xrR(x)
EncodeCRC(G)
check
D
25Designing CRC
- Detect a single-bit error E(x) xi
- if G(x) contains two or more terms, E(x) is not
divisible by G(x) - Detect an odd number of errors E(x) has an odd
number of terms - lemma if E(x) has an odd number of terms, E(x)
cannot be divisible by (x1) - suppose E(x) (x1)F(x), let x1, the left hand
will be 1, while the right hand will be 0 - thus if G(x) contains x1 as a factor, E(x) will
not be divided by G(x) - Many more errors can be detected by designing the
right G(x)
26Example G(x)
- 32 bits CRC
- CRC32 x32 x26 x23 x22 x16 x12 x11
x10 x8 x7 x5 x4 x2 x 1 - used by Ethernet, FDDI, PKZIP, WinZip, and PNG
- GSM phones
- For more details see the link below and further
links it contains - http//en.wikipedia.org/wiki/Cyclic_redundancy_che
ck
.
27Outline
- Admin
- Link layer overview
- Error detection/correction
- Link access
28Multiple Access Links and Protocols
- Two types of links
- point-to-point
- e.g., a leased dedicated line, PPP for dial-up
access - broadcast (shared wire or medium)
- traditional Ethernet Cable networks
- 802.11 wireless LAN cellular networks
- satellite
29Multiple Access Protocols
- Single shared broadcast channel
- thus, if two or more simultaneous transmissions
by nodes, due to interference, only one node can
send successfully at a time (see CDMA later for
an exception) - multiple access protocol
- Protocol that determines how nodes share channel,
i.e., determines when nodes can transmit - Communication about channel sharing must use
channel itself ! - Discussion properties of an ideal multiple
access protocol.
30Ideal Mulitple Access Protocol
- Broadcast channel of rate R bps
- Efficiency when only one node wants to transmit,
it can send at full rate R - Rate allocation
- simple fairness when N nodes want to transmit,
each can send at average rate R/N - we may need more complex rate control
- Decentralized
- no special node to coordinate transmissions
- no synchronization of clocks
- Simple
31MAC Protocols a Taxonomy
- Goals
- efficient, rate control, decentralized, simple
- Three broad classes
- channel partitioning
- divide channel into smaller pieces (time slot,
frequency, code) - non-partitioning
- random access
- allow collisions
- taking-turns
- a token coordinates shared access to avoid
collisions
32Outline
- Admin. and recap
- Link layer overview
- Error detection and correction
- Media access control (MAC) protocols
- channel partitioning
33Channel Partitioning TDMA
- TDMA time division multiple access
- Access to channel in "rounds"
- Each station gets fixed length slot (length pkt
trans time) in each round - Unused slots go idle
- Example 6-station LAN, 1,3,4 have pkt, slots
2,5,6 idle
34Channel Partitioning FDMA
- FDMA frequency division multiple access
- Channel spectrum divided into frequency bands
- Each station assigned fixed frequency band
- Unused transmission time in frequency bands go
idle - Example 6-station LAN, 1,3,4 have pkt, frequency
bands 2,5,6 idle
time
1
2
3
frequency bands
4
5
6
35GSM - TDMA/FDMA
935-960 MHz 124 channels (200 kHz) downlink
frequency
890-915 MHz 124 channels (200 kHz) uplink
time
GSM TDMA frame
GSM time-slot (normal burst)
guard space
guard space
tail
user data
Training
S
S
user data
tail
57 bits
1
1
3
3 bits
57 bits
26 bits
S indicates data or control
36Channel Partitioning CDMA
- CDMA (Code Division Multiple Access)
- Used mostly in wireless broadcast channels
(cellular, satellite, etc) - A spread-spectrum technique
History http//people.seas.harvard.edu/jones/csc
ie129/nu_lectures/lecture7/hedy/lemarr.htm
37CDMA Encoding
- All users share same frequency, but each user m
has its own unique chipping sequence (i.e.,
code) cm to encode data, i.e., code set
partitioning - e.g. cm 1 1 1 -1 1 -1 -1 -1
- Assume original data are represented by 1 and -1
- Encoded signal (original data) modulated by
(chipping sequence) - assume cm 1 1 1 -1 1 -1 -1 -1
- if data is d, send d cm,
- if data d is 1, send cm
- if data d is -1 send -cm
38CDMA Encoding
tb
user data d(t)
1
-1
X
tc
chipping sequence c(t)
-1
1
1
-1
1
-1
1
-1
1
-1
-1
1
1
1
resulting signal
-1
1
1
-1
-1
1
-1
1
1
-1
1
-1
-1
1
tb bit period tc chip period
39CDMA Decoding
- Inner-product (summation of bit-by-bit product)
of encoded signal and chipping sequence - if inner-product gt 0, the data is 1 else -1
40CDMA Encode/Decode
Encode
Code of user m cm 1 1 1 -1 1 -1 -1 -1
Decode
- The number of bitsof each chipping sequence is
M
41CDMA Deal with Multiple-User Interference
- Two codes Ci and Cj are orthogonal, if
- , where we use . to denote inner
product, e.g. - If codes are orthogonal, multiple users can
coexist and transmit simultaneously with
minimal interference
C1 1 1 1 -1 1 -1
-1 -1 C2 1 -1 1 1
1 -1 1 1 ------------------------
----------------- C1 . C2 1 (-1) 1
(-1) 1 1 (-1)(-1)0
Analogy Speak in different languages!
42CDMA Two-Sender Interference
Code 1 1 1 1 -1 1 -1 -1 -1 Code 2 1 -1 1 1
1 -1 1 1
43Discussions
- Advantages of channel partitioning
- Problems of channel partitioning
44Outline
- Recap
- Link layer overview
- Error detection and correction
- MAC protocols
- Partitioning protocols
- Non-partitioning MAC protocols
- Random access
44
45Random Access Protocols
- When a node has packets to send
- transmit at full channel data rate R
- no a priori coordination among nodes
- Two or more transmitting nodes -gt collision
- Random access MAC protocol specifies
- when to access channel?
- how to detect collisions?
- how to recover from collisions?
- Examples of random access MAC protocols
- slotted ALOHA and pure ALOHA
- CSMA and CSMA/CD, CSMA/CA
46Slotted Aloha Norm Abramson
- Time is divided into equal size slots ( pkt
trans. time) - Node with new arriving pkt transmit at beginning
of next slot - If collision retransmit pkt in future slots with
probability p, until successful.
Success (S), Collision (C), Empty (E) slots
47Slotted Aloha Efficiency
- Q What is the fraction of successful slots?
- suppose n stations have packets to send
- suppose each transmits in a slot with probability
p - - prob. of succ. by a specific node p
(1-p)(n-1) -
- - prob. of succ. by any one of the N nodes
- S(p) n Prob (only one transmits)
- n p (1-p)(n-1)
48Goodput vs. Offered Load
S throughput goodput (success rate)
1.5
0.5
1.0
2.0
G offered load np
- when p n lt 1, as p (or n) increases
- probability of empty slots reduces
- probability of collision is still low, thus
goodput increases - when p n gt 1, as p (or n) increases,
- probability of empty slots does not reduce much,
but - probability of collision increases, thus goodput
decreases - goodput is optimal when p n 1
49Maximum Efficiency vs. n
1/e 0.37
50Pure (unslotted) Aloha
- Unslotted Aloha simpler, no clock
synchronization - Whenever pkt needs transmission
- send without awaiting for the beginning of slot
- Collision probability increases
- pkt sent at t0 collide with other pkts sent in
t0-1, t01
51Pure Aloha (cont.)
- Assume a node transmit with probability p in one
unit of time - P(success by a given node) P(node transmits)
-
P(no other node transmits in t0-1,t0 -
P(no other node transmits in t0, t01 - p .
(1-p)n-1 . (1-p)n-1 - p .
(1-p)2(n-1) - P(success by any of N nodes) n p . (1-p)2(n-1)
-
- - Bound 1/(2e) .18
52Goodput vs. Offered Load
0.4
0.3
S throughput goodput (success rate)
0.2
0.1
1.5
0.5
1.0
2.0
G offered load Np
53Dynamics of (Slotted) Aloha
- In reality, the number of stations backlogged is
changing - we need to study the dynamics when using a fixed
transmission probability p - Assume we have a total of m stations (the
machines on a LAN) - n of them are currently backlogged, each tries
with a (fixed) probability p - the remaining m-n stations are not backlogged.
They may start to generate packets with a
probability pa, where pa is much smaller than p
54Model
n backlogged each transmits with prob. p
m-n unbacklogged
each transmits with prob. pa
55Dynamics of Aloha Effects of Fixed Probability
- - assume a total of
- m stations
- pa ltlt p
- success rate is thedeparture rate, the rate
the backlog is reducing
dep. and arrival rate of backlogged stations
n number of backlogged stations
m
0
offered load 1
Lesson if we fix p, but n varies, we may have an
undesirable stable point
56Summary of Problems of Aloha Protocols
- Problems
- slotted Aloha has better efficiency than pure
Aloha but clock synchronization is hard to
achieve - Aloha protocols have low efficiency due to
collision or empty slots - when offered load is optimal (p 1/N), the
goodput is only about 37 - when the offered load is not optimal, the goodput
is even lower - undesirable steady state at a fixed transmission
rate, when the number of backlogged stations
varies - Ethernet design address the problems
- approximate slotted Aloha without clock
synchronization - reduce the penalty of collision or empty slots
- infer optimal transmission rate
57The Basic MAC Mechanisms of Ethernet
get a packet from upper layer K 0 n
0 // K control wait time n no. of
collisions repeat wait for K 512 bit-time
while (network busy) wait wait for 96
bit-time after detecting no signal transmit
and detect collision if detect collision
stop and transmit a 48-bit jam signal
n m min(n, 10), where n is the
number of collisions choose K randomly
from 0, 1, 2, , 2m-1. if n lt 16 goto
repeat else give up
58Ethernet
- Dominant LAN technology
- First widely used LAN technology
- Kept up with speed race 10 Mbps, 100 Mbps, 1
Gbps, 10 Gbps
Metcalfes Ethernet sketch
59Course Topics Summary
- The Internet is a general-purpose, large-scale,
distributed computer network - Major design features/principles
- packet switching/statistical multiplexing
- hour-glass architecture
- end-to-end principle
- decentralized architecture
- E.g., DNS, interdomain routing
- resource allocation framework
- optimization decomposition through duality
- adaptive control
- e.g., AIMD sliding window self clocking, Ethernet
- queueing modeling/performance analysis and design
- tradeoff between theoretical impossibility and
practice
60Evolution
- Driven by Technology, Infrastructure, Policy,
Applications, and Understanding - technology
- e.g., wireless/optical communication technologies
and device miniaturization (sensors) - infrastructure
- e.g., cloud computing
- applications
- e.g., content distribution, game, tele presence,
sensing, grid computing, VoIP, - understanding
- e.g., resource sharing principle, routing
principles, mechanism design, optimal stochastic
control (randomized access) - Complexity comes from evolution.
- Dont be afraid to challenge the foundation and
redesign!
61(No Transcript)
62Backup Slides
63Ethernets Exponential Backoff
- Goal adapt retransmission attempts to estimated
current load - compared with CSMA, 1/2m can be considered as p
- not a static p---adjusted using exponential
backoff - first collision choose K from 0,1 delay is K
x 512 bit transmission times - after second collision choose K from 0,1,2,3
- after ten or more collisions, choose K from
0,1,2,3,4,,1023
64Many Issues
- How to make it faster
- How to make it more efficient
- How to make it more reliable/robust/secure
65CSMA Carrier Sense Multiple Access
- CSMA listen before transmit
- Objective approximate slotted Aloha (compared
with pure Aloha) - If backlogged, wait until channel sensed idle,
then transmit pkt with prob. p - human analogy dont interrupt others !
66CSMA Collisions
spatial layout of nodes along Ethernet
D
A
B
C
collisions can still occur propagation delay
means two nodes may not hear each others
transmission
t0
time
Collision entire packet transmission time
wasted still not very efficient!
67CSMA/CD (Collision Detection)
- Human analogy the polite conversationalist
- CSMA/CD
- observations
- collisions can be detected within short time
- if colliding transmissions are aborted, we can
reduce channel wastage - carrier sensing, deferral as in CSMA
- collision detection
- easy in wired LANs measure signal strengths,
compare transmitted, received signals - difficult in wireless LANs receiver shuts off
while transmitting
68CSMA/CD Collision Detection
spatial layout of nodes along Ethernet
spatial layout of nodes along Ethernet
D
D
A
A
B
C
B
C
t0
t0
time
time
B detects collision, aborts
D detects collision, aborts
instead of wasting the whole packettransmission
time, abort after detection.
69Efficiency of CSMA/CD
- Given collision detection, instead of wasting the
whole packet transmission time (a slot), we waste
only the time needed to detect collision. - Use a contention slot of 2 T, where T is one-way
propagation delay (why 2 T ?) - When the transmission probability p is
approximately optimal (p 1/N), we try
approximately e times before each successful
transmission
P packet size, e.g. 1000 bitsC link capacity,
e.g. 10Mbps
P/C
70Efficiency of CSMA/CD
- The efficiency (the percentage of useful time) is
approximately - The value of a plays a fundamental role in the
efficiency of CSMA/CD protocols. - Question you want to increase the capacity of a
link layer technology (e.g., , 10 Mbps Ethernet
to 100 Mbps), but still want to maintain the same
efficiency, what can you do?
71Summary of Problems to be Addressed
- Approximate slotted Aloha
- Reduce the penalty of collision or empty slots
- Infer optimal transmission rate
72Physical Layer
73Internet Bandwidth Growth
Source TeleGeograph Research
74What Determines Transmission Rate?
- Service transmit a bit stream from a sender to a
receiver
sender
receiver
channel
Encoding
Decoding
output bit stream
input bit stream
Question to be addressed how much can we send
through the channel ?
75Basic Theory Channel Capacity
- The maximum number of bits that can be
transmitted per second (bps) by a physical media
is - where W is the frequency range, S/N is the signal
noise ratio. We assume Gaussian noise.
76Fourier Transform
- Suppose the period of a data unit is f (1/T),
then the data unit can be represented as the sum
of many harmonics (sin(), cos()) with frequencies
f, 2f, 3f, 4f, - A reasonably behaved periodic function g(t), with
minimal period T, can be constructed as the sum
of a series of sines and cosines
77char b
78Signal Attenuation
- The quality of signal will degrade when it
travels - loss, frequency passing
79Frequency Dependent Attenuation
- The received signal will be distorted even when
there is no interference and the transmitted
signal is perfect square waveform
Example Voltage-attenuation magnitude ratios of
Category 5 cable. For example, 500 feet of cable
attenuates a 10-MHz, 1-V signal to 0.32 V, which
corresponds to about 9.90 dB ( 20 log 1/0.32)
80Example
V.34 (33.6kbps Dialup Modem)
channel
telephone network
sender modem
ISP modem
Analog to Digital quantization for
transmitting throughthe digital telephone
backbone
Modem Modulation(digit-gtanalog)
ISPdemodulation
input bit stream
3Khz bandwidth(add white noise)
output bit stream
- Example W3000Hz, S/N ? 4000
81Example ADSL
- Spectrum allocation divided into a total of
256 downstream and 32 upstream tones, where
each tone is a standard 4kHz voice channel - During initial negotiation, a tone is used only
if the S/N is above 6 db (?4)
82Faster
83The Wire Fiber
- A look at a fiber
- How it works?
A graded index fiber
84The Wire Fiber
- Wide spectrum at low loss 0.3db/km (c.f.
copper 190db/km _at_100Mhz), 30-100km without
repeater - Bandwidth of a single fiber
- theoretical 100-200Tbps http//www.trnmag.com/St
ories/080101/Study_shows_fiber_has_room_to_grow_08
0101.html - Lightweight 33 tons of copper to transmit the
same amount of information carried by ¼ pound of
optical fiber
85Advantages of Fibers
86How to Do Switching?
- Optical-Electrical-Optical
- Optical switch optical micro-electro-mechanical
systems (MEMS)
Optical path
One optical switch
http//www.qwest.com/largebusiness/enterprisesolut
ions/networkMaps/preloader.swf
87Example MEMS Optical Switch
- Using mirrors, e.g. Lambda Router
88Implications
- Fine-grained switching may not be feasible
- What is the architecture of optical networks
packet switching, circuit switching, or others?
89More Efficient
90Problem Inefficient Interactions
- Large deployment of highly adaptive, multipoint
applications - An iterative process between two sets of
adaptation - ISP traffic engineering to change routing to
shift traffic away from higher utilized links - current traffic pattern ? new routing matrix
- App direct traffic to better performing end
points - current routing matrix ? new traffic pattern
91ISP Traffic Engineering App Latency Optimizer
- red App adjust alone fixed ISP routing
- blue ISP traffic engineering adapt alone fixed
App communications
ISP optimizer interacts poorly with App.
92The Fundamental Problem
- Traditional Internet architectural feedback to
application efficiency is limited - routing (hidden)
- rate control through coarse-grained TCP
congestion feedback - To achieve better efficiency, needs explicit
communications between network resource providers
and applications
93P4P Framework Design Goals
- Performance improvement
- Scalability and extensibility support diverse
ISP objectives and applications scenarios in
large networks - Privacy preservation
- Ease of implementation
- Open standard any ISP, provider, applications
can easily implement it
94Current Status
- ATT
- Bezeq Intl
- BitTorrent
- CacheLogic
- Cisco Systems
- Grid Networks
- Joost
- LimeWire
- Manatt
- Oversi
- Pando Networks
- PeerApp
- Telefonica Group
- VeriSign
- Verizon
- Vuze
- Univ of Washington
- Yale University
- Abacast
- AHT Intl
- Akamai
- Alcatel Lucent
- CableLabs
- Cablevision
- Comcast
- Cox Comm
- Juniper Networks
- Microsoft
- MPAA
- NBC Universal
- Nokia
- RawFlow
- Solid State Networks
- Thomson
- Time Warner Cable
- Turner Broadcasting
- P4P-WG
- Next step
- wider integration
- IETF standard
95Reliability
96Is the Internet Reliable?
- A key design objective of the Internet (i.e.,
packet-switched networks) is robustness - Does the Internet infrastructure achieve the
target reliability objective of a highly
reliable system (99.999)?
97Perspective
- 911 Phone service (1993 NRIC report )
- 29 minutes per year per line
- 99.994 availability
- Std. Phone service (various sources)
- 53 minutes per line per year
- 99.99 availability
- what about the Internet?
- Various studies about 99.5
- Need to reduce down time by 500 times to achieve
five nines 50 times to match phone service
98Unreachable Networks 10 days
99Internet Disaster Recovery Response
- Why slow response?
- the cable repairing is slow not until 21 days
after quake - BGP is not designed to create business
relationship - Objective
- a meta-BGP to facilitate discovery and creation
of BGP business relationship
100(No Transcript)
101Backup IP Multicast
102IP Fragmentation Reassembly
- Network links have MTU (max.transfer size) -
largest possible link-level frame. - different link types, different MTUs, e.g.
Ethernet MTU is 1500 bytes - Large IP datagram divided (fragmented)
- one datagram becomes several datagrams
- reassembled only at final destination
- IP header bits used to identify, order related
fragments
fragmentation in one large datagram out 3
smaller datagrams
reassembly
103IP Fragmentation and Reassembly
- Example
- 4000 byte datagram
- MTU 1500 bytes
104IP Multicast Service Model
- Multicast group concept use of indirection
- A group is identified by a location-independent
logical address (class D IP address prefix 1110) - Open group model
- Anyone can send packets to the logical group
address - Anyone can join a group and receive packets
- Normal, best-effort delivery semantics of IP
Needed infrastructure to deliver mcast-addressed
datagrams to all hosts that have joined that
multicast group
105Multicast Across LANs
- Goal find a tree (or trees) connecting routers
having local mcast group members - source-based different tree from sender to each
receiver - Distance-vector multicast routing protocol
(DVMRP) - Protocol-independent multicast-dense mode
(PIM-DM) - shared-tree same tree used by all group members
- Core-Based Tree (CBT)
- Protocol-independent multicast-sparse mode
(PIM-SM)
shared tree
106Source Tree Reverse Path Flooding (RPF)
- A router x forwards a packet from source (S) iff
it arrives via neighbor y, and y is on the
shortest path from x back to S - A packet is replicated to all but the incoming
interface
S
1
1
y
x
1
z
1
1
t
a
107Reverse Path Forwarding Improvement
- Basic idea forward a packet from S only on child
links for S - A child link of router x for source S
- a link that has x as parent on the shortest path
from thelink to S - a child x notifies its parent y(through the
routing protocol)that it has selected y as
itsparent
S
y
x
z
t
a
108Reverse Path Forwarding Pruning
- No need to forward datagrams down subtree with no
mcast group members - prune msgs sent upstream by router with no
downstream group members
LEGEND
S source
R1
router with attached group member
R4
router with no attached group member
R2
P
P
R5
prune message
links with multicast forwarding
P
R3
R7
R6
109Pruning
- Prune (Source, Group) at a leaf router if no
members - send No-Membership Report (NMR) up tree
- If all children of router R prune (S,G)
- propagate prune for (S,G) to its parent
- What do you do when a member of a group
(re)joins? - send a Graft message to upstream parent
- How to deal with failures?
- prune dropped
- flow is reinstated
- down stream routers re-prune
- Note again a soft-state approach
110Implementation of Source Trees in the Internet
- Multicast OSFP (MOSFP)
- Membership is part of the link state
distribution calculate source specific,
pre-pruned trees - Reverse Path Forwarding
- Distance Vector Multicast Routing Protocol
(DVMRP) - Protocol Independent Multicast Dense Mode
(PIM-DM) - very similar to DVMRP
- Difference PIM uses any unicast routing
algorithm to determine the path from a router to
the source DVMRP uses distance vector - Question the state requirement of Reverse Path
Forwarding
111Building a Shared Tree
- Steiner Tree minimum cost tree connecting all
routerswith attached group members - A Steiner tree is not a spanning tree because
you do not need to connect all nodes in the
network - Problem is NP-hard
- Excellent heuristics exists
- Not used in practice
- computational complexity
- information about entire network needed
- monolithic rerun whenever a router needs to
join/leave
112Center (Core) based Shared Tree
- Single delivery tree shared by all
- One router identified as center of tree
- Tree construction is receiver-based
- edge router sends unicast join-msg addressed to
center router - join-msg processed by intermediate routers and
forwarded towards center - join-msg either hits existing tree branch for
this center, or arrives at center - path taken by join-msg becomes new branch of tree
for this router - A sender unicasts a packet to center
- The packet is distributed on the tree when it
hits the tree
113Example M3 Joins
core
M1
M2
M3
S1
Discussion what is property of the constructed
tree?
114Example M1 Sends Data
- Group members M1, M2, M3
- M1 sends data
core
M1
M2
M3
control (join) messages
data
S1
115Shared Tree Protocols in the Internet
- Core Based Tree
- Protocol Independent Multicast (PIM) Sparse mode
- The catch how do you know the center?
- session announcement
116Mbone Tunneling
Q How to connect islands of multicast routers
in a sea of unicast routers?
logical topology
physical topology
- mcast datagram encapsulated inside normal
(non-multicast-addressed) datagram - normal IP datagram sent thru tunnel via regular
IP unicast to receiving mcast router - receiving mcast router unencapsulates to get
mcast datagram