High Performance Switches and Internet Routers: Architecture and Scheduling

About This Presentation

Title:

High Performance Switches and Internet Routers: Architecture and Scheduling

Description:

Scheduling in Input Queued (IQ) Switches. The Buffered Crossbar ... Quite complex in Hardware (random encoders), Parallel Iterative Matching. Random Selection ... – PowerPoint PPT presentation

Number of Views:112

Avg rating:3.0/5.0

Slides: 39

Provided by: lot63

Category:

more less

Transcript and Presenter's Notes

Title: High Performance Switches and Internet Routers: Architecture and Scheduling

1
High Performance Switches and Internet Routers
Architecture and Scheduling
TU Delft June 18, 2004
Lotfi Mhamdi Computer Engineering Lab. HKUST,
HONG KONG http//www.cs.ust.hk/lotfi
2
Outline

The Need for fast Routers
Routers Architecture
Scheduling in Input Queued (IQ) Switches
The Buffered Crossbar Switching Architecture
(BCS)
Scheduling in BCS
Output Queueing (OQ) switch Emulation
Concluding Remarks

3
Where high performance packet switches are used
- Core Router - ATM Switch - Frame Relay Switch
The Internet Core
4
Recent trends
5
Outline

The Need for fast Routers
Routers Architecture
Scheduling in Input Queued (IQ) Switches
The Buffered Crossbar Switching Architecture
(BCS)
Scheduling in BCS
Output Queueing (OQ) switch Emulation
Concluding Remarks

6
Basic Architectural Components Data Path Per
packet processing
Ingress
Ingress
Egress
2.
Interconnect
7
InterconnectsTwo basic techniques
Input Queueing
Output Queueing
Usually a non-blocking switch fabric (e.g.
crossbar)
Usually a fast bus/Shared Memory
8
Output QueueingThe ideal
9
Input QueueingThe Head of Line Blocking
10
Head of Line Blocking
11
(No Transcript)
12
(No Transcript)
13
Input QueueingVirtual Output Queues
14
IQ Switch with VOQs
It can be quite complex!
15
Outline

The Need for fast Routers
Routers Architecture
Scheduling in Input Queued (IQ) Switches
The Buffered Crossbar Switching Architecture
(BCS)
Scheduling in BCS
Output Queueing (OQ) switch Emulation
Concluding Remarks

16
Matching (Scheduling)
service matrix S(n) Si,j(n), where 1 if
input i sends to output j Si,j(n)
0 otherwise Our objective is Max S(n) s.t
?i Si,j 1 ?j Si,j 1
Matching
Maximum Weight or Maximum Size?
Request Graph
Bipartite Match
17
Maximum Size/Weight Matching
Matching

Maximizes instantaneous throughput
Complexity O(N2.5)
Hard to implement in hardware,
Slow

Weight (Queue Length, Waiting time) ? 100
throughput
Stable under any admissible input traffic (LQF,
OCF)
Complexity O(N3LogN)
Hard to implement in hardware and very slow

18
Parallel Iterative Matching
Random Selection
Random Selection

100 throughput under uniform traffic (converges
in LogN iterations)
63 throughput with 1 iteration (pointers
synchronization)
Quite complex in Hardware (random encoders),

19
iSLIP
Round-Robin Selection

Easy to implement in hardware,
Converges in LogN iterations,
100 throughput under Uniform Traffic.

20
Performance 16X16 Switch, Uniform Traffic
FIFO
PIM
RRM
iSLIP
21
Pointer Synchronization
22
Performance 3X3 Switch, Non-uniform Traffic
23
So Far

Maximum size/weight matching algorithms are
impractical (hardware implementation)
Iterative matching algorithms (PIM, iSLIP,) are
practical but unstable under non-uniform traffic.
Their centralized design is a bottleneck

Is it possible to design Distributed Scheduling
Algorithms?
If yes, how?

24
Outline

The Need for fast Routers
Routers Architecture
Scheduling in Input Queued (IQ) Switches
The Buffered Crossbar Switching Architecture
(BCS)
Scheduling in BCS
Output Queueing (OQ) switch Emulation
Concluding Remarks

25
Buffered Crossbar Switch Architecture
26
I/O Contention Resolution
1
2
3
4
3
4
1
2
27
Outline

The Need for fast Routers
Routers Architecture
Scheduling in Input Queued (IQ) Switches
The Buffered Crossbar Switching Architecture
(BCS)
Scheduling in BCS
Output Queueing (OQ) switch Emulation
Concluding Remarks

28
Scheduling Process

Scheduling is divided into three steps
Input scheduling
Every input i independently selects (in parallel)
a HoL cell of an eligible VOQ and sends it to the
corresponding internal buffer.
Output scheduling
Every output j independently selects (in
parallel) a cell amongst all non-empty XPij to be
delivered to the output port.
Delivery notifying (Flow Control)
For each delivered cell, inform the corresponding
input of the internal buffer status.

Eligible VOQi,j non-empty VOQi,j and empty XPij
29
Existing Algorithms

Round Robin (RR-RR)
Round robin scheduling at the inputs, and the
outputs.

Oldest Cell First (OCF-OCF)
Select the oldest HoL cell in each input, and
the oldest at the outputs.

Longest Queue First - Round Robin (LQF-RR)
Select the HoL cell of the longest VOQ at each
input, and round robin at the outputs.

30
Internal Buffers based Scheduling
XPi,j internal buffer
31
SBF-LBF Performance
32
Performance (VOQs Occupancies)
33
Outline

The Need for fast Routers
Routers Architecture
Scheduling in Input Queued (IQ) Switches
The Buffered Crossbar Switching Architecture
(BCS)
Scheduling in BCS
Output Queueing (OQ) switch Emulation
Concluding Remarks

34
OQ EmulationThe Speed up problem
Input Queueing
Output Queueing
Best delay and throughput performance - High
fabric speedup (S N)
Speedup of one is sufficient - Unpredictable
delay due to input contention
Memory speeds for 32x32 ATM switch
35
The Ideal Solution
Find a compromise 1 lt Speedup ltlt N

to get the performance of an OQ switch
close to the cost of an IQ switch

Question Can we find

a simple and good algorithm
that exactly mimics output-queueing
regardless of switch sizes and traffic patterns?

36
Proposed Algorithms

IQ Buffer less crossbar switch
A speed up of 4 was shown to be sufficient (MUCF
algorithm)
A speed up of just two was also provided (GBVOQ
algorithm).
The bad news is Both schemes are impractical
(high complexity due to the centralized
scheduling)

Buffered Crossbar switch

A speed up of just two was proven to be enough
for the exact emulation of an OQ switch (MCAF_LTF
algorithm).
MCAF_LTF is practical and simple to implement in
hardware

?
37
Outline

The Need for fast Routers
Routers Architecture
Scheduling in Input Queued (IQ) Switches
The Buffered Crossbar Switching Architecture
(BCS)
Scheduling in BCS
Output Queueing (OQ) switch Emulation
Concluding Remarks

38
Concluding Remarks

The IQ crossbar switching architecture is
becoming less attractive due to the scalability
and scheduling complexity challenges.
The BCS switching architecture presents a good
potential in overcoming the IQ switching problems.

Open Questions

Is 100 throughput achievable with a speedup lt 2
for buffered crossbars using parallel scheduling?
Will storing multiple cells per crosspoint
further simplify scheduling?

Write a Comment

User Comments (0)

About PowerShow.com

High Performance Switches and Internet Routers: Architecture and Scheduling - PowerPoint PPT Presentation

High Performance Switches and Internet Routers: Architecture and Scheduling

Scheduling in Input Queued (IQ) Switches. The Buffered Crossbar ... Quite complex in Hardware (random encoders), Parallel Iterative Matching. Random Selection ... – PowerPoint PPT presentation