The ongoing evolution from Packet based networks to Hybrid Networks in Research - PowerPoint PPT Presentation

About This Presentation
Title:

The ongoing evolution from Packet based networks to Hybrid Networks in Research

Description:

Title: PowerPoint Presentation Author: JL C Last modified by: omartin Created Date: 11/5/2003 1:31:03 PM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:105
Avg rating:3.0/5.0
Slides: 36
Provided by: JLC68
Category:

less

Transcript and Presenter's Notes

Title: The ongoing evolution from Packet based networks to Hybrid Networks in Research


1
  • The ongoing evolution from Packet based networks
    to Hybrid Networks in Research Education
    Networks

Olivier Martin, CERN Swiss ICT Task Force
(Fribourg)
2
Presentation Outline
  • The demise of conventional packet based networks
    in the RE community
  • The advent of community managed dark fiber
    networks
  • The Grid its associated Wide Area Networking
    challenges
  •  On-Demand Lambda Grids 
  • Ethernet over SONET new standards
  • WAN-PHY, GFP, VCAT/LCAS, G.709, OTN
  • Disclaimer The views expressed herein are not
    necessarily those of CERN, furthermore although I
    am formally a CERN staff member until July 31,
    2006, I do not work for CERN any more since
    October 3, being on a pre-retirement program.

3
(No Transcript)
4
OC-768c
40-GE
5
Some facts
(5 of 12)
  • Internet is everywhere
  • Ethernet is everywhere
  • The advent of next generation G.709 Optical
    Transport Networks
  • is very unsure!
  • hence one has to learn how to live best with
    existing network infrastructures,
  • which may well explain all the hype about
    on-demand lambda Grids!
  • For the first time in the history of the
    Internet, the Commercial and the Research
    Education Internet appear to follow different
    routes
  • Will they ever converge again?
  • Dark fiber based, customer owned long distance,
    networks are booming!
  • users are becoming their own Telecom Operators
  • Is it a good or a bad thing?

6
Internet Backbone Speeds
MBPS
IP/?
OC12c
OC3c
ATM-VCs
T3 lines
T1 Lines
7
High Speed IP Network Transport Trends
Multiplexing, protection and management at every
layer
IP
Signalling
ATM
SONET/SDH
Optical
B-ISDN
Higher Speed, Lower cost, complexity and overhead
8
(No Transcript)
9
(No Transcript)
10
Network Exponentials
  • Network vs. computer performance
  • Computer speed doubles every 18 months
  • Network speed doubles every 9 months
  • Difference order of magnitude per 5 years
  • 1986 to 2000
  • Computers x 500
  • Networks x 340,000
  • 2001 to 2010
  • Computers x 60
  • Networks x 4000

Moores Law vs. storage improvements vs. optical
improvements. Graph from Scientific American
(Jan-2001) by Cleo Vilett, source Vined Khoslan,
Kleiner, Caufield and Perkins.
11
Know the user
(3 of 12)
of users
A
C
B
ADSL
GigE LAN
F(t)
BW requirements
A -gt Lightweight users, browsing, mailing, home
use B -gt Business applications, multicast,
streaming, VPNs, mostly LAN C -gt Special
scientific applications, computing, data grids,
virtual-presence
12
What the user
(4 of 12)
Total BW
A
B
C
ADSL
GigE LAN
BW requirements
A -gt Need full Internet routing, one to many B -gt
Need VPN services on/and full Internet routing,
several to several C -gt Need very fat pipes,
limited multiple Virtual Organizations, few to few
13
So what are the facts
(5 of 12)
  • Costs of fat pipes (fibers) are one/third of
    equipment to light them up
  • Is what Lambda salesmen told Cees de Laat
    (University of Amsterdam Surfnet)
  • Costs of optical equipment 10 of switching 10
    of full routing equipment for same throughput
  • 100 Byte packet _at_ 10 Gb/s -gt 80 ns to look up in
    100 Mbyte routing table (light speed from me to
    you on the back row!)
  • Big sciences need fat pipes
  • Bottom line create a hybrid architecture which
    serves all users in one coherent and cost
    effective way

14
Utilization trends
Gbps
Network Capacity Limit
Jan 2005
15
Todays hierarchical IP network
Other national networks
National or Pan-National IP Network
NREN A
NREN C
NREN B
NREN D
University
16
Tomorrows peer to peer IP network
World
World
National DWDM Network
World
Child Lightpaths
NREN B
NREN A
NREN C
NREN D
Child Lightpaths
University
Server
17
Creation of application VPNs
Direct connect bypasses campus firewall
University
Dept
High Energy Physics Network
CERN
Commodity Internet
Research Network
University
University
Bio-informatics Network
University
University
eVLBI Network
18
Production vs Research Campus Networks
  • Increasingly campuses are deploying parallel
    networks for high end users
  • Reduces costs by providing high end network
    capability to only those who need it
  • Limitations of campus firewall and border router
    are eliminated
  • Many issues in regards to security, back door
    routing, etc
  • Campus networks may follow same evolution as
    campus computing
  • Discipline specific networks being extended into
    the campus

19
UCLP intended for projects like National
LambdaRail
20
GEANT2 POP Design
21
LHC Data Grid Hierarchy
CERN/Outside Resource Ratio 12Tier0/(?
Tier1)/(? Tier2) 111
PByte/sec
100-400 MBytes/sec
Online System
Experiment
CERN 700k SI95 1 PB Disk Tape Robot
Tier 0 1
HPSS
10 Gbps
Tier 1
FNAL 200k SI95 600 TB
IN2P3 Center
INFN Center
RAL Center
2.5/10 Gbps
Tier 2
2.5 Gbps
Tier 3
Institute 0.25TIPS
Institute
Institute
Institute
Physicists work on analysis channels Each
institute has 10 physicists working on one or
more channels
0.11 Gbps
Physics data cache
Tier 4
Workstations
22
Main Networking Challenges
  • Fulfill the, yet unproven, assertion that the
    network can be  nearly  transparent to the Grid
  • Deploy suitable Wide Area Network infrastructure
    (50-100 Gb/s)
  • Deploy suitable Local Area Network infrastructure
    (matching or exceeding that of the WAN)
  • Seamless interconnection of LAN WAN
    infrastructures
  • firewall?
  • End to End issues (transport protocols, PCs
    (Itanium, Xeon), 10GigE NICs (Intel, S2io), where
    are we today
  • memory to memory 7.5Gb/s (PCI bus limit)
  • memory to disk 1.2MB (Windows 2003
    server/NewiSys)
  • disk to disk 400MB (Linux), 600MB (Windows)

23
Main TCP issues
  • Does not scale to some environments
  • High speed, high latency
  • Noisy
  • Unfair behaviour with respect to
  • Round Trip Time (RTT
  • Frame size (MSS)
  • Access Bandwidth
  • Widespread use of multiple streams in order to
    compensate for inherent TCP/IP limitations (e.g.
    Gridftp, BBftp)
  • Bandage rather than a cure
  • New TCP/IP proposals in order to restore
    performance in single stream environments
  • Not clear if/when it will have a real impact
  • In the mean time there is an absolute requirement
    for backbones with
  • Zero packet losses,
  • And no packet re-ordering
  • Which re-inforces the case for lambda Grids

24
TCP dynamics(10Gbps, 100ms RTT, 1500Bytes
packets)
  • Window size (W) BandwidthRound Trip Time
  • Wbits 10Gbps100ms 1Gb
  • Wpackets 1Gb/(81500) 83333 packets
  • Standard Additive Increase Multiplicative
    Decrease (AIMD) mechanisms
  • WW/2 (halving the congestion window on loss
    event)
  • WW 1 (increasing congestion window by one
    packet every RTT)
  • Time to recover from W/2 to W (congestion
    avoidance) at 1 packet per RTT
  • RTTWp/2 1.157 hour
  • In practice, 1 packet per 2 RTT because of
    delayed acks, i.e. 2.31 hour
  • Packets per second
  • RTTWpackets 833333 packets

25
Internet2 land speed record history (IPv4
IPv6) period 2000-2004
26
Layer1/2/3 networking (1)
  • Conventional layer 3 technology is no longer
    fashionable because of
  • High associated costs, e.g. 200/300 KUSD for a
    10G router interfaces
  • Implied use of shared backbones
  • The use of layer 1 or layer 2 technology is very
    attractive because it helps to solve a number of
    problems, e.g.
  • 1500 bytes Ethernet frame size (layer1)
  • Protocol transparency (layer1 layer2)
  • Minimum functionality hence, in theory, much
    lower costs (layer12)

27
Layer1/2/3 networking (2)
  •  0n-demand Lambda Grids  are becoming very
    popular
  • Pros
  • circuit oriented model like the telephone
    network, hence no need for complex transport
    protocols
  • Lower equipment costs (i.e.  in theory  a
    factor 2 or 3 per layer)
  • the concept of a dedicated end to end light path
    is very elegant
  • Cons
  •  End to end  still very loosely defined, i.e.
    site to site, cluster to cluster or really host
    to host
  • Higher circuit costs, Scalability, Additional
    middleware to deal with circuit set up/tear down,
    etc
  • Extending dynamic VLAN functionality is a
    potential nightmare!

28
 Lambda Grids What does it mean?
  • Clearly different things to different people,
    hence the apparently easy consensus!
  • Conservatively, on demand  site to site 
    connectivity
  • Where is the innovation?
  • What does it solve in terms of transport
    protocols?
  • Where are the savings?
  • Less interfaces needed (customer) but more
    standby/idle circuits needed (provider)
  • Economics from the service provider vs the
    customer perspective?
  • Traditionally, switched services have been very
    expensive,
  • Usage vs flat charge
  • Break even, switches vs leased, few hours/day
  • Why would this change?
  • In case there are no savings, why bother?
  • More advanced, cluster to cluster
  • Implies even more active circuits in paralle
  • Is it realistic?
  • Even more advanced, Host to Host or even  per
    flow 
  • All optical
  • Is it really realisitic?

29
Some Challenges
  • Real bandwidth estimates given the chaotic nature
    of the requirements.
  • End-end performance given the whole chain
    involved
  • (disk-bus-memory-bus-network-bus-memory-bus-disk)
  • Provisioning over complex network infrastructures
    (GEANT, NRENs etc)
  • Cost model for options (packetSLAs, circuit
    switched etc)
  • Consistent Performance (dealing with firewalls)
  • Merging leading edge research with production
    networking

30
Tentative conclusions
  • There is a very clear trend towards community
    managed dark fiber networks
  • As a consequence National Research Education
    Networks are evolving into Telecom Operators, is
    it right?
  • In the short term, almost certainly YES
  • In the longer term, probably NO
  • In many countries, there is NO other way to have
    affordable access to multi-Gbit/s networks,
    therefore this is clearly the right move
  • The Grid its associated Wide Area Networking
    challenges
  •  on-demand Lambda Grids  are, according to me,
    extremely doubtful!
  • Ethernet over SONET new standards will
    revolutionize the Internet
  • WAN-PHY (IEEE) has, according to me NO future!
  • However, GFP, VCAT/LCAS, G.709, OTN are very
    likely to have a very bright future.

31
Single TCP stream performance under periodic
losses
  • Loss rate 0.01
  • LAN BW utilization 99
  • WAN BW utilization1.2

Bandwidth available 1 Gbps
  • TCP throughput much more sensitive to packet loss
    in WANs than LANs
  • TCPs congestion control algorithm (AIMD) is not
    well-suited to gigabit networks
  • The effect of packets loss can be disastrous
  • TCP is inefficient in high bandwidthdelay
    networks
  • The future performance-outlook for computational
    grids looks bad if we continue to rely solely on
    the widely-deployed TCP RENO

32
Responsiveness
  • Time to recover from a single packet loss

2
C . RTT
r
C Capacity of the link
2 . MSS
Path Bandwidth RTT (ms) MTU (Byte) Time to recover
LAN 10 Gb/s 1 1500 430 ms
GenevaChicago 10 Gb/s 120 1500 1 hr 32 min
Geneva-Los Angeles 1 Gb/s 180 1500 23 min
Geneva-Los Angeles 10 Gb/s 180 1500 3 hr 51 min
Geneva-Los Angeles 10 Gb/s 180 9000 38 min
Geneva-Los Angeles 10 Gb/s 180 64k (TSO) 5 min
Geneva-Tokyo 1 Gb/s 300 1500 1 hr 04 min
  • Large MTU accelerates the growth of the window
  • Time to recover from a packet loss decreases with
    large MTU
  • Larger MTU reduces overhead per frames (saves CPU
    cycles, reduces the number of packets)

33
Single TCP stream between Caltech and CERN
  • Available (PCI-X) Bandwidth8.5 Gbps
  • RTT250ms (16000 km)
  • 9000 Byte MTU
  • 15 min to increase throughput from 3 to 6 Gbps
  • Sending station
  • Tyan S2882 motherboard, 2x Opteron 2.4 GHz ,
    2 GB DDR.
  • Receiving station
  • CERN OpenLabHP rx4640, 4x 1.5GHz Itanium-2,
    zx1 chipset, 8GB memory
  • Network adapter
  • S2IO 10 GbE

CPU load 100
Single packet loss
Burst of packet losses
34
High Throughput Disk to Disk Transfers From 0.1
to 1GByte/sec
  • Server Hardware (Rather than Network)
    Bottlenecks
  • Write/read and transmit tasks share the same
    limited resources CPU, PCI-X bus, memory, IO
    chipset
  • PCI-X bus bandwidth 8.5 Gbps 133MHz x 64 bit
  • Link aggregation (802.3ad) Logical interface
    with two physical interfaces on two independent
    PCI-X buses.
  • LAN test 11.1 Gbps (memory to memory)

Performance in this range (from 100 MByte/sec up
to 1 GByte/sec) is required to build a
responsive Grid-based Processing and Analysis
System for LHC
35
Transferring a TB from Caltech to CERN in 64-bit
MS Windows
  • Latest disk to disk over 10Gbps WAN 4.3
    Gbits/sec (536 MB/sec) - 8 TCP streams from CERN
    to Caltech 1TB file
  • 3 Supermicro Marvell SATA disk controllers 24
    SATA 7200rpm SATA disks
  • Local Disk IO 9.6 Gbits/sec (1.2 GBytes/sec
    read/write, with lt20 CPU utilization)
  • S2io SR 10GE NIC
  • 10 GE NIC 7.5 Gbits/sec (memory-to-memory, with
    52 CPU utilization)
  • 210 GE NIC (802.3ad link aggregation) 11.1
    Gbits/sec (memory-to-memory)
  • Memory to Memory WAN data flow, and local Memory
    to Disk read/write flow, are not matched when
    combining the two operations
  • Quad Opteron AMD848 2.2GHz processors with 3
    AMD-8131 chipsets 4 64-bit/133MHz PCI-X slots.
  • Interrupt Affinity Filter allows a user to
    change the CPU-affinity of the interrupts in a
    system.
  • Overcome packet loss with re-connect logic.
  • Proposed Internet2 Terabyte File Transfer
    Benchmark
Write a Comment
User Comments (0)
About PowerShow.com