Title: The Advanced Networks and Services Underpinning the Large-Scale Science of DOE
1The Advanced Networks and ServicesUnderpinning
the Large-Scale Science ofDOEs Office of
ScienceThe Evolution of Production
NetworksOver the Next 10 Yearsto Support
Large-Scale International ScienceAn ESnet View
- William E. Johnston, wej_at_es.net ESnet Manager
and Senior Scientist - Lawrence Berkeley National Laboratory
- www.es.net
2 DOE Office of Science Drivers for Networking
- The DOE Office of Science supports more than 40
of all US RD in high-energy physics, nuclear
physics, and fusion energy sciences
(http//www.science.doe.gov) - This large-scale science that is the mission of
the Office of Science depends on high-speed
networks for - Sharing of massive amounts of data
- Supporting thousands of collaborators world-wide
- Distributed data processing
- Distributed simulation, visualization, and
computational steering - Distributed data management
- The role of ESnet is to provides networking that
supports and anticipates these uses for the
Office of Science Labs and their collaborators - The issues were explored in two Office of Science
workshops that formulated networking requirements
to meet the needs of the science programs (see
refs.)
3Increasing Large-Scale Science Collaborationis
Reflected in Network Usage
- As of May, 2005 ESnet is transporting about 530
Terabytes/mo. - ESnet traffic has increased by 10X every 46
months, on average, since 1990
ESnet Monthly Accepted Traffic Feb., 1990 May,
2005
TBytes/Month
Feb, 90 Aug, 90 Feb, 91 Aug, 91 Feb, 92 Aug,
92 Feb, 93 Aug, 93 Feb, 94 Aug, 94 Feb, 95 Aug,
95 Feb, 96 Aug, 96 Feb, 97 Aug, 97 Feb, 98 Aug,
98 Feb, 99 Aug, 99 Feb, 00 Aug, 00 Feb, 01 Aug,
01 Feb, 02 Aug, 02 Feb, 03 Aug, 03 Feb, 04 Aug,
04 Feb, 05
4Large-Scale Science Has Changed How the Network
is Used
Total ESnet traffic Feb., 2005 323 TBy in
approx. 6,000,000,000 flows
ESnet Top 100 Host-to-Host Flows, Feb., 2005
TBytes/Month
DOE Lab-International RE
All other flows(lt 0.28 TBy/month each)
Lab-U.S. RE (domestic)
Lab-Lab(domestic)
International
Domestic
Inter-Lab
Lab-Comm.(domestic)
- A small number of large-scale science users now
account fora significant fraction of all ESnet
traffic - Over the next few years this will grow to be the
dominate use of the network
5Large-Scale Science Has Changed How the Network
is Used
- These flows are primarily bulk data transfer at
this point and are candidates for circuit based
services for several reasons - Traffic engineering to manage the traffic on
the backbone - Guaranteed bandwidth is needed to satisfy
deadline scheduling requirements - Traffic isolation will permit the use of
efficient, but TCP unfriendly, data transfer
protocols
6Virtual Circuit Network Services
- A top priority of the science community
- Today
- Primarily to support bulk data transfer with
deadlines - In the near future
- Support for widely distributed Grid workflow
engines - Real-time instrument operation
- Coupled, distributed applications
- To get an idea of how circuit services might be
used to support the current trends, look at the
one year history of the flows that are currently
the top 20 - Estimate from the flow history what would be the
characteristics of a circuit set up to manage the
flow
7Source and Destination of the Top 20 Flows, Sept.
2005
8What are Characteristics of Todays Flows How
Dynamic a Circuit?
LIGO CalTech Over 1 year the circuit duration
is about 3 months
Gigabytes/day
(no data)
9What are Characteristics of Todays Flows How
Dynamic a Circuit?
SLAC - IN2P3 (FR)Over 1 year circuit duration
is about 1 day to 1 week
Gigabytes/day
(no data)
10Between ESnet, Abilene, GÉANT, and the connected
regional RE networks, there will be dozens of
lambdas in production networks that are shared
between thousands of users who want to use
virtual circuits Very complex inter-domain
issues
similar situationin GÉANT and theEuropean NRENs
Abilene
ESnet-Abilenex-connects
ESnet
similar situationin US regionals
US RE environment
11OSCARS Virtual Circuit Service
- Despite the long circuit duration, these circuits
cannot be managed by hand too many circuits - There must automated scheduling, authorization,
path analysis and selection, and path setup
management plane and control plane - Virtual circuits must operate across domains
- End points will be on campuses or research
institutes that are served by ESnet, Abilenes
regional networks, and GÉANTs regional networks
typically five domains to cross to do
end-to-end system connection - There are many issues here that are poorly
understood - A collaboration between Internet2/HOPI,
DANTE/GÉANT, and ESnet is building a
prototype-production, interoperable service - ESnet virtual circuit project On-demand Secure
Circuits and Advance Reservation System (OSCARS)
(Contact Chin Guok (chin_at_es.net) for
information.)
12What about lambda switching?
- Two factors argue that this is a long ways out
for production networks - 1) There will not be enough lambdas available to
satisfy the need - Just provisioning a single lambda ring around the
US (7000miles -11,000km) is still about
2,000,000 even on RE networks - This should drop by a factor of 5 -10 over next
decade - 2) Even if there were a lot of lambdas
(hundreds?) there are thousands of large-scale
science users - Just considering sites (and not scientific
groups) there are probably 300 major research
science research sites in the US and a comparable
number in Europe - So, lambdas will have to be shared for the
foreseeable future - Multiple QoS paths per lambda
- Guaranteed minimum level of service for best
effort traffic when utilizing the production IP
networks - Allocation management
- There will be hundreds to thousands of contenders
with different science priorities
13References DOE Network Related Planning
Workshops
- 1) High Performance Network Planning Workshop,
August 2002 - http//www.doecollaboratory.org/meetings/hpnpw
- 2) DOE Science Networking Roadmap Meeting, June
2003 - http//www.es.net/hypertext/welcome/pr/Roadmap/ind
ex.html - 3) DOE Workshop on Ultra High-Speed Transport
Protocols and Network Provisioning for
Large-Scale Science Applications, April 2003 - http//www.csm.ornl.gov/ghpn/wk2003
- 4) Science Case for Large Scale Simulation, June
2003 - http//www.pnl.gov/scales/
- 5) Workshop on the Road Map for the
Revitalization of High End Computing, June 2003 - http//www.cra.org/Activities/workshops/nitrd
- http//www.sc.doe.gov/ascr/20040510_hecrtf.pdf
(public report) - 6) ASCR Strategic Planning Workshop, July 2003
- http//www.fp-mcs.anl.gov/ascr-july03spw
- 7) Planning Workshops-Office of Science
Data-Management Strategy, March May 2004 - http//www-conf.slac.stanford.edu/dmw2004
14 15ESnet Today Provides Global High-Speed Internet
Connectivity forDOE Facilities and Collaborators
Japan (SINet) Australia (AARNet) Canada
(CAnet4 Taiwan (TANet2) Singaren
CAnet4 France GLORIAD (Russia, China)Korea
(Kreonet2
MREN Netherlands StarTapTaiwan (TANet2, ASCC)
PNWGPoP/PAcificWave
SEA
ESnet Science Data Network (SDN) core
ESnet IP core
NYC
CHI-SL
MAE-E
SNV
CHI
Equinix
PAIX-PA Equinix, etc.
SNV SDN
DC
ATL
SDSC
ALB
42 end user sites
Office Of Science Sponsored (22)
ELP
NNSA Sponsored (12)
International (high speed) 10 Gb/s SDN core 10G/s
IP core 2.5 Gb/s IP core MAN rings ( 10
G/s) OC12 ATM (622 Mb/s) OC12 / GigEthernet OC3
(155 Mb/s) 45 Mb/s and less
Joint Sponsored (3)
Other Sponsored (NSF LIGO, NOAA)
ESnet IP core Packet over SONET Optical Ring and
Hubs
Laboratory Sponsored (6)
commercial and RE peering points
ESnet core hubs
IP
high-speed peering points with Internet2/Abilene
16 DOE Office of Science Drivers for Networking
- The DOE Office of Science supports more than 40
of all US RD in high-energy physics, nuclear
physics, and fusion energy sciences
(http//www.science.doe.gov) - This large-scale science that is the mission of
the Office of Science depends on networks for - Sharing of massive amounts of data
- Supporting thousands of collaborators world-wide
- Distributed data processing
- Distributed simulation, visualization, and
computational steering - Distributed data management
- The role of ESnet is to provide networking that
supports these uses for the Office of Science
Labs and their collaborators - The issues were explored in two Office of Science
workshops that formulated networking requirements
to meet the needs of the science programs (see
refs.)
17CERN / LHC High Energy Physics Data Provides One
ofSciences Most Challenging Data Management
Problems (CMS is one of several experiments at
LHC)
100 MBytes/sec
event simulation
Online System
PByte/sec
Tier 0 1
eventreconstruction
HPSS
human
CERN LHC CMS detector 15m X 15m X 22m, 12,500
tons, 700M.
2.5-40 Gbits/sec
Tier 1
German Regional Center
French Regional Center
FermiLab, USA Regional Center
Italian Center
0.6-2.5 Gbps
analysis
Tier 2
0.6-2.5 Gbps
Tier 3
Institute 0.25TIPS
- 2000 physicists in 31 countries are involved in
this 20-year experiment in which DOE is a major
player. - Grid infrastructure spread over the US and Europe
coordinates the data analysis
Institute
Institute
Institute
100 - 1000 Mbits/sec
Physics data cache
Tier 4
Courtesy Harvey Newman, CalTech
Workstations
18LHC Networking
- This picture represents the MONARCH model a
hierarchical, bulk data transfer model - Still accurate for Tier 0 (CERN) to Tier 1
(experiment data centers) data movement - Not accurate for the Tier 2 (analysis) sites
which are implementing Grid based data analysis
19Example Complicated Workflow Many Sites
20Distributed Workflow
- Distributed / Grid based workflow systems involve
many interacting computing and storage elements
that rely on smooth inter-element communication
for effective operation - The new LHC Grid based data analysis model will
involve networks connecting dozens of sites and
thousands of systems for each analysis center
21Example Multidisciplinary Simulation
A complete approach to climate modeling
involves many interacting models and data that
are provided by different groups at different
locations (Tim Killeen, NCAR)
Chemistry CO2, CH4, N2O ozone, aerosols
Climate Temperature, Precipitation, Radiation,
Humidity, Wind
Heat Moisture Momentum
CO2 CH4 N2O VOCs Dust
Minutes-To-Hours
Biogeophysics
Biogeochemistry
Carbon Assimilation
Aero- dynamics
Decomposition
Water
Energy
Mineralization
Microclimate Canopy Physiology
Phenology
Hydrology
Inter- cepted Water
Bud Break
Soil Water
Snow
Days-To-Weeks
Leaf Senescence
Evaporation Transpiration Snow Melt Infiltration R
unoff
Gross Primary Production Plant
Respiration Microbial Respiration Nutrient
Availability
Species Composition Ecosystem Structure Nutrient
Availability Water
Years-To-Centuries
Ecosystems Species Composition Ecosystem Structure
WatershedsSurface Water Subsurface
Water Geomorphology
Disturbance Fires Hurricanes Ice Storms Windthrows
Vegetation Dynamics
Hydrologic Cycle
(Courtesy Gordon Bonan, NCAR Ecological
Climatology Concepts and Applications. Cambridge
University Press, Cambridge, 2002.)
22Distributed Multidisciplinary Simulation
- Distributed multidisciplinary simulation involves
integrating computing elements at several remote
locations - Requires co-scheduling of computing, data
storage, and network elements - Also Quality of Service (e.g. bandwidth
guarantees) - There is not a lot of experience with this
scenario yet, but it is coming (e.g. the new
Office of Science supercomputing facility at Oak
Ridge National Lab has a distributed computing
elements model)
23Projected Science Requirements for Networking
Science Areas considered in the Workshop 1(not including Nuclear Physics and Supercomputing) Today End2End Throughput 5 years End2End Documented Throughput Requirements 5-10 Years End2End Estimated Throughput Requirements Remarks
High Energy Physics 0.5 Gb/s 100 Gb/s 1000 Gb/s high bulk throughput with deadlines (Grid based analysis systems require QoS)
Climate (Data Computation) 0.5 Gb/s 160-200 Gb/s N x 1000 Gb/s high bulk throughput
SNS NanoScience Not yet started 1 Gb/s 1000 Gb/s remote control and time critical throughput (QoS)
Fusion Energy 0.066 Gb/s(500 MB/s burst) 0.198 Gb/s(500MB/20 sec. burst) N x 1000 Gb/s time critical throughput (QoS)
Astrophysics 0.013 Gb/s(1 TBy/week) NN multicast 1000 Gb/s computational steering and collaborations
Genomics Data Computation 0.091 Gb/s(1 TBy/day) 100s of users 1000 Gb/s high throughput and steering
24 ESnet Goal 2009/2010
AsiaPac
- 10 Gbps enterprise IP traffic
- 40-60 Gbps circuit based transport
SEA
Europe
CERN
CERN
Aus.
Europe
ESnet Science Data Network (2nd Core 30-50
Gbps,National Lambda Rail)
Japan
Japan
CHI
SNV
Europe
NYC
DEN
DC
MetropolitanAreaRings
Aus.
ESnet IP Core (10 Gbps)
ALB
ATL
SDG
ESnet hubs
New ESnet hubs
Metropolitan Area Rings
High-speed cross connects with Internet2/Abilene
10Gb/s 10Gb/s 30Gb/s 40Gb/s
Production IP ESnet core
Science Data Network core
Lab supplied
Major international
25 Observed Drivers for the Evolution of ESnet
ESnet is currently transporting About 530
Terabytes/mo.and this volume is increasing
exponentially ESnet traffic has increased by
10X every 46 months, on average, since 1990
ESnet Monthly Accepted Traffic Feb., 1990 May,
2005
TBytes/Month
Feb, 90 Aug, 90 Feb, 91 Aug, 91 Feb, 92 Aug,
92 Feb, 93 Aug, 93 Feb, 94 Aug, 94 Feb, 95 Aug,
95 Feb, 96 Aug, 96 Feb, 97 Aug, 97 Feb, 98 Aug,
98 Feb, 99 Aug, 99 Feb, 00 Aug, 00 Feb, 01 Aug,
01 Feb, 02 Aug, 02 Feb, 03 Aug, 03 Feb, 04 Aug,
04 Feb, 05
26Observed Drivers The Rise of Large-Scale Science
- A small number of large-scale science users now
account fora significant fraction of all ESnet
traffic - ESnet Top 100 Host-to-Host Flows, Feb., 2005
Total ESnet traffic Feb., 2005 323 TBy in
approx. 6,000,000,000 flows
DOE Lab-International RE
TBytes/Month
Lab-U.S. RE (domestic)
All other flows(lt 0.28 TBy/month each)
Lab-Lab(domestic)
International
Lab-Comm.(domestic)
Domestic
Inter-Lab
27Traffic Evolution over the Next 5-10 Years
- The current traffic pattern trend of the
large-scale science projects giving rise to the
top 100 data flows that represent about 1/3 of
all network traffic will continue to evolve - This evolution in traffic patterns and volume is
driven by large-scale science collaborations and
will result in large-scale science data flows
overwhelming everything else on the network in
3-5 yrs. (WEJ predicts) - The top 100 flows will become the top 1000 or
5000 flows - These large flows will account for 75-95 of a
much larger total ESnet traffic volume as - the remaining 6 billion flows will continue to
account for the remainder of the traffic, which
will also grow even as its fraction of the total
becomes smaller
28Virtual Circuit Network Services
- Every requirements workshop involving the science
community has put bandwidth-on-demand as the
highest priority e.g. for - Massive data transfers for collaborative analysis
of experiment data - Real-time data analysis for remote instruments
- Control channels for remote instruments
- Deadline scheduling for data transfers
- Smooth interconnection for complex Grid
workflows
29What is the Nature of the Required Circuits
- Today
- Primarily to support bulk data transfer with
deadlines - In the near future
- Support for widely distributed Grid workflow
engines - Real-time instrument operation
- Coupled, distributed applications
- To get an idea of how circuit services might be
used look at the one year history of the flows
that are currently the top 20 - Estimate from the flow history what would be the
characteristics of a circuit set up to manage the
flow
30Source and Destination of the Top 20 Flows, Sept.
2005
31What are Characteristics of Todays Flows How
Dynamic a Circuit?
LIGO CalTech Over 1 year the circuit duration
is about 3 months
Gigabytes/day
(no data)
32What are Characteristics of Todays Flows How
Dynamic a Circuit?
SLAC - IN2P3 (FR)Over 1 year circuit duration
is about 1 day to 1 week
Gigabytes/day
(no data)
33What are Characteristics of Todays Flows How
Dynamic a Circuit?
SLAC - INFN (IT)Over 1 year circuit duration
is about 1 to 3 months
Gigabytes/day
(no data)
34What are Characteristics of Todays Flows How
Dynamic a Circuit?
FNAL - IN2P3 (FR)Over 1 year circuit duration
is about 2 to 3 months
Gigabytes/day
(no data)
35What are Characteristics of Todays Flows How
Dynamic a Circuit?
INFN (IT) - SLACOver 1 year circuit duration
is about 3 weeks to 3 months
Gigabytes/day
(no data)
36Characteristics of Todays Circuits How
Dynamic?
- These flows are candidates for circuit based
services for two reasons - Traffic engineering to manage the traffic on
the IP production backbone - To satisfy deadline scheduling requirements
- Traffic isolation to permit the use of efficient,
but TCP unfriendly, data transfer protocols - Despite the long circuit duration, this cannot be
managed by hand too many circuits - There must automated scheduling, authorization,
path analysis and selection, and path setup
37Virtual Circuit Services - What about lambda
switching?
- Two factors argue that this is a long ways out
for production networks - 1) There will not be enough lambdas available to
satisfy the need - Just provisioning a single lambda ring around the
US (7000miles -11,000km) is still about
2,000,000 even on RE networks - This should drop by a factor of 5 -10 over next 5
-10 years - 2) Even if there were a lot of lambdas
(hundreds?) there are thousands of large-scale
science users - Just considering sites (and not scientific
groups) there are probably 300 major research
science research sites in the US and a comparable
number in Europe - So, lambdas will have to be shared for the
foreseeable future - Multiple QoS paths per lambda
- Guaranteed minimum level of service for best
effort traffic when utilizing the production IP
networks - Allocation management
- There will be hundreds to thousands of contenders
with different science priorities
38OSCARS Guaranteed Bandwidth Service
- Virtual circuits must operate across domains
- End points will be on campuses or research
institutes that are served by ESnet, Abilenes
regional networks, and GÉANTs regional networks
typically five domains to cross to do
end-to-end system connection - There are many issues here that are poorly
understood - An ESnet Internet2/HOPI DANTE/GÉANT
collaboration - ESnet virtual circuit project On-demand Secure
Circuits and Advance Reservation System (OSCARS)
(Contact Chin Guok (chin_at_es.net) for
information.)
39OSCARS Guaranteed Bandwidth Service
bandwidthbroker
allocationmanager
authorization
resource manager
policer
usersystem1
shaper
path manager (dynamic, global view of network)
site A
resource manager
- To address all of the issues is complex
- There are many potential restriction points
- There are many users that would like priority
service, which must be rationed
usersystem2
resource manager
policer
site B
40ESnet 2010 Lambda Infrastructure and LHC T0-T1
Networking
TRIUMF
CERN-1
CANARIE
Seattle
CERN-2
Boise
CERN-3
BNL
Chicago
Clev
New York
Denver
Sunnyvale
KC
Pitts
GÉANT-1
FNAL
Wash DC
Raleigh
Tulsa
LA
Albuq.
Phoenix
GÉANT-2
San Diego
Atlanta
Dallas
Jacksonville
El Paso - Las Cruces
Pensacola
NLR PoPs
Baton Rouge
Houston
ESnet IP core hubs
San Ant.
ESnet Production IP core (10-20 Gbps) ESnet
Science Data Network core (10G/link)(incremental
upgrades, 2007-2010) Other NLR links CERN/DOE
supplied (10G/link) International IP connections
(10G/link)
ESnet SDN/NLR hubs
Tier 1 Centers
Cross connects with Internet2/Abilene
New hubs
41Abilene and LHC Tier 2, Near-Term Networking
TRIUMF
CERN-1
CANARIE
Seattle
CERN-2
Boise
CERN-3
Chicago
BNL
Clev
New York
Denver
Sunnyvale
KC
Pitts
GÉANT-1
FNAL
Wash DC
Raleigh
Tulsa
LA
Albuq.
Phoenix
GÉANT-2
San Diego
Atlanta
Dallas
Jacksonville
El Paso - Las Cruces
- Atlas Tier 2 Centers
- University of Texas at Arlington
- University of Oklahoma Norman
- University of New Mexico Albuquerque
- Langston University
- University of Chicago
- Indiana University Bloomington
- Boston University
- Harvard University
- University of Michigan
Pensacola
NLR PoPs
Baton Rouge
Houston
- CMS Tier 2 Centers
- MIT
- University of Florida at Gainesville
- University of Nebraska at Lincoln
- University of Wisconsin at Madison
- Caltech
- Purdue University
- University of California San Diego
ESnet IP core hubs
San Ant.
ESnet Production IP core (10-20 Gbps) ESnet
Science Data Network core (10G/link)(incremental
upgrades, 2007-2010) Other NLR links CERN/DOE
supplied (10G/link) International IP connections
(10G/link)
ESnet SDN/NLR hubs
lt 10G connections to Abilene
10G connections to USLHC or ESnet
Tier 1 Centers
Abilene/GigaPoP nodes
Cross connects with Internet2/Abilene
USLHC nodes
New hubs
WEJ projection of future Abilene
42Between ESnet, Abilene, GÉANT, and the connected
regional RE networks, there will be dozens of
lambdas in production networks that are shared
between thousands of users who want to use
virtual circuits Very complex inter-domain
issues
similar situationin Europe
Abilene
ESnet-Abilenex-connects
ESnet
similar situationin US regionals
US RE environment
43ESnet Optical Networking Roadmap
- Dedicated virtual circuits
- Dynamic virtual circuit allocation
2005 2006 2007
2008 2009 2010
- Interoperability between GMPLS circuits, VLANs,
and MPLS circuits (Layer 1-3)
- Interoperability between VLANs and MPLS
circuits(Layer 2 3)
- Dynamic provisioning of MPLS circuits (Layer 3)
44Tying Domains Together (1/2)
- Motivation
- For a virtual circuit service to be successful,
it must - Be end-to-end, potentially crossing several
administrative domains - Have consistent network service guarantees
throughout the circuit - Observation
- Setting up an intra-domain circuit is easy
compared with coordinating an inter-domain
circuit - Issues
- Cross domain authentication and authorization
- A mechanism to authenticate and authorize a
bandwidth on-demand (BoD) circuit request must be
agreed upon in order to automate the process - Multi-domain Acceptable Use Policies (AUPs)
- Domains may have very specific AUPs dictating
what the BoD circuits can be used for and where
they can transit/terminate - Domain specific service offerings
- Domains must have way to guarantee a certain
level of service for BoD circuits - Security concerns
- Are there mechanisms for a domain to protect
itself? (e.g. RSVP filtering)
45Tying Domains Together (2/2)
- Approach
- Utilize existing standards and protocols (e.g.
GMPLS, RSVP) - Adopt widely accepted schemas/services (e.g X.509
certificates) - Collaborate with like-minded projects (e.g. JRA3
(DANTE/GÉANT), BRUW (Internet2/HOPI) to - 1. Create a common service definition for BoD
circuits - 2. Develop an appropriate User-Network-Interface
(UNI) and Network-Network-Interface (NNI)