Network Control and Management in the 100x100 Architecture - PowerPoint PPT Presentation

1 / 74
About This Presentation
Title:

Network Control and Management in the 100x100 Architecture

Description:

Multiple Decision Elements per network, using simple election protocol to pick master ... Recent Results ... of ACM HotNets-III, San Diego, CA, November 2004. ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 75
Provided by: dma113
Category:

less

Transcript and Presenter's Notes

Title: Network Control and Management in the 100x100 Architecture


1
Network Control and Managementin the 100x100
Architecture
2
The Role of Network Control and Management
  • Many different network environments
  • Data center networks, enterprise/campus
  • Access, backbone networks
  • Many different technologies
  • Longest-prefix routing, label switching,
    switching
  • IP, MPLS, ATM, optical circuits
  • Many different policies
  • Routing, reachability, transit, traffic
    engineering, robustness
  • The control plane software binds these elements
    together and defines the network

3
Control Plane The Key Leverage Point
  • Great Potential control plane determines the
    behavior of the network
  • Reaction to events, reachability, services
  • Great Opportunities
  • A radical clean-slate control plane can be
    deployed
  • Agnostic to packet format IPv4/v6, ethernet
  • No changes to end-system software
  • Control plane is the nexus of network evolution
  • Changing the control plane logic can smooth
    transitions in network technologies and
    architectures

4
100x100 Project Themes
5
A Clean-slate Design
  • What are the fundamental causes of outages?
  • How to reduce/simplify the software in networks?
  • Control logic is software no reason it should
    be hard to update, but how to avoid complexity
    pitfalls
  • What functionality needs to be distributed what
    can be centralized?
  • What would a RISC router look like?
  • Leverage technology trends
  • CPU and link-speed growing faster than of
    switches

FIX ME
6
Three Principles forNetwork Control Management
  • Network-level Objectives
  • Express goals explicitly
  • Security policies, QoS, egress point selection
  • Do not bury goals in box-specific configuration

Reachability matrix Traffic engineering rules
Management Logic
7
Three Principles forNetwork Control Management
  • Network-wide Views
  • Design network to provide timely, accurate info
  • Topology, traffic, resource limitations
  • Give logic the inputs it needs

Reachability matrix Traffic engineering rules
Management Logic
Read state info
8
Three Principles forNetwork Control Management
  • Direct Control
  • Allow logic to directly set forwarding state
  • FIB entries, packet filters, queuing parameters
  • Logic computes desired network state, let it
    implement it

Reachability matrix Traffic engineering rules
Write state
Management Logic
Read state info
9
Overview of the 4D Architecture
Network-level objectives
Decision
Dissemination
Direct control
Network-wide views
Discovery
Data
  • Decision Plane
  • All management logic implemented on centralized
    servers making all decisions
  • Decision Elements use views to compute data plane
    state that meets objectives, then directly writes
    this state to routers

10
Overview of the 4D Architecture
Network-level objectives
Decision
Dissemination
Direct control
Network-wide views
Discovery
Data
  • Dissemination Plane
  • Provides a robust communication channel to each
    router and robustness is the only goal!
  • May run over same links as user data, but
    logically separate and independently controlled

11
Overview of the 4D Architecture
Network-level objectives
Decision
Dissemination
Direct control
Network-wide views
Discovery
Data
  • Discovery Plane
  • Each router discovers its own resources and its
    local environment
  • E.g., the identity of its immediate neighbors

12
Overview of the 4D Architecture
Network-level objectives
Decision
Dissemination
Direct control
Network-wide views
Discovery
Data
  • Data Plane
  • Spatially distributed routers/switches
  • Ideally exposes interface to tables in hardware
  • Can deploy with todays technology

13
Concerns and Challenges
  • How does the 4D simplify the problem?
  • How will communication between routers and DEs
    survive failures in the network?
  • Can a robust dissemination plane be built?
  • Latency means DEs view of network is behind
    reality. Will the control loop be stable?
  • What is the overhead to/from the DEs?
  • What happens in a network partition?

FIX ME
14
Fundamental Problem Wrong Abstractions
Shell scripts
Traffic Eng
  • Management Plane
  • Figure out what is happening in network
  • Decide how to change it

Planning tools
Databases
Configs
SNMP
netflow
modems
OSPF
  • Control Plane
  • Multiple routing processes on each router
  • Each router with different configuration program
  • Huge number of control knobs metrics, ACLs,
    policy

Link metrics
Routing policies
FIB
  • Data Plane
  • Distributed routers
  • Forwarding, filtering, queueing
  • Based on FIB or labels

FIB
FIB
Packet filters
15
Good Abstractions Reduce Complexity
Management Plane
Configs
Decision Plane
Control Plane
FIBs, ACLs
FIBs, ACLs
Dissemination
Data Plane
Data Plane
  • All decision making logic lifted out of control
    plane
  • Eliminates duplicate logic in management plane
  • Dissemination plane provides robust communication
    to/from data plane switches

16
4D Separates Distributed Computing Issues from
Networking Issues
  • Distributed computing issues ! protocols and
    network architecture
  • Overhead
  • Resiliency
  • Scalability
  • Networking issues ! management logic
  • Traffic engineering and service provisioning
  • Egress point selection
  • Reachability control (VPNs)
  • Precomputation of backup paths

17
4D Can Leverage Network Structure
  • Decision plane logic can be specialized for
    structure of each physical network
  • Distributed protocols must be prepared for
    arbitrary topology graphs
  • 4D enables network logic specialized differently
    for access and for backbone
  • Advantages
  • Faster route computations
  • Retain flexibility to evolve network as needed
  • Support transition to 100x100 architecture

18
The Feasibility of the 4D Architecture
  • We designed and built a prototype of the 4D
    Architecture
  • 4D Architecture permits many designs prototype
    is a single, simple design point
  • Decision plane
  • Contains logic to simultaneously compute routes
    and enforce reachability matrix
  • Multiple Decision Elements per network, using
    simple election protocol to pick master
  • Dissemination plane
  • Uses source routes to direct control messages
  • Extremely simple, but can route around failed
    data links

19
Evaluation of the 4D Prototype
  • Evaluated using Emulab (www.emulab.net)
  • Linux PCs used as routers (650 800MHz)
  • Tested on 9 enterprise network topologies
    (10-100 routers each)

Example network with 49 switches and 5 DEs
20
Performance of the 4D Prototype
  • Trivial prototype has performance comparable to
    well-tuned production networks
  • Recovers from single link failure in lt 300 ms
  • lt 1 s response considered excellent
  • Survives failure of master Decision Element
  • New DE takes control within 1 s
  • No disruption unless second fault occurs
  • Gracefully handles complete network partitions
  • Less than 1.5 s of outage

21
Future Work
  • Scalability
  • Evaluate over 1-10K switches, 10-100K routes
  • Networks with backbone-like propagation delays
  • Structuring decision logic
  • Arbitrate among multiple, potentially competing
    objectives
  • Unify control when some logic takes longer than
    others
  • Protocol improvements
  • Better dissemination and discovery planes
  • Deployment in todays networks
  • Data center, enterprise, campus, backbone (RCP)

22
Themes of Network Control Management
  • Holistic Design
  • Many different technologies a few common
    problems
  • Find the right abstractions exploit commonality
  • Clean Slate
  • How much autonomy do routers/switches need?
  • New principles for controlling networks
  • Eliminate duplicate logic
  • Leverage Network Structure
  • Many different types of networks exist - each
    with different objectives and topologies
  • Separate networking issues from distributed
    system issues

23
Recent Results
  • G. Xie, J. Zhan, D. A. Maltz, H. Zhang, A.
    Greenberg, G. Hjalmtysson, J. Rexford, On Static
    Reachability Analysis of IP Networks, IEEE
    INFOCOM 2005, Orlando, FL, March 2005.
  • J. Rexford, A. Greenberg, G. Hjalmtysson, D. A.
    Maltz, A. Myers, G. Xie, J. Zhan, H. Zhang,
    Network-Wide Decision Making Toward a
    Wafer-Thin Control Plane, Proceedings of ACM
    HotNets-III, San Diego, CA, November 2004.
  • D. A. Maltz, J. Zhan, G. Xie, G. Hjalmtysson, A.
    Greenberg, H. Zhang, Routing Design in
    Operational Networks A Look from the Inside,
    Proceedings of the 2004 Conference on
    Applications, Technologies, Architectures, and
    Protocols for Computer Communications (ACM
    SIGCOMM 2004), Portland, Oregon, 2004.
  • D. A. Maltz, J. Zhan, G. Xie, H. Zhang, G.
    Hjalmtysson, A. Greenberg, J. Rexford, Structure
    Preserving Anonymization of Router Configuration
    Data, Proceedings of ACM/Usenix Internet
    Measurement Conference (IMC 2004), Sicily, Italy,
    2004.

24
Questions?
25
Fundamental Problem Wrong Abstractions
  • interface Ethernet0
  • ip address 6.2.5.14 255.255.255.128
  • interface Serial1/0.5 point-to-point
  • ip address 6.2.2.85 255.255.255.252
  • ip access-group 143 in
  • frame-relay interface-dlci 28
  • router ospf 64
  • redistribute connected subnets
  • redistribute bgp 64780 metric 1 subnets
  • network 66.251.75.128 0.0.0.127 area 0
  • router bgp 64780
  • redistribute ospf 64 match route-map
    8aTzlvBrbaW
  • neighbor 66.253.160.68 remote-as 12762
  • neighbor 66.253.160.68 distribute-list 4 in

access-list 143 deny 1.1.0.0/16 access-list 143
permit any route-map 8aTzlvBrbaW deny 10 match
ip address 4 route-map 8aTzlvBrbaW permit 20
match ip address 7 ip route 10.2.2.1/16 10.2.1.7
26
Fundamental Problem Wrong Abstractions
2000
Size of configuration files in a single
enterprise network (881 routers)
Lines in config file
1000
0
881
0
Router ID (sorted by file size)
27
Fundamental Problem Wrong Abstractions
Shell scripts
Traffic Eng
  • Management Plane
  • Figure out what is happening in network
  • Decide how to change it

Planning tools
Databases
Configs
SNMP
netflow
modems
OSPF
  • Control Plane
  • Multiple routing processes on each router
  • Each router with different configuration program
  • Huge number of control knobs metrics, ACLs,
    policy

Link metrics
Routing policies
FIB
  • Data Plane
  • Distributed routers
  • Forwarding, filtering, queueing
  • Based on FIB or labels

FIB
FIB
Packet filters
28
Good Abstractions Reduce Complexity
Management Plane
Configs
Decision Plane
Control Plane
FIBs, ACLs
FIBs, ACLs
Dissemination
Data Plane
Data Plane
  • All decision making logic lifted out of control
    plane
  • Eliminates duplicate logic in management plane
  • Dissemination plane provides robust communication
    to/from data plane switches

29
Fundamental Problem Conflating Distributed
Systems Issues with Networking Issues
Routing Process
D left
D
D
Routing Process
Routing Process
D
D
D left
D left
  • Distributed Systems Concern resiliency to link
    failures
  • Solution multiple paths through routing process
    graph

30
Fundamental Problem Conflating Distributed
Systems Issues with Networking Issues
Routing Process
D right
D
Routing Process
Routing Process
D
D
D left
D left
  • Distributed Systems Concern resiliency to link
    failures
  • Solution multiple paths through routing process
    graph

31
Fundamental Problem Conflating Distributed
Systems Issues with Networking Issues
Routing Process
D left
D
D
Routing Process
Routing Process
D
D
D left
D left
  • Networking Concern implement resource or
    security policy
  • Solution restrict flow of routing information,
    filter routes, summarize/aggregate routes

32
4D Separates Distributed Computing Issues from
Networking Issues
  • Distributed computing issues ! protocols and
    network architecture
  • Overhead
  • Resiliency
  • Scalability
  • Networking issues ! management logic
  • Traffic engineering and service provisioning
  • Egress point selection
  • Reachability control (VPNs)
  • Precomputation of backup paths

33
4D Can Leverage Network Structure
  • Decision plane logic can be specialized for
    structure of each physical network
  • Distributed protocols must be prepared for
    arbitrary topology graphs
  • 4D enables network logic specialized differently
    for access and for backbone
  • Advantages
  • Faster route computations
  • Retain flexibility to evolve network as needed
  • Support transition to 100x100 architecture

34
Fundamental Problem Computing Configurations is
Intractable
  • Computing configuration files that cause control
    plane to compute desired forwarding states is
    intractable
  • NP-hard in many cases
  • Requires predictive model of control plane
    behavior
  • Configurations files form a program that defines
    a set of forwarding states
  • Very hard to create program that permits only
    desired states, and doesnt transit through bad
    ones

Forwarding states allowed by configs
Auto-adaptation leads to/thru bad states
Planned responses avoid bad states
35
Direct Control Provides Complete Control
  • Zero device-specific configuration
  • Supports many models for pushing routes
  • Trivial push convergence requires time for all
    updates to be receive and applied same as today
  • Synchronized update updates propagated, but not
    applied till agreed time in the future clock
    skew defines convergence time
  • Controlled state trajectory DE serializes
    updates to avoid all incorrect transient states

36
4D and Todays Networks
  • 4D architecture and principles apply to todays
    networks as well as 100x100
  • Enterprise/campus/university networks
  • Data center networks
  • Access/backbone networks
  • Greater expressivity in determining behavior
  • Behavior of butterfly graph gadgets under failure
  • Selection of traffic egress points

37
4D Supports Network Evolution Expansion
  • Decision logic can be upgraded as needed
  • No need for update of distributed protocols
    implemented in software distributed on every
    switch
  • Decision Elements can be upgraded as needed
  • Network expansion requires upgrades only to DEs,
    not every switch

38
Three Key Questions
  • Is there any transition path to deploy the 4D
    architecture?
  • Is the 4D architecture feasible?
  • Does the 4D architecture have more expressive
    power than todays approaches to network control
    and management?

39
Deployment of the 4D Architecture
  • Pre-existing industry trend towards separating
    router hardware from software
  • IETF FORCES, GSMP, GMPLS
  • SoftRouter Lakshman, HotNets04
  • Incremental deployment path exists
  • Individual networks can upgrade to 4D and gain
    benefits
  • Small enterprise networks have most to gain
  • No changes to end-systems required

40
Reachability Example
R1
R2
Chicago (chi)
New York (nyc)
Data Center
Front Office
R5
R4
R3
  • Two locations, each with data center front
    office
  • All routers exchange routes over all links

41
Reachability Example
R1
R2
Chicago (chi)
New York (nyc)
Data Center
Front Office
R5
R4
R3
chi-DC
chi-FO
nyc-DC
nyc-FO
chi-DC
chi-FO
nyc-DC
nyc-FO
42
Reachability Example
Packet filter Drop nyc-FO -gt Permit
R1
R2
chi
Data Center
Front Office
Packet filter Drop chi-FO -gt Permit
R5
nyc
R4
R3
43
Reachability Example
Packet filter Drop nyc-FO -gt Permit
R1
R2
chi
Data Center
Front Office
Packet filter Drop chi-FO -gt Permit
R5
nyc
R4
R3
  • A new short-cut link added between data centers
  • Intended for backup traffic between centers

44
Reachability Example
Packet filter Drop nyc-FO -gt Permit
R1
R2
chi
Data Center
Front Office
Packet filter Drop chi-FO -gt Permit
R5
nyc
R4
R3
  • Oops new link lets packets violate security
    policy!
  • Routing changed, but
  • Packet filters dont update automatically

45
Prohibiting Packets from chi-FO to nyc-DC
46
Reachability Example
Packet filter Drop nyc-FO -gt Permit
R2
R1
chi
Data Center
Front Office
Packet filter Drop chi-FO -gt Permit
R5
nyc
R4
R3
  • Typical response add more packet filters to
    plug the holes in security policy

47
Reachability Example
Drop nyc-FO -gt
R2
R1
chi
Data Center
Front Office
R5
nyc
Drop chi-FO -gt
R4
R3
  • Packet filters have surprising consequences
  • Consider a link failure
  • chi-FO and nyc-FO still connected

48
Reachability Example
Drop nyc-FO -gt
R2
R1
chi
Data Center
Front Office
R5
nyc
Drop chi-FO -gt
R4
R3
  • Network has less survivability than topology
    suggests
  • chi-FO and nyc-FO still connected
  • But packet filter means no data can flow!
  • Probing the network wont predict this problem

49
Allowing Packets from chi-FO to nyc-FO
50
(No Transcript)
51
(No Transcript)
52
Packet Filters Implement Policy
  • Packet filters used extensively throughout
    networks
  • Protect routers from attack
  • Implement reachability matrix
  • Define which hosts can communicate
  • Localize traffic, particularly multicast

53
Mechanisms for Action at a Distance
A
Routing Process
Routing Process
Routing Process
Atag12
Atag12
Tag?
FIB
FIB
FIB
R1
R2
R3
  • Policy often implemented by tagging routes on one
    router
  • And testing for tag at another router

54
Multiple Interacting Routing Processes
Client
Server
55
The Routing Instance Graph of a 881 Router
Network
56
Reconvergence Time UnderSingle Link Failure
57
Reconvergence Time When Master DE Crashes
58
Reconvergence Time WhenNetwork Partitions
59
Reconvergence Time WhenNetwork Partitions
60
Systems of Systems
  • Systems are designed as components to be used in
    larger systems in different contexts, for
    different purposes, interacting with different
    components
  • Example OSPF and BGP are complex systems in its
    own right, they are components in a routing
    system of a network, interacting with each other
    and packet filters, interacting with management
    tools
  • Complex configuration to enable flexibility
  • The glue has tremendous impact on network
    performance
  • State of art multiple interactive distributed
    programs written in assembly language
  • Lack of intellectual framework to understand
    global behavior

61
Many Implementations Possible
Single redundant decision engine
  • Multiple decision engines
  • Hot stand-by
  • Divide network load share
  • Distributed decision engines
  • Up to one per router
  • Choice can be based on reliability requirements
  • Dessim. Plane can be in-band, or leverage OOB
    links
  • Less need for distributed solutions (harder to
    reason about)
  • More focus on network issues, less on distributed
    protocols

62
Direct Expression Enables New Algorithms
D
  • OSPF normally calculates a single path to each
    destination D
  • OSPF allows load-balancing only for equal-cost
    paths to avoid loops
  • Using ECMP requires careful engineering of link
    weights

D
  • Decision Plane with network-wide view can compute
    multiple paths
  • Backup paths installed for free!
  • Bounded stretch, bounded fan-in

63
Slides under Development
64
Supporting Network Evolution
  • Logic for controlling the network needs to change
    over time
  • Traffic engineering rules
  • Interactions with other networks
  • Service characteristics
  • Upgrades to field-deployed network equipment must
    be avoided
  • Very high cost
  • Software upgrades often require hardware upgrades
    (more CPU or memory)

65
Supporting Network EvolutionToday
  • Todays Solution
  • Vendors stuff their routers with software
    implementing all possible features
  • Multiple routing protocols
  • Multiple signaling protocols (RSVP, CR-LDP)
  • Each feature controlled by parameters set at
    configuration time to achieve late binding
  • Feature-creep creates configuration nightmare
  • Tremendous complexity for syntax semantics
  • Mis-interactions between features is common
  • Our Goal Separate decision making logic from the
    field-deployed devices

66
Supporting Network Expansion
  • Networks are constantly growing
  • New routers/switches/links added
  • Old equipment rarely removed
  • Adding a new switch can cause old equipment to
    become overloaded
  • CPU/Memory demands on each device should not
    scale up with network size

67
Supporting Network ExpansionToday
  • Routers run a link-state routing protocol
  • Size of link-state database scales with of
    routers
  • Expanding network can exceed memory limits of old
    routers
  • Todays Solution
  • Monitor resources on all routers
  • Predict approach of exhaustion and then
  • Global upgrade
  • Rearchitecture of routing design to add
    summarization, route aggregation, information
    hiding
  • Our Goal make demands scale with hardware (e.g.,
    of interfaces)

68
Supporting Remote Devices
  • Maintaining communication with all network
    devices is critical for network management
  • Diagnosis of problems
  • Monitoring status and network health
  • Updating configuration or software
  • the chicken or the egg.
  • Cannot send device configuration/management
    information until it can communicate
  • Device cannot communicate until it is correctly
    configured

69
Supporting Remote DevicesToday
  • Todays Solution
  • Use PSTN as management network of last resort
  • Connect console of remote routers to phone modem
  • Cant be used for customer premise equipment
    (CPE) DSL/cable modems, integrated access
    devices (IADs)
  • In a converged network, PSTN is decommissioned
  • Our Goal Preserve management communication to
    any device that is not physically partitioned,
    regardless of configuration state

70
Network Control and Management Today
  • State everywhere!
  • Dynamic state in FIBs
  • Configured state in settings, policies, packet
    filters
  • Programmed state in magic constants, timers
  • Many dependencies between bits of state
  • State updated in uncoordinated, decentralized way!
  • Data Plane
  • Distributed routers
  • Forwarding, filtering, queueing
  • Based on FIB or labels

Packet filters
71
Network Control and Management Today
  • Logic everywhere!
  • Path Computation built into routing protocols
  • Routing Policy distributed across the routers
  • Packet Filters placed by tools in Mng. Plane
  • No way to arbitrate inconsistencies between logic!
  • State everywhere!
  • Dynamic state in FIBs
  • Configured state in settings, policies, packet
    filters
  • Programmed state in magic constants, timers
  • Many dependencies between bits of state
  • State updated in uncoordinated, decentralized way!
  • Data Plane
  • Distributed routers
  • Forwarding, filtering, queueing
  • Based on FIB or labels

Packet filters
72
A Study of Operational Production Networks
  • How complicated/simple are real control planes?
  • What is the structure of the distributed system?
  • Use reverse-engineering methodology
  • There are few or no documents
  • The ones that exist are out-of-date
  • Anonymized configuration files for 31 active
    networks (gt8,000 configuration files)
  • 6 Tier-1 and Tier-2 Internet backbone networks
  • 25 enterprise networks
  • Sizes between 10 and 1,200 routers
  • 4 enterprise networks significantly larger than
    the backbone networks

73
Learning from Ethernet Evolution Experience
Current Implementations Everything Changed
Except Name and Framing
HUB Switch
Router
WAN
  • Switched solution
  • Little use for collision domains
  • Servers, routers 10 x station speed
  • 10/100/1000 Mbps, 10gig coming Copper, Fiber

Ethernet Conc..
Server
74
Ethernet Re-inventing the Wheel
  • Becoming as service-rich and complex as IP
  • Traffic engineering
  • Reachability control and traffic isolation
    (VLANs)
  • QoS (802.1q)
  • Ethernet networks rediscovering the problems and
    solutions faced by IP networks
  • Is there commonality to exploit?
  • Switch/routers are all fundamentally table-driven
  • Destination addr, MPLS labels, VLANs, Circuit IDs

75
Control/Management Needs of100x100 Network
Architecture
  • Control/Management creates logical network from
    physical network
  • Supports architecture and end-to-end view of
    100x100 network
  • Access Network
  • Logical level aggregation tree between CPE and
    Regional Node
  • Physical level network with redundant links and
    multiple Regional Nodes
  • Backbone Network
  • Logical level full mesh of links among Regional
    Nodes
  • Physical level sparse graph of fiber routes
    constrained by geography
Write a Comment
User Comments (0)
About PowerShow.com