Title: Does it have to be that complicated? Thoughts on a next-generation Internet
1Does it have to be that complicated? Thoughts on
a next-generation Internet
- Henning Schulzrinne
- Internet Real Time Laboratory
- Computer Science Dept., Columbia University, New
York - http//www.cs.columbia.edu/IRT
2Overview
- The transformation in keynote big pictures
- The transition in cost metrics
- What has made the Internet successful?
- Some Internet problems
- Simplicity wins
- Architectural complexity
- New protocol engineering
3Philosophy transition
PC era cell phone era
One computer/phone, many users
One computer/phone, one user
mainframe era home phone party line
Many computers/phones, one user
ubiquitous computing
anywhere, any time any media
right place (device), right time, right media
4Evolution of VoIP
how can I make it stop ringing?
does it do call transfer?
long-distance calling, ca. 1930
going beyond the black phone
amazing the phone rings
catching up with the digital PBX
1996-2000
2000-2003
2004-
5Transition in cost balance
- Total cost of ownership
- Ethernet port cost ? 10
- about 80 of Columbia CSs system support cost is
staff cost - about 2500/person/year ? 2 new PCs/year
- much of the rest is backup license for spam
filters ? - Does not count hours of employee or son/daughter
time - PC, Ethernet port and router cost seem to have
reached plateau - just that the 10 now buys a 100 Mb/s port
instead of 10 Mb/s
6What has made the Internet successful?
- 36 years ? approaching mid-life crisis ? time for
self-reflection - ? next generation suddenly no longer finds it hip
- Transparency in the core
- new applications
- Narrow interfaces
- socket interface, resolver
- HTTP and SMTP messaging as applications
- prevent change leakage
- Low barrier to entry
- L2 minimalist assumptions
- technical basic connectivity is within
- economical below 20?
- Commercial off-the-shelf systems
- scale compare 802.11 router vs. cell base
station
Ethernet web server
7IP hourglass
email WWW phone... SMTP HTTP RTP... TCP
UDP IP ethernet PPP CSMA async
sonet... copper fiber radio...
Steve Deering, IETF Aug. 2001
8User issues (guesses)
- Lack of trust
- small mistakes ? identity gone
- waste time on spam, viruses, worms, spyware,
- Lack of reliability
- 99.5 instead of 99.999
- even IETF meeting cant get reliable 802.11
connectivity - Lack of symmetry
- asymmetric bandwidth ADSL
- asymmetric addressing NAT, firewalls ?
client(-server) only, packet relaying via TURN or
p2p - Users as Internet mechanics
- why does a user need to know whether to use IMAP
or POP? - navigate circle of blame
9Technical infrastructure issues
- Sensor networks
- addressing mechanisms not suitable ?geographic
addressing, self-routing packets - TCP model
- partial and temporal connectivity
- Mobile ad-hoc networks (MANET
- address acquisition?
- Mobile networks (e.g., plane Connexion, train,
car, ) - routing granularity each plane a BGP AS?
- network merging and splitting
10Technical infrastructure issues
- Multi-homing and mobility
- address vs. locator issues
- Large-scale Internet
- secure routing
- routing scaling (60,000 AS)
- Architecture
- standardization delays ? now routinely 3-5 years
for minor extensions - resistance to change at L4
- difficulty in deploying new applications
- Internet service outbound port 80 and 443
11What has gone wrong?
- Familiar to anybody who has an old house
- Entropy
- as parts are added, complexity and interactions
increase - Changing assumptions
- trust model research colleagues ? far more
spammers and phishers than friends - AOL 80 of email is spam
- internationalization internationalized domain
names, email character sets - criticality email research papers ? transfers B
and dial 9-1-1 - economics competing providers
- Internet does not route money (Clark)
- Backfitting
- had to backfit security, I18N, autoconfiguration,
- ? Tear down the old house, gut interior or more
wall paper?
12In more detail
- Deployment problems
- Layer creep
- Simple and universal wins
- Scaling in human terms
- Cross-cutting concerns, e.g.,
- CPU vs. human cycles
- we optimize the 100 component, not the 100/hour
labor - introspection
- graceful upgrades
- no policy magic
13The transformation of protocol stacks
14Cause of death for the next big thing
QoS multi- cast mobile IP active networks IPsec IPv6
not manageable across competing domains ? ? ? ?
not configurable by normal users (or apps writers) ? ? ?
no business model for ISPs ? ? ? ? ? ?
no initial gain ? ? ? ? ?
80 solution in existing system ? ? ? ? ? ? (NAT)
increase system vulnerability ? ? ? ?
15Simple wins (mostly)
- Examples
- Ethernet vs. all other L2 technologies
- HTTP vs. HTTPng and all the other hypertext
attempts - SMTP vs. X.400
- SDP vs. SDPng
- TLS vs. IPsec (simpler to re-use)
- no QoS MPLS vs. RSVP
- DNS-SD (Bonjour) vs. SLP
- SIP vs. H.323 (but conversely SIP vs. Jabber,
SIP vs. Asterisk) - the failure of almost all middleware
- future demise of 3G vs. plain SIP
- Efficiency is not important
- BitTorrent, P2P searching, RSS,
16Measuring complexity
- Traditional O(.) metrics rarely helpful
- except maybe for routing protocols
- Focus on parsing, messaging complexity
- marginally helpful, but no engineering metrics
for trade-offs - No protocol engineering discipline, lacking
- guidelines for design
- learning from failures
- we have plenty to choose from but hard to look
at our own (communal) failures - re-usable components
- components not designed for plug-and-play
- we dont do APIs ? we dont worry about whether
a simple API can be written that can be taught in
Networking 101
17Measuring complexity
- Conceptual complexity
- can I explain the protocol operation in one
class? - e.g., counter examples PIM-SM, MADCAP, OSPF
- Observable vs. hidden
- one side can see, without god box
- hidden state and actions increase information
complexity - unknown variables can have any state
- Number of system interfaces
- see BISDN, 3GPP, NGN,
18Possible complexity metrics
- new code needed (vs. reuse) ? less likely to be
buggy or have buffer overflows - e.g., new text format almost the same
- numerous binary formats
- security components
- new identities and identifiers needed
- number of configurable options parameters
- must be configured can be configured (with
interop impact) - discoverable vs. manual/unspecified
- SIP experience things that shouldnt be
configurable will be - RED experience parameter robustness
- mute programmer interop test two
implementations, no side channel - number of left-to-local policy
- DSCP confusion
- start-up latency (protocol boot time)
- IPv4 DAD, IGMP
19Democratization of protocol engineering
- Traditional Internet application protocols (IETF
et al.) - one protocol for each type of application
- SMTP for email, ftp for file transfer, HTTP for
web access, POP for email retrieval, NNTP for
netnews, - slow protocol development process
- re-do security (authentication) for each
- each new protocol has its own text encoding
- similarity across protocols SMTP-style headers
- Content-Type text/plain charset"us-ascii"
formatflowed - large parsing exposure ? new buffer overflows for
each protocol - Separate worlds
- most of the new protocols in the real world based
on WS - IETF stuck in bubble of one-off protocols ? more
fun! - re-use considered a disadvantage
- insular protocols that have local cult following
(BEEP)
20The transformation of protocol design
- One application, one protocol ? common
infrastructure for new application - Old model
- RPC for corporate one-off applications
- custom protocols for common Internet-scale
applications - Far too many new applications
- and not enough protocol engineers
- network specialist ? application specialist
- new IETF application protocol design takes 5
years - Many of the applications (email to file access)
could be modeled as RPC
custom text protocol (ftp)
RFC 822 protocol (SMTP, HTTP, RTSP, SIP, )
use XML for protocol bodies (IETF IM presence)
SOAP and other XML protocols
ASN.1-based (SNMP, X.400)
21Why are web services succeeding() after RPC
failed?
- SOAP just another remote procedure call
mechanism - plenty of predecessors SunRPC, DCE, DCOM, Corba,
- client-server computing
- all of them were to transform (enterprise)
computing, integrate legacy applications, end
world hunger, - Why didnt they?
- Speculation
- no web front end (no three-tier applications)
- few open-source implementations
- no common protocol between PC client (Microsoft)
and backend (IBM mainframes, Sun, VMS) - corporate networks local only (one site), with
limited backbone bandwidth
Corba
DCOM
SunRPC
() we hope
22Why did web services succeed after RPC failed?
- More speculation
- Corba, DCOM, SunRPC no real security story
- had custom-made security instead of TLS
- Many initially designed for LAN only
- e.g., use of UDP, service naming by ports only
- Limited language support
- e.g., no PHP or Perl support for Corba
- Limited platform support
- DCOM Microsoft
- Corba all-but-Microsoft
23Technical challenges for web services
- Resistance to common protocol infrastructure
- Yet another RPC fad
- SOAP overhead the price of generality
- SOAP envelope
- inefficient binary encoding (images, etc.)
compared to MIME multipart - klumsy load-balancing and redundancy
- inefficient implementations
- high start-up costs
- XML problems
- XML schema hard to work with
- Namespace clutter
- hard to extend among multiple independent parties
(? RelaxNG) - can only do ltany othergt
SOAP
servlets
PHP
24Emerging light-weight alternatives
- Many SOAP services, but public services are often
XML-RPC only - or even HTTP POST
- Examples
- eBay, Amazon, Google
- Ajax
- Reasons?
- SOAP envelope adds modest value
- Scripting languages have
- REST principles identify objects by URL in
request, not by identifier in RPC body - easier dispatch to PHP/Ruby/ scripts
25Time for a new protocol stack?
- Now add x.5 sublayers and overlay
- HIP, MPLS, TLS,
- Doesnt tell us what we could/should do
- or where functionality belongs
- use of upper layers to help lower layers
(security associations, authorization) - what is innate complexity and what is entropy?
- Examples
- Applications do we need ftp, SMTP, IMAP, POP,
SIP, RTSP, HTTP, p2p protocols? - Network can we reduce complexity by integrating
functionality or re-assigning it? - e.g., should e2e security focus on transport
layer rather than network layer? - probably need pub/sub layer currently kludged
locally (email, IM, specialized)
26NSF Green Field approach
- US National Science Foundation (NSF) working on
new funding thrust ? next-generation Internet - idea incremental components ? new architecture
- vs. traditional brown field approach
- Two major components
- GENI large-scale experimental testbed for
testing next-generation ideas - building on PlanetLab (hundreds of public-access
servers) ? p2p, CDN, measurement infrastructures - probably offers circuits (optical or virtual with
bandwidth guarantees) - 300M (not yet allocated)
- FIND
- regular research program within NETS (15m)
- prepare architecture designs
27NSF FIND and GENI, contd
- Fundamental notions
- not constrained by existing Internet architecture
- Difficulties
- not coordinated ? too many moving pieces?
- no single research team can do everything
- point optimization Internet for
- benchmarks missing
- how do you compare architectures?
- are there quantifiable requirements?
- are there metrics to compare different
approaches? - Cynics prediction based on the past
- IPv6 youll get security, QoS,
autoconfiguration, mobility, - IPv4 good ideas, Ill do those, too
28(My) guidelines for a new Internet
- Maintain success factors, such as
- service transparency
- low barrier to entry
- narrow interfaces
- New guidelines
- optimize human cycles, not CPU cycles
- design for symmetry
- security built-in, not bolted-on
- everything can be mobile, including networks
- sending me data is a privilege, not a right
- reliability paramount
- isolation of flows
- New possibilities
- another look at circuit switching?
- knowledge and control (signaling) planes?
- separate packet forwarding from control
- better alignment of costs and benefit
- better scaling for Internet-scale routing
- more general services
29More network services
- Currently, very specialized and limited
- packet forwarding
- DNS for identifier lookup
- DHCP for configuration
- New opportunities
- packet forwarding with control
- general identifier storage and lookup
- both server-based and peer-to-peer
- SLP/Jini/UDDI service location ? ontology-based
data store - network file storage ? for temporarily-disconnecte
d mobiles - network computation ? translation, relaying
- trust services (? IRT trust paths work)
30Security
- More than just encryption!
- Need identity and role-based certificates
- May want reverse-path reachability (bank ?
customer)
asking user network
user do I know this person? is he a likely sender of spam? is this really a bank? am I connected to a real network or an impostor?
network is this a customer? is this BGP route advertisement legitimate?
31Summary
- Traditional protocol engineering
- must do congestion control
- must do security
- must be efficient
- New module engineering
- must reduce operations cost
- out-of-the-box experience
- re-usable components
- most protocol design will be done by domain
experts (cf. PHP vs. C) - What would a clean-room design look like?
- keep what made Internet successful
- generalize adjust to new conditions
32Managing (VoIP) Applications DYSWIS
- Henning Schulzrinne
- Dept. of Computer Science
- Columbia University
- July 2005
33Overview
- User experience for VoIP still inferior
- Existing network management doesnt work for VoIP
and other modern applications - Need user-centric rather than operator-centric
management - Proposal peer-to-peer management
- Do You See What I See?
- Also use for reliability estimation and
statistical fault characterization
34VoIP user experience
- Only 95-99.5 call attempt success
- Keynote was able to complete VoIP calls 96.9 of
the time, compared with 99.9 for calls made over
the public network. Voice quality for VoIP calls
on average was rated at 3.5 out of 5, compared
with 3.9 for public-network calls and 3.6 for
cellular phone calls. And the amount of delay the
audio signals experienced was 295 milliseconds
for VoIP calls, compared with 139 milliseconds
for public-network calls. (InformationWeek, July
11, 2005) - Mid-call disruptions
- Lots of knobs to turn
- Separate problem manual configuration
35Diagnostic undecidability
- symptom cannot reach server
- more precise send packet, but no response
- causes
- NAT problem (return packet dropped)?
- firewall problem?
- path to server broken?
- outdated server information (moved)?
- server dead?
- 5 causes ? very different remedies
- no good way for non-technical user to tell
- Whom do you call?
36Circle of blame
ISP
probably packet loss in your Internet connection
? reboot your DSL modem
probably a gateway fault ? choose us as provider
OS
VSP
must be a Windows registry problem ?
re-install Windows
app vendor
must be your software ? upgrade
37Traditional network management model
X
SNMP
management from the center
38Old assumptions, now wrong
- Single provider (enterprise, carrier)
- has access to most path elements
- professionally managed
- Typically, hard failures or aggregate problems
- element failures
- substantial packet loss
- Mostly L2 and L3 elements
- switches, routers
- rarely 802.11 APs
- Indirect detection
- MIB variable vs. actual protocol performance
- No real end system management
- DMI SNMP never succeeded
- each application does its own updates
39Example VoIP managing the protocol stack
protocol problem authorization asymmetric conn
(NAT)
media
echo gain problems VAD action
RTP
SIP
protocol problem playout errors
UDP/TCP
TCP neg. failure NAT time-out firewall policy
IP
no route packet loss
40Example VoIP call lifecycle view
get addresses
exchange media
terminate call
SIP INVITE
get 200 OK
REGISTER
outbound proxy? dest. proxy?
STUN failure
auth? registrar?
loss? gain? silence suppression?
41Types of failures
- Hard failures
- connection attempt fails
- no media connection
- NAT time-out
- Soft failures (degradation)
- packet loss (bursts)
- access network? backbone? remote access?
- delay (bursts)
- OS? access networks?
- acoustic problems (microphone gain, echo)
42Examples of additional problems
- ping and traceroute no longer works reliably
- WinXP SP 2 turns off ICMP
- some networks filter all ICMP messages
- Early NAT binding time-out
- initial packet exchange succeeds, but then TCP
binding is removed (web-only Internet)\ - policy intent vs. failure
- broken by design
- we dont allow port 25 vs. SMTP server
temporarily unreachable
43Do You See What I See?
- Each node has a set of active and passive
measurement tools - Use intercept (NDIS, pcap)
- to detect problems automatically
- e.g., no response to HTTP or DNS request
- gather performance statistics (packet jitter)
- capture RTCP and similar measurement packets
- Nodes can ask others for their view
- possibly also dedicated weather stations
- Iterative process, leading to
- user indication of cause of failure
- in some cases, work-around (application-layer
routing) ? TURN server, use remote DNS servers - Nodes collect statistical information on failures
and their likely causes
44Failure detection tools
- STUN server
- what is your IP address?
- ping and traceroute
- Transport-level liveness
- open TCP connection to port
- send UDP ping to port
media
RTP
UDP/TCP
IP
45How to find management peers?
- Use carrier-provided bootstrap list
- Previous session partners
- e.g., address book
- Watcher list
46Need failure statistics
- Which parts of the network are most likely to
fail (or degrade) - access network
- network interconnects
- backbone network
- infrastructure servers (DHCP, DNS)
- application servers (SIP, RTSP, HTTP, )
- protocol failures/incompatibility
- Currently, mostly guesses
- End nodes can gather and accumulate statistics
47Conclusion
- Internet middle-aged ? time for reflection
- can we keep what has worked and re-consider the
others? - need to work on control, management and
reflection - opportunity for new building blocks vs. classical
middleware - opportunity