Title: Surviving Large Scale Internet Outages
1Surviving Large Scale Internet Outages
- Dr. Krishna Kant
- Intel Research
Acknowledgements Work supported by
National Science Foundation Collaborative
work with A. Sahoo P. Mohapatra
2Outline
- Overview
- Routing and Name resolution infrastructures
- Some large scale failures
- Routing Vulnerabilities
- Routing algorithms their properties
- Improving inter-domain routing
- Dealing with Name Resolution Failures
- Name resolution preliminaries
- DNS vulnerabilities Solutions
3The Problem
- Internet has two critical elements
- Routing (Inter intra domain)
- Name resolution
- How robust are they against large scale
failures/attacks? - How do we improve them?
4Internet Routing
- Not a homogeneous network
- A network of autonomous systems (AS)
- Large variation in AS sizes typical heavy tail.
- Inter-AS routing
- Border Gateway Protocol (BGP)
- Complex configuration parameters
- Flexible but serious stability, recoverability
and configurability issues - Intra-AS routing
- Usually easier to manage
- Central control, smaller network,
- But, can suffer from similar problems
5Internet Name Resolution
- Domain Name Server (DNS)
- Translates names to IP addresses.
- Critical for all networking services
- Hierarchical structure
- Caching of data in proxy servers resolvers
- DNS Vulnerabilities
- Complex dependencies easy to poison
- Can lead to large scale failures
- Inability of access or diversion to malicious
sites.
ftp acme.com
application
Resolver
10.7.196.31
acme.com
DNS proxy server
Auth. DNS server
6Large Scale Failures
- Characteristics
- Large service impact.
- Usually non-uniformly distributed, e.g., an
affected geographical area, hijacked .com domain,
etc. - Why study large scale failures?
- Several moderate sized incidents already.
- Larger failures will happen
- Can cause other undesirable impacts
- Secondary failures due to large recovery traffic,
- Substantial imbalance in load,
7Routing Failures
- Physical Damage
- Earthquake, hurricane, high BW cable cuts,
- SW bugs configuration errors
- Incorrect input or output filtering rules
- Aggregation of large un-owned IP blocks
- Incompatible policies among ASes
- Network wide congestion (DoS attack)
- Malicious route advertisements via worms
8Name Resolution Failures
- Compromising name resolution
- Poisoning (altering/insertion) of address records
- Doesnt even require compromising the server
- Extensive caching ? More points of entry
- Substitution of rogue DNS server
- Security holes due to configuration errors
- Potential large scale effects
- Poisoning at higher levels ? Large scale
disruption - Example March 2005 .com attack
- Redirection to malicious sites to collect
sensitive info
9Some Significant Failure Events
10Taiwan Earthquake Dec 2006
- Major outage in SE Asia, 60 drop in traffic
- Issues
- Global traffic passes through a small number of
seismically active choke points. - Luzon strait, Malacca strait, South coast of
Japan - Satellite overland cables ? Inadequate backup
capacity - Several countries depend on 1-2 landing pts
- Outlook Potential repeat performance
- Economics makes change unlikely.
- May be exploited by pirates terrorists
- Reference http//master.apan.net/meetings/xian200
7/publication/051_Kitamura.pdf
11Hurricane Katrina (Aug 2005)
- Major local outages. No major regional cable
routes through the worst affected areas. - Outages persisted for weeks months. Notable
after-effects in FL (significant outages 4 days
later!) - Reference
- http//www.renesys.com/tech/presentations/pdf/Ren
esys-Katrina-Report-9sep2005.pdf
12NY Power Outage (Aug 2003)
- No of concurrent network outages vs. time
- Large ASes suffered less than smaller ones.
- Many ASes all routers down for 4 hours.
- Very similar power outage in Italy, sept 2003.
13Slammer Worm (Jan 2003)
- Worm started w/ buffer overflow of MS SQL.
- Very rapid replication, huge congestion buildup
in 10 mins - Korea falls out, 5/13 DNS root servers fail,
failed ATMs, - High BGP activity to find working routes.
- Reference http//www.cs.ucsd.edu/
savage/papers/IEEESP03.pdf
14DNS Attack (Jan 2006)
- Attack Type
- Authoritative TLD DNS servers attacked using 100
zombie clients 51K recursive servers. - 55 Byte zombie query ? 4.2KB response.
- Responses directed to target name server (w/
spoofed IP address).
- Impact
- Failures in networks in the path including
transit providers to authoritative TLD DNS
servers - Graph
- Unanswered queries (Y-axis) vs. Time (X-axis)
- Red failure, yellow slow
Reference http//www.oecd.org/dataoecd/34/40/3865
3402.pdf
15Infrastructure Induced Failures
- En-masse use of backup routes by 4000 Cisco
routers in May 2007 (Japan) - Routing table rewrites ? 7 hr downtime in NE
Japan - Ref http//www.networkworld.com/news/2007/051607-
cisco-routers-major-outage-japan.html - Akamai CDN failure June 2004
- Probably widespread failures in Akamais DNS.
- Ref http//www.landfield.com/isn/mail-archive/200
4/Jun/0064.html - Worldcom router mis-configuration Oct 2002
- Misconfigured eBGP router flooded internal
routers with routes. - Ref http//www.isoc-chicago.org/internetoutage.pd
f
16Routing Infrastructure
17Routing Basics
- Distance vector based (DV)
- RIP (Routing Information Protocol).
- IGRP (Interior gateway routing Protocol).
- Link State Based (LS)
- OSPF (Open shortest path first)
- IS-IS (Intermediate system to IS)
- Path Vector Based (PV)
- BGP (Border Gateway Protocol)
- Intra-domain (iBGP) inter-domain (eBGP)
versions.
18Distance Vector (DS) Protocols
- Build RT using successive path advertisements.
- May use stale info used to handle failures
- count to infinity problem Several versions to
fix this. - Difficult to use policies
Routing Table for A
B
D
E
A
F
C
19Link State (LS) Protocols
- Each node keeps complete adjacency/cost matrix
computes shortest paths locally - Any failure propagated via flooding
- Expensive in a large network
- Loop-free can use policies easily.
3
B
D
1
4
6
A
2
5
E
C
20Path Vector Protocols
- Each node initialized w/ a set of paths for each
destination - Active paths updated much like in DV
- Explicitly withdraw failed paths ( advertise
next best) - Filtering on incoming/outgoing paths, path
selection policies
- Paths A to D
- Via B cost 3
- Via C cost 4
- Entire path not stored (only cost, next hop)
21Intra-domain Routing under Failures
- Routing algorithms
- Link state (OSPF)
- Flooding can handle failures quickly.
- Path vector (iBGP)
- iBGP routers are fully meshed in small networks
(routing not much of an issue) - In large network, route reflectors may be used
for scalability - Can recover rather quickly
- Single domain of control
- High visibility, common management network, etc.
- Easy to configure consistent values at all
routers - iBGP with route reflection shown to suffer from
oscillations, but can be remedied. - Reference A. Rawat M.A. Shayman, Preventing
persistent oscillations and loops in IBGP
configuration with route reflection, Computer
Networks, Vol 50, No 18, Dec 2006, pp 3642-3665
22Inter-domain Routing
- BGP Default inter-AS protocol (RFC 1771)
- Path vector protocol, runs on TCP
- Scalable, rich policy settings
- But prone to long convergence delays
- High packet loss delay during convergence
23Inter-domain routingBGP specifics and
vulnerabilities
24BGP Routing Table
- Prefix origin address for dest mask
(eg.,207.8.128.0/17) - Next hop Neighbor that announced the route
- One active route, others kept as backup
- Only active route can be advertised
- Route attributes -- may be conveyed outside
- ASpath Used for loop avoidance.
- MED (multi-exit discriminator) preferred
incoming path - Local pref Used for local path selection
25BGP Messages
- Message Types
- Open (establish TCP conn), notification, update,
keepalive - Update
- Withdraw zero or more old routes
- Optionally advertise exactly one new route.
- May need to also advertise sub-prefix
- E.g., 207.8.240.0/24 which is contained in
207.8.128.0/17
26Routing Process
- Input policy engine
- Filter routes by path attributes, prefix, etc.
- Output policy engine
- Manipulate attributes, e.g. Local pref., MED,
etc. - Multiple points for possible configuration errors
mismatch between ASes
27BGP Recovery
- BGP Convergence Delay
- Time for all routes to stabilize following an
event - Four durations of interest
- Tup, Tshort, Tlong, Tdown
- Min. Route Advertisement Interval (MRAI)
- Applies only to adv., not withdrawals
- Intended per destination, Implemented per
peer - Damps out oscillations
Convergence Delay
MRAI
28Impact of BGP Recovery
- Long Recovery Times
- 3 min. for 30 of isolated failures
- 15 min. for 10 of cases
- Longer for larger failures
- Consequences
- Connection attempts over invalid routes fail.
- Big increase in pkt loss (30X) and delay (4X)
- Compromised QoS
Graphs taken from ref 2, Labovitz, et.al.
29BGP Illustration (1)
- Best path PSD(N, cost) X
- S,D Source destination nodes
- N Next hop
- X Actual path (for illustration only)
- Sample starting paths to C
- PBC(D,3) BDAC, PDC(A,2) DAC, etc.
- Paths shown using arrows (all share seg AC)
- Failure of A
- BGP does not attempt to diagnose problem or
broadcast failure events.
30BGP Illustration (2)
- NOTE Affected node names in blue, rest in white
- As neighbors install paths not using A as next
hop ? - PDC(B,5) DBFEAC, PEC(F,5) EFBDAC, PGC(H,6)
GHIBDAC - Full path unknown ? Passage of these paths thru A
is not known! - D advertises PDC (B,5) to B
- Current PBC is via D ? B must pick a path not via
D ? - B installs PBC(F,4) BFEAC advertises it to F
I - Note Green indicates first advertisement by B
31BGP Illustration (3)
- E advertises PEC EFBDAC to F
- Current PFC is via E ?
- F installs PFC(B,4) FBDAC advertises to E
B - G advertises PGC GHIBDAC to H
- Current PHC is via H ?
- H installs PHC(I,5) HIBDAC advertises to I
32BGP Illustration (4)
- Bs adv BFEAC reaches F I
- PFC(B,4) FBDAC thru B ? F withdraws PFC has
no path to C! - PIC(H,5) IHGAC is shorter ? I retains it.
- Fs adv FBDAC reaches B PBC(F,4) BFEAC thru
F ? - B installs PBC(I,6) BIHGAC and advertises to
D, F I - Note Green text Bs first adv Grey text Bs
subsequent adv. (disallowed by MRAI)
33BGP Illustration (5a)
- Hs adv HIBDAC reaches I
- PIC(H,5) IHGAC thru H ? I installs PIC(B,6)
IBDAC advertises to B H. - Bs adv BIHGAC reaches D, F
- D updates PDC(B,8) DBIHGAC (Just a local
update) - F updates PFC(B,8) FBIHGAC advertises to E
- w/ MRAI
- D F have wrong (lower) cost metric, but will
still follow the same path thru. B.
34BGP Illustration (5b)
- Bs adv BIHGAC reaches I
- PIC(B,6) IBDAC thru B ? I withdraws PIC has
no path to C! - w/ MRAI
- I will continue to use the nonworking path IBDAC.
Same as having no path. - Is adv IBDAC reaches B H
- H changes its path to HIBDAC
- Bs path thru I, so B installs (C,10)
advertises to its neighbors D, F I
35BGP Illustration (5c)
- Fs update reaches E
- E updates its path locally.
- Is withdrawal of IBDAC reaches H ( also B)
- H withdraws the path IBDAC has no path to C!
- Hs withdrawal of HIBDAC reaches G ( also I)
- G withdraws the path GHIBDAC has no path to C!
- w/ MRAI
- Nonworking paths stay at E, H G
36BGP Illustration (6) No MRAI
- Bs adv C reaches D, F I (in some order)
- D updates its path cost (B,11)
- F updates its path cost (B,11) advertises PFC
to E. - I updates its path cost (B,13) advertises PIC
to H - Final updates
- Fs update FBC reaches E which updates its path
locally - Is adv IBC reaches H
- H updates its path cost (I,14) HIBC
advertises PHC to G - G does a local update
37BGP Illustration (5) w/ MRAI
- Hs adv HIBDAC reaches I
- PIC(H,5) IHGAC thru H ? I installs PIC(B,6)
IBDAC advertises to B H. - Is adv IBDAC reaches B H
- H changes its path to HIBDAC
- Bs path is thru I, so B installs (C,10)
- When MRAI expires, B advertises to its neighbors
D, F I - Note If MRAI is large, path recovery gets delayed
38BGP Illustration (6) w/ MRAI
- Bs adv C reaches D, F I (in some order)
- D updates its path cost (B,11)
- F updates its path cost (B,11) advertises PFC
to E. - I installs updated path IBC and advertises it
to H - Final updates Same as for (6)
- W/ vs. w/o MRAI
- MRAI avoids some unnecessary path updates (less
router load)
39BGP Convergence Delay Analysis
40Known Analytic Results
- Lots of work for isolated failures, none on large
scale failures. - Labovitz 1 Convergence delay bound for full
mesh networks - O(n3) for average case, O(n!) for worst case.
- Labovitz 2, Obradovic 3, Pei8
- Convergence delay ? Length of longest path
involved - Applies only for unit cost hops
- Griffin and Premore 4
- V shaped curve of convergence delay wrt MRAI.
- Messages wrt MRAI decreases at a decreasing rate.
41Evaluation of LS Failures
- Evaluation methods
- Primarily simulation. Analysis is intractable
- BGP Simulation Tools
- Several available, but simulation expense is the
key! - SSFNET scalable, but max 240 nodes on 32-bit
machine - SSFNet default parameter settings
- MRAI jittered by 25 to avoid synchronization
- OSPFv2 used as the intra-domain protocol
42Topology Modeling
- Topology Generation BRITE
- Enhanced to generate arbitrary degree
distributions - Heavy tailed based on actual measurements.
- Approx 70 low 30 high degree nodes.
- Mostly used 1 router/AS ? Easier to see trends.
- Failure topology Geographical placement
- Emulated by placing all AS routers and ASes on a
1000x1000 grid - The area of an AS ? No. of routers in AS
43Convergence Delay vs. Failure Extent
- Initial rapid increase then flattens out.
- Delays increase rate both go up with network
size - ? Large failures can pose a problem!
44Delay Msg Traffic vs. MRAI
- Small networks in simulation ?
- Optimal MRAI for isolated failures small (0.375
s). - Main observations
- Larger failure ? Larger MRAI more effective
45Convergence Delay vs. MRAI
- A V-shaped curve, as expected
- Curve flattens out as failure extent increases
- Optimal MRAI shifts to right with failure extent.
46Impact of AS Distance
- ASes more likely to be connected to other
nearby ASes. - b indicates the preference for shorter distances
(smaller b ? higher preference) - Lower convergence delay for lower b.
47Improving BGP Convergence Delay
48Reducing Convergence Delays
- Many schemes mostly evaluated for isolated
failures - Some popular schemes
- Ghost Flushing
- Consistency Assertions
- Root Cause Notification
- Our work (Large scale failure focused)
- Dynamic MRAI
- Batching
- Speculative Invalidation
49Ghost Flushing
- Bremler-Barr, Afek, Schwarz Infocom 2003
- An adv. implicitly replaces old path
- GF withdraws old path immediately.
- Pros
- Withdrawals will cascade thru ntwk
- More likely to install new working routes
- Cons
- Substantial addl load on routers
- Flushing takes away a working route!
- Install BC ?
- Routes at D, F, I via B will start working
- Flushing will take them away.
50Consistency Assertion
- Pei, Zhao, et.al., Infocom 2002
- If S has two paths SN1xD SN2yN1xD, first
path is withdrawn, then second path is not used
(considered infeasible). - Pros
- Avoids trying out paths that are unlikely to be
working. - Cons
- Consistency Checking can be expensive
S
N2
N1
y
x
D
51Root Cause Notification
- Pei, Azuma, Massy, Zhang Computer Networks, 2004
- Modify BGP messages to carry root cause (e.g.,
node/link failure). - Pros
- Avoid paths w/ failed nodes/links ? substantial
reduction in conv. delay. - Cons
- Change to BGP protocol. Unlikely to be adopted.
- Applicability to large scale failures unclear
(diagnosis difficult)
H
E
F
G
I
D
2
3
A
B
10
C
- D, E, G diagnose if A or link to A has failed.
- Propagate this info to neighbors
52Large Scale FailuresOur Approach
- What we cant or wouldnt do?
- No coordination between ASes
- Business issues, security issues, very hard to
do, - No change to wire protocol (i.e., no new msg
type). - No substantial router overhead
- Solution applicable to both isolated LS
failures. - What we can do?
- Change MRAI based on network and/or load parms
- e.g., degree dependent, backlog dependent,
- Process messages ( generate updates) differently
53Key Idea Dynamic MRAI
- Increase MRAI when the router is heavily loaded
- Reduces load of route changes.
- Relationship to large scale failure
- Larger failure size ? Greater router loading ?
Larger MRAI more appropriate. - Router load directed MRAI caters to all failure
sizes! - Implementation
- Queue length threshold based MRAI adjustment.
Decrease th1
Increase th1
Decrease th2
Increase th2
54Dynamic MRAI Effect on Delay
- Change wrt fixed MRAI0.375 secs.
- Improves convergence delay as compared to fixed
values.
55Key Idea Message Batching
- BGP default FIFO message processing ?
- Unnecessary processing, if
- A later update (already in queue) changes route
to dest. - Timer expiry before a later msg is processed.
- Relationship to large scale failure
- Significant batching (and hence batching
advantage) likely for large scale failures only. - Algorithm
- A separate logical queue/dest. allows
processing of all updates to dest as a batch. - 1 update from same neighbor ? Delete older ones.
B
C
B
A
A
A
A
B
A
C
B
A
56Batching Effect on Delay
- Behavior similar to dynamic MRAI w/o actually
making it dynamic - Combination w/ dynamic MRAI works somewhat
better.
57Key Idea Speculative Invalidation
- Large scale failure
- A lot of route withdrawals for the failed AS, say
X - withdrawn paths w/ AS X e AS_path thres ?
Invalidate all paths containing X - Implementation Issues
- Going through the routes for invalidation is
inefficient - Use output route filters at each node
- Threshold estimation ? Computed (see paper)
- Reverting routes to valid state ? time-slot based
58Effect of Invalidation
- Avoids exploring unnecessary paths
- Reduces conv. delay significantly, but
- May affect connectivity adversely.
- Implement only at nodes with degree 4 or higher
59Comparison of Various Schemes
- CA is the best scheme throughout!
- GF is rather poor
- Batching dynamic MRAI do pretty well
considering their simplicity
60Routing Recovery Metrics
61Whats the right performance metric?
- Convergence delay
- Network centric, not user centric
- Instability in infrequently used routes is almost
irrelevant - User Centric Metrics
- Packet loss packet delays
- Convergence delay does not correlate well with
user centric metrics
62User Centric Metrics
- Frac of pkts lost or frac increase in pkt delay
- Pkt delay needs E2E monitoring ? Impractical
- Metric computation
- Single number Overall avg over routes time
- Distribution wrt routes, time dependent rate,
etc.
63Comparison between Schemes
- Comparisons
- Consistency assertion (CA), Ghost Flushing (GF),
Speculative Invalidation (SI) - All 3 schemes reduce conv delay substantially,
but only CA can really reduce the pkt losses!
64How Schemes affect routes
- Cumulative time for which there is no valid path
- T_noroute Time for which there is no route at
all - T_allinval Time for which all neighbors
advertise an invalid route - T_BGPinval Time for which BGP chooses an invalid
route (even though some neighbor has a valid
route). - GF increases T_noroute the most, CA reduces
T_allinval the most
65Changes to Reduce pkt Losses
- GF Difficult to reduce T_noroute. Not attempted.
- CA Use best route even if all of them are
infeasible, but dont advertise infeasible
routes. - Improves substantially
- SI Mark the route invalid probabilistically
depending on fail count (instead of
deterministically) - Improves substantially
66Routing Misconfiguration
67BGP Configuration Faults
- Configuration parameters
- Which neighboring networks can send traffic?
- Where traffic enters and leaves the network?
- How routers within the network learn routes to
external destinations? - Potential Problems
- Invisible path Valid route exists, but not made
available - Invalid route, e.g., routing loop
68Configuration Checking
- Fault checking by a tool called RCC
- N. Feamster H. Balakrishnan, Detecting BGP
faults with static analysis, NSDI 05 - Config faults in every AS, most related to lack
of coordination - Some faults could have global consequences
- Consistency checking required for each change in
policies hard!
69Conclusions Open Issues
- Inter-domain routing does not perform very well
for large scale failures. - Considered several schemes for improvement. Room
for further work. - Convergence delay is not the right metric
- Defined pkt loss related metric a simple scheme
to improve it. - Open Issues for large scale failures
- Analytic Modeling of convergence properties.
- What aspects affect pkt losses can we model
them? - Account for policies AS relationships.
- Effective efficient methods for
misconfiguration detection.
70Domain Name SystemBasics Vulnerabilities
71DNS Infrastructure
root
Browser
E-Mail
FTP
nz
au
sg
Application and O/S
Resolver
edu
gov
DNS Proxy Server
ips
sa
gb
Cache
Authoritative DNS Server
- Three levels of name resolution
- Client side (OS provided resolver)
- DNS proxy server (organization level)
- Authoritative server (serves a zone)
72DNS Basics
- DNS manages zones
- A set of names that are under the same authority
- E.g., ftp.acme.com and www.acme.com under
acme.com - DNS is a "lookup service"
- Simple queries ? No search or 'best fit' answers
- Limited data expansion capability
- TTL (Time To Live) The time an RRSet can be
cached/reused by a non- authoritative server - Best matching records
- Iterative vs. recursive resolution
73TTL Values Used by DNS Proxies
- 2.7 Million names on dmoz.org
- Some values, e.g. 1 hr, 1 day, 2 days dominate
- Some extremely small values
- Large TTL
- Low overhead, more likely to be stale/incorrect
- Small TTL
- High overhead, but up to date.
CDF of TTLs
74DNS Resource Record (RR)
- Domain name
- (length, name) pairs, eg., cisco.com ?
05cisco03com00 - Record Types
- DNS Internal types
- Authority NS, SOA DNSSEC DS, DNSKEY, RRSIG,
- Many others TSIG, TKEY, CNAME, DNAME,
- Terminal RR
- Address records A, AAAA
- Informational TXT, HINFO, KEY, (data carried
to apps) - Non Terminal RR
- MX, SRV, PTR, w/ domain names resulting in
further queries. - Other fields
- RL Record length, RDATA IP address,
referral, - TTL Time To Live in a cache
75DNS Attacks
- Inject incorrect RR into DNS proxy (poisoning)
- Capture query send fake response before the
real response - Randomly send fake responses
- Query interception relatively easy
- UDP based ? Dont need any context!
- DNS query uses 16-bit trans-id to connect query
w/ response. - Randomized in newer implementations, but attacker
can generate a large number of replies. - Response can include additional RRs
- Intercept updates to authoritative server
- Technically not poisoning, but a problem
76Poisoning Consequences
- Can be exploited in many ways
- Disallow name resolution
- Direct all traffic to small set of servers
- DDOS attack!
- Direct to a malicious server to collect info or
drop malware - Scale of attack depends on the level in the
hierarchy! - Poison propagates downwards
- Set large TTL to avoid expiry
- Actual scenario in Mar 05 (.com entry poisoned)
Proxy Cache
77Kaminsky DNS Attack
- Attack target www.abc.com
- Poisoning of auth server response possible on TTL
expiry in the proxy ? hard - Generate queries for fake x.abc.com (x1, 2, 3,
) - Supply fake responses with guessed TXID (ahead of
auth server response) - In fake response, delegate www.abc.com to a
server of attackers choosing - Source port guessing is necessary for this attack
- Repeat until something works, say 83.abc.com
- The proxy-server now has a valid DNS record for
83.abc.com. - Queries to www.abc.com are now directed to
attackers site - Reference http//www.doxpara.com/DMK_BO2K8.ppt
78DNSchanger Attack
- Attack via DHCP
- Doesn't exploit any security vulnerability
- Depends on ndisprot.sys driver installed on
infected box that generates fake DHCP server
offers.
- Attack Scenario
- Infected client X connects to some network N
- A benign client Y Requests IP address for N
- X responds w/ DHCP-offer that supplies rogue DNS
server address. - Ys DNS requests can now be translated to
arbitrary IP addresses.
79DNS Robustness
80Making DNS Robust
- TSIG (symmetric key crypto)
- Intended for secure master?slave proxy comm.
- Issues Not general, Scalability
- DNSSEC
- Stops cache poisoning, but issues of overhead,
infrastructure change, key mgmt, etc. - Based on PKI, a symmetric key version also
exists. - Cooperative Lookup
- Direct requests to responsive clients (CoDNS)
- Distributed hash table (DHT) structure for DNS
(CoDoNS) - Cooperative checking between clients (DoX)
81PK-DNSSEC
- Auth. chain starts from root
- Parent signs child certificates (avoids lying
about public key) - Encrypted exchange also supplies signed public
keys - F public key, f private key
Root Cert.
root
nz
sg
au
Fgov(query)
gov
edu
DNS proxy
fgov(resp, Fgb)
Fgb(query)
ips
sa
gb
gb
fgb(resp)
82CoDoNS
- Organize DNS using DHT (distributed hash table).
- Enhances availability via distribution and
replication - Explicit version control to keep all copies
current - Issues
- DHT issues (equal capacity nodes)
- Explicit version control unscalable.
- Not directed towards poisoning control (but
DNSSEC can be used)
83Domain Name Cross-referencing (DoX)
- Client peer groups
- Diversity common interest based
- Peers agree to cooperate on verification of
popular records. - Mutual verification
- Assumes that authoritative server is not poisoned.
Peer2
Peer1
Verify
Peer3
Peer4
84Choosing Peers
- Give get
- Give A peer must participate in verification
even if it is not interested in the address ?
Overhead - Get Immediate poison detection, high data
currency. - Selection of peers
- Topic channel w/ subscription by peers
- E.g. names under a Google/Yahoo directory
- Community channel, e.g., peers within the same
org - Minimizing overhead
- Verify only popular (perhaps most vulnerable)
names
85DoX Verification
- Uses verification cache for efficiency
- Verification
- DNS copy (Rd) verified copy (Rv) ? Stop
- Else send (Ro,Rn) (Rv,Rd) to all other peers
- At least m peers agree ? Stop, else obtain
authoritative copy Ra if Ra ! Rd, poison
detected. - Agreement procedure
- Involves local copy Rv remotely received
(Ro,Rn) - If RvRn ? agree, else peer gets auth. copy Ra
- Several cases, e.g., if RvRo, RaRn ? agree
- Verified copy was obsolete, got correct one now ?
Forced removal of obsolete copy
86Handling Multiple IPs per name
- DNS directed load distribution
- Easily handled with set comparison
- Multiple Views
- Used to differentiate inside/outside clients
- All peers should belong to same view (statically
or trial error). - Content Distribution Networks (CDNs)
- Same name translates to different IP addresses in
different regions - Need a flowset based IP address comparison
87Results Normal DNS
- Poison spreads in the cache
- More queries are affected
88Results DoX
- Poison removed immediately
89DoX vs. DNSSEC
90DNS Delegation
- Delegation
- Allows portions of name space to be managed
separately - Often used for geographical diversity
redundancy - How it works
- sales.dell.com delegated to domain pc1.com
- Client needs to follow DNS tree to resolve
pc1.com (unless there is a glue record)
root
.com
dell.com
pc2.com
pc1.com
support.dell.com
sales.dell.com
parts.sales.dell.com
91Delegation Related Failures
- Delegation Characteristics
- Any domain server may delegate its subspace
further - Delegation done manually w/o any global
visibility. - Delegation related problems
- Potential for long delegation chains long
resolution delays - Lame delegation Child fails to inform parent of
IP address change - Cyclic dependencies
- Greatly amplified opportunities for compromise,
poisoning delegation to rogue servers.
92Exploiting Delegation
- How do we exploit delegation to increase DNS
resilience? - Diversity physical, geographical and
organizational - Careful selection of nodes to delegate to
- Active monitoring of delegation related
anomalies. - Delegation Sentry
- Overlay based monitoring of delegations
- Each DNS server monitored by a selected set of
peers - Need to establish policies for selecting delegate
- Based on reputation, availability, capacity,
location, domain, chain length, etc. - Mechanisms to defeat compromised servers.
- Automated checking of problems such as lame
delegation, cyclic dependencies, etc.
93Conclusions
- DNS has numerous vulnerabilities easy to attack
- Several proposed solutions, none entirely
satisfactory - Large deployed base resists significant overhaul
- Securing DNS remains a challenge
- Combine the best of DNSSEC, CoDoNS DoX.
- Automated detection and correction of delegation
problems.
94Thats all folks!Questions?
95BGP References
- A.L. Barabasi and R. Albert, Emergence of
Scaling in Random Networks, Science, pp.
509512, Oct. 1999. - A. Basu, C.L. Ong, et. al. Route oscillations in
IBGP with route reflection, In Proc. ACM SIGCOMM
(Pittsburgh, PA, Aug. 2002). - A. Bremler-Barr, Y. Afek, and S. Schwarz,
Improved BGP convergence via ghost flushing, in
Proc. IEEE INFOCOM 2003, vol. 2, San Francisco,
CA, Mar 2003, pp. 927937. - S. Deshpande and B. Sikdar,On the Impact of
Route Processing and MRAI Timers on BGP
Convergence Times, in Proc. GLOBECOM 2004, Vol.
2, pp 1147- 1151. - L. Gao, T.G. Griffin, J. Rexford, Inherently
safe backup routing with BGP, In Proc. IEEE
INFOCOM (Anchorage, AK, Apr. 2001) - T.G. Griffin and B.J. Premore, An experimental
analysis of BGP convergence time, in Proc. ICNP
2001, Riverside, California, Nov 2001, pp. 5361. - T.G. Griffin, F.B. Shepherd, G. Wilfong, The
stable paths problem and inter-domain routing,
IEEE/ACM Transactions on Networking 10, 1 (2002),
232243. - F. Hao, S. Kamat, and P. V. Koppol, "On metrics
for evaluating BGP routing convergence," Bell
Labora- tories Tech. Rep., 2003. - C. Labovitz, G. R. Malan, and F. Jahanian,
Internet Routing Instability, IEEE/ACM
Transactions on Networking, vol. 6, no. 5, pp.
515528, Oct. 1998. - C. Labovitz, Ahuja, et al., Delayed internet
routing convergence, in Proc. ACM SIGCOMM 2000,
Stockholm, Sweden, Aug 2000, pp. 175187. - C. Labovitz, A. Ahuja, et al., The Impact of
Internet Policy and Topology on Delayed Routing
Convergence, in Proc. IEEE INFOCOM 2001, vol. 1,
Anchorage, Alaska, Apr 2001, pp. 537546.
96BGP References
- A. Lakhina, J.W. Byers, et al., On the
Geographic Location of Internet Resources, IEEE
Journal on Selected Areas in Communications, vol.
21 , no. 6, pp. 934948, Aug. 2003. - R. Mahajan, D. Wetherall, T. Anderson,
Understanding BGP misconfiguration, In Proc.
ACM SIGCOMM, (Pittsburgh, PA, Aug. 2002), pp.
317. - A. Medina, A. Lakhina, et al., Brite Universal
topology generation from a users perspective,
in Proc. MASCOTS 2001, Cincinnati, Ohio, Aug
2001, pp. 346-353. - N. Feamster H. Balakrishnan, Detecting BGP
faults with static analysis, Proc. of NSDI 2005 - D. Obradovic, Real-time Model and Convergence
Time of BGP, in Proc. IEEE INFOCOM 2002, vol. 2,
New York, June 2002, pp. 893901. - D. Pei, et al., "A study of packet delivery
perfor- mance during routing convergence," in
Proc. DSN 2003, San Francisco, CA, June 2003, pp.
183-192. - Dan Pei, B. Zhang, et al., An analysis of
convergence delay in path vector routing
protocols, Computer Networks, vol. 30, no. 3,
Feb. 2006, pp. 398421. - D. Pei, X. Zhao, et al., Improving BGP
convergence through consistency assertions, in
Proc. IEEE INFOCOM 2002, vol. 2, New York, NY,
June 2327, 2002, pp. 902911. - Y. Rekhter, T. Li, and S. Hares, Border Gateway
Protocol 4, RFC 4271, Jan. 2006. - J. Rexford, J. Wang, et al., BGP routing
stability of popular destinations, in Proc.
Internet Measurement Workshop 2002, Marseille,
France, Nov. 68, 2002, pp. 197202. - A. Sahoo, K. Kant, and P. Mohapatra,
Characterization of BGP recovery under
Large-scale Failures, in Proc. ICC 2006,
Istanbul, Turkey, June 1115, 2006.
97BGP References
- A. Sahoo, K. Kant, and P. Mohapatra, Improving
BGP Convergence Delay for Large Scale Failures,
in Proc. DSN 2006, June 25-28, 2006,
Philadelphia, Pennsylvania, pp. 323-332. - A. Sahoo, K. Kant, and P. Mohapatra, "Speculative
Route Invalidation to Improve BGP Convergence
Delay under Large-Scale Failures," in Proc. ICCCN
2006, Arlington, VA, Oct. 2006. - A. Sahoo, K. Kant, and P. Mohapatra, Improving
Packet Delivery Performance of BGP During
Large-Scale Failures", submitted to Globecom
2007. - SSFNet Scalable Simulation Framework.
Online. Available http//www.ssfnet.org/ - W. Sun, Z. M. Mao, K. G. Shin, Differentiated
BGP Update Processing for Improved Routing
Convergence, in Proc. ICNP 2006, Santa Barbara,
CA, Nov. 1215, 2006 , pp. 280289. - H. Tangmunarunkit, J. Doyle, et al, Does Size
Determine Degree in AS Topology?, ACM SIGCOMM,
vol. 31, issue 5, pp. 710, Oct. 2001. - R. Teixeira, S. Agarwal, and J. Rexford, BGP
routing changes Merging views from two ISPs,
ACM SIGCOMM, vol. 35, issue 5, pp. 7982, Oct.
2005. - B. Waxman, Routing of Multipoint Connections,
IEEE Journal on Selected Areas in Communications,
vol. 6, no. 9, pp. 16171622, Dec. 1988. - B. Zhang, R. Liu, et al., Measuring the
internets vital statistics Collecting the
internet AS-level topology , ACM SIGCOMM, vol.
35, issue 1, pp. 5361, Jan. 2005. - B. Zhang, D. Massey, and L. Zhang, "Destination
Reachability and BGP Convergence Time," in Proc.
GLOBECOM 2004, vol. 3, Dallas, TX, Nov 3, 2004,
1383-1389.
98DNS References
- R. Arends, R. Austein, et.al, DNS Security
Introduction Requirements,'' RFC 4033, 2005. - G. Ateniese S. Mangard, A new approach to DNS
security (DNSSEC),'' in Proc. 8th ACM conf on
comp comm system security, 2001. - D. Atkins R. Austein, Threat analysis of the
domain name system,' http//www.rfc-archive.org/g
etrfc.php?rfc3833, August 2004. - R. Curtmola, A. D. Sorbo, G. Ateniese, On the
performance and analysis of dns security
extensions,'' in Proceedings of CANS, 2005. - M. Theimer M. B. Jones, Overlook Scalable
name service on an overlay network,'' in Proc.
22nd ICDCS, 2002. - K. Park, V. Pai, et.al, CoDNS Improving DNS
performance and reliability via cooperative
lookups,'' in Proc. 6th Symp on OS design
impl., 2004. - V. Pappas, Z. Xu, S. Lu, D. Massey, A. Terzis,
and L. Zhang, Impact of configuration errors on
DNS robustness, SIGCOMM CCR., vol. 34, no. 4,
pp. 319330, 2004. - L. Yuan, K. Kant, et. al, DoX A peer-to-peer
antidote for DNS cache poisoning attacks,'' in
Proc. IEEE ICC, 2006. - L. Yuan, K. Kant P. Mohapatra, A proxy view
of quality of domain name service,'' in IEEE
Infocom 2007. - V. Ramasubramanian and E. G. Sirer, Perils of
transitive trust in the Domain Name System, in
Proc. International Measurement Conference, 2005. - V. Ramasubramanium E.G. Sirer, The design
implementation of next generation name service
for internet, Sigcom 2004.
99Backup
100Quality of DNS Service(QoDNS)
- Availability
- Measures if DNS can answer the query.
- Prob of correct referral when record is not
cached. - Accuracy
- Prob of hitting a stale record in proxy cache
- Poison Propagation
- Prob(poison at leaf level at tt level k
poisoned at t0) - Latency Additional time per query
- Overhead Additional msgs/BW per query
101Computation of Metrics
- Modification at authoritative server
- Copy obsolete but proxy not aware until TTL
expires a new query forces a reload
modification
U
X
hit
miss
hit
hit
hit
miss
- XR Residual time of query arrival process
- MR Residual time of modification process
- Y Inter-miss period TTL XR
102Dealing with Hierarchy
.gov
Level h-1
.sa
.ips
Level h
.gb
- A miss at a node ? a query at its parent
- Superposition of miss processes of children ?
query arrival process of parent - Recursively derive the arrival process bottom-up
103Deriving QoDNS Metrics
- Accuracy
- Prob. leaf record is current
- (Un)Availability
- Prob. BMR is Obsolete referral
- Latency
- RTT x referrals
- Overhead
- Related to referrals for current RRs tries
for obsolete RRs
BMR is Current
.sg
.nz
.au
.gov
.edu
.gov
BMR is Obsolete Referral
.sa
.ips
.gb
.gb
.abc
.xys
.qwe
.abc
BMR is Obsolete Record