Title: The Design and Implementation of a Next Generation Name Service for the Internet
1The Design and Implementation of a Next
Generation Name Service for the Internet
- Leo Bhebhe
- leo.bhebhe_at_nokia.com
2Contents
- Introduction
- Name Servers and Name Resolution
- Current Issues With Domain Name System (DNS)
- Cooperative Domain Name System (CoDoNS)
- Performance measurements
- Summary/Conclusions
3Introduction
- Analysis of the current Domain Naming System
(DNS) and looks at the limitations of the DNS
structure, bottlenecks and performance issues - Proposal of a new Cooperative Domain Naming
System (CoDoNS) to replace the old
Domain Name System (DNS) Cooperative Domain Name System (CoDoNS)
Slow High lookup performance through proactive caching
Vulnerable to denial of service (DoS) attacks Resilience to denial of service attacks through automatic load-balancing
Does not support fast updates Fast propagation of updates
4Name Servers and Name Resolution
flits.cs.vu.nl
5Current Issues With Domain Name System (DNS)
- Network failures
- Susceptible to denial of service attacks (DoS)
- Small number of nameservers serve domains,
limited redundancy in name servers - At network level
- 80 of domain names are served by just two
servers - 0.8 by one
- 32 servers are connected to the internet by a
single gateway (serious outage or DoS) - At top level hierarchy
- Small number of servers (targets for DoS)
- Recent DoS attack severely affected the
availability of Microsoft's web services for hrs. - Checked by performing trace routes to 10000
different nameservers which serve 5000 randomly
chosen names from 50 global distributed sites on
Planetlab
6Current Issues With Domain Name System (DNS)
- Network failures
- Failure Resilience-Implimentation Errors
- 20 of name server implementations contains
security flaws - 18 of servers dont respond to version querries
- 14 do not report valid BIND versions
- 2 of the nameservers have the tsig bug which
permits a buffer to overflow that can enable
malicious agents to gain access to the system - 19 of the servers have the negache problem that
can be exploited to launch the DoS attack by
providing negative responses with large TTL value
from a malicious server - Checked by surveying 150000 servers based on the
Berkely Internet Name Daemon (BIND) to see if any
known vulnerabilities are available
7Current Issues With Domain Name System (DNS)
- Performance - System Latency makes it unsuitable
for dynamic updates. - Name address translation in the DNS incurs long
delays. The legacy DNS incorporates aggressive
caching in order to reduce latency of query
resolution - But explosive growth of namespace has decreased
the effectiveness of caching - Short timeouts (TTL)-reduces DNS cache hit rates
- Increase in load imbalance - at root and TLD
nameservers handle a large load - Configuration error such as broken (lame) or
inconsistance delegations - can introduce latent
performance problems - Update propagation-Large scale caching
-maintaining the consistance of cached records in
the presence of dynamic change (TTL?)
8Cooperative Domain Name System (CoDoNS)
- CoDoNS derives its performance characteristics
from a proactive caching layer called Beehive - Automatically replicates the DNS mappings
throughout the network to match anticipated
demand and provides a strong performance
guarantee. - Achieves a targeted average lookup latency with a
minimum number of replicas - Beehive is a proactive replication framework that
enables prefix-matching DHTs to achieve O(1)
lookup performance - Pastry and Tapestry are examples of structured
DHTs that use prefix-matching to look up objects - Example
- If the identity of the record is e.g. 110011,
Pastry would store the value in the host with the
identity closest to that (the home node) - If proactive caching is used for one level, then
the record is copied to all hosts whose identity
begins with 11001 and - For level 2, 1100 etc.
- The home node receives periodically information
about the usage of the record and makes decisions
on how many levels to cache the record. - Thus most often used records can be disseminated
to almost all nodes while rarely used can be
stored in relatively few places
9Cooperative Domain Name System (CoDoNS)
- Replicating every object at every node would
achieve O(1) lookups BUT would - Incur excessive space overhead
- Consume significant bandwidth
- Lead to large update latencies
- Beehive minimize bandwidth by posing the
following optimising problem - Minimize the total number of replicas subject to
constraint that the aggregate looup latency less
than a desired constant C - For power law (or Zipf-like) query distributions,
Beehive analytically derives the optimal closed
solution to this problem. - The final expression that minimizes the total
number of replicas for Zipf-like query with
parameter ?lt1 is the following - b is the base of the underlying DHT, Xi is the
fraction of most popular objects that get
replicated at level I - Selecting the appropiate C enables applications
to achieve any targeted average. lookup latency
10Cooperative Domain Name System (CoDoNS)
- CoDoNS architecture
- Consists of globally distributed nodes that self
organise to form a peer-to-peer network - CoDoNS associates the node whose identifier is
closest to the consistent hash of the domain name
as the home node for that domain name - The home node stores a permanent copy of the
resource records owned by that domain name and
manages their replication - If the home node fails, the closest node in the
identifier space automatically becomes the new
home node - CoDoNS replicates all records of several nodes
adjacent to the home node in the identifier space
in order to avoid data loss due to node failures - It can serve as a backup for legacy DNS, as well
as a complete replacement
11CoDoNS Deployment and The Process of Query
Qesolution
- Clients send DNS quiries to the local CoDoNS
server - The local CoDoNS server obtains records from the
home node or an intermediate node - The local CoDoNS server then responds to the
client - In the background, the home node interacts with
the legacy DNS to keep records fresh and
propagate updates to cached copies
12Issues and Implications
- CoDoNS uses crypto-graphic delegations and
self-verifying records based on the DNSSEC
standard. - DNSSEC uses public key cryptography to enable
authentication of resource records. - Every namespace operator has a public-private key
pair - The private key is used to digitally sign DNS
records managed by that operator - The corresponding public key is in turn certified
by a signature from a domain higher up in the
hierarchy. - The signature and the public key are stored in
DNS as resource records of type sig and key
respectively. - The use of cryptographic certificates enables any
client to check the verity of a record
independently, and keeps peers in the network
from forging certificates. - To speed up certificate verification, CoDoNS
servers cache the certificates along with the
resource records and provide them to the clients.
13Cumulative Distribution of Look Up Latency
- CoDoNs achieves low latencies for name resolution
- More than 50 of quries incur no network delay as
they are answered from the local CoDoNS cache - This is because proactive replication pushes
responses for the most popular domain names to
all CoDoNS servers
14Median latency
- CoDoNS latency decreases significantly as
proactive caching takes effect in the background - Initially, CoDoNS servers have an empty cache and
redirect most of the queries to legacy DNS. - Consequently, they incur higher latencies than
the legacy DNS. - But as resource records are fetched from legacy
DNS and replication in the background pushes
records to other CoDoNS servers, the latency
decreases significantly.
15Median resolution lookup latency in CoDoNS
- Flash crowd is introduced at 6 hours
- CoDoNS detects the flash crowd quickly and adapts
to the amount of caching to counter it while
continuing to provide high performance - Beehives proactive replication in the background
detects the changes in popularity, adjusts the
number of replicas, and decreases the lookup
latency.
16Load Balance
quantifying load balancing using
- CoDoNs handles flash-crowds by balancing the
quiery load uniformly across node - The graph shows the standard deviation to the
mean across all nodes - At the start of the experiment, the query load is
highly unbalanced, since home nodes of popular
domain names receive far greater number of
queries than average. - The imbalance is significantly reduced as the
records for popular domains get replicated in the
system. - Even when a flash-crowd is introduced at the six
hour mark, dynamic changes in caching keep the
load balanced after a temporary increase in load
variance.
17Update Propagation Time
- CoDoNS incurs low latencies for propagation
updates. - 98 of replicas get updated within one second
- It takes a few seconds longer to update some
replicas due to high variance in network delays
and loads at some hosts. - The latency to update 99 of replicas one hop
from the home node is about one second. - Overall, update propagation latency in CoDoNS
depends on the extent of replication of records
18Summary/Conclusions
- Performance measurements from a planetary-scale
deployment against a real workload indicate that
CoDoNS can provide low latencies for query
resolution. - Massive replication for the most popular records,
but a modest number of replicas per server,
achieves high performance with low overhead. - Eliminating the static query processing hierarchy
and shedding load dynamically onto peer nodes
greatly decreases the vulnerability of CoDoNS to
denial of service attacks. - Self organization and continuous adaptation of
replication avoids bottlenecks in the presence of
flash crowds. - Proactive update propagation ensures that
unanticipated changes can be quickly disseminated
and cached in the system - Cooperative Domain Name System (CoDoNS) is
proposed as an alternative for DNS