Naming - PowerPoint PPT Presentation

1 / 110
About This Presentation
Title:

Naming

Description:

Use your phone number as your name? Use your SSN as your name? Use your name as your SSN? ... 1. New node asks any node to lookup succ(p 1) ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 111
Provided by: Ken667
Category:

less

Transcript and Presenter's Notes

Title: Naming


1
Naming
  • Introduction to Distributed SystemsCS
    457/557Fall 2008Kenneth Chiu

2
Entities, Names, IDs, and Addresses
  • A name is a sequence of bits that can be used to
    refer to an entity.
  • Entities The thing that we want to refer to.
  • What properties are desired in name?
  • Location independence
  • Easy to remember
  • Suppose I have the name of an entity. Can I now
    operate on it immediately? Suppose I have your
    name and want to access you?
  • To operate on an entity, necessary to access it,
    via an access point. Names of access points are
    addresses.
  • Can an entity have more than one access point?
  • Can an entities access point change over time?
  • Are names unique? Permanent?
  • Addresses unique? Permanent?

3
  • IDs are also a kind of name. What properties do
    they have?
  • Refers to at most one entity
  • Each entity has at most one ID
  • ID is permanent
  • Examples?
  • What is your name? What is your address? What is
    your ID?
  • SSN, phone number, passport number, street
    address, e-mail address.
  • Can we substitute one for the other?
  • Use your phone number as your name?
  • Use your SSN as your name?
  • Use your name as your SSN? Phone number?

4
Central Question
  • How to resolve names to addresses?

5
Naming vs. Locating
  • DNS works well for static situations, where
    addresses dont change very often.
  • What if we assume that it does?
  • Suppose we want to change ftp.cs.binghamton.edu
    to ftp.cs.albany.edu. How?
  • Change the IP address for ftp.cs.binghamton.edu.
  • Make a symbolic link from ftp.cs.binghamton.edu
    to ftp.albany.cs.edu. In other words, put an
    entry in DNS that says ftp.cs.binghamton.edu has
    been renamed to ftp.cs.albany.cs.edu.
  • Compare and contrast?
  • If the first is over long distances, latency may
    be slow. Also, could be a bottleneck, since it is
    centralized.
  • Adding indirection via a symbolic link can
    create very long chains. Chain management is an
    issue.

6
  • A better solution is to divide into separate
    naming and location service.
  • How many levels of naming are used in typical
    Internet?

Two-level mapping using identities and separate
naming and location service.
Direct, single level mapping between names and
addresses.
7
Name Spaces
  • The name space is the way that names in a
    particular system are organized. This also
    defines the set of all possible names.
  • Examples?
  • Phone numbers
  • Credit card numbers
  • DNS
  • Human names in the US
  • Files in UNIX, Windows
  • URLs

8
Flat Naming
9
Flat Naming
  • Given an unstructured name (ID), how we locate
    the access point?
  • Broadcasting
  • Forwarding
  • Home-based
  • Distributed hash tables
  • Hierarchical location service

10
Broadcast/Multicast Location
  • How does Ethernet addressing work?
  • MAC address
  • How does your switch/hub know the IP to MAC
    address mapping?
  • ARP, sends request, gets answer
  • Disadvantage?
  • Could waste bandwidth if network is large.
  • Interrupts hosts to check if they are the one
    being sought.
  • What if multicast instead of broadcast?
  • Can also be used to find best replica.

11
Forwarding
  • Question How does the post office deal with
    mobility?
  • When entity moves, leave a reference.
  • Disadvantages?
  • Chain too long if lots of movement.
  • All intermediate locations have to maintain the
    forwarding.
  • Vulnerable to failure.
  • Performance is bad.

12
Forwarding Pointers in SSP
  • Object originally in P3, then moved to P4. P1
    passes object reference to P2.
  • Do we always need to go through the whole chain?

13
Shortcut
  • Redirecting a forwarding pointer, by storing a
    shortcut in a client stub.

How to deal with broken chains?
14
Home-Based Approaches
  • Let a home keep track of where the entity is.
    Mobile IP
  • All hosts use fixed IP (essentially functions as
    an ID) as a home location (or home address). Home
    location registered with a naming service.
  • A home agent is monitoring this address.
  • When entity moves, it registers a foreign address
    as the care-of address.
  • Clients send to home location first. When home
    agent receives a packet, it tunnels it to current
    care-of address, and also responds back to client
    with current location.

15
(No Transcript)
16
  • Disadvantages?
  • Home address has to be supported as long as
    entity is alive.
  • Home address is fixed, what happens when the move
    is permanent.
  • What if entity is local, but home is far away?
  • Try a two-tiered scheme, first see if the entity
    is local, then try the home.
  • Solution for moves?
  • Use a naming service to find the home.

17
Distributed Hash Tables
  • Chord
  • Organize all nodes in a ring.
  • Each node is assigned a random m-bit ID.
  • Each entity is assigned a unique m-bit key.
  • Entity with key k managed by node with smallest
    id gt k (called the successor).
  • Simple solution?

18
Node 1 receives request to find key 19. What does
it do?
p 1
pred(p)
succ(p1)
Possible key values
Actual nodes
19
  • Finger tables
  • Each node p maintains a finger table FTp with at
    most m entries FTpi
    points to the first node succeeding p by at least
    2i1.
  • To look up a key k, node p forwards the request
    to node with index j satisfying
  • If p lt k lt FTp1, then the request is also
    forwarded to FTp1.

20
Possible key values
Actual nodes
21
  • Resolving key 26 from node 1 and key 12 from node
    28 in a Chord system.

22
Lookup
  • Primary task is how to lookup the node
    responsible for storing the value of a key, given
    a key.
  • Typically, the key might be the hash of a file,
    for example.
  • The process is that each node uses its finger
    table to decide which node to send it to.

23
succ(p2i-1)
i
Resolve k12 from node 28
Possiblekey values
Resolve k26 from node 1
Actual nodes
24
Joining
  • How does a node join?

25
succ(p2i-1)
i
2. New node informs succ(p1) that itself is new
predecessor.
Possiblekey values
Insert new node
1. New node asks any node to lookup succ(p1)
24
Actual nodes
26
succ(p2i-1)
2. New node informs succ(p1) that itself is new
predecessor.
i
Possiblekey values
24
1. New node asks any node to lookup succ(p1)
Insert
3. New node builds its own finger table by doing
successive lookups.
27
  • Maintaining connectivity
  • Nodes periodically contact succ(p1) and ask it
    to return pred(succ(p1)).
  • If same, all is fine.
  • If different, what happened?
  • If different, some node q must have joined. p
    will set FT1 to q, then contact q to ask for
    its predecessor.
  • Nodes periodically contact pred(p) to see if
    still alive.
  • If dead, pred(p) is set to null.
  • If a node discovers that pred(succ(p1)) is null,
    then it informs succ(p1) that its predecessor
    is very likely to be p.
  • Maintaining finger tables.
  • Nodes periodically lookup succ(p2i-1).
  • As long as its not too far out of whack, will
    still succeed efficiently.

28
  • Locality Chord ignores the network topology.
  • Topology-based assignment When assigning an ID
    to a node, make sure that nodes close in ID space
    are also close in the network.
  • How do network failures impact this?
  • Proximity routing Maintain more than one
    alternative for each entry in the table.
    Currently, each entry is the first node in the
    range p2i-1, p2i-1. It can point to multiple
    ones, scattered in network space.
  • Proximity neighbor selection If there is a
    choice as to neighbors, select the closest one.

29
Hierarchical Approaches
  • Build a large-scale search tree by dividing
    network in hierarchical domains.
  • Each domain has a node.
  • Generalization of
  • First try local registry, then try the home.
  • Can generalize to multiple tiers.

30
  • Hierarchical organization of a location service
    into domains, each having an associated directory
    node.
  • Is this a hierarchical namespace?
  • Note that namespace is still flat!

31
  • Each entity in domain D has a location record in
    dir(D).
  • The location record in the leaf domain contains
    the current address.
  • The location record in a non-leaf domain contains
    a reference to the directory node of the correct
    next lower domain.

R
E N1
N1
E N2
N2
E Address of E
Entity (E)
Client
32
  • An example of storing information of an entity
    having two addresses in different leaf domains.

33
  • Looking up a location in a hierarchically
    organized location service.

34
  • Each entity in domain D has a location record in
    dir(D).
  • The location record in the leaf domain contains
    the current address.
  • The location record in a non-leaf domain contains
    a reference to the directory node of the correct
    next lower domain.

R
E N1
N1
E N2
N2
E Address of E
Entity (E)
Client
35
  • Inserting a replica The request is forwarded to
    the first node that knows about entity E.

36
  • A chain of forwarding pointers to the leaf node
    is created.

37
  • Delete operations are analogous.
  • Directory node dir(D) in leaf domain is requested
    to remove entry for E.
  • If dir(D) has no more entries for E (no more
    replicas), it then requests its parent to also
    remove the entry pointing to dir(D).

38
Why Is This Any Better?
  • Lets play Devils Advocate and compare this to
    straightforward solutions.
  • Exploits locality
  • Search expands in a ring.
  • As an entity moves, it usually is a local
    operation.

39
Pointer Caching
  • How effective is caching addresses?
  • Depends on degree of mobility.
  • If very mobile, what can we do to help?
  • If D is the smallest domain that E moves around
    in, can cache dir(D).
  • Called pointer caching.

40
  • Caching a reference to a directory node of the
    lowest-level domain in which an entity will
    reside most of the time.

41
  • If a replica is inserted locally, caches should
    be updated to point to the local replica.

42
Scalability
  • Where is the bottleneck here?
  • Root has to store everything.
  • How to address?
  • Federate/distribute the root (multiple roots).
  • Each root is responsible for a subset.
  • Cluster solution?
  • Also distribute geographically.

43
Scalability Issues
  • Locating subnodes correctly is a challenge.

44
Structured Names
45
Naming Graph
  • Path, local name, absolute name
  • Should it be a tree, DAG, allow cycles?

46
Name Spaces (2)
  • The general organization of the UNIX file system
    implementation on a logical disk of contiguous
    disk blocks.

47
Name Resolution
  • Looking up a name (finding the value) is called
    name resolution.
  • Closure mechanism (or where to start)
  • How to find the root node, for example.
  • Examples, file systems, ZIP code, DNS

48
Aliases
  • Aliases
  • Can be hard.
  • Can be soft, like a forwarding address.

49
  • Naming graph for symbolic link.

50
Merging Namespaces
  • How can we merge namespaces? Are there any
    issues?
  • Mounting
  • Can be used to merge namespaces.
  • In the hierarchical case, what is needed is a
    special directory node that jumps to the other
    namespace.

51
Linking and Mounting
  • Consider a collection of heirarchical namespaces
    distributed across different machines.
  • Each namespace implemented by different server.
  • Information required to mount a foreign name
    space in a distributed system
  • The name of an access protocol.
  • The name of the server.
  • The name of the mounting point in the foreign
    name space.

52
  • Mounting remote name spaces through a specific
    process protocol.
  • How do you access steens mbox from Machine A?

Network
53
Name Space Implementation
  • Name spaces always map names to something.
  • DNS maps what to what?
  • Can be divided into three layers
  • Global layer Doesnt change very often.
  • Administrational layer Single organization, like
    a department, or division.
  • Managerial layer Change regularly, such as local
    area network.

54
Global layer
Administrational layer
Managerial layer
55
Name Server Characteristics
  • A comparison between name servers for
    implementing nodes from a large-scale name space
    partitioned into a global layer, as an
    administrational layer, and a managerial layer.

56
Name Resolution
  • A name resolver looks up names.
  • How about a simple hash table?
  • Bottleneck if just one.
  • Replicate?
  • How do you update a record? Every single replica?
  • The idea is that you want to distribute the load,
    but do it in the right way.
  • Assume that we use a hierarchical name space.

57
Iterative Name Resolution
  • Consider the name rootltnl, vu, cs, ftp, pub,
    globe, index.txtgt.

58
Recursive Name Resolution
  • Which loads root server more?

59
Recursion and Caching
  • Recursive name resolution of ltnl, vu, cs, ftpgt.
    Name servers cache intermediate results for
    subsequent lookups.
  • Is iterative or recursive better for caching?

60
Name Resolution Communication Costs
  • The comparison between recursive and iterative
    name resolution with respect to communication
    costs.

61
Name Resolution Communication Costs
  • A comparison of iterative vs. recursive

R1
Name servernl node
Recursive name resolution
R2
I1
Name servervu node
Client
I2
R3
Iterative name resolution
Name servercs node
I3
Long distance communication
62
Example DNS
  • Name space is a tree of nodes.
  • Label can be 63 characters.
  • Max name is 255 characters.
  • Path names represented two ways
  • rootltnl, vu, cs, flitsgt
  • flits.cs.vu.nl.
  • Subtree is a domain. Path name is a domain name.
  • A node contains a collection of resource records.
  • A zone is the part of the tree that a nameserver
    is responsible for.
  • A domain is made up of one or more zones.

63
Resource Records
64
DNS Implementation
  • Each zone is managed by a name server.

65
Node Contents
  • An excerpt from the DNS database for the zone
    cs.vu.nl.

66
(No Transcript)
67
DNS Subdomains
  • Part of the description for the vu.nl domain
    which contains the cs.vu.nl domain.

68
Decentralized DNS
  • Basic idea Take the DNS name, hash it, and use
    DHT to find the key.
  • Disadvantage?
  • Pastry
  • Prefixes of keys are used to route to nodes.
  • Each digit taken from base b.
  • Suppose you have base 4. A node with ID 3210 is
    responsible for all keys with prefix 321. it
    keeps the following table.

69
  • Suppose it receives a lookup request for 3123?
    1000?

70
Replication
  • The main problem with this is that there is going
    to be a lot of hops.
  • Replicate to higher levels. For example, key 3211
    is replicated to all nodes havin prefix 321.
  • What happens if you replicate everything?
  • Suppose you want to guarantee that on average, it
    takes C hops? Which keys should be replicated?

71
Distribution
  • How are queries distributed? Are some more common
    than others? What does it look like?
  • Zipf distribution says that the frequency of the
    n-th ranked item is proportional to 1/n?, with ?
    being close to 1.

72
Selective Replication
  • Assume Zipf distribution of queries, then the
    formula above shows fraction of most popular keys
    that should be replicated at level i. d is based
    on ? base b. N is total number of nodes. C is
    desired hop count.

73
Example
  • Example Assume that you want an average of one
    hop, with base b4, ?0.9, N10,000, and
    1,000,000 records.
  • 61 most popular should be replicated at level 0.
  • 284 next most popular should be replicated at
    level 1.
  • 1323 next most popular should be replicated at
    level 2.
  • 6177 next most popular should be replicated at
    level 3.
  • 28826 next most popular should be replicated at
    level 4.
  • 134505 next most popular should not be replicated.

74
Attribute-Based Naming
75
Directory Services
  • Sometimes you want to search for things based on
    some kind of description of them.
  • Usually known as directory services.
  • Coming up with attributes can be hard.
  • Resource Description Framework (RDF) is
    specifically designed for this.

76
Example LDAP
  • DNS resolves a name to a node in the namespace
    graph.
  • LDAP is a directory service, which allows more
    general queries.
  • Consists of a set of records.
  • Each record is a list of attribute-value pairs,
    with possible multiple values.

77
LDAP Directory Entry
  • A simple example of a LDAP directory entry using
    LDAP naming conventions.
  • /CNL/OVrije Universiteit/OUMath. Comp. Sc.

78
  • Collection of all entries is a directory
    information base (DIB).
  • Each naming attribute is a relative distinguished
    name (RDN).
  • The RDNs, in sequence, can be used to form a
    directory information tree (DIT).

79
Hierarchical Implementations LDAP (2)
  • Part of a directory information tree.

80
Children Nodes
  • Two directory entries having Host_Name as RDN.

81
Using DHTs for attribute-value searches
  • So far, we have assumed that the search is
    centralized.

82
Lookups
  • To do a lookup, represent as a path, then hash
    the path.

83
Range Queries
  • Divide key into two parts, name and value.
  • Hash the name. Assume that a group of servers is
    responsible for that.
  • Each server in the group is responsible for a
    range.
  • For a resource description described by two
    attribute-values, it must be stored via both of
    them.
  • Example, movie made after 1980 with rating of
    four-five stars.

84
Semantic Overlay Networks
  • Maintaining a semantic overlay through gossiping.

85
Garbage Collection
86
Garbage Collection
  • How do you handle a server object that is unused?
  • Can it be deleted by the server?
  • More context.

87
Unreferenced Objects
  • An example of a graph representing objects
    containing references to each other.

88
Example
  • class Class1 Class2 cls2 Class2 global
    new Class2foo() Class1 obj new
    Class1 obj-gtcls2 new Class2 global
    new Class2

89
Reference Counting
  • How to avoid the double-counting?

90
Passing a Reference
  • Copying a reference to another process and
    incrementing the counter too late
  • A solution.

91
Weighted Reference Counting
  • Can we avoid sending increment messages?

2
1
1
2
X
Increment
Decrement
92
Weighted References
  • Think of each reference as a token. If you give
    each reference multiple tokens, then it can hand
    those out without contacting the skeleton.

1
2
1
Step 2 Copy of reference is made. Weight is
divided by two.
2
2
Decrement 1
X
2
Step 1 Reference created with weight 2.
1
Step 3 A reference is deleted. A decrement by 1
message is sent.
93
Weighted References
  • Works correctly in the simple case.

2
2
Step 1 Reference created with weight 2.
Decrement 2
2
X
Step 2 A reference is deleted. A decrement by 2
message is sent.
94
Total and Partial Weights
  • To work in real situation, the skeleton keeps
    track of the initial total weight that is
    available.
  • Each proxy/stub then keeps track of how much
    weight it is carrying (the partial weight).
  • When a proxy is duplicated, the partial weight is
    halved.
  • When a proxy is deleted, a decrement message is
    sent.

95
Weighted Reference Counting
  • The initial assignment of weights in weighted
    reference counting
  • Weight assignment when creating a new reference.

96
Passing a Weighted Ref Count
  • Weight assignment when copying a reference.

97
Indirection
  • Creating an indirection when the partial weight
    of a reference has reached 1.

98
Generation Reference Counting
  • Simple reference counting requires message for
    incrementing and a message for decrementing.
  • Can we somehow combine those into one message?
    Maybe delay the increment somehow?
  • Generation reference counting gets rid of one of
    the messages (the increment message).
  • Basic idea is to try to defer the increment until
    we actually decrement.

99
Delayed Incrementing
1
C0
1 A proxy and a skeleton.
X
Increment 2, Decrement 1
1
C2
C0
1
C0
C0
C0
3 First proxy is deleted. It sends a message
saying it created two other proxies, so increment
ref count by two before decrementing by one.
2 Proxy created two more. Ref count at skeleton
is not updated, but first proxy keeps track of
the fact that it created two other proxies (C2).
100
A Problem
1
C0
C2
1 A proxy and a skeleton.
1
C0
C2
Decrement 1
1
X
C0
C0
3 Third proxy is deleted. It has not created any
proxies, so it just sends a message to decrement
by one. This causes the object to be improperly
deleted.
2 Proxy created two more. Ref count at skeleton
is not updated, but first proxy keeps track of
the fact that it created two other proxies (C2).
101
Generations
  • Proxies have generations, as in humans.

G0C2
Generation 0
G1C2
G1C1
Generation 1
G2C0
G2C0
G2C0
Generation 2
102
Generational Ref Counting
  • Creating and copying a remote reference in
    generation reference counting.

103
Deleting a Proxy
  • Skeleton maintains a table Gi, which denotes
    the references for generation i.
  • When a proxy is deleted, a message is sent with
    the generation number k and the number of copies
    c.
  • Skeleton decrements Gk and increments Gk1 by
    c.
  • Only when table is all 0 is the skeleton deleted.

104
Deleting a Proxy
G01G1-1
G0C0
G0C2
G01
1 A proxy and a skeleton.
G1C0
G0C2
Decrement G1 by 1
G01
X
G1C0
G1C0
3 Third proxy is deleted. It has not created any
proxies, so it just sends a message to decrement
by one. Generation number is different from first
one.
2 Proxy created two more. Ref count at skeleton
is not updated, but first proxy keeps track of
the fact that it created two other proxies (C2).
105
Refresh Distributed Garbage Collection
  • The problem. How do you discover when no one
    needs a server object?
  • One solution Develop distributed versions of
    garbage collection algorithms.

106
Reference Listing
  • Distributed reference counting is tricky because
    of failures.
  • If you send an increment, how do you know it
    arrived?
  • What if the ack is lost?
  • If you can design it so that it doesnt matter
    how many times you send a message, then it is
    simpler.
  • This is called idempotency. An idempotent
    operation can be done many times without negative
    effect.
  • Are these idempotent?
  • Withdrawing 50 from your bank account.
  • Cancelling a credit card account.
  • Registering for a course.

107
Idempotent Reference Counting
  • How can we make reference counting idempotent?
  • What turns non-idempotent registration into
    idempotent registration?
  • Keep track of which proxies have been created in
    the skeleton.

108
Reference Listing
P1
Reference ListP1P2
P2
  • Keep a list of proxies in the skeleton.
  • Failures can be handled with heartbeats, etc.
  • Main issue is scaling, if millions of proxies.

109
Tracing in Groups
  • Garbage collect first within a group of
    distributed processes.
  • An object is not collected if
  • It has a reference from outside the group
  • It has a reference from the root set
  • This is conservative.
  • It is possible that a reference from outside the
    group is not reachable.

110
The Model
  • Proxies (stubs), skeletons, and objects.
  • Only one proxy per object per process.
  • A root set of references. The root set is not
    proxies.

Proxy
Skeleton
Object
111
Basic Steps
  • Find all skeletons in a group that are reachable
    from outside or from root set.
  • Mark them hard, others are soft.
  • Within a process, proxies that are reachable from
    hard are marked hard, reachable from soft are
    marked soft. Some are marked none.
  • Repeat the above until stable (no change).

112
  • Skeletons
  • Hard
  • Reachable from proxy outside of group
  • Reachable from root object inside of group
  • Soft
  • Reachable only from proxies inside of group
  • Proxies
  • Hard
  • Reachable from root set
  • Soft
  • Reachable from skeleton marked soft
  • None

113
Initial Marking of Skeletons
  • All proxies in group report to skeleton.
  • If ref count greater than number of proxies, then
    must be external refs.

114
Local Propagation
  • Propagate hard/soft marks locally.

115
Iterate Till Convergence
  • Final marking.
  • Anything not hard can be collected.
Write a Comment
User Comments (0)
About PowerShow.com