Distributed system - Class 8(Distributed Naming)

Inorder to find URL quickly. DNS is an example.

In order to reach a server, we need to konw the name and address of the machine. If you send the email, the email address and name should be looked up.

Name

Pure name: 0 or 1
Non-purename
name gives us some attributes.

Two types of name services

Name services: most common look is (name + locaiton)
Directory services: property and value, can you give me a name?

Resolve URL

name sends different address space.

Process: find Network(IP) -> macchine -> port -> direcotry -> file -> html. But machiine can’t resolve the html, it can only recognize 0 or 1, then it should map html to 0 or 1.

Requirements on Name Services

Over 50 million divices until 2020, how to figure out all the name schemes? Name services are the heart of distributed applications. Fault isolation means if one name service shut down, it shoudl look up other name service and dosen’t affect others.

Name sapces

Name of files system, it give you navigation, around ‘/‘ provides important contexts and expands name space easily. Therefore name space supports organization and lookup.

Naming graph with single root

Name on leaf node must be file. If you have name graph, it can extend to be a tree. Golab comes after right of the root by global organization, very stable. Administrational layer is also vary stable, there may be distribution such as “waterloo”, sometimes changes. Managerial layer is user managing the node by themselves. Rename a file will change the name. Increasing nodes from top to bottom.

Name space distributions(2)

Email ends up with look up, it reaches the very top layer of nodes, keep ask node layer by layer where the address the machine is. For example, you send email from our university to Japanese univerisity, your email service ask global domain Japan frist, then ask university, then the machine, the port and unitl to person. Global layer must be very reliable. your file replication for back up on your lap is very little. Cacheing means I may send same email to same locations, public your update and cache value so that if one domain is changed, you want to lookup other domains again. Because changes on global and adminstraional layers don’t happen frequently, you don’t need to update them. Domain name also needs to cache because so many users send emails to same address.

Partitioning helps recognize the address. Replication and caching need authorility. Every machine has DNS clinet, your DNS client will do caching yourself. So caching happens in every level. Most client DNS tries to do cycle,because if it doesn’t find the server, it goes other server again and again until find the server or goese back to client.

Iterative: Client sends request to node 1, it may response “yes, it has done” or “I havn’t found it”. Then it asks other nodes. Client decides when it stops. Every node responses to client directly and client dicides next step.

Recurtive: NS1 is No-recurisive server-controlled means nodes behave as client to resolve name instead of client side recurisive. Server sens one request to NS1, NS1 can’t figure out and just sends request to NS2, When NS1 asks NS2, NS2 responds it can do it then send remain parts to next node. There are a couple of issues. If there are a lot of requests, request from client must be responsed and response can’ be forgot, so all layers’ nodes hold the requests all the time until done. Iteritive and recursive, which one is faster? This is recursive. Every hoop is physically very closed to each other so this way is faster. Caching in every layer exists, so it saves time for caching and imporve communication efficiency.

A DNS client is called a resolver:

normally implemented as library S/W
communicates with one or more name servers
simple req-reply protocal using UDP(because you only lookup, if fail just send the request again. Because you don’t want waste resource asuch as adding connections)
client implements timeout and resends query if necessary(Because you use UDP, your request may get lost, but just send it gain.)
can configure resolver to contact an initial list of name servers, eg, in some other of preference
the DNS arch allows for recursive and iterative navigation(Once server does recursive, you can’t force server to do things in your way. In order to decrease latency, you can send a few request to get multiple IP)
the resolver can specify which ype of nvigation is required when contacting a name server
- but name server is not bound to follow this
- can have issues with typing up server threads, causing delays in requests
- can pack multiple queries in a single msg cand can get multiple replies in a “ “

Domain Name System(DNS)

name-to-IP mapping

Resource sits in servers throughout whole world, distribution is naturally required. If you try to do centralize DNS, don’t need scale because you only care about the domain you want to contact, others are far away from you.

Reference material:
Book: Distributed Systems, Third edition, Version 3.02(2018), Maarten van Steen and Andrew S. Tanenbaum.
Lectures: University of Waterloo, CS 454/654 (Distributed System), 2020 winter term, Professor Khuzaima Daudjee.