Distributed system - Class 5

Alternative organizations

scale out: increase more memory and storage, add more servers such as machines.

Vertical Distribution (partitioning)

One machine assigns job to different machines according to the required functions. It requires to figure out which request sent to which server and decide what functioins each server holds.

Horizontal Distribution (replicated service)

One server handles all load and ever other machine has same functions. It looks an easy but updating data is difficult, becuase all machines need to update.

Example:

95% chance of a server being up at any instant.

for 2 servers to be up and running at any time: 0.95^2. Service not being running: around 10%
for 4 servers to be up and running at any time: vertical distribution. service not being running: (1-81%) ~ 20%
for 14 …… < 50%
If use vertical distribution, adding more machines results in increasing risk of fail as each one stores a function.

If use horizontal distribution, every machine is doing the same work, once one server is down, others are still running. It is good. But when updating data some machines are down, then they are running latter. It is useless because their data may not be updated.

Peer-to-peer architecture

kye-value hash key to a node location. We hava key value 1~~200 mapping to node value ranges in 1~~200.

Consistent hashing (most popluar structure of hashing: chord). Using sccessor function to assign key to each node. We have key=1,2,6, and node ID=1,3,6, which is a node ring. Look if there is a node=1 can accommodate key=1, yes; Look if there is a node=2 can accomodate key=2, no, then serach the next closest node=3, so put key=2 to node=3; Look if a node=6 can accomodate key=6, no, then continuing to look the closest highest node, node=0 can accept. In order to make key as most as closed to node location, we can add node if we want to distribute data more evently. This is Join.

Delete node, move keys stored in node to next closest one and remove this node. If we know all nodes, then we know where keys should go. For example, we deign finger table (power of 2). for means a formula to compute start, start means the key value start with (.), succ means key value starts with start eventually go to (.). For key=2, node=0 ask node=3: if you have key=2? node=3 response yes.

Reference material:
Book: Distributed Systems, Third edition, Version 3.02(2018), Maarten van Steen and Andrew S. Tanenbaum.
Lectures: University of Waterloo, CS 454/654 (Distributed System), 2020 winter term, Professor Khuzaima Daudjee.