Data-Centric Consistency

four consistency rules:

  • monotonic reads: r1[x] - r2[x], r2[x] see at least recent status as r1[x]
  • monotonic writes: w1[x] - w2[x], w2[x] change status of data based on w1[x]’s last status of data
  • reads your writes: w[x] - r[x], r[x] will read what w[x] writes
  • writes follow reads: r[x] - w[x], w[x] happends based on what r[x] most recently read

Different background how you can put replicants:

  • permanent replicas: eg, when you download app online, you may be asked which mirror site you want to download according to the cities.
  • server-initited replicas: differece between cache and replicas: cache means caching systems “pull data” from the origin server, while replication systems tend to “push” data to maintain mirror copies of the same data at various place on the network. Replicas depends on requests.
  • client-initited replicas: eg, browser cache

Distribute load is still a chanllenge. This is not good solution to do dynamic replicas. One solution is to keep servers running, but it’s costly.

Update propagation

replicas algorithms

Q1: what to propagate

  • write happen on server A, after done, send invalidation to server B, tell it “your replicas” is not valid any more. Msg is short and only send once, therfore it’s cheap and simple.
  • write happen on server A, keep a log to record the changes, when all of the operation are done on server A, it propagates the same operation order to server B.
  • this protocal is not common used. server A updates physical file, then send to B to replace the old file. If the file is very large, this is expensive.

Q2: who propagate( who send or who recieve )

depends the efficiency and latency.

  • server push updates? server only send one single msg, put request. Save half msg. But workload is higher, because server has to remember everthing, such as which client I need to send msg.
  • client pull updates? client ask for “could you send update”, server reply “yes” then send to client. Round msg in total.

Higher approach of propagate: move functionality from server to client(lease-based): when load scales up(lease expired), transfer protocal from put based to pull based.

Epidemic protocols:

site 1 and site 2 try to fix inconsistency in order to eventually consistency.

example: site 1 holds file x, site 2 holds fresh x and y. site 1 does w1[x], then site 2 does w2[x]. site 1 recieves r1[y], but it doesn’t has y, then it passes the request to site 2. After site 2 does r1[y], it sends msg to site 1: “Hi, site 1, I finished r1[y], by the way, I finished w2[y]”. It sends w2[y] back to site 1, site 1 discard change of w1[y] and does w2[y] first, then does w1[y]. Because it has to fix the eventual consistency. This machnism is popular. The protocal will pass this machenism to all sites as epidemic.

client-centralized protocol model

  • Primary Copy Remote-Write Protocol: the remote server is the primary to do each operation, then it updates all other server.
  • Primary Copy Local-Write Protocol: the remote server is the old primary, all operation must be local, so local copy become the new primary, all operation happen on new primary, then it updates all other server.

Quorum-Based protocol( votes get something to happen )

when we talk aout quorums, we really talk about majority quorums, also called majority quorums onsensus. Quorum system assignm each replica one vote.

N_r: read quorum, vote from the replica sites indicates you can read on these sites.

V_r: means how many votes I need so that I can do read on them, if no so many votes, I have to wait untill I get enough votes.

N_w: write quorum, vote from the replica sites indicates you can wirte on these sites.

V_w: means how many votes I need so that I can do write on them, if no so many votes, I have to wait untill I get enough votes.

V: total number of votes

Two rules a quorum system design has to hold:( design means V_r = ? and V_w = ?)

  • V_r + V_w > V: if V_r + V_w > V, you can’t do read and write at the same time, have to do them in order, look at timestamp. If write has finished, read happens later, read has to see at least one site which latest update. Because write and read overlap, this must be true. This is the power of quorum system.
  • V_w > V/2: If not satisifed, two writes may run at the same time and update total different sites.
  • extream exmaple: V_r=1, V_w=12, V=12, write on all sites, and doesn’t matter read from any one site. It can be successful, but it can’t be quorum system, because it is eager and synchrounize, which is expensive, then quorum protocal is meaningless.

Reminder: when design quorum system, use as less as votes, because more votes means more contact with sites, so more expensive and slower.

Reference material:
Book: Distributed Systems, Third edition, Version 3.02(2018), Maarten van Steen and Andrew S. Tanenbaum.
Lectures: University of Waterloo, CS 454/654 (Distributed System), 2020 winter term, Professor Khuzaima Daudjee.