transaction characterization

T_1, r_1[x] + w_1[x] + c_1

T_2, r_2[x] + w_2[x] + c_2

  • T - transaction

Transactions are sequentially not imediately.

Why at least one of read and write must happen during transaction, because once you want to change samething, you must know the server location.

Distirbuted transaction execution: you need to know
replica control protocal: figure out once one machine gets a copy, replicate on all machines, or split files on all machines.

isolation

more concurrency means better performance, concurrency stands for distribution system. So only one client transaction at the same time is not good. We want great concurrency and isolation. But now, we have to trade off, to get better concurrency less isolation.

update lost problem

Two clients are doing transactions, they do themselves’ update which doesn’t matter what the other client has done. So at last, only one transation is successful.

inconsistent retrievals problem

If account A and B have $200 respectively, client withdraws $100, client B should have seen $400 in two accounts totally, but it sees the client’s result immediately, so it can only see $300. Because both of the transaction work concurrency and can see each other’s transaction immediately. This is called dirty read/write if one transaction is not committed. If you don’t have isolation, you have to watch out this: other transaction saw the rollback, then all effected transactions will abort.

serializability

W_1[x] doesn’t conflict with W_2[y], because they operate different files x or y. can concurrency

r_1[x] doesn’t conflict with r_2[x], because both of them do read. can concurrency

w_1[x] doesn’t conflict with r_2[y], because total different files. can concurrency

w_1[x] conflict with w_2[x], because both of them wirte the same file. can’t concurrency

w_1[x] conflict with r_2[x], because they operate the same file. can’t concurrency

w_1[x] doesn’t conflict with r_1[x], because they don’t belong to two differenct transactions. transaction can’t conflict with itself.

serialization means the order of transactions. If we have isolation, we have consistency on data. Serializable system will conflic serizaliable system having the same seriablizable history.

Logic sort graph to figure out the order based on serializable history. If graph has circle, it’s not serializable system.

Distributed transaction serializability conditions

  • each local should be serializable. check serializable histroty on each local.
  • put two local serializable history together to make a global serializable graph. If not conflic, then good.

concurrency control

Real time serializable check is difficult. Locking is all about conflict operations.

  • Two phase locking (2PL)

    If you want to read, you require read lock. If you want to write, you require write lock. This is showed in our assignment.

Reference material:
Book: Distributed Systems, Third edition, Version 3.02(2018), Maarten van Steen and Andrew S. Tanenbaum.
Lectures: University of Waterloo, CS 454/654 (Distributed System), 2020 winter term, Professor Khuzaima Daudjee.