Distributed Systems
CS3.401Prof. Kishore Kothapalli•Monsoon 2025-26•4 credits
200-mark mock paper · PAPER B — 200 marks
Duration: 180 min • Max marks: 200
Section A — Short Answer Questions [3 marks each]
30 marks- 1.Berkeley algorithm: how does the master compute the time offset to broadcast? What happens if one slave's clock has drifted significantly (e.g., 2 hours)?3 m
- 2.NTP hierarchy: explain stratum 1, 2, 3 with the typical accuracy at each level over a LAN.3 m
- 3.Singhal-Kshemkalyani differential technique: state the storage overhead per process and the channel requirement. Why is FIFO necessary?3 m
- 4.True / False with explanation: (a) Lai-Yang requires every process to record every message ever sent until snapshot completes. (b) In 2PC, a participant in \<ready T\> state can always safely abort if the coordinator is unreachable.3 m
- 5.Maximal Independent Set: define it. Give an MIS for the graph K_3 (complete graph on 3 vertices) and for P_4 (path of 4 vertices).3 m
- 6.BASE acronym: expand it. How is each component a relaxation of the corresponding ACID property?3 m
- 7.Mitchell-Merritt priority extension: when two blocked nodes have equal public labels in a transmit step, what action does the algorithm take?3 m
- 8.Raft leader-completeness property: state it precisely. Why does this guarantee that committed entries survive leader changes?3 m
- 9.In GFS write protocol's pipelined data push, why is the data forwarded to the closest replica first rather than to the primary directly?3 m
- 10.Chord monotonicity property: define and explain why it minimizes data movement when a bucket joins or leaves.3 m
Section B — Medium Answer Questions [5 marks each]
50 marks- 1.Causal-order Multicast using Vector Clocks: How can you implement causal-order delivery using vector clocks piggybacked on every message? Describe the delivery condition at the receiver.5 m
- 2.Maekawa V2 Trace: 3 processes P0, P1, P2 with quorums S0={0,1}, S1={1,2}, S2={2,0} (so each quorum is size 2 and any pair shares one node). Suppose all three request CS simultaneously with timestamps t0 \< t1 \< t2. Show the sequence of FAILED, INQUIRE, YIELD messages to resolve the deadlock and let process 0 enter CS first.5 m
- 3.Ricart-Agarwala Trace: 4 processes P1, P2, P3, P4. P1 requests CS at time 3, P3 requests at time 5, P2 requests at time 4 (all timestamped). Show the REQUEST/REPLY exchanges and indicate the order processes enter the CS.5 m
- 4.Diffusion-based OR Deadlock Detection: 4 processes in a knot configuration: P1 OR-waits on {P2, P3}; P2 OR-waits on {P4}; P3 OR-waits on {P4}; P4 OR-waits on {P1}. P1 initiates diffusion. Show the engaging queries, replies, and how deadlock is detected.5 m
- 5.Raft AppendEntries with Log Mismatch: Leader's log = [(1,a), (1,b), (2,c), (3,d), (3,e)]. Follower's log = [(1,a), (1,b), (1,x), (1,y)]. Each entry is (term, command). Trace how the leader brings the follower into consistency.5 m
- 6.Acharya-Badrinath SENT/RECD: 3 processes P1, P2, P3. Initially all SENT and RECD arrays are zero. Sequence: P1 sends 2 msgs to P2 (received), P2 sends 1 msg to P3 (received), P1 sends 1 more msg to P3 (in transit). Snapshot is now initiated. Compute SENT and RECD arrays for each process. Use them to derive channel states.5 m
- 7.Phase King for n=4, f=1 --- SUCCESSFUL execution: Choose initial values and assume the single faulty process is one of the lieutenants (not a king in any phase). Show how the algorithm successfully reaches consensus.5 m
- 8.Consistent Hashing with Virtual Nodes: 3 physical nodes A, B, C, each with 2 virtual nodes placed at hash positions A1=5, A2=80, B1=20, B2=60, C1=40, C2=95 on ring [0, 100). 10 items hashing to positions {3, 15, 22, 30, 45, 55, 65, 75, 85, 92}. Compute which physical node each item maps to and verify load balance.5 m
- 9.Chord Finger Table for m = 6 (ring mod 64): Nodes at positions {5, 15, 30, 42, 55, 60}. Compute the finger table for node 15.5 m
- 10.GFS Copy-on-Write Snapshot: A directory /data contains files F1 and F2, each 1 chunk. Trace what happens when (i) snapshot of /data is taken, (ii) a client subsequently appends to F1, (iii) reads F2 from the snapshot. State the chunk reference counts at each step.5 m
Section C — Long Answer Questions [10 marks each]
120 marks- 1.Time Synchronization --- Complete Comparison **a.** Describe Cristian's algorithm and the Berkeley algorithm. State a scenario where each is preferable. **b.** Why are logical clocks needed even when NTP provides millisecond accuracy? Give a concrete example where physical-time ordering would be incorrect. **c.** Compare scalar, vector, and matrix clocks on the strongest causal information each can reveal.10 m
- 2.Chandy-Lamport Correctness Proof **a.** Prove that the global state recorded by Chandy-Lamport satisfies condition C1 (every message recorded as sent is either in the channel state or recorded as received). **b.** Prove condition C2 (no message is recorded as in-channel or received without being recorded as sent). **c.** Why does the algorithm's correctness break if a process delays recording its state after receiving its first marker?10 m
- 3.Mutual Exclusion --- Token vs Non-Token Deep Comparison **a.** Describe Raymond's algorithm: tree structure, Holder pointer, request queue. Trace one CS request. **b.** Describe Suzuki-Kasami's algorithm: token contents, RN/LN arrays. Trace one CS request.10 m
- 4.OR-Model Deadlock Detection via Diffusion **a.** Define a KNOT formally. Prove that in the OR model, a process is deadlocked iff it lies in a knot (or has every path leading into a knot). **b.** Describe the diffusion algorithm (Chandy et al.) for OR-model deadlock detection: engaging queries, replies, termination condition. Why does the algorithm correctly detect deadlock?10 m
- 5.Termination Detection --- Two Algorithms Compared **a.** Describe the weight-throwing algorithm. State the conservation invariant and how termination is detected. **b.** Describe Dijkstra-Scholten's spanning-tree algorithm. State the deficit computation and ACK protocol. **c.** Compare the two on robustness to: (i) message duplication, (ii) message loss, (iii) process crashes.10 m
- 6.Crash-Consensus Algorithm **a.** State the algorithm (pseudocode) for synchronous consensus tolerating up to f crash failures. **b.** Prove that the algorithm achieves Agreement, Validity, and Termination. **c.** Why does the algorithm need EXACTLY f + 1 rounds --- neither more nor less?10 m
- 7.Commit Protocols --- End-to-End **a.** Prove that 2PC blocks under coordinator failure when all participants are in \<ready T\>. **b.** Argue that 3PC, under its assumptions (no partitions, ≤ K failures), does not block. **c.** Give one scenario where 3PC violates atomicity (despite its assumptions) --- i.e., when its assumptions break, 3PC can produce inconsistency.10 m
- 8.GHS Algorithm --- State and Trace **a.** List the node states (sleep, find, found) and edge states (basic, branch, reject). Describe the transitions. **b.** Trace GHS on a 4-node graph: nodes A, B, C, D with edges A-B (weight 1), B-C (weight 2), C-D (weight 3), A-D (weight 4), A-C (weight 5). Identify the MST step by step.10 m
- 9.GFS --- Architecture and Fault Tolerance **a.** Describe the master's metadata: file/chunk namespaces, chunk-to-location mapping, lease/version info. What is replicated to remote machines vs reconstructed? **b.** Describe how GFS recovers from: (i) chunkserver failure, (ii) master failure, (iii) data corruption on a single chunk. **c.** Why is having a single master not a bottleneck despite serving all metadata requests?10 m
- 10.Distributed Databases --- DynamoDB vs BigTable **a.** Describe Dynamo's design: consistent hashing, replication, quorum (N, W, R), conflict resolution via vector clocks. **b.** Describe BigTable's design: tablets, GFS as storage, strong consistency within a row. **c.** When would you choose Dynamo over BigTable, and vice versa? Justify with one concrete use case for each.10 m
- 11.MPC --- Parallel BFS with Diameter D10 m
- 12.Phase King --- Full Correctness Proof **a.** State the invariant maintained at the end of each phase: at the end of the first 'good' phase (honest king), all honest processes share the same value. **b.** Prove this invariant assuming n \> 4f. Cover the three sub-cases (both took majority / both took king / mixed). **c.** Prove that once the invariant is established, it is maintained through all subsequent phases (even with malicious kings).10 m
Track your attempt locally — score and time are recorded in your browser. (Coming soon: timed-attempt mode.)