Saral Shiksha Yojna
Courses/Distributed Systems

Distributed Systems

CS3.401
Prof. Kishore KothapalliMonsoon 2025-264 credits
Revision Notes/Unit 5 — Distributed Mutual Exclusion/Lamport, Ricart-Agrawala, Maekawa, Suzuki-Kasami, Raymond — Complete Comparison

Lamport, Ricart-Agrawala, Maekawa, Suzuki-Kasami, Raymond — Complete Comparison

NotesStory

Intuition

Mutual exclusion in a distributed system is the same idea as a critical section in OS — except there's no shared memory, no semaphore, no central arbitrator. Everything is messages. Five algorithms attack the problem in two families: non-token (request-permission from peers — Lamport, R-A, Maekawa) and token (a single token grants entry — Suzuki-Kasami, Raymond). Trade-offs: message count, synchronisation delay, FIFO requirement, fault behaviour.

Explanation

Three required properties of any DME algorithm. Safety — at any instant, at most one process in the CS. Liveness — no endless wait; every request eventually granted in finite time. Fairness — CS requests granted in order of their logical-clock arrival (timestamp order).

Four performance metrics. Message complexity — messages per CS invocation. Synchronisation delay (SD) — time after one site leaves CS before the next enters. Response time — time from a request being made until CS execution completes. System throughput where = average CS execution time.

Two algorithm families. Non-token (permission-based): Lamport, Ricart-Agrawala, Maekawa. Token-based: Suzuki-Kasami (broadcast request, token holds queue), Raymond (token migrates along tree).

LAMPORT — assumptions. FIFO channels, bidirectional, scalar (Lamport) timestamps, no failures.

Lamport — algorithm. *Request*: broadcasts to all; places own request in . *Receive REQUEST*: puts the request in its queue and sends a timestamped REPLY to . *Enter CS*: enters when (L1) it has received a message with timestamp from every other site (need not be a REPLY — any later message works), AND (L2) its own request is at the top of its queue. *Release*: removes its request from queue and broadcasts RELEASE; on receipt, every removes 's request from its queue.

Lamport — metrics. Messages = per CS (REQUEST + REPLY + RELEASE to each of peers). SD = max message transmission time . Requests granted in increasing timestamp order — strong fairness.

RICART-AGRAWALA — improvement over Lamport. Eliminates RELEASE messages. A site need not REPLY immediately — it *defers* a REPLY when it has a higher-priority (smaller-ts) request of its own.

Ricart-Agrawala — algorithm. *Request*: broadcast timestamped REQUEST. *Receive REQUEST* at : send REPLY if is idle OR is requesting but 's request has smaller timestamp. Else *defer* (set ). *Enter CS*: enters after receiving REPLY from all sites. *Release*: for every with , send REPLY to ; reset .

Ricart-Agrawala — metrics + advantage. messages per CS. Does not require FIFO. SD = .

Roucairol-Carvalho optimisation. Once has received a REPLY from , need not send a REQUEST to again to re-enter CS — unless has since sent a REPLY to . Message complexity varies from to . Worst-case unchanged.

MAEKAWA — quorum key idea. Each site has a Request Set (quorum); permission needed only from . Requirements: (i) ; (ii) (every two quorums intersect — the common member arbitrates); (iii) for all ; (iv) every node is in exactly quorums. Optimum: (minimises total messages while satisfying intersection).

Maekawa V1 protocol. Send timestamped REQUEST to all in . On receiving a request → send ACK to lowest-timestamp requester; lock yourself to it; queue others. Enter CS after ACK from every member of . To exit: send RELEASE to all in . Recipient unlocks; sends ACK to next-lowest in queue. Messages: per CS.

Why Maekawa V1 deadlocks. Cyclic locking among quorums. Three sites with overlapping quorums can each lock one another's quorum member, cycle, deadlock. V2 fix adds three messages: FAILED — sent when reply already given to a higher-priority request. INQUIRE — sent by when a higher-priority request arrives; asks the previously-acked site 'are you in CS?'. YIELD — sent by a process that received a FAILED elsewhere, relinquishing its lock so can grant to the higher-priority one. V2 messages: up to . SD = (double Lamport / R-A).

SUZUKI-KASAMI — token-based with broadcast request. Data structures: Token contains a FIFO queue of pending requestor IDs, and where = sequence number of most recently executed request of . Each site maintains — the largest sequence number it has seen in any REQUEST from .

Suzuki-Kasami — algorithm. *Request CS*: if doesn't hold token, ++ and broadcast . *Receive REQUEST at *: . If has the token and is idle and (this is a fresh, not outdated request), send token to . *Enter CS*: on receiving token. *Release CS*: . For every not in , if , append to . If non-empty, pop head and send token to it.

Suzuki-Kasami — metrics. Messages = (if already holds token) or (broadcast REQUEST + token transfer = ). SD = or max msg delay. No starvation. The condition filters outdated requests.

RAYMOND — token-based tree. Logical k-ary directed tree, root = token holder. Each node has: Holder pointer — points to its parent on the path to the current root (itself if it holds the token). **FIFO queue ** of pending requests (its own and from children).

Raymond — algorithm. *Request CS*: place own request in . If not token-holder and was previously empty, send REQUEST to Holder. *Non-root receives REQUEST*: place in . If no prior REQUEST sent up, send REQUEST to Holder. *Root receives REQUEST*: send token to requester; set Holder := requester. *On receiving token*: pop head of ; if popped entry is self, enter CS; else forward token to that node and set Holder := that node. If still non-empty, send REQUEST upward (to new parent). *Release CS*: same as receiving-token logic for next in queue.

Raymond — metrics. Average messages per CS in a balanced tree. SD .

Comparison table (memorise — high-yield). Lamport — non-token, msgs, SD, FIFO required. Ricart-Agrawala — non-token, msgs, SD, no FIFO. Maekawa V1 — quorum, msgs, SD, deadlock possible. Maekawa V2 — quorum, up to msgs, SD, deadlock-free. Suzuki-Kasami — token, or msgs, or SD. Raymond — token tree, msgs, SD.

Why Lamport needs FIFO but Ricart-Agrawala doesn't. Lamport relies on the order REQUEST → RELEASE — if RELEASE could overtake an earlier REQUEST from the same sender, sites would remove a request that hasn't yet been queued. Ricart-Agrawala has no RELEASE at all; deferred REPLYs implicitly serialise things by timestamp.

**Why Maekawa picks .** Cost (request + reply + release in quorum) msgs. Quorum intersection forces (Fisher inequality), so . Minimum .

**Why Suzuki-Kasami's .** = last executed seq num for . = max seen by . Equality confirms the request is *fresh* (immediately next) — not a stale duplicate from a delayed message. Filters out outdated requests so the token isn't sent for a request that's already been served.

Token-based DME — disadvantages. Token loss requires regeneration. Token-holder failure halts the system. Suzuki-Kasami broadcasts → doesn't scale beyond moderate . Raymond — all requests funnel through the root → root bottleneck.

Centralised mutex — why not preferred in DS. Single coordinator = single point of failure. Communication bottleneck. High latency for far-away nodes. Coordinator crash blocks everything.

Definitions

  • DME — safety / liveness / fairnessSafety: ≤ 1 in CS. Liveness: every requester eventually enters. Fairness: served in timestamp arrival order.
  • Synchronisation delay (SD)Time after one site leaves the CS before the next enters. Determines throughput as .
  • Lamport DMENon-token. FIFO required. Per-site request queue. Entry: L1 (later-ts msg from every site) ∧ L2 (own at top). 3(N-1) msgs per CS.
  • Ricart-AgrawalaNon-token. No FIFO. REQUEST + REPLY only; REPLY deferred when own ts is smaller. 2(N-1) msgs per CS.
  • Roucairol-CarvalhoOptimisation of R-A: once you have a REPLY from , don't re-request from unless you've replied to in between. 0 to 2(N-1) msgs.
  • Maekawa quorumRequest set with , every pair , optimum . Quorum-based permission.
  • Maekawa V2 messagesFAILED (replied to higher), INQUIRE (higher arrived; ask 'in CS?'), YIELD (relinquish lock after FAILED). Breaks V1's cyclic deadlock.
  • Suzuki-KasamiToken-based with broadcast REQUEST. Token holds Q + LN[]. Each site has RN[]. Token sent only on fresh request ().
  • Raymond's tree algorithmToken-based on a logical tree. Holder pointer at each node points toward root. Token migrates along Holder chain. in balanced tree.
  • Token-based vs non-token DMEToken: a single privilege passes around; requires fault-tolerant token. Non-token: each request asks permission from peers; no token to lose, but more messages per CS.

Formulas

Derivations

Lamport DME safety. Suppose both in CS simultaneously. Both have own requests at top of their queues. WLOG . By L1, has received a message from with timestamp . By FIFO, this means 's REQUEST (which has ) has been queued at . Since , 's request is below 's in the queue — contradiction with 's L2.

**Why R-A is msgs.** REQUEST to peers + REPLY from peers (eventually). No RELEASE. Total .

**Maekawa quorum intersection ⇒ .** Consider all pairs of quorums. Each pair must share at least one element. Each quorum has size , hence appears in pair-intersections from its own members. Sum over pairs gives a counting bound that yields . Hence .

Examples

  • Lamport trace for 3 sites. wants CS at time . wants at . broadcasts . queue it and REPLY. broadcasts . queues and replies. 's queue: . at top + L1 satisfied → enters. RELEASE. Queue: . enters next.
  • Ricart-Agrawala defer. requests replies (idle). requests defers because and has its own request pending. After leaves CS, sends deferred REPLY to . enters.
  • **Maekawa quorum example .** Pick . One valid quorum design: , , — every pair shares 1. Each node in quorums (designed so total ).
  • Suzuki-Kasami token transfer. holds token. requests: , broadcast REQUEST. receives, updates . (previous). ✓ fresh → send token to if is idle.
  • Raymond tree, 7-node balanced tree, P4 holds token. requests CS: pushes self to (empty before); sends REQUEST to Holder (P_4's parent). REQUEST flows up the tree; root forwards token to through Holder chain; tree pointers update along path.

Diagrams

  • Comparison table: rows = Lamport / R-A / Maekawa-V2 / Suzuki-Kasami / Raymond; cols = type, msgs, SD, assumption, fault note.
  • Lamport request flow: 3-process timeline with REQUEST, REPLY, RELEASE arrows + per-site queues at each instant.
  • Maekawa quorum diagram: with ; show three quorums and their pairwise intersections.
  • Maekawa V2 deadlock fix: arrows for FAILED, INQUIRE, YIELD.
  • Suzuki-Kasami token state: token's + visualised as data structures.
  • Raymond tree with Holder pointers all converging to root token-holder.

Edge cases

  • Lamport without FIFO breaks: RELEASE may overtake REQUEST, removing a queue entry that hasn't been added.
  • Maekawa V1 deadlock in cyclic locking; V2 with FAILED/INQUIRE/YIELD breaks the cycle.
  • Token loss in Suzuki-Kasami / Raymond → need regeneration protocol (election + reset / Holder).
  • Suzuki-Kasami broadcast doesn't scale to thousands of nodes.
  • Raymond root bottleneck if request rate is high near the root.
  • Tied timestamps — break with process ID; without tie-break, fairness breaks.

Common mistakes

  • Confusing message counts: Lamport = 3(N-1), R-A = 2(N-1), Maekawa V1 = 3√N, V2 = up to 5√N, S-K = 0 or N, Raymond = O(log N). Memorise the table.
  • Saying 'R-A needs FIFO'. No — R-A's beauty is that it doesn't.
  • Forgetting L1 + L2 are BOTH required in Lamport. Either alone is insufficient.
  • Saying 'Maekawa V1 is deadlock-free'. No — V1 can deadlock; V2 fixes it.
  • Suzuki-Kasami token send condition is (strict equality) — not or .

Shortcuts

  • Lamport: 3(N-1) msgs, FIFO, L1+L2 entry.
  • R-A: 2(N-1) msgs, no FIFO, deferred REPLY.
  • Maekawa: K=D=√N, V1 deadlocks → V2 FAILED/INQUIRE/YIELD, SD = 2T.
  • **S-K: 0 or N msgs, broadcast REQ + fresh-only token, .**
  • Raymond: O(log N), Holder pointer, root bottleneck.

Proofs / Algorithms

Lamport DME safety (formal). Assume two sites in CS at same time with . Both satisfy L2 — own request at top. has at top. But must have queued 's REQUEST (broadcasted) by FIFO + the fact that L1 at requires a message from with timestamp . The only way is 's RELEASE arriving after broadcast its REQUEST — but then has been removed from 's queue. Trace shows contradiction with L2 at .

Ricart-Agrawala safety. enters CS after REPLY from all . If has a higher-priority pending request, would have deferred 's REPLY until leaves CS — so couldn't have all REPLYs while is in CS. Hence at most one site in CS.

Maekawa intersection-based safety. Two simultaneous requesters need ACK from disjoint sets. But — some site in both. can only ACK one at a time. So only one of gets full ACKs and enters CS.

Suzuki-Kasami safety. Only the token-holder can enter CS. Token is unique (single token in system). Hence safety.