Revision Notes/Unit 5 — Distributed Mutual Exclusion/Lamport, Ricart-Agrawala, Maekawa, Suzuki-Kasami, Raymond — Complete Comparison

Lamport, Ricart-Agrawala, Maekawa, Suzuki-Kasami, Raymond — Complete Comparison

NotesStory

Intuition

Mutual exclusion in a distributed system is the same idea as a critical section in OS — except there's no shared memory, no semaphore, no central arbitrator. Everything is messages. Five algorithms attack the problem in two families: non-token (request-permission from peers — Lamport, R-A, Maekawa) and token (a single token grants entry — Suzuki-Kasami, Raymond). Trade-offs: message count, synchronisation delay, FIFO requirement, fault behaviour.

Explanation

Three required properties of any DME algorithm. Safety — at any instant, at most one process in the CS. Liveness — no endless wait; every request eventually granted in finite time. Fairness — CS requests granted in order of their logical-clock arrival (timestamp order).

Four performance metrics. Message complexity — messages per CS invocation. Synchronisation delay (SD) — time after one site leaves CS before the next enters. Response time — time from a request being made until CS execution completes. System throughput $= 1/ (SD + E)$ where $E$ = average CS execution time.

Two algorithm families. Non-token (permission-based): Lamport, Ricart-Agrawala, Maekawa. Token-based: Suzuki-Kasami (broadcast request, token holds queue), Raymond (token migrates along tree).

LAMPORT — assumptions. FIFO channels, bidirectional, scalar (Lamport) timestamps, no failures.

Lamport — algorithm. *Request*: $S_{i}$ broadcasts $REQUEST (t s_{i}, i)$ to all; places own request in $request_queue_{i}$ . *Receive REQUEST*: $S_{k}$ puts the request in its queue and sends a timestamped REPLY to $S_{i}$ . *Enter CS*: $S_{i}$ enters when (L1) it has received a message with timestamp $> (t s_{i}, i)$ from every other site (need not be a REPLY — any later message works), AND (L2) its own request is at the top of its queue. *Release*: $S_{i}$ removes its request from queue and broadcasts RELEASE; on receipt, every $S_{j}$ removes $S_{i}$ 's request from its queue.

Lamport — metrics. Messages = $3 (N - 1)$ per CS (REQUEST + REPLY + RELEASE to each of $N - 1$ peers). SD = max message transmission time $T$ . Requests granted in increasing timestamp order — strong fairness.

RICART-AGRAWALA — improvement over Lamport. Eliminates RELEASE messages. A site need not REPLY immediately — it *defers* a REPLY when it has a higher-priority (smaller-ts) request of its own.

Ricart-Agrawala — algorithm. *Request*: broadcast timestamped REQUEST. *Receive REQUEST* at $S_{k}$ : send REPLY if $S_{k}$ is idle OR $S_{k}$ is requesting but $S_{i}$ 's request has smaller timestamp. Else *defer* (set $R D_{k} [i] = 1$ ). *Enter CS*: $S_{i}$ enters after receiving REPLY from all $N - 1$ sites. *Release*: for every $j$ with $R D_{i} [j] = 1$ , send REPLY to $S_{j}$ ; reset $R D_{i} [j] = 0$ .

Ricart-Agrawala — metrics + advantage. $2 (N - 1)$ messages per CS. Does not require FIFO. SD = $T$ .

Roucairol-Carvalho optimisation. Once $S_{i}$ has received a REPLY from $S_{j}$ , $S_{i}$ need not send a REQUEST to $S_{j}$ again to re-enter CS — unless $S_{i}$ has since sent a REPLY to $S_{j}$ . Message complexity varies from $0$ to $2 (N - 1)$ . Worst-case unchanged.

MAEKAWA — quorum key idea. Each site $S_{i}$ has a Request Set $R_{i}$ (quorum); permission needed only from $R_{i}$ . Requirements: (i) $i \in R_{i}$ ; (ii) $\forall i, j : R_{i} \cap R_{j} \neq = \emptyset$ (every two quorums intersect — the common member arbitrates); (iii) $∣ R_{i} ∣ = K$ for all $i$ ; (iv) every node is in exactly $D$ quorums. Optimum: $K = D = N$ (minimises total messages while satisfying intersection).

Maekawa V1 protocol. Send timestamped REQUEST to all in $R_{i}$ . On receiving a request → send ACK to lowest-timestamp requester; lock yourself to it; queue others. Enter CS after ACK from every member of $R_{i}$ . To exit: send RELEASE to all in $R_{i}$ . Recipient unlocks; sends ACK to next-lowest in queue. Messages: $3 N$ per CS.

Why Maekawa V1 deadlocks. Cyclic locking among quorums. Three sites $S_{1}, S_{2}, S_{3}$ with overlapping quorums can each lock one another's quorum member, cycle, deadlock. V2 fix adds three messages: FAILED — sent when reply already given to a higher-priority request. INQUIRE — sent by $S_{k}$ when a higher-priority request arrives; asks the previously-acked site 'are you in CS?'. YIELD — sent by a process that received a FAILED elsewhere, relinquishing its lock so $S_{k}$ can grant to the higher-priority one. V2 messages: up to $5 N$ . SD = $2 T$ (double Lamport / R-A).

SUZUKI-KASAMI — token-based with broadcast request. Data structures: Token contains a FIFO queue $Q$ of pending requestor IDs, and $L N [1.. n]$ where $L N [j]$ = sequence number of most recently executed request of $P_{j}$ . Each site $P_{i}$ maintains $R N_{i} [1.. n]$ — the largest sequence number it has seen in any REQUEST from $P_{j}$ .

Suzuki-Kasami — algorithm. *Request CS*: if $P_{i}$ doesn't hold token, $R N_{i} [i]$ ++ and broadcast $REQUEST (i, R N_{i} [i])$ . *Receive REQUEST $(i, s n)$ at $P_{j}$ *: $R N_{j} [i] := max (R N_{j} [i], s n)$ . If $P_{j}$ has the token and is idle and $R N_{j} [i] = L N [i] + 1$ (this is a fresh, not outdated request), send token to $P_{i}$ . *Enter CS*: on receiving token. *Release CS*: $L N [i] := R N_{i} [i]$ . For every $j$ not in $Q$ , if $R N_{i} [j] = L N [j] + 1$ , append $j$ to $Q$ . If $Q$ non-empty, pop head and send token to it.

Suzuki-Kasami — metrics. Messages = $0$ (if already holds token) or $N$ (broadcast REQUEST + token transfer = $N - 1 + 1 = N$ ). SD = $0$ or max msg delay. No starvation. The condition $R N_{j} [i] = L N [i] + 1$ filters outdated requests.

RAYMOND — token-based tree. Logical k-ary directed tree, root = token holder. Each node has: Holder pointer — points to its parent on the path to the current root (itself if it holds the token). **FIFO queue $Q_{i}$ ** of pending requests (its own and from children).

Raymond — algorithm. *Request CS*: place own request in $Q_{i}$ . If not token-holder and $Q_{i}$ was previously empty, send REQUEST to Holder. *Non-root receives REQUEST*: place in $Q_{j}$ . If no prior REQUEST sent up, send REQUEST to Holder. *Root receives REQUEST*: send token to requester; set Holder := requester. *On receiving token*: pop head of $Q$ ; if popped entry is self, enter CS; else forward token to that node and set Holder := that node. If $Q$ still non-empty, send REQUEST upward (to new parent). *Release CS*: same as receiving-token logic for next in queue.

Raymond — metrics. Average $O (lo g N)$ messages per CS in a balanced tree. SD $= (lo g N) /2 \cdot T$ .

Comparison table (memorise — high-yield). Lamport — non-token, $3 (N - 1)$ msgs, $T$ SD, FIFO required. Ricart-Agrawala — non-token, $2 (N - 1)$ msgs, $T$ SD, no FIFO. Maekawa V1 — quorum, $3 N$ msgs, $2 T$ SD, deadlock possible. Maekawa V2 — quorum, up to $5 N$ msgs, $2 T$ SD, deadlock-free. Suzuki-Kasami — token, $0$ or $N$ msgs, $0$ or $T$ SD. Raymond — token tree, $O (lo g N)$ msgs, $(lo g N) /2 \cdot T$ SD.

Why Lamport needs FIFO but Ricart-Agrawala doesn't. Lamport relies on the order REQUEST → RELEASE — if RELEASE could overtake an earlier REQUEST from the same sender, sites would remove a request that hasn't yet been queued. Ricart-Agrawala has no RELEASE at all; deferred REPLYs implicitly serialise things by timestamp.

**Why Maekawa picks $N$ .** Cost $= K + K$ (request + reply + release in quorum) $= O (K)$ msgs. Quorum intersection forces $K^{2} \geq N$ (Fisher inequality), so $K \geq N$ . Minimum $K = N$ .

**Why Suzuki-Kasami's $R N_{j} [i] = L N [i] + 1$ .** $L N [i]$ = last executed seq num for $P_{i}$ . $R N_{j} [i]$ = max seen by $P_{j}$ . Equality confirms the request is *fresh* (immediately next) — not a stale duplicate from a delayed message. Filters out outdated requests so the token isn't sent for a request that's already been served.

Token-based DME — disadvantages. Token loss requires regeneration. Token-holder failure halts the system. Suzuki-Kasami broadcasts → doesn't scale beyond moderate $N$ . Raymond — all requests funnel through the root → root bottleneck.

Centralised mutex — why not preferred in DS. Single coordinator = single point of failure. Communication bottleneck. High latency for far-away nodes. Coordinator crash blocks everything.

Definitions

DME — safety / liveness / fairness — Safety: ≤ 1 in CS. Liveness: every requester eventually enters. Fairness: served in timestamp arrival order.
Synchronisation delay (SD) — Time after one site leaves the CS before the next enters. Determines throughput as $1/ (SD + E)$ .
Lamport DME — Non-token. FIFO required. Per-site request queue. Entry: L1 (later-ts msg from every site) ∧ L2 (own at top). 3(N-1) msgs per CS.
Ricart-Agrawala — Non-token. No FIFO. REQUEST + REPLY only; REPLY deferred when own ts is smaller. 2(N-1) msgs per CS.
Roucairol-Carvalho — Optimisation of R-A: once you have a REPLY from $S_{j}$ , don't re-request from $S_{j}$ unless you've replied to $S_{j}$ in between. 0 to 2(N-1) msgs.
Maekawa quorum — Request set $R_{i}$ with $∣ R_{i} ∣ = K$ , every pair $R_{i} \cap R_{j} \neq = \emptyset$ , optimum $K = N$ . Quorum-based permission.
Maekawa V2 messages — FAILED (replied to higher), INQUIRE (higher arrived; ask 'in CS?'), YIELD (relinquish lock after FAILED). Breaks V1's cyclic deadlock.
Suzuki-Kasami — Token-based with broadcast REQUEST. Token holds Q + LN[]. Each site has RN[]. Token sent only on fresh request ( $R N [i] = L N [i] + 1$ ).
Raymond's tree algorithm — Token-based on a logical tree. Holder pointer at each node points toward root. Token migrates along Holder chain. $O (lo g N)$ in balanced tree.
Token-based vs non-token DME — Token: a single privilege passes around; requires fault-tolerant token. Non-token: each request asks permission from peers; no token to lose, but more messages per CS.

Formulas

$Lamport msgs: 3 (N - 1) per CS$
$Ricart-Agrawala msgs: 2 (N - 1)$
$Maekawa V1: 3 N; V2 up to 5 N$
$Suzuki-Kasami: 0 (holds token) or N$
$Raymond: O (lo g N) (balanced tree)$
$Throughput: 1/ (SD + E)$
$Lamport entry: L 1 (later-ts msg from every site) \land L 2 (own at top)$
$Maekawa quorum optimum: K = D = N, K^{2} \geq N (intersection)$

Derivations

Lamport DME safety. Suppose $S_{i}, S_{j}$ both in CS simultaneously. Both have own requests at top of their queues. WLOG $(t s_{i}, i) < (t s_{j}, j)$ . By L1, $S_{i}$ has received a message from $S_{j}$ with timestamp $> (t s_{i}, i)$ . By FIFO, this means $S_{j}$ 's REQUEST (which has $t s_{j} > t s_{i}$ ) has been queued at $S_{i}$ . Since $(t s_{j}, j) > (t s_{i}, i)$ , $S_{j}$ 's request is below $S_{i}$ 's in the queue — contradiction with $S_{j}$ 's L2.

**Why R-A is $2 (N - 1)$ msgs.** REQUEST to $N - 1$ peers + REPLY from $N - 1$ peers (eventually). No RELEASE. Total $2 (N - 1)$ .

**Maekawa quorum intersection ⇒ $K^{2} \geq N$ .** Consider all $(2 N)$ pairs of quorums. Each pair must share at least one element. Each quorum has size $K$ , hence appears in $(2 K)$ pair-intersections from its own members. Sum over pairs gives a counting bound that yields $K^{2} \geq N$ . Hence $K \geq N$ .

Examples

Lamport trace for 3 sites. $S_{1}$ wants CS at time $(2, 1)$ . $S_{2}$ wants at $(3, 2)$ . $S_{1}$ broadcasts $REQUEST (2, 1)$ . $S_{2}, S_{3}$ queue it and REPLY. $S_{2}$ broadcasts $REQUEST (3, 2)$ . $S_{3}$ queues and replies. $S_{1}$ 's queue: $[(2, 1) \to S_{1}, (3, 2) \to S_{2}]$ . $(2, 1)$ at top + L1 satisfied → $S_{1}$ enters. RELEASE. Queue: $[(3, 2) \to S_{2}]$ . $S_{2}$ enters next.
Ricart-Agrawala defer. $S_{1}$ requests $(2, 1)$ → $S_{2}$ replies (idle). $S_{2}$ requests $(3, 2)$ → $S_{1}$ defers because $(2, 1) < (3, 2)$ and $S_{1}$ has its own request pending. After $S_{1}$ leaves CS, $S_{1}$ sends deferred REPLY to $S_{2}$ . $S_{2}$ enters.
**Maekawa quorum example $N = 7$ .** Pick $K = 3$ . One valid quorum design: $R_{1} = {1, 2, 3}$ , $R_{2} = {1, 4, 5}$ , $R_{3} = {1, 6, 7}$ — every pair shares 1. Each node in $D = 3$ quorums (designed so total $K \cdot N / D = K^{2} = N$ ).
Suzuki-Kasami token transfer. $P_{1}$ holds token. $P_{2}$ requests: $R N_{2} [2] = 5$ , broadcast REQUEST $(2, 5)$ . $P_{1}$ receives, updates $R N_{1} [2] = 5$ . $L N_{1} [2] = 4$ (previous). $5 = 4 + 1$ ✓ fresh → send token to $P_{2}$ if $P_{1}$ is idle.
Raymond tree, 7-node balanced tree, P4 holds token. $P_{3}$ requests CS: pushes self to $Q_{3}$ (empty before); sends REQUEST to Holder (P_4's parent). REQUEST flows up the tree; root forwards token to $P_{3}$ through Holder chain; tree pointers update along path.

Diagrams

Comparison table: rows = Lamport / R-A / Maekawa-V2 / Suzuki-Kasami / Raymond; cols = type, msgs, SD, assumption, fault note.
Lamport request flow: 3-process timeline with REQUEST, REPLY, RELEASE arrows + per-site queues at each instant.
Maekawa quorum diagram: $N = 7$ with $K = 3$ ; show three quorums and their pairwise intersections.
Maekawa V2 deadlock fix: arrows for FAILED, INQUIRE, YIELD.
Suzuki-Kasami token state: token's $Q$ + $L N []$ visualised as data structures.
Raymond tree with Holder pointers all converging to root token-holder.

Edge cases

Lamport without FIFO breaks: RELEASE may overtake REQUEST, removing a queue entry that hasn't been added.
Maekawa V1 deadlock in cyclic locking; V2 with FAILED/INQUIRE/YIELD breaks the cycle.
Token loss in Suzuki-Kasami / Raymond → need regeneration protocol (election + reset $L N []$ / Holder).
Suzuki-Kasami broadcast doesn't scale to thousands of nodes.
Raymond root bottleneck if request rate is high near the root.
Tied timestamps — break with process ID; without tie-break, fairness breaks.

Common mistakes

Confusing message counts: Lamport = 3(N-1), R-A = 2(N-1), Maekawa V1 = 3√N, V2 = up to 5√N, S-K = 0 or N, Raymond = O(log N). Memorise the table.
Saying 'R-A needs FIFO'. No — R-A's beauty is that it doesn't.
Forgetting L1 + L2 are BOTH required in Lamport. Either alone is insufficient.
Saying 'Maekawa V1 is deadlock-free'. No — V1 can deadlock; V2 fixes it.
Suzuki-Kasami token send condition is $R N [i] = L N [i] + 1$ (strict equality) — not $\geq$ or $\leq$ .

Shortcuts

Lamport: 3(N-1) msgs, FIFO, L1+L2 entry.
R-A: 2(N-1) msgs, no FIFO, deferred REPLY.
Maekawa: K=D=√N, V1 deadlocks → V2 FAILED/INQUIRE/YIELD, SD = 2T.
**S-K: 0 or N msgs, broadcast REQ + fresh-only token, $R N [i] = L N [i] + 1$ .**
Raymond: O(log N), Holder pointer, root bottleneck.

Proofs / Algorithms

Lamport DME safety (formal). Assume two sites $S_{i}, S_{j}$ in CS at same time with $(t s_{i}, i) < (t s_{j}, j)$ . Both satisfy L2 — own request at top. $S_{j}$ has $(t s_{j}, j)$ at top. But $S_{j}$ must have queued $S_{i}$ 's REQUEST (broadcasted) by FIFO + the fact that L1 at $S_{j}$ requires a message from $S_{i}$ with timestamp $> (t s_{j}, j)$ . The only way is $S_{i}$ 's RELEASE arriving after $S_{j}$ broadcast its REQUEST — but then $(t s_{i}, i)$ has been removed from $S_{j}$ 's queue. Trace shows contradiction with L2 at $S_{j}$ .

Ricart-Agrawala safety. $S_{i}$ enters CS after REPLY from all $N - 1$ . If $S_{j}$ has a higher-priority pending request, $S_{j}$ would have deferred $S_{i}$ 's REPLY until $S_{j}$ leaves CS — so $S_{i}$ couldn't have all REPLYs while $S_{j}$ is in CS. Hence at most one site in CS.

Maekawa intersection-based safety. Two simultaneous requesters $S_{i}, S_{j}$ need ACK from disjoint sets. But $R_{i} \cap R_{j} \neq = \emptyset$ — some site $S_{k}$ in both. $S_{k}$ can only ACK one at a time. So only one of $S_{i}, S_{j}$ gets full ACKs and enters CS.

Suzuki-Kasami safety. Only the token-holder can enter CS. Token is unique (single token in system). Hence safety.

End of chapterUnit 5 — Distributed Mutual Exclusion · Lamport, Ricart-Agrawala, Maekawa, Suzuki-Kasami, Raymond — Complete Comparison

View definitions for this chapter →·Cheatsheet·Practice questions

Distributed Systems