Revision Notes/Unit 2 — Models, Events & Logical Time/Logical Clocks (Scalar / Vector / Matrix) + Physical Time Sync/Story

Logical Clocks (Scalar / Vector / Matrix) + Physical Time Sync

NotesStory

Unit 2 — Models, Events & Logical Time

Time Without A Clock

The most surprising thing about distributed systems is that there's no shared "now". Each node has its own quartz crystal that ticks at a slightly different rate. Two events happen — was X before Y, after Y, or simultaneously? The honest answer is: we can't tell without a discipline.

Leslie Lamport's 1978 paper "Time, Clocks, and the Ordering of Events in a Distributed System" gave us that discipline. The trick: forget physical time. Define causality instead, then build clocks that respect it.

The Happened-Before Relation

Lamport's $\to$ has three rules:

1. Same-process order: if $e$ and $e^{'}$ are in the same process and $e$ occurs first, then $e \to e^{'}$ . 2. Send precedes receive: $send (m) \to receive (m)$ . 3. Transitive: $e \to e^{'} \land e^{'} \to e^{''} \Rightarrow e \to e^{''}$ .

Two events that are NOT related by $\to$ (in either direction) are concurrent. Two ships passing in the night; neither caused the other.

Notice what's NOT in this definition: anything physical. We've defined a *partial order* on events purely from local sequencing + communication. Causality is now a graph property, not a wall-clock property.

Logical Clocks

A logical clock is a function $C$ that assigns timestamps to events such that $e \to e^{'} \Rightarrow C (e) < C (e^{'})$ . That's called consistency (or monotonicity). The stronger property — strong consistency — requires the converse too: $e \to e^{'} \Leftrightarrow C (e) < C (e^{'})$ .

This converse is what separates scalar clocks from vector/matrix clocks. Memorise this distinction; it's the most-asked exam trap on logical clocks.

Scalar (Lamport) Clocks

Each process $P_{i}$ keeps a single integer $c_{i}$ .

R1 (before any event): $c_{i} := c_{i} + d$ (usually $d = 1$ ).
R2 (on receive of message with timestamp $t_{m}$ ): $c_{i} := max (c_{i}, t_{m})$ , then R1, then deliver.

To get a *total* order from this partial one, tie-break with $(t, i)$ — timestamp first, then process ID, lex order.

Consistency, yes; strong consistency, no. Forward direction holds: if $e \to e^{'}$ via a chain of sends/recvs/local events, the clock values strictly increase along the chain. But the converse FAILS: $P_{1}$ runs to $c_{1} = 3$ , $P_{2}$ runs independently to $c_{2} = 5$ . We have $3 < 5$ but the events are concurrent — no causal relationship. Scalar clocks lose that information.

Vector Clocks

Each $P_{i}$ keeps a vector $V_{i}$ of length $n$ (one entry per process).

R1 (before any event): $V_{i} [i] := V_{i} [i] + 1$ .
R2 (on receive with vector $V_{m}$ ): $\forall k : V_{i} [k] := max (V_{i} [k], V_{m} [k])$ , then R1, then deliver.

Comparison: $V_{h} < V_{k}$ iff $V_{h} \leq V_{k}$ componentwise AND $\exists i : V_{h} [i] < V_{k} [i]$ . Otherwise $V_{h} ∥ V_{k}$ — concurrent.

Strong consistency: $e \to e^{'} \Leftrightarrow V (e) < V (e^{'})$ . The $i$ -th component of $P_{i}$ 's vector clock equals the number of events at $P_{i}$ causally preceding the current event. Vector clocks capture exactly the partial order of causality.

The cost: $O (n)$ storage per process, $O (n)$ bytes per message. In a cluster of 10,000 nodes, that's heavy.

Singhal-Kshemkalyani Optimisation

Most of the time most components don't change. Send only the *changed* entries since last send to that recipient. Each $P_{i}$ maintains $L S_{i} [j]$ = $V_{i} [i]$ when last sent to $P_{j}$ , and $L U_{i} [j]$ = $V_{i} [i]$ when $V_{i} [j]$ was last updated. On send to $P_{j}$ , include only $(k, V_{i} [k])$ where $L U_{i} [k] > L S_{i} [j]$ .

Requires FIFO channels — the recipient reconstructs the full vector incrementally; later partial-vectors arriving before earlier ones would corrupt that reconstruction.

Matrix Clocks

Each $P_{i}$ keeps an $n \times n$ matrix.

$m t_{i} [i, *]$ = $P_{i}$ 's own vector clock.
$m t_{i} [j, k]$ = $P_{i}$ 's knowledge of $P_{j}$ 's knowledge of $P_{k}$ 's clock.

That's *second-order knowledge*: "what does I know about what J knows?" Useful when you need to safely discard information that's universally known.

Killer application — obsolete-message GC. Once $min_{k} m t_{i} [k, i] \geq t$ , every process has been observed to have seen $P_{i}$ 's clock pass $t$ — so messages with timestamps $\leq t$ are universally delivered and can be discarded from buffers. Used in replicated databases.

When To Use Which

| Clock | Storage | Consistency | Use | |---|---|---|---| | Scalar | 1 int | Consistent (forward only) | Total ordering with tie-break | | Vector | n ints | Strongly consistent | Causal order, concurrency detection | | Matrix | n² ints | Strong + 2nd-order knowledge | Obsolete-message GC, replicated DBs |

Physical Time Sync — Cristian's, Berkeley, NTP

Even with logical clocks, sometimes you need real time. Build timestamps. Security tokens. Coordinated timeouts. The four classical algorithms:

Cristian's algorithm. Client polls a single time server. RTT = $T_{1} - T_{0}$ . Estimated time = $T_{s} + RTT /2$ . Uses UTC. Single point of failure.

Berkeley algorithm. Master polls all slaves, averages their times (after correcting transit delay and discarding outliers), sends each slave its delta. No UTC — only internal agreement among nodes. Useful for LAN clusters that don't need world time.

Decentralised averaging. At fixed times, every machine broadcasts. Each averages received times. Fully decentralised; no master.

NTP — the production answer. Hierarchical strata: stratum 0 (atomic clocks, GPS) → stratum 1 (servers directly connected) → stratum 2+ (secondary). Clients exchange four timestamps with servers and compute offset + delay. Accuracy: ms WAN, sub-ms LAN. Cannot achieve perfect sync — stratum delay accumulates; different nodes may use different NTP servers.

Why Still Need Logical Clocks?

NTP gives only approximate synchronisation. Distributed mutex, snapshots, deadlock detection — these need *causal* correctness, not numeric closeness. A 1 ms clock skew between two nodes is invisible to NTP but breaks safety properties of algorithms that rely on "happened before". Logical clocks capture causality exactly, regardless of physical drift.

What You Walk In Carrying

The DS model (processors, channels, three event types). Lamport's $\to$ — three rules + concurrency definition. Logical clock definition + consistency vs strong consistency. Scalar fails the converse; vector and matrix satisfy both. Scalar R1 (++own, R2 (max + R1). Vector R1 (++own), R2 (componentwise max + R1). Comparison: $V_{h} < V_{k}$ iff $\leq$ componentwise + $\exists$ strict. Singhal-Kshemkalyani LS / LU + FIFO requirement. Matrix interpretation $m t [j, k]$ as second-order knowledge + GC application. Cristian's RTT/2 + UTC. Berkeley average + no UTC. NTP stratum hierarchy + accuracy.

Distributed Systems