How does a database stay correct when a thousand transactions run at once — and a server crashes mid-write? Five mechanisms, one coherent story: isolation, durability, and distributed atomicity.
Without any concurrency control, concurrent transactions produce incorrect results. The four anomalies below are the classic starting point, but they are not the whole story. Real isolation levels are ultimately judged by whether they permit serialization anomalies such as write skew, not just by this short list.
// Simulate concurrency anomalies without any concurrency control // Each function shows what goes wrong when two transactions interleave function dirtyRead() { // T2 reads a value written by T1 before T1 commits // If T1 aborts, T2 has seen data that never existed let balance = 1000; const T1_write = () => { balance = 1500; }; // T1: transfer in (not yet committed) const T2_read = () => balance; // T2: reads T1's uncommitted write const T1_abort = () => { balance = 1000; }; // T1: rolls back T1_write(); const seen = T2_read(); // T2 sees 1500 T1_abort(); console.log(`Dirty read: T2 saw ${seen}, actual=${balance} ← T2 used phantom money!`); } function nonRepeatableRead() { // T1 reads the same row twice; T2 modifies it between the reads let price = 100; const read1 = price; // T1 first read price = 120; // T2 updates and commits const read2 = price; // T1 second read — different result! console.log(`Non-repeatable read: T1 read price as ${read1} then ${read2} in one transaction`); } function phantomRead() { // T1 runs a query twice; T2 inserts rows matching the predicate between them let orders = [{ id:1, amount:50 }, { id:2, amount:80 }]; const query = () => orders.filter(o => o.amount > 40); const scan1 = query(); // T1 first scan: 2 rows orders.push({ id:3, amount:200 }); // T2 inserts a new large order const scan2 = query(); // T1 second scan: 3 rows — phantom appeared! console.log(`Phantom read: first scan=${scan1.length} rows, second scan=${scan2.length} rows`); } function lostUpdate() { // Both T1 and T2 read-then-write; one write overwrites the other let views = 1000; const v1 = views; // T1 reads 1000 const v2 = views; // T2 reads 1000 views = v1 + 1; // T1 writes 1001 views = v2 + 1; // T2 writes 1001 — T1's increment is LOST console.log(`Lost update: after 2 increments, views=${views} (should be 1002)`); } dirtyRead(); nonRepeatableRead(); phantomRead(); lostUpdate(); // SQL isolation levels and what they prevent: const levels = [ { level: "READ UNCOMMITTED", dirty: "✗", nonrep: "✗", phantom: "✗", lost: "✗" }, { level: "READ COMMITTED", dirty: "✓", nonrep: "✗", phantom: "✗", lost: "✗" }, { level: "REPEATABLE READ", dirty: "✓", nonrep: "✓", phantom: "✗", lost: "✓" }, { level: "SERIALIZABLE", dirty: "✓", nonrep: "✓", phantom: "✓", lost: "✓" }, ]; console.log("\nIsolation levels (✓=prevented, ✗=possible):"); levels.forEach(l => console.log(` ${l.level.padEnd(20)} dirty=${l.dirty} nonrep=${l.nonrep} phantom=${l.phantom} lost=${l.lost}`));
2PL is the classic pessimistic approach. Every transaction must acquire a shared lock (S) before reading a resource and an exclusive lock (X) before writing. The "two phases" are: a growing phase where locks are only acquired, and a shrinking phase where locks are only released. Once you release any lock, you can't acquire new ones.
This strict ordering guarantees serializability — the execution is equivalent to some serial order. The cost: deadlocks. If T1 holds X on A and waits for B, while T2 holds X on B and waits for A, neither can proceed. The database must detect and break the cycle by aborting one transaction.
// Two-Phase Locking: acquire locks (growing), then release (shrinking) // Strict 2PL: hold ALL locks until commit/abort (used by most real DBs) class LockManager { constructor() { this.locks = new Map(); // resource → { holders: Set, waiters: [] } this.txLocks = new Map(); // txId → Set of resources held this.waitsFor= new Map(); // txId → txId (waiting for) } _getLock(resource) { if (!this.locks.has(resource)) this.locks.set(resource, { mode: null, holders: new Set(), waiters: [] }); return this.locks.get(resource); } acquire(txId, resource, mode) { const lock = this._getLock(resource); const compatible = (existing, requested) => existing === null || (existing === 'S' && requested === 'S'); if (compatible(lock.mode, mode) || lock.holders.has(txId)) { lock.holders.add(txId); lock.mode = lock.holders.size > 1 ? 'S' : mode; if (!this.txLocks.has(txId)) this.txLocks.set(txId, new Set()); this.txLocks.get(txId).add(resource); console.log(` T${txId} acquired ${mode}-lock on ${resource}`); return true; } // Must wait — record waits-for for deadlock detection lock.waiters.push({ txId, mode }); this.waitsFor.set(txId, [...lock.holders][0]); console.log(` T${txId} WAITING for ${mode}-lock on ${resource} (held by T${[...lock.holders][0]})`); return false; } release(txId) { const held = this.txLocks.get(txId) || new Set(); for (const res of held) { const lock = this.locks.get(res); lock.holders.delete(txId); if (lock.holders.size === 0) { lock.mode = null; const next = lock.waiters.shift(); if (next) { lock.holders.add(next.txId); lock.mode = next.mode; this.waitsFor.delete(next.txId); console.log(` T${next.txId} unblocked — acquired ${next.mode}-lock on ${res}`); } } } this.txLocks.delete(txId); console.log(` T${txId} released all locks`); } detectDeadlock() { // DFS on waits-for graph looking for a cycle const visited = new Set(), stack = new Set(); const hasCycle = (node) => { if (stack.has(node)) return true; if (visited.has(node)) return false; visited.add(node); stack.add(node); const next = this.waitsFor.get(node); if (next && hasCycle(next)) return true; stack.delete(node); return false; }; for (const tx of this.waitsFor.keys()) if (hasCycle(tx)) return true; return false; } } const lm = new LockManager(); console.log("=== Normal 2PL scenario ==="); lm.acquire(1, "account_A", "S"); // T1 reads A lm.acquire(2, "account_A", "S"); // T2 reads A — compatible, both get S-lock lm.acquire(1, "account_B", "X"); // T1 writes B — no one holds B, granted lm.acquire(2, "account_B", "X"); // T2 wants X on B — must wait for T1 console.log(`Deadlock detected: ${lm.detectDeadlock()}`); lm.release(1); // T1 commits, releases all locks console.log("\n=== Deadlock scenario ==="); const lm2 = new LockManager(); lm2.acquire(1, "A", "X"); // T1 holds A lm2.acquire(2, "B", "X"); // T2 holds B lm2.acquire(1, "B", "X"); // T1 waits for B (held by T2) lm2.acquire(2, "A", "X"); // T2 waits for A (held by T1) → CYCLE console.log(`Deadlock detected: ${lm2.detectDeadlock()} → abort one transaction`);
OCC assumes conflicts are rare. Instead of blocking on locks, every transaction executes freely — reads and writes go to a private workspace. At commit time, a validation phase checks whether any conflicts actually occurred. If not, the writes are installed. If yes, the transaction is aborted and retried.
OCC has three phases: Read (execute freely, track read-set and write-set), Validate (check for conflicts with recently committed transactions), Write (install changes atomically if validation passes). A correct validator must reject both read/write conflicts and write/write conflicts that would otherwise collapse into last-writer-wins.
// Optimistic Concurrency Control — three phases: Read, Validate, Write // No locks during execution. Conflicts detected only at commit time. class OCCManager { constructor(db) { this.db = { ...db }; // shared database state this.committed = []; // list of recently committed txns {writeSet, endTime} this.clock = 0; } begin() { return { id: ++this.clock, startTime: this.clock, readSet: new Set(), writeSet: new Map(), workspace: { ...this.db } }; } read(tx, key) { tx.readSet.add(key); return tx.workspace[key]; // reads from private workspace snapshot } write(tx, key, value) { tx.writeSet.set(key, value); tx.workspace[key] = value; // buffered in private workspace, not committed yet } validate(tx) { tx.validationTime = ++this.clock; // For every committed txn whose interval overlaps ours: // its write-set must not intersect our read-set or write-set for (const committed of this.committed) { if (committed.endTime <= tx.startTime) continue; // committed before we started for (const key of committed.writeSet.keys()) { if (tx.readSet.has(key)) { console.log(` T${tx.id} CONFLICT: committed T${committed.id} wrote '${key}' which T${tx.id} read → ABORT`); return false; } if (tx.writeSet.has(key)) { console.log(` T${tx.id} CONFLICT: committed T${committed.id} also wrote '${key}' → ABORT`); return false; } } } return true; } commit(tx) { if (!this.validate(tx)) return false; // Write phase: install all buffered writes atomically for (const [key, val] of tx.writeSet) this.db[key] = val; tx.endTime = ++this.clock; this.committed.push(tx); console.log(` T${tx.id} COMMITTED. DB: ${JSON.stringify(this.db)}`); return true; } } const occ = new OCCManager({ balance_A: 1000, balance_B: 500 }); console.log("=== Scenario 1: No conflict (different keys) ==="); const t1 = occ.begin(); const t2 = occ.begin(); occ.read(t1, "balance_A"); occ.write(t1, "balance_A", 900); occ.read(t2, "balance_B"); occ.write(t2, "balance_B", 600); occ.commit(t1); // T1 commits first occ.commit(t2); // T2 commits — no overlap in read/write sets console.log("\n=== Scenario 2: Write-Read conflict ==="); const occ2 = new OCCManager({ stock: 10 }); const ta = occ2.begin(); const tb = occ2.begin(); occ2.read(ta, "stock"); occ2.write(ta, "stock", 9); // buy 1 unit occ2.read(tb, "stock"); occ2.write(tb, "stock", 9); // also buy 1 unit occ2.commit(ta); // ta commits: stock = 9 occ2.commit(tb); // tb: ta wrote 'stock' which tb read → ABORT, retry!
MVCC is the approach used by PostgreSQL, MySQL InnoDB, Oracle, and most modern databases. Instead of locking rows during reads, each transaction gets a consistent snapshot of the database as it existed at the transaction's start time. Writes create new versions; readers see the version that was current when their transaction began, plus their own writes.
The easiest way to picture this is as a version chain per row. A row is not overwritten in place. When a transaction updates price, the database keeps the old version and appends a new one. A reader walks backward through that chain and asks: “Which version was alive at the moment my snapshot was taken?” The first visible one is the value that transaction reads.
That is what the metadata means in the figure below. xmin is the transaction ID that created this version. xmax is the transaction ID that later replaced or deleted it. So a version is visible to a snapshot if it was born early enough (xmin is old enough) and not yet dead at that snapshot point (xmax is still in the future, or ∞ if it is the current version).
The result: readers never block writers, and writers never block readers. A long-running read-only query never holds up a write transaction. The cost is version storage — old versions must eventually be garbage-collected (PostgreSQL calls this VACUUM). Snapshot-based MVCC is great for read concurrency, but it is not automatically serializable; write skew is the classic counterexample.
REPEATABLE READ from true SERIALIZABLE.// MVCC: every write creates a new version; reads see the snapshot at tx start // plus the transaction's own writes. This is snapshot isolation, not full serializability. // Used by: PostgreSQL, MySQL InnoDB, Oracle, CockroachDB (with engine-specific details) class MVCCDatabase { constructor() { this.versions = new Map(); // key → [{value, xmin, xmax}] (version chain) this.txCounter = 0; this.activeIds = new Set(); } begin() { const txId = ++this.txCounter; this.activeIds.add(txId); // Snapshot = current committed txId + set of in-progress txns at begin time return { txId, snapshotMin: txId, activeTxns: new Set(this.activeIds) }; } write(key, value, txId) { if (!this.versions.has(key)) this.versions.set(key, []); const chain = this.versions.get(key); // Mark previous head as deleted (set xmax) when this tx commits if (chain.length) chain[chain.length-1].xmax = txId; chain.push({ value, xmin: txId, xmax: Infinity }); console.log(` T${txId} wrote '${key}' = ${JSON.stringify(value)} (version xmin=${txId})`); } read(key, snapshot) { const chain = this.versions.get(key) || []; // Find newest version visible to this snapshot for (let i = chain.length - 1; i >= 0; i--) { const v = chain[i]; const selfVisible = v.xmin === snapshot.txId; const creatorCommitted = v.xmin < snapshot.snapshotMin && !snapshot.activeTxns.has(v.xmin); const notYetDeleted = v.xmax > snapshot.snapshotMin || snapshot.activeTxns.has(v.xmax); if ((selfVisible || creatorCommitted) && notYetDeleted) return v.value; } return undefined; // no visible version } commit(tx) { this.activeIds.delete(tx.txId); console.log(` T${tx.txId} committed`); } vacuum(oldestActiveTx) { // Remove versions invisible to all active transactions (PostgreSQL VACUUM) let removed = 0; for (const [key, chain] of this.versions) this.versions.set(key, chain.filter(v => { const dead = v.xmax <= oldestActiveTx; if (dead) removed++; return !dead; })); console.log(` VACUUM removed ${removed} dead version(s)`); } } const db = new MVCCDatabase(); // Seed initial data (committed before any tx) db.versions.set("price", [{ value: 100, xmin: 0, xmax: Infinity }]); console.log("=== MVCC snapshot isolation ==="); const longReader = db.begin(); // T1: starts a long read-only report const writer = db.begin(); // T2: will update the price db.write("price", 120, writer.txId); // T2 writes new version db.commit(writer); // T2 commits // T1 (longReader) still sees the OLD version because it started before T2 const seen = db.read("price", longReader); console.log(` T1 (longReader) sees price = ${seen} (started before write, sees old version)`); const freshReader = db.begin(); const seen2 = db.read("price", freshReader); console.log(` T3 (freshReader) sees price = ${seen2} (started after write, sees new version)`); db.commit(longReader); db.vacuum(3); // clean up versions invisible to all active txns
WAL (also called the "redo log" in Oracle/MySQL) solves a different problem: durability. If a server crashes after a transaction commits but before the actual data pages have been flushed to disk, that data is lost. WAL prevents this with one rule:
Every change is first appended to a sequential log file (fast sequential I/O), with each record assigned a monotonically increasing Log Sequence Number (LSN). At commit, only the log needs to be flushed — the data pages can be written lazily in the background. Recovery logic varies by engine: ARIES is one important design (Analysis → Redo → Undo), while other systems describe crash recovery more simply as replaying WAL from the last checkpoint and consulting transaction status.
// Write-Ahead Logging — durability through sequential log writes // ARIES recovery: Analysis → Redo → Undo class WALEngine { constructor() { this.log = []; // the WAL — sequential, append-only this.dataPages = {}; // in-memory "disk" (may lag behind log) this.activeTxns = {}; // txId → { writes: [] } this.lsn = 0; // Log Sequence Number — ever-increasing this.flushedLSN = 0; // last LSN synced to durable storage this.lastCheckpoint= 0; } _appendLog(type, txId, data = {}) { const entry = { lsn: ++this.lsn, type, txId, timestamp: Date.now(), ...data }; this.log.push(entry); return entry.lsn; } begin(txId) { this.activeTxns[txId] = { writes: [], prevLSN: 0 }; this._appendLog("BEGIN", txId); console.log(`[LSN ${this.lsn}] BEGIN T${txId}`); } write(txId, key, oldVal, newVal) { // Write to log FIRST (before touching data pages) const lsn = this._appendLog("UPDATE", txId, { key, oldVal, newVal }); this.activeTxns[txId].writes.push({ key, oldVal, lsn }); // Data page write can be delayed (lazy/background) this.dataPages[key] = newVal; // in practice: buffered, flushed later console.log(`[LSN ${lsn}] UPDATE T${txId}: ${key} ${oldVal}→${newVal}`); } commit(txId) { const lsn = this._appendLog("COMMIT", txId); // Force-flush log up to this LSN before returning to client this.flushedLSN = lsn; delete this.activeTxns[txId]; console.log(`[LSN ${lsn}] COMMIT T${txId} — log flushed to LSN ${lsn}`); } checkpoint() { // Fuzzy checkpoint: log active txns, current LSN — don't stop the world const active = Object.keys(this.activeTxns); this.lastCheckpoint = this.lsn; this._appendLog("CHECKPOINT", null, { activeTxns: active }); console.log(`[LSN ${this.lsn}] CHECKPOINT — active txns: [${active.map(t=>'T'+t).join(',')||'none'}]`); } recover() { console.log("\n=== ARIES RECOVERY ==="); // Phase 1: Analysis — find checkpoint, determine active txns at crash const cpEntry = [...this.log].reverse().find(e => e.type === "CHECKPOINT"); const winner = new Set(); // committed txns to redo const loser = new Set(); // active-at-crash txns to undo const startLSN = cpEntry ? cpEntry.lsn : 0; for (const e of this.log.filter(e => e.lsn > startLSN)) { if (e.type === "BEGIN") loser.add(e.txId); if (e.type === "COMMIT") { loser.delete(e.txId); winner.add(e.txId); } if (e.type === "ABORT") loser.delete(e.txId); } console.log(`Analysis: REDO txns=[${[...winner].map(t=>'T'+t)}], UNDO txns=[${[...loser].map(t=>'T'+t)}]`); // Phase 2: Redo — replay all updates for winners (ensures durability) const db = {}; for (const e of this.log) { if (e.type === "UPDATE" && winner.has(e.txId)) db[e.key] = e.newVal; } console.log("After Redo, DB:", JSON.stringify(db)); // Phase 3: Undo — roll back all writes of loser txns (ensures atomicity) for (const e of [...this.log].reverse()) { if (e.type === "UPDATE" && loser.has(e.txId)) db[e.key] = e.oldVal; } console.log("After Undo, DB:", JSON.stringify(db)); return db; } } const wal = new WALEngine(); wal.begin(1); wal.begin(2); wal.write(1, "balance_A", 1000, 900); wal.write(2, "balance_B", 500, 600); wal.commit(1); // T1 commits — log flushed, durable wal.checkpoint(); // T2 still active wal.write(2, "balance_C", 200, 300); // --- CRASH HERE --- T2 was still active --- console.log("\n*** SERVER CRASHES (T2 never committed) ***"); const recovered = wal.recover(); // T1 is redone (committed), T2 is undone (was in-flight)
So far, everything has been single-node. In a distributed system (sharded databases, microservices writing to multiple stores), you need a way to commit a transaction across multiple nodes atomically — either all commit or all abort. 2PC achieves this with a coordinator and one or more participants.
Phase 1 (Prepare): coordinator sends "prepare to commit" to all participants. Each participant durably writes its work to its WAL and replies "vote YES" (I'm ready) or "vote NO" (I can't). Phase 2 (Commit/Abort): if all voted YES, coordinator sends "COMMIT" to all. If any voted NO, it sends "ABORT" to all.
// Two-Phase Commit — atomic commit across multiple nodes // Shows both the happy path and the coordinator-failure blocking scenario class Participant { constructor(name, willVoteYes = true) { this.name = name; this.state = "idle"; this.prepared = null; // written to WAL when prepared this.willVote = willVoteYes; } prepare(txId, writes) { if (!this.willVote) { console.log(` ${this.name}: VOTE NO (constraint violation)`); this.state = "aborted"; return false; } // Force-write prepare record to WAL — cannot undo this vote this.prepared = { txId, writes }; this.state = "prepared"; console.log(` ${this.name}: VOTE YES (wrote prepare record to WAL)`); return true; } commit() { if (this.state !== "prepared") throw new Error("Not prepared!"); this.state = "committed"; console.log(` ${this.name}: COMMITTED`); } abort() { this.state = "aborted"; console.log(` ${this.name}: ABORTED`); } } class Coordinator { constructor(participants) { this.participants = participants; } commit(txId, writesPerParticipant) { console.log(`[Phase 1 — PREPARE] TxId=${txId}`); const votes = this.participants.map((p, i) => p.prepare(txId, writesPerParticipant[i] || [])); const decision = votes.every(v => v) ? "COMMIT" : "ABORT"; console.log(`[Coordinator decision: ${decision}]`); console.log(`[Phase 2 — ${decision}]`); // Coordinator writes decision to its OWN WAL before sending this.participants.forEach(p => decision === "COMMIT" ? p.commit() : p.abort()); return decision; } } console.log("=== Happy path: all vote YES ==="); const p1 = [new Participant("node-A"), new Participant("node-B"), new Participant("node-C")]; const coord1 = new Coordinator(p1); coord1.commit("tx-001", [["w(A,1)"],["w(B,2)"],["w(C,3)"]]); console.log("\n=== One participant votes NO ==="); const p2 = [new Participant("node-A"), new Participant("node-B", false), new Participant("node-C")]; const coord2 = new Coordinator(p2); coord2.commit("tx-002", [[],[],[]]); console.log("\n=== Blocking scenario: coordinator crashes after phase 1 ==="); const p3 = [new Participant("node-A"), new Participant("node-B")]; console.log("Phase 1: all vote YES..."); p3.forEach(p => p.prepare("tx-003", [])); console.log("*** COORDINATOR CRASHES before sending Phase 2 ***"); console.log("node-A state:", p3[0].state, "← BLOCKED: holds locks, cannot commit or abort"); console.log("node-B state:", p3[1].state, "← BLOCKED: holds locks, cannot commit or abort"); console.log("Resolution: wait for coordinator recovery, or contact other participants");
3PC adds a middle phase between "prepared" and "committed" to eliminate the blocking problem. The insight: 2PC blocks because participants can't tell whether the coordinator crashed before or after deciding. 3PC makes the decision observable by adding a pre-commit phase.
Phase 1 (CanCommit): coordinator asks "can you commit?". Phase 2 (PreCommit): if all say yes, coordinator sends "prepare to commit" — this is new. Now every participant knows all others have voted YES. Phase 3 (DoCommit): coordinator sends actual commit. If the coordinator crashes during Phase 3, any participant can take over — they know every participant is in pre-committed state, so committing is safe.
// Three-Phase Commit — non-blocking (under synchronous networks) // Adds a PRE-COMMIT phase so all participants know everyone voted YES class ThreePCParticipant { constructor(name, willVote = true) { this.name = name; this.state = "idle"; // idle → prepared → pre-committed → committed/aborted this.willVote = willVote; } canCommit() { if (!this.willVote) { console.log(` ${this.name}: NO`); return false; } console.log(` ${this.name}: YES to canCommit`); this.state = "prepared"; return true; } preCommit() { // Receiving preCommit means ALL others have voted YES // Now I can commit unilaterally if coordinator disappears this.state = "pre-committed"; console.log(` ${this.name}: ack preCommit (now knows all voted YES)`); } doCommit() { this.state = "committed"; console.log(` ${this.name}: COMMITTED`); } doAbort() { this.state = "aborted"; console.log(` ${this.name}: ABORTED`); } recoverWithPeers(peers) { // If coordinator is gone, check peer states to decide if (this.state === "pre-committed") { // If I'm pre-committed, everyone voted YES — safe to commit const anyAborted = peers.some(p => p.state === "aborted"); if (anyAborted) { this.doAbort(); } else { this.doCommit(); } } else if (this.state === "prepared") { // If I'm only prepared (not pre-committed), someone may have aborted const anyPreCommit = peers.some(p => p.state === "pre-committed" || p.state === "committed"); if (anyPreCommit) { this.doCommit(); } else { this.doAbort(); } // safe to abort — no one pre-committed yet } } } class ThreePCCoordinator { constructor(participants) { this.participants = participants; } run(txId, crashAfterPhase = null) { // Phase 1: CanCommit? console.log(`[Phase 1 — CAN COMMIT? TxId=${txId}]`); const votes = this.participants.map(p => p.canCommit()); if (!votes.every(v => v)) { this.participants.forEach(p => p.doAbort()); return; } if (crashAfterPhase === 1) { console.log("*** COORDINATOR CRASHES after Phase 1 ***"); return; } // Phase 2: PreCommit — tell everyone all voted YES console.log(`[Phase 2 — PRE-COMMIT]`); this.participants.forEach(p => p.preCommit()); if (crashAfterPhase === 2) { console.log("*** COORDINATOR CRASHES after Phase 2 ***"); return; } // Phase 3: DoCommit console.log(`[Phase 3 — DO COMMIT]`); this.participants.forEach(p => p.doCommit()); } } // 3PC crash after Phase 1 — participants self-resolve (safe to abort) console.log("=== 3PC: coordinator crashes after Phase 1 ==="); const ps1 = [new ThreePCParticipant("A"), new ThreePCParticipant("B")]; new ThreePCCoordinator(ps1).run("tx-1", 1); console.log("Peer recovery:"); ps1[0].recoverWithPeers([ps1[1]]); ps1[1].recoverWithPeers([ps1[0]]); // 3PC crash after Phase 2 — all pre-committed, safe to commit console.log("\n=== 3PC: coordinator crashes after Phase 2 ==="); const ps2 = [new ThreePCParticipant("A"), new ThreePCParticipant("B")]; new ThreePCCoordinator(ps2).run("tx-2", 2); console.log("Peer recovery:"); ps2[0].recoverWithPeers([ps2[1]]); ps2[1].recoverWithPeers([ps2[0]]); console.log("Both committed without coordinator — non-blocking!");
All five mechanisms solve different parts of the same problem. Production systems combine several, but not in one universal recipe. PostgreSQL is MVCC + WAL, with optional serializable conflict tracking and optional 2PC for prepared transactions. InnoDB mixes MVCC snapshots with locking. Oracle has its own multiversion design. SQL Server is lock-based by default, with row-versioned modes available. Spanner and CockroachDB layer MVCC on top of consensus.
| Mechanism | Solves | Read blocks write? | Write blocks read? | Overhead | Used in |
|---|---|---|---|---|---|
| 2PL | Isolation | Yes (S-lock) | Yes (X-lock) | Lock mgmt + deadlock detect | SQL Server default, classic lock managers, InnoDB locking |
| OCC | Isolation | No | No | Abort + retry under contention | Datomic, some key-value stores |
| MVCC | Isolation | No | No | Version storage + GC (VACUUM) | PostgreSQL, Oracle, CockroachDB, InnoDB snapshots |
| WAL | Durability | N/A — not about concurrency | Sequential log I/O | All serious databases | |
| 2PC | Distributed atomicity | N/A — cross-node coordination | Blocking on coord failure | XA / prepared transactions / distributed coordinators | |
| 3PC | Non-blocking dist. atomicity | N/A | Network partition unsafe | Rarely; superseded by Paxos/Raft | |
| Paxos/Raft | Distributed consensus | N/A | Partition-tolerant, live | Spanner, CockroachDB, etcd, MongoDB | |