CAP theorem
Brewer’s theorem says that a distributed system can simultaneously provide at most two of: consistency, availability, and partition tolerance. In practice, partition tolerance is non-negotiable, so the real choice is between consistency and availability under failure.
The trinity
flowchart TB
classDef pillar fill:#1e293b,stroke:#7dd3fc,color:#e2e8f0,stroke-width:2px,rx:8;
classDef pick fill:#0f172a,stroke:#a5b4fc,color:#c7d2fe,stroke-width:2px,rx:6;
C[Consistency<br/><i>every read sees the latest write</i>]:::pillar
A[Availability<br/><i>every request gets a response</i>]:::pillar
P[Partition tolerance<br/><i>survives network splits</i>]:::pillar
C --- CA[CA<br/>single-node DBs]:::pick --- A
C --- CP[CP<br/>etcd, ZooKeeper, MongoDB]:::pick --- P
A --- AP[AP<br/>Cassandra, DynamoDB, Riak]:::pick --- P
Formal statement
Given a system serving operations on a register, let be the set of reads and the set of writes ordered by real time. Linearizability requires that for every read , the value returned equals the value of the most recent write such that completed before began:
Under partition, no protocol can satisfy this constraint and respond to every request — that’s the impossibility result.
Real-world choices
| System | Mode | When partition occurs | Notes |
|---|---|---|---|
| etcd | CP | minority partition refuses writes | Raft-based |
| ZooKeeper | CP | minority partition becomes read-only | ZAB protocol |
| Cassandra | AP | both sides accept writes; reconciled later | last-write-wins |
| DynamoDB | AP | regional reads/writes continue | tunable consistency |
| MongoDB | CP* | primary on minority side steps down | *with majority writes |
| Spanner | CP | TrueTime bounds keep linearizability | uses GPS + atomic clocks |
What it doesn’t say
Note — CAP doesn’t apply during normal (non-partitioned) operation. A system can offer strong consistency and high availability when the network is healthy. The theorem only constrains the worst-case tradeoff.
The more useful frame for everyday work is PACELC:
- Partition: pick Availability or Consistency.
- Else (no partition): pick Latency or Consistency.
Most “AP” systems are also “EL” — they trade some consistency for latency even when the network is fine.
Sequence under partition
sequenceDiagram
participant Client
participant Replica1
participant Replica2
Note over Replica1,Replica2: Network partition begins
Client->>Replica1: write(x = 1)
Replica1-->>Client: ok (CP: would block)
Client->>Replica2: read(x)
Replica2-->>Client: x = 0 (stale, AP) or refuse (CP)
Note over Replica1,Replica2: Partition heals
Replica1->>Replica2: replicate x = 1
Replica2-->>Client: x = 1 (eventually)
Related
- distributions — relevant if you’re modeling tail-latency tradeoffs