Systems

Paxos

Paxos is the original consensus algorithm — a protocol for reaching agreement on a single value among a group of unreliable processes. Most real systems use Multi-Paxos (a sequence of Paxos instances) or the more readable raft reformulation.

Roles

  • Proposer — proposes values
  • Acceptor — votes on proposals
  • Learner — learns the chosen value

A node typically plays multiple roles.

The two phases

sequenceDiagram
    participant P as Proposer
    participant A1 as Acceptor 1
    participant A2 as Acceptor 2
    participant A3 as Acceptor 3

    Note over P,A3: Phase 1: Prepare
    P->>A1: prepare(n)
    P->>A2: prepare(n)
    P->>A3: prepare(n)
    A1-->>P: promise(n, last accepted)
    A2-->>P: promise(n, last accepted)

    Note over P,A3: Phase 2: Accept
    P->>A1: accept(n, v)
    P->>A2: accept(n, v)
    A1-->>P: accepted(n, v)
    A2-->>P: accepted(n, v)

Safety invariants

For a value vv chosen with proposal number nn:

If v is chosen, every higher-numbered proposal proposes v\text{If } v \text{ is chosen, every higher-numbered proposal proposes } v

This requires that any new proposer first reads the highest accepted value among a majority — phase 1’s “prepare” round — and adopts it if found.

Why people prefer raft

Paxos is correct but notoriously hard to teach and implement. The original paper deliberately omits a complete pseudocode listing. Production systems that “use Paxos” almost always use heavily modified variants:

  • Multi-Paxos — amortizes phase 1 across many proposals
  • Cheap Paxos — uses non-voting auxiliary nodes
  • Fast Paxos — single round-trip in the common case
  • EPaxos — leaderless, uses dependency graphs

Each variant fixes a real performance limitation but compounds the implementation complexity.

  • raft — the more understandable reformulation
  • cap-theorem — both Paxos and Raft are CP