Paxos
Paxos is a family of protocols for solving consensus in a network of unreliable processors. Consensus is a fundamental problem in distributed computing, where an agreement must be reached by some or all of the processes (e.g., computers, nodes) in the system. Here's a detailed overview:
History
The Paxos algorithm was developed by Leslie Lamport in the early 1990s. It was inspired by an imagined ancient Greek legislative process on the island of Paxos, hence the name. Lamport's initial paper on the subject, "The Part-Time Parliament," was published in 1998, although the protocol itself was conceptualized earlier.
Key Concepts
- Consensus: Achieving agreement among multiple agents or processes, even in the presence of failures.
- Proposer: A process that proposes a value to be agreed upon.
- Acceptor: A process that votes on the proposals.
- Learner: A process that learns the outcome of the consensus.
- Phases: The protocol operates in two phases:
- Phase 1 (Prepare): Proposers send prepare requests to acceptors to gain the right to propose a value.
- Phase 2 (Accept): If the proposer receives sufficient promises, it can then send an accept request with a value to the acceptors.
Protocol Outline
- Prepare: A proposer chooses a proposal number and sends a prepare request with this number to a majority of acceptors.
- Promise: An acceptor responds with a promise not to accept any proposals numbered less than the one proposed and informs the proposer of the highest-numbered proposal it has accepted so far.
- Accept: If the proposer receives a majority of promises, it sends an accept request with the value (which might be the value from the highest proposal number if one was previously accepted) to the acceptors.
- Accepted: An acceptor accepts the proposal unless it has promised to ignore proposals with lower numbers.
- Learning: Once a value has been accepted by a majority, it is learned by the learners, which can then act on the consensus.
Properties
- Safety: Only a value that has been proposed can be learned, and at most one value can be learned.
- Liveness: Under certain conditions (e.g., a majority of non-faulty nodes), the protocol guarantees progress.
- Fault Tolerance: Can handle failures as long as a majority of the nodes are functioning correctly.
Applications
Paxos has been influential in various distributed systems:
- It forms the basis for many consensus algorithms in Distributed Systems, especially in database replication and state machine replication.
- Systems like Google's Chubby lock service use variations of the Paxos algorithm.
- ZooKeeper by Apache uses a variation called Zab, which is inspired by Paxos.
Criticism and Variants
- Paxos has been criticized for its complexity, leading to several variations:
- Raft - A more understandable alternative designed to have stronger guarantees on leader election and log consistency.
- Multi-Paxos - An optimization for handling multiple instances of consensus with a single leader.
References
Related Topics