Three-Phase Commit: Non-Blocking Distributed Transactions
Learn how Three-Phase Commit (3PC) extends 2PC with a pre-commit phase, its assumptions, limitations, and when to use it.
Three-Phase Commit: Non-Blocking Distributed Transactions
Two-Phase Commit works when everyone cooperates. The coordinator stays up, participants respond, and the network behaves. But distributed systems do not always cooperate. When the coordinator crashes mid-transaction, participants can wait forever. This blocking problem is what Three-Phase Commit tries to solve.
I ran into this during a database migration. We had a distributed transaction spanning three data centers, and the coordinator crashed at exactly the wrong moment. Two participants thought the transaction was pending. One thought it had aborted. We spent hours untangling the state. 2PC is simple but fragile. 3PC is smarter but comes with its own costs.
The Blocking Problem in 2PC
Here’s the issue with 2PC. After voting yes in Phase 1, participants enter the prepared state. They hold locks and wait. And wait. And wait for the coordinator’s decision. If the coordinator crashes at this point, those participants are stuck. They cannot commit (maybe the coordinator decided to abort). They cannot abort (maybe the coordinator decided to commit). They just block.
graph TD
subgraph "2PC Coordinator Crash Scenario"
C[Coordinator]
P1[Participant 1 - PREPARED]
P2[Participant 2 - PREPARED]
C -->|Phase 1| P1
C -->|Phase 1| P2
P1 -->|YES| C
P2 -->|YES| C
C -.- X[CRASH]
X -.->|stuck| B1[BLOCKED]
X -.->|stuck| B2[BLOCKED]
end
This blocking is not just a performance issue. Locks sit held. Resources stay consumed. In worst cases, someone has to manually untangle things.
How 3PC Extends 2PC
3PC adds an extra phase between voting and committing. The idea is that 3PC is designed to be non-blocking under failure assumptions that are more realistic than 2PC’s assumptions.
Phase 1: CanCommit
The coordinator asks all participants if they can commit a transaction. This is identical to 2PC’s prepare phase.
sequenceDiagram
participant C as Coordinator
participant P1 as Participant 1
participant P2 as Participant 2
C->>P1: CanCommit?
C->>P2: CanCommit?
P1-->>C: Yes
P2-->>C: Yes
If any participant votes No or times out, the coordinator sends Abort. The transaction ends. No blocking at this stage.
Phase 2: PreCommit
If all participants vote Yes, the coordinator sends PreCommit to all participants. This is the new phase in 3PC.
sequenceDiagram
participant C as Coordinator
participant P1 as Participant 1
participant P2 as Participant 2
Note over C,P2: Phase 2: PreCommit
C->>P1: PreCommit
C->>P2: PreCommit
P1-->>C: ACK
P2-->>C: ACK
Once a participant receives PreCommit, it knows something: all participants voted Yes, and the coordinator is still alive (it managed to send PreCommit messages). This knowledge changes the failure semantics.
Phase 3: DoCommit
After receiving ACK from all participants, the coordinator sends DoCommit. Participants then finalize the transaction.
sequenceDiagram
participant C as Coordinator
participant P1 as Participant 1
participant P2 as Participant 2
Note over C,P2: Phase 3: DoCommit
C->>P1: DoCommit
C->>P2: DoCommit
P1-->>C: Committed
P2-->>C: Committed
Why 3PC Is Non-Blocking
Here’s what happens when the coordinator crashes during Phase 2.
With 2PC, a participant in the prepared state cannot decide if the coordinator dies. With 3PC, when a participant receives PreCommit, it knows every participant voted Yes. If the coordinator crashes after sending PreCommit, participants can safely complete the commit. They have enough information to decide.
graph TD
subgraph "3PC Coordinator Crash After PreCommit"
C[Coordinator]
P1[Participant 1]
P2[Participant 2]
C -->|PreCommit| P1
C -->|PreCommit| P2
P1 -->|ACK| C
P2 -->|ACK| C
C -.- X[CRASH]
P1 -->|I can commit| D1[DoCommit]
P2 -->|I can commit| D2[DoCommit]
end
A participant that receives PreCommit and times out waiting for DoCommit can safely commit. It knows all participants voted Yes and the coordinator was alive long enough to send PreCommit.
Critical Assumptions
3PC requires assumptions that are often violated in practice:
Network Synchrony Assumption
3PC assumes the network is eventually synchronous. This means messages will eventually be delivered, even if delayed. In a truly asynchronous network where messages can be lost indefinitely, 3PC cannot guarantee non-blocking behavior.
This is the same assumption that makes FLP impossibility result relevant. If the network can partition forever, no protocol can be both safe and live in all executions.
Bounded Node Failure
3PC assumes nodes do not fail forever. If a node crashes and never recovers, 3PC cannot complete that transaction. The protocol handles transient coordinator failures but not permanent participant failures.
No Partition After PreCommit
3PC guarantees non-blocking behavior when the coordinator crashes after PreCommit AND the network does not partition. If a network partition occurs at exactly the wrong moment, participants could diverge.
When 3PC Might Be Considered
3PC is rarely used in production, but here are scenarios where it could make sense:
-
Short-duration transactions on reliable networks: If your network is mostly reliable and transactions complete quickly, the extra phase overhead might be acceptable.
-
Systems requiring strict liveness: If blocking is unacceptable and your network assumptions match 3PC’s requirements, the protocol provides better liveness guarantees than 2PC.
-
Research and educational contexts: Understanding 3PC helps understand the trade-offs in distributed transaction protocols.
2PC vs 3PC vs Saga: A Comparison
| Aspect | 2PC | 3PC | Saga |
|---|---|---|---|
| Blocking | Yes, coordinator crash in prepared state | No (under assumptions) | No |
| Phases | 2 | 3 | Many (one per step) |
| Coordinator crash during prepared | Blocks participants | Participants can recover | No effect |
| Network assumptions | None (works async) | Eventual synchrony | None |
| Rollback on failure | Atomic | Atomic | Compensating transactions |
| Performance overhead | 2 round trips | 3 round trips | N round trips |
| Complexity | Low | Medium | High |
| Use case | Tight consistency | Tight consistency | Eventual consistency |
| Example systems | PostgreSQL, MySQL | Rarely used | AWS Step Functions, Temporal |
Why 3PC Is Rarely Used
Despite its theoretical advantages, 3PC is rarely used in production:
-
The assumptions are hard to meet. Network synchrony is not guaranteed in real systems. Wide-area networks especially can experience prolonged partitions.
-
The improvement is marginal. 3PC eliminates blocking only under specific failure scenarios. Most systems just use timeouts and manual intervention instead of adding 3PC’s complexity.
-
Saga pattern is often better. For long-running transactions, compensating transactions are more practical than trying to maintain locks across distributed participants.
-
The performance cost matters. The extra round trip hurts high-throughput systems. For most use cases, the blocking probability with 2PC is low enough that 3PC’s extra latency is hard to justify.
Implementing a Simple 3PC
Here is a simplified view of how 3PC coordinator logic works:
class ThreePhaseCommitCoordinator:
def __init__(self, participants):
self.participants = participants
self.state = "INIT"
def execute(self, transaction):
# Phase 1: CanCommit
votes = []
for p in self.participants:
vote = p.can_commit()
votes.append(vote)
if all(v == "YES" for v in votes):
# Phase 2: PreCommit
self.state = "PRECOMMIT"
for p in self.participants:
p.pre_commit()
# Phase 3: DoCommit
self.state = "COMMIT"
for p in self.participants:
p.do_commit()
else:
# Abort
self.state = "ABORT"
for p in self.participants:
p.abort()
The participant side follows a similar pattern with timeouts at each phase that enable recovery decisions.
Quick Recap
- 3PC adds a PreCommit phase between 2PC’s voting and commit phases
- The PreCommit phase lets participants recover when the coordinator crashes
- 3PC is non-blocking under assumptions of eventual network synchrony and bounded failures
- In practice, 3PC is rarely used because its assumptions are hard to meet
- Saga pattern is often preferred for long-running distributed transactions
- 2PC remains the most common protocol for short distributed transactions requiring atomicity
For more on distributed transactions, see Two-Phase Commit for the protocol that 3PC builds upon. To understand the broader consistency landscape, read Consistency Models. For handling failures without blocking, see the Saga Pattern and Outbox Pattern.
Three-Phase Commit solves 2PC’s blocking problem in theory. In practice, the assumptions required for 3PC to work are harder to guarantee than just dealing with 2PC’s rare blocking scenarios. Understanding the trade-offs helps you choose the right protocol for your specific requirements.
Category
Related Posts
Apache ZooKeeper: Consensus and Coordination
Explore ZooKeeper's Zab consensus protocol, hierarchical znodes, watches, leader election, and practical use cases for distributed coordination.
Distributed Systems Primer: Key Concepts for Modern Architecture
A practical introduction to distributed systems fundamentals. Learn about failure modes, replication strategies, consensus algorithms, and the core challenges of building distributed software.
etcd: Distributed Key-Value Store for Configuration
Deep dive into etcd architecture using Raft consensus, watches for reactive configuration, leader election patterns, and Kubernetes integration.