Synchronous Replication: Guaranteeing Data Consistency Across Nodes
Explore synchronous replication patterns in distributed databases. Learn about the write-ahead log shipping, Quorum-based replication, and how synchronous replication ensures zero RPO in production systems.
Synchronous Replication: Guaranteeing Data Consistency Across Nodes
Every write to your database has to go somewhere. With synchronous replication, that write does not confirm until replicas have confirmed receipt. This trades latency for certainty. When a primary fails, no data is lost because every committed write lives on multiple nodes.
I learned to appreciate synchronous replication the hard way. We ran async replication across three data centers. The primary went down during a network partition. When we brought up the failover, two hours of transactions were gone. The replicas had not caught up. That incident taught me that async is a business decision, not a technical one.
How Synchronous Replication Works
Synchronous replication blocks the commit operation until replicas confirm they have written the data. The primary sends the transaction to replicas, waits for acknowledgment, then returns success to the client.
sequenceDiagram
participant Client
participant Primary
participant Replica1
participant Replica2
Client->>Primary: BEGIN; UPDATE...; COMMIT
Primary->>Replica1: Send WAL entry
Primary->>Replica2: Send WAL entry
Replica1-->>Primary: ACK received
Replica2-->>Primary: ACK received
Primary-->>Client: COMMIT confirmed
The protocol varies. Some systems use write-ahead log (WAL) shipping. Others use statement-based replication. PostgreSQL’s synchronous replication uses WAL, while MySQL Group Replication uses a certification-based approach.
The Cost: Latency
Every synchronous write waits for a network round trip to at least one replica. In a single-datacenter setup, this latency might be 1-5ms. Cross-datacenter latency jumps to 20-100ms or more depending on geography.
# Synchronous replication adds latency
start = time.time()
result = db.execute("UPDATE accounts SET balance = ? WHERE id = ?", new_balance, account_id)
# This blocks until replica confirms
end = time.time()
print(f"Write latency: {end - start}ms") # Typically 5-50ms with sync repl
This latency is the reason most systems default to async. For critical data, the durability guarantee is worth the wait.
Quorum-Based Replication
Modern synchronous replication often uses quorum-based approaches. With a quorum, you write to multiple nodes but only wait for a majority to confirm.
graph TD
subgraph Quorum Write
Client1[Client] --> Write[Write to N nodes]
Write --> Q{Quorum?}
Q -->|Write confirmed| Success[Success to Client]
end
subgraph "3 Node Cluster"
N1[(Node 1)]
N2[(Node 2)]
N3[(Node 3)]
end
Write --> N1
Write --> N2
Write --> N3
N1 --> Q
N2 --> Q
N3 --> Q
With 3 nodes, you need 2 confirmations (majority). With 5 nodes, you need 3. The quorum provides fault tolerance: you can lose floor(N/2) nodes and still have a majority to confirm writes.
# Quorum calculation
def required_quorum(total_nodes):
return (total_nodes // 2) + 1
# 3 nodes require 2 acks
print(required_quorum(3)) # 2
# 5 nodes require 3 acks
print(required_quorum(5)) # 3
# 7 nodes require 4 acks
print(required_quorum(7)) # 4
Google Spanner uses this approach. So does CockroachDB. Both provide strong consistency across globally distributed nodes.
Write-Your-Write Consistency
Synchronous replication gives you write-your-write consistency by default. If a client writes data and then immediately reads it, the read will see that write. No special handling needed.
-- With synchronous replication, this always works
BEGIN;
UPDATE users SET email = 'new@example.com' WHERE id = 1;
COMMIT;
-- Immediate read returns the new email
SELECT email FROM users WHERE id = 1;
-- Returns: 'new@example.com'
With async replication, that immediate read might hit a replica that has not yet received the update. You would see the old email.
Single-Leader vs Multi-Leader
Synchronous replication works with both single-leader and multi-leader topologies.
Single-leader is simpler. All writes go to one primary. The primary coordinates synchronous replication to followers. PostgreSQL streaming replication in synchronous mode uses this topology.
Multi-leader allows writes to any node. Each leader synchronously replicates to other leaders. This is more complex because concurrent writes to the same row can conflict.
graph TD
subgraph Single Leader
L1[Leader] --> F1[Follower 1]
L1 --> F2[Follower 2]
end
subgraph Multi Leader
L3[Leader A] <--> L4[Leader B]
L4 <--> L5[Leader C]
L5 <--> L3
end
Most production systems use single-leader synchronous replication. Multi-leader conflicts are genuinely hard to handle correctly.
Configuring Synchronous Replication
PostgreSQL
-- Enable synchronous replication
ALTER SYSTEM SET synchronous_commit = on;
ALTER SYSTEM SET synchronous_standby_names = 'replica1,replica2';
-- Verify status
SELECT client_addr, state, sent_lsn, write_lsn,
flush_lsn, replay_lsn,
sent_lsn - replay_lsn AS replication_lag
FROM pg_stat_replication;
MySQL Group Replication
-- Configure group replication as synchronous
SET GLOBAL group_replication_single_primary_mode = OFF;
SET GLOBAL group_replication_enforce_update_everywhere_checks = ON;
START GROUP_REPLICATION;
CockroachDB
CockroachDB uses the Raft consensus algorithm for synchronous replication. By default, it writes to 3 nodes and waits for 2 confirmations.
-- CockroachDB uses Range replication by default
-- 3-way synchronous replication is built in
SHOW EXPERIMENTAL_RANGES FROM TABLE users;
When Synchronous Replication Makes Sense
Synchronous replication is not always the answer. Use it when:
- Financial transactions: Ledgers, payments, anything where data loss is unacceptable
- Inventory systems: You cannot oversell because a replica was behind
- Regulatory requirements: Some compliance frameworks mandate zero data loss (RPO = 0)
- Critical metadata: User accounts, permissions, access control lists
# Example: Synchronous replication for financial transactions
def transfer_funds(from_account, to_account, amount):
with db.transaction():
# These writes confirm synchronously
db.execute("UPDATE accounts SET balance = balance - ? WHERE id = ?",
amount, from_account)
db.execute("UPDATE accounts SET balance = balance + ? WHERE id = ?",
amount, to_account)
# Transaction only commits when replicas confirm
Semi-Synchronous Replication
Pure synchronous replication can block forever if a replica goes down. Semi-synchronous replication offers a middle ground: wait for at least one replica, but do not block if none respond.
graph TD
P[Primary] --> R1[Replica 1]
P --> R2[Replica 2]
R1 -->|confirms| P
P -->|after 1 ack| Client[Client notified]
R2 -.->|timeout| P
PostgreSQL supports this via the synchronous_commit = remote_apply setting. MySQL has similar functionality with semi-sync replication plugins.
Synchronous Replication Failure Flow
Understanding what happens when components fail is critical for designing resilient systems.
flowchart TD
Start[Write Request] --> Primary[Primary receives write]
Primary --> Send1{Send to Replica 1}
Primary --> Send2{Send to Replica 2}
Send1 -->|ACK received| Wait2{Wait for Replica 2}
Send2 -->|ACK received| Wait1{Wait for Replica 1}
Send1 -->|timeout| Retry1{Retry timeout?}
Send2 -->|timeout| Retry2{Retry timeout?}
Retry1 -->|exceeded| Block[Primary blocks write]
Retry2 -->|exceeded| Block
Wait2 -->|both acks| Success[Write confirmed to client]
Wait1 -->|both acks| Success
Block -->|Replica recovers| Resume[Resume waiting]
Block -->|replica down long| FailWrite[Write fails - transaction rejected]
Failure scenarios:
| Scenario | Behavior | Impact |
|---|---|---|
| One replica times out | Primary blocks until timeout exceeds threshold | Write latency increases |
| Both replicas timeout | Write fails, transaction rejected | Client must retry |
| Primary fails before commit | No data loss (no commit until replicas confirm) | Automatic failover to replica |
| Network partition | Primary cannot reach quorum | Primary blocks or fails |
Trade-Off Comparison: Sync vs Semi-Sync vs Async
| Dimension | Synchronous | Semi-Synchronous | Asynchronous |
|---|---|---|---|
| Durability Guarantee | Full (all replicas confirm) | Partial (at least one) | Minimal (primary only) |
| Write Latency | Highest (waits for all) | Medium (waits for one) | Lowest (no wait) |
| Blocking Risk | High (any replica down) | Low (only primary down) | None |
| Data Loss Risk | None | Minimal | Possible |
| Availability | Lower | Medium | Highest |
| Typical RPO | Zero | Near-zero | Lag-dependent |
| Cross-DC Support | Poor (high latency) | Moderate | Excellent |
Monitoring Synchronous Replication
Watch these metrics:
-- PostgreSQL: Check replication health
SELECT client_addr, state, sent_lsn, write_lsn,
flush_lsn, replay_lsn,
replay_lag,
flush_lag,
write_lag
FROM pg_stat_replication;
-- MySQL: Group Replication status
SELECT MEMBER_HOST, MEMBER_STATE, MEMBER_ROLE
FROM performance_schema.replication_group_members;
Key metrics to track:
| Metric | Warning | Critical |
|---|---|---|
| Replication lag | > 1s | > 10s |
| Replica state | — | off “streaming” |
| Disk usage | > 70% | > 85% |
| Connection count | > 80% max | > 95% max |
Split-Brain Prevention
Synchronous replication helps prevent split-brain scenarios. If the primary fails and a replica promotes, that replica has all committed data because writes only confirmed after replicas received them.
But you still need quorum-based failover logic. Multiple replicas might think the primary is down simultaneously. Use consensus-based leader election to ensure only one replica promotes.
graph TD
M[Monitor] --> P1{Primary alive?}
P1 -->|no| E[Election]
E --> Q{Quorum agrees?}
Q -->|yes| P2[Promote replica]
Q -->|no| E
P2 --> NM[New Primary]
Common Pitfalls
-
Configuring sync with only one replica: If you have synchronous replication configured but only one replica, losing that replica blocks all writes. Always have at least two replicas for synchronous mode.
-
Mixing sync and async modes: Some databases let you configure which replicas use sync vs async. Mixing modes creates confusing behavior where some data is durable and some is not.
-
Ignoring latency during peak load: During high write throughput, synchronous replication latency compounds. Test under load, not just idle conditions.
-
Not testing failover: If you have never tested synchronous failover, you do not know if it works correctly. Schedule regular failover drills.
-
Geographic distribution: Synchronous replication across continents is usually impractical. If you need strong consistency with geographic distribution, consider synchronizing to a nearby replica and async-replicating to distant ones.
Quick Recap
- Synchronous replication blocks until replicas confirm writes
- Latency cost: expect 1-50ms per write depending on topology
- Quorum-based replication provides fault tolerance with majority confirmation
- Best for: financial data, inventory, anything requiring zero data loss
- Monitor replication lag, connection counts, and disk usage
- Test failover procedures before you need them
For more on replication strategies, see Database Replication for a broader overview. To understand the consistency trade-offs, read Consistency Models. If you are evaluating replication against other availability patterns, see Availability Patterns.
Synchronous replication is not free, but for data that cannot be lost, it is worth the latency cost. The key is knowing which data truly needs it and configuring your system accordingly.
Category
Related Posts
Asynchronous Replication: Speed and Availability at Scale
Learn how asynchronous replication works in distributed databases, including eventual consistency implications, lag monitoring, and practical use cases where speed outweighs strict consistency.
Google Spanner: Globally Distributed SQL at Scale
Google Spanner architecture combining relational model with horizontal scalability, TrueTime API for global consistency, and F1 database implementation.
Consistency Models in Distributed Systems: A Complete Guide
Learn about strong, weak, eventual, and causal consistency models. Understand read-your-writes, monotonic reads, and how to choose the right model for your system.