CQRS and Event Sourcing: Distributed Data Management Patterns
Learn about Command Query Responsibility Segregation and Event Sourcing patterns for managing distributed data in microservices architectures.
CQRS and Event Sourcing: Patterns for Distributed Data
Most database patterns assume the same model works for reading and writing. CRUD against a single table works fine when your system is simple. That assumption breaks down as you scale. Read-heavy workloads collide with write-heavy workloads. Reporting queries slow down transactional writes. Security gets tangled when the same model serves both purposes.
CQRS and Event Sourcing tackle these problems head-on. These are not new ideas, but they have become more relevant as teams move toward microservices and distributed systems where a monolithic database is no longer the right tool.
This post explains both patterns, shows how they work together, and helps you decide when the tradeoffs make sense.
What is CQRS
CQRS stands for Command Query Responsibility Segregation. The core idea is straightforward: use different models for writing data versus reading data.
In a typical application, one data model handles both operations:
-- Writing
INSERT INTO orders (customer_id, total, status) VALUES (1, 99.99, 'pending');
-- Reading
SELECT * FROM orders WHERE customer_id = 1;
With CQRS, you split this into two distinct models:
- Command model: Handles writes. Optimized for validating and persisting state changes.
- Query model: Handles reads. Optimized for answering specific questions efficiently.
The command model might store data in a normalized relational structure. The query model might use a denormalized format optimized for specific access patterns, like a separate read model for each screen in your application.
Why Separate Reads and Writes
Reads and writes have fundamentally different characteristics:
| Aspect | Writes | Reads |
|---|---|---|
| Pattern | Infrequent, discrete operations | Frequent, potentially complex queries |
| Data volume | Single record or small batch | Large result sets, aggregations |
| Optimization | Normalization, constraints | Denormalization, indexes |
| Timing | Immediate consistency needed | Can tolerate staleness |
When you try to optimize both on the same model, you end up compromising both. CQRS lets you optimize each side independently.
Basic CQRS Flow
graph LR
Command[Command] --> CommandHandler[Command Handler]
CommandHandler --> WriteStore[(Write Store)]
WriteStore -->|async projection| ReadStore[(Read Store)]
Query[Query] --> QueryHandler[Query Handler]
QueryHandler --> ReadStore
Commands flow to the command handler, which validates and persists changes to the write store. Read models are updated asynchronously from the write store through an event-driven projection mechanism.
Event Sourcing Fundamentals
Event sourcing changes how you think about persistence. Instead of storing current state, you store the sequence of events that led to that state.
Storing Events Instead of State
Traditional state storage overwrites values:
UPDATE accounts SET balance = 500 WHERE id = 1;
The old state is lost.
Event sourcing appends instead:
AccountCredited { account_id: 1, amount: 200, balance: 500, timestamp: 2026-03-24T10:00:00Z }
AccountCredited { account_id: 1, amount: 300, balance: 500, timestamp: 2026-03-24T11:00:00Z }
Every change becomes an immutable record. You derive the current balance by replaying events.
The Event Store
An event store is a specialized database designed for event sourcing. Core operations:
- Append events: Add new events to a stream
- Read stream: Retrieve all events for an aggregate in order
- Subscribe: Get notified when new events are added
The event store is the source of truth. Everything else is derived.
graph LR
E1[AccountOpened] --> Store[(Event Store)]
E2[Deposited] --> Store
E3[Withdrawn] --> Store
Store -->|replay| Snapshot[Current State]
Aggregates and Event Streams
In domain-driven design, an aggregate is a cluster of related objects treated as a single unit for data changes. Each aggregate has its own event stream.
For an order management system:
OrderAggregatestream contains:OrderCreated,ItemAdded,ItemRemoved,OrderConfirmed,OrderShipped,OrderCancelledInventoryAggregatestream contains:InventoryReserved,InventoryReleased,InventoryDamaged
Each stream is append-only. Events are immutable once written.
Benefits of Event Sourcing
Event sourcing solves several problems that are difficult to handle with state storage:
Complete audit trail. You have a log of every state change. Regulatory requirements, debugging, and understanding user behavior all become easier with the full history.
Temporal queries. “What was the account balance on March 1st at 3pm?” With event sourcing, you replay events up to that timestamp. With state storage, you typically do not have this data readily available.
Replay capability. If your read model has a bug, you fix it and rebuild from the event stream. No data is lost because the events are the source of truth.
Debugging and tracing. You can replay production events locally to reproduce bugs. The event log becomes a precise record of what happened.
Parallel projections. Multiple read models can be built from the same event stream independently. You can add new views of your data without modifying the write side.
Challenges of Event Sourcing
Event sourcing is not without friction:
Event schema evolution. Event schemas change over time. Old events must still deserialize correctly. You need a strategy for handling changes: upcasters, version numbers, or both.
Projection rebuild times. Replaying millions of events takes time. Large aggregates can become slow to reconstruct. Periodic snapshots help mitigate this.
Eventual consistency. Read models built from events lag behind writes. Your application must handle stale data.
Increased complexity. The mental model differs from traditional CRUD. Teams need time to adapt.
Event Stores
Standard relational databases are not well-suited for event sourcing without significant effort. You need specialized infrastructure designed for append-only event storage.
Kafka
Apache Kafka is frequently used as an event store for event sourcing. It provides durable, ordered, partitioned event streams with consumer group semantics.
Kafka excels at:
- High-throughput event streaming
- Event replay from any offset
- Multi-subscriber event consumption
- Partitioning for horizontal scaling
Kafka is not a true event store in the domain-driven design sense because it lacks aggregate-centric operations. It is an event streaming platform adapted for event sourcing patterns.
EventStoreDB
EventStoreDB is purpose-built for event sourcing. It understands aggregates, streams, and event versioning natively.
EventStoreDB provides:
- Aggregate-centric APIs
- Built-in projections
- Event versioning and upcasting support
- Subscription mechanisms for building read models
EventStoreDB is a better fit when event sourcing is the primary pattern and you want infrastructure designed specifically for it.
Other Options
- Postgres with append-only tables: Simple approach for low-volume scenarios
- Axon Server: Commercial event store with built-in CQRS support
- Marten: Event sourcing library for .NET using Postgres
The choice depends on your scale requirements, team expertise, and whether you need the broader Kafka ecosystem for other use cases.
Building Read Models with Projections
Projections build read models from events. A projection subscribes to event streams and updates a read model accordingly.
How Projections Work
graph LR
EventStore[(Event Store)] --> Projection[Projection]
Projection --> ReadModel[(Read Model)]
subgraph Projection Logic
P1[Handle AccountOpened]
P2[Handle Deposit]
P3[Handle Withdrawal]
end
Projection --> P1
Projection --> P2
Projection --> P3
When an event is written, the projection handler for that event type executes. It reads the event, updates the read model, and commits.
Projection Examples
Given these events:
OrderCreated { order_id: 123, customer_id: 456, items: [...] }
ItemAdded { order_id: 123, item_id: 789, quantity: 2 }
OrderConfirmed { order_id: 123 }
A “orders by customer” read model projection:
def handle_order_created(event):
read_model.insert({
'order_id': event.order_id,
'customer_id': event.customer_id,
'status': 'created',
'total_items': len(event.items)
})
def handle_item_added(event):
read_model.update(
{'order_id': event.order_id},
{'$inc': {'total_items': event.quantity}}
)
def handle_order_confirmed(event):
read_model.update(
{'order_id': event.order_id},
{'$set': {'status': 'confirmed'}}
)
The projection logic can be anything you can express in code. This flexibility is what makes projections powerful.
Command Handler Implementation
class CommandHandler:
"""Handles commands and emits events."""
def __init__(self, event_store):
self.event_store = event_store
def handle_create_order(self, command: CreateOrderCommand) -> OrderCreatedEvent:
# Validate business rules
if not command.items:
raise InvalidOrderError("Order must have at least one item")
# Create and persist event
event = OrderCreatedEvent(
order_id=str(uuid.uuid4()),
customer_id=command.customer_id,
items=command.items,
total=sum(item.price * item.quantity for item in command.items),
created_at=datetime.utcnow()
)
self.event_store.append(event)
return event
def handle_confirm_order(self, command: ConfirmOrderCommand) -> OrderConfirmedEvent:
# Load current state from event stream
events = self.event_store.get_stream(command.order_id)
order = OrderAggregate.reconstruct(events)
# Validate state transition
if order.status != "pending":
raise InvalidStateTransitionError(f"Cannot confirm order in status: {order.status}")
# Emit event
event = OrderConfirmedEvent(
order_id=command.order_id,
confirmed_at=datetime.utcnow()
)
self.event_store.append(event)
return event
Aggregate Reconstruction with Snapshots
class OrderAggregate:
"""Reconstructs aggregate state from events with snapshot support."""
SNAPSHOT_INTERVAL = 100 # Create snapshot every 100 events
@classmethod
def reconstruct(cls, events: list[Event], snapshot: Snapshot = None) -> "OrderAggregate":
aggregate = cls()
if snapshot:
# Start from snapshot
aggregate.order_id = snapshot.order_id
aggregate.customer_id = snapshot.customer_id
aggregate.status = snapshot.status
aggregate.items = snapshot.items
aggregate.total = snapshot.total
aggregate.version = snapshot.version
else:
# Replay from beginning
aggregate.version = 0
# Apply remaining events
start_version = snapshot.version if snapshot else 0
for event in events[start_version:]:
aggregate._apply(event)
aggregate.version += 1
return aggregate
def _apply(self, event: Event):
if isinstance(event, OrderCreated):
self.order_id = event.order_id
self.customer_id = event.customer_id
self.items = event.items
self.total = event.total
self.status = "created"
elif isinstance(event, ItemAdded):
self.items.append(event.item)
self.total += event.item.price * event.item.quantity
elif isinstance(event, OrderConfirmed):
self.status = "confirmed"
def should_snapshot(self) -> bool:
return self.version % self.SNAPSHOT_INTERVAL == 0
def create_snapshot(self) -> Snapshot:
return Snapshot(
order_id=self.order_id,
customer_id=self.customer_id,
status=self.status,
items=self.items,
total=self.total,
version=self.version
)
Idempotent Projection Handler
class IdempotentProjection:
"""Projection that handles duplicate events safely."""
def __init__(self, read_model_db):
self.read_model = read_model_db
self.processed_events = set() # Track processed event IDs
def handle(self, event: Event):
# Idempotency check
if event.event_id in self.processed_events:
return # Already processed, skip
# Process event
if isinstance(event, OrderCreated):
self._handle_order_created(event)
elif isinstance(event, OrderConfirmed):
self._handle_order_confirmed(event)
# Mark as processed
self.processed_events.add(event.event_id)
# Prune old event IDs to prevent memory growth
if len(self.processed_events) > 100000:
self._prune_processed_events()
def _handle_order_created(self, event):
# Upsert instead of insert
self.read_model.upsert(
{'order_id': event.order_id},
{
'$set': {
'customer_id': event.customer_id,
'status': 'created',
'created_at': event.created_at
},
'$setOnInsert': {'_id': event.order_id}
}
)
Because projections are derived from events, they can be rebuilt from scratch at any time. This is invaluable when:
- You find a bug in projection logic
- You need a new read model from historical data
- You want to change the read model schema
Rebuilding involves:
- Clear the existing read model
- Reset the projection checkpoint to the beginning
- Replay all events through the projection handler
The event store provides the replay capability.
Eventual Consistency Implications
CQRS and event sourcing introduce eventual consistency between the write side and read models. This has practical implications for how you build applications.
What Eventual Consistency Means
When you write data, the command completes successfully. The event is persisted. But the read model update happens asynchronously. For a brief window, reads return stale data.
The duration of this window varies:
- Local event processing: milliseconds
- Kafka consumer lag: milliseconds to seconds
- Cross-datacenter replication: seconds to minutes
You must design your application assuming reads can be stale.
User Experience Considerations
Users notice eventual consistency in specific scenarios:
- They submit an order and immediately check their order list
- They update their profile and refresh the page
- They delete an item and see it briefly before the page updates
Mitigation strategies:
- Optimistic UI: Show the expected state immediately, reconcile if needed
- Read-your-writes consistency: Route reads to a model that includes your own writes
- Refresh mechanisms: Provide manual refresh or auto-refresh for stale-sensitive views
Eventual Consistency Flow
graph LR
Command[Command] --> Validate[Validate Command]
Validate --> Persist[Persist to Event Store]
Persist --> Emit[Emit Event]
Emit --> Async[Async Projection]
Async --> Update[Update Read Model]
Update --> Query[Query Read Model]
Query --> User[Return to User]
Persist --> Query[Read immediately<br/>after write]
Production Failure Scenarios
| Failure | Impact | Mitigation |
|---|---|---|
| Projection lag too long | Users see stale data for extended period | Monitor projection lag; scale projection workers; alert on SLA breach |
| Event schema changes break projections | Read models stop updating | Implement upcasters; version event schemas; test schema evolution |
| Projection worker crashes mid-update | Partial update to read model | Idempotent projections; event replay capability; transactional outbox |
| Event store becomes unavailable | Commands cannot be processed | Use replicated event store; partition for availability |
| Snapshot too old | Long aggregate reconstruction time | Define snapshot frequency; monitor snapshot age |
| Concurrent projections conflict | Duplicate or lost updates in read model | Use optimistic concurrency; idempotent operations; event ordering guarantees |
| Kafka consumer group rebalance | Brief read model unavailability | Plan for rebalance delays; use consumer lag monitoring |
| Read model database unavailable | Reads fail but writes succeed | Read replicas; circuit breakers; graceful degradation |
Consistency Guarantees
CQRS does not mean you abandon consistency entirely:
- Commands validate against current state before accepting changes
- Events are written atomically within an aggregate
- Read models eventually catch up
The aggregate boundary defines your consistency scope. Within an aggregate, you have strong consistency. Across aggregates, you have eventual consistency.
When to Use CQRS and Event Sourcing
These patterns solve specific problems. They introduce complexity that must be justified.
Trade-off Comparison
| Criteria | Traditional CRUD | CQRS Only | CQRS + Event Sourcing |
|---|---|---|---|
| Complexity | Low | Medium | High |
| Read/Write Optimization | Limited (same model) | Independent optimization | Independent optimization |
| Audit Trail | Limited (last state only) | Limited (last state only) | Complete history |
| Temporal Queries | Not supported | Not supported | Full support |
| Projection Rebuild | N/A | Clear and rebuild | Replay events |
| Event Schema Evolution | N/A | N/A | Required |
| Infrastructure | Standard database | Write + read stores | Event store + projections |
| Team Learning Curve | Low | Medium | High |
| Debugging | Standard | Standard | Event replay |
Good Fits
CQRS and event sourcing work well when:
Audit requirements are strict. Financial systems, healthcare records, compliance-heavy domains. The immutable event log satisfies audit requirements naturally.
Temporal queries are frequent. “Show me all state changes for account X in Q3 2025.” With events, this is straightforward. With state storage, you need special infrastructure.
Multiple read models exist. Different consumers need different views of the same data. Reporting, analytics, operational dashboards, user-facing views. Each can have its own optimized read model.
Complex domain with rich behavior. When the write side involves significant business logic, a separate command model lets you express that logic clearly without mixing in read concerns.
Distributed systems with event-driven integration. Microservices that communicate through events naturally benefit from event sourcing. The events become the integration contract between services.
Poor Fits
These patterns are likely overkill when:
Simple CRUD dominates. If your application is mostly create, read, update, delete with straightforward queries, CQRS adds complexity without benefit.
Strong consistency is required everywhere. If the business cannot tolerate stale reads in any scenario, eventual consistency creates friction.
Small team with limited experience. The operational overhead is real. You need expertise in event store administration, projection development, and debugging distributed systems.
Low latency requirements are strict. The additional async steps in CQRS introduce latency. If sub-millisecond response times are critical, the added hops hurt.
Decision Framework
Before adopting CQRS and event sourcing:
- Do you have a genuine need for event replay or temporal queries?
- Do you need multiple, differently-optimized read models?
- Is the team prepared for the operational complexity?
- Is eventual consistency acceptable for your use cases?
If you answered yes to the first two questions and are prepared for the complexity, these patterns will likely pay off.
Related Patterns
CQRS and event sourcing connect to other patterns in distributed systems.
Event sourcing pairs naturally with event-driven architecture. Events are the communication mechanism, and event sourcing provides the persistence strategy for those events.
The saga pattern handles distributed transactions across microservices. Sagas work well with event sourcing because each saga step can emit events that trigger the next step.
For distributed transactions across services, see distributed transactions which covers two-phase commit and related approaches.
The database per service pattern supports CQRS because each side can use a database optimized for its workload. The write side might use a relational database for transactional integrity. The read side might use a document store or columnar database for query performance.
Quick Recap
Key Points
- CQRS separates read and write models for independent optimization
- Event sourcing stores events instead of state, enabling replay and audit
- Event stores like Kafka and EventStoreDB provide the persistence layer
- Projections build read models from events asynchronously
- Eventual consistency is a fundamental tradeoff
- These patterns suit audit-heavy, temporal-query-rich, and multi-read-model domains
- They add complexity and require careful consideration before adoption
Pre-Deployment Checklist
- [ ] Event store selected and operational (Kafka, EventStoreDB, or other)
- [ ] Aggregate boundaries defined and documented
- [ ] Event schema versioning strategy implemented (upcasters or versioned types)
- [ ] Projection infrastructure in place for read model updates
- [ ] Snapshot strategy defined for long-running aggregates
- [ ] Monitoring for projection lag and event consumption delays
- [ ] Read model rebuild procedure documented and tested
- [ ] Eventual consistency handling documented for UI teams
- [ ] Security model defined for event store access
- [ ] Schema evolution testing implemented
Conclusion
CQRS and event sourcing are powerful patterns for managing complexity in distributed systems. They separate concerns that are muddled in traditional architectures, enable audit and replay capabilities that are otherwise difficult to achieve, and support multiple optimized read models from a single source of truth.
The benefits are real, but so is the cost. These patterns require new mental models, additional infrastructure, and team expertise. Start by understanding the problems they solve and whether your system faces those problems. Adopt incrementally where the fit is clear.
For most applications, a well-designed monolith with a single database is sufficient. For systems with demanding audit requirements, complex domain behavior, or genuine microservices boundaries, CQRS and event sourcing provide tools that are worth the learning curve.
Category
Related Posts
CQRS Pattern
Separate read and write models. Command vs query models, eventual consistency implications, event sourcing integration, and when CQRS makes sense.
Event Sourcing
Storing state changes as immutable events. Event store implementation, event replay, schema evolution, and comparison with traditional CRUD approaches.
Amazon's Architecture: Lessons from the Pioneer of Microservices
Learn how Amazon pioneered service-oriented architecture, the famous 'two-pizza team' rule, and how they built the foundation for AWS.