Exactly-Once Delivery: The Elusive Guarantee

Explore exactly-once semantics in distributed messaging - why it's hard, how Kafka and SQS approach it, and practical patterns for deduplication.

published: March 24, 2026 reading time: 26 min read author: GeekWorkBench updated: June 17, 2026

Quick Summary

Exactly-once delivery is harder than it sounds because duplicates can creep in at the broker, across the network, or on the consumer side—and most systems actually deliver at-least-once anyway. This post shows how Kafka transactions, SQS FIFO deduplication, and Pub/Sub's retry windows each tackle exactly-once differently, then gets into the practical patterns that actually work: idempotency keys, deduplication tables, and the fact that some operations are naturally idempotent so you don't need external stores at all. The takeaway for most teams is that at-least-once delivery combined with idempotent consumers is usually the right tradeoff between complexity and correctness.

Introduction

Distributed systems fail in ways that are hard to predict. A producer sends a message. The broker receives it and acknowledges. The acknowledgment gets lost in transit. The producer sends again. Now the broker has two copies. Or the consumer processes the message, crashes before committing its offset, and on restart receives the same message again.

sequenceDiagram
    Producer->Broker: Send(msg)
    Broker->Producer: ACK
    Note over Producer,Broker: ACK lost in transit
    Producer->Broker: Send(msg)
    Broker->Producer: ACK
    Note over Broker,Consumer: Two copies in broker

Any retry between producer, broker, or consumer can create duplicates. At-least-once is straightforward. Exactly-once requires coordinating all three.

Delivery Semantics

Message systems typically offer one of three guarantees.

At-least-once means messages may be redelivered but never lost. The consumer acknowledges after processing. If processing succeeds but the ack fails, the message comes back. This is the most common setting, but it requires idempotent consumers.

At-most-once means messages may be lost but never duplicated. The consumer acknowledges before processing. If the consumer crashes after ack but before processing, the message is gone. This is rare and usually a poor trade-off.

Exactly-once means each message is processed precisely once. This requires either a transaction spanning the message and its side effects, or idempotent processing with deduplication.

stateDiagram-v2
    [*] --> AtMostOnce: Ack before processing
    AtMostOnce --> AtLeastOnce: Ack after processing
    AtLeastOnce --> ExactlyOnce: Add deduplication/idempotency

    note right of AtMostOnce
        Risk: Message loss
        Use case: Log aggregation
    end note

    note right of AtLeastOnce
        Risk: Duplicates
        Use case: Most applications
    end note

    note right of ExactlyOnce
        Risk: Higher latency
        Use case: Financial transactions
    end note

The delivery guarantee spectrum:

Guarantee	Duplicates	Loss	Complexity	Typical Use
At-most-once	No	Yes	Low	Critical logging
At-least-once	Yes	No	Medium	Most applications
Exactly-once	No	No	High	Financial transactions

Kafka’s Exactly-Once Semantics

Kafka offers exactly-once semantics within the Kafka ecosystem. It can guarantee that messages produced to a topic are consumed exactly once by consumers within Kafka.

Kafka achieves this through transactions introduced in version 0.11. When you enable transactions, Kafka writes messages to partitions as part of a transaction. Consumer offsets are committed as part of the same transaction, tying message consumption to offset commit.

flowchart LR
    Producer-->|produce| Kafka[Kafka Cluster]
    Kafka-->|consume| Consumer[Consumer]
    Consumer-->|commit offset| Kafka
    Kafka-->|write tx| Partition[Partition]
    Consumer-->|write| DB[(Database)]
    Note over Producer,DB: Atomic if both in same transaction

For exactly-once processing outside Kafka, say writing to a database, you need to handle it yourself. The consumer must write the offset and the database update in a single transaction, or rely on idempotent semantics to deduplicate.

For more on Kafka’s partitioning and consumer groups, see Apache Kafka.

AWS SQS FIFO

SQS offers two queue types. Standard queues provide at-least-once delivery. FIFO queues provide exactly-once processing and preserve order.

FIFO queues use content-based deduplication to discard duplicate messages within a 5-minute window. Send the same message twice within 5 minutes and SQS drops the second copy.

sqs.send_message(
    QueueUrl=queue_url,
    MessageBody=json.dumps({'order_id': '12345', 'action': 'ship'}),
    MessageDeduplicationId='order-12345-ship',
    MessageGroupId='orders'
)

SQS applies a hash to the message body to derive the deduplication ID when you enable content-based deduplication. Identical bodies produce identical deduplication IDs.

SQS FIFO can guarantee exactly-once within the queue itself, but your consumer still needs to be idempotent. If your consumer times out and does not delete the message within the visibility timeout window, SQS will deliver it again.

For more on SQS features, see AWS SQS and SNS.

RabbitMQ and Idempotent Consumers

RabbitMQ does not support exactly-once semantics out of the box. It gives you at-least-once with manual acknowledgments. Your consumer must make itself idempotent.

Idempotency means processing the same message multiple times has the same effect as processing it once. Here is how to actually do it.

Deduplication table: Store processed message IDs in a database. Before processing, check if the ID exists.

INSERT INTO processed_messages (msg_id, processed_at)
VALUES ('msg-123', NOW())
ON CONFLICT (msg_id) DO NOTHING;

-- If insert succeeds, process the message
-- If insert fails (duplicate), skip processing

Natural idempotency: Some operations are naturally idempotent. Setting a field to the same value twice leaves the database in the same state. UPDATE users SET status = 'active' WHERE id = 1 can run twice and nothing changes. Spotting these operations and using them cuts down on overhead.

Message-level transactions: Combine the message acknowledgment with your database transaction. If the database update fails, the message is not acknowledged and comes back.

For more on RabbitMQ’s acknowledgment model, see RabbitMQ.

Practical Patterns

Idempotency Keys

Every message carries a unique ID or you derive one from content. The consumer stores processed IDs. When a message arrives, check if its ID was already processed and skip if so. This approach works with any message broker.

def process_message(msg):
    msg_id = msg.headers.get('x-idempotency-key') or hash(msg.body)

    if redis.exists(f"processed:{msg_id}"):
        return

    do_work(msg)
    redis.setex(f"processed:{msg_id}", ttl=86400, value='1')

Set the TTL longer than your maximum retry window to prevent unbounded growth.

Outbox Pattern

The outbox pattern addresses the dual-write problem. Instead of writing to the database and publishing a message in two separate steps, you write both to the database in a single transaction.

flowchart TD
    Service-->|1 write| DB[(Outbox Table)]
    DB-->|2 read| Relay[Outbox Relay]
    Relay-->|3 publish| Broker[Message Broker]
    Note over Service,Broker: Atomic write ensures consistency

The outbox table lives in the same database as your business data. You write the business record and the outbox event in the same transaction. A separate process polls the outbox and publishes to the broker.

The key guarantee: if the business record exists, the event will eventually be published. The event is never published without the corresponding record.

For more on these patterns, see Event-Driven Architecture and Asynchronous Communication.

Transactional Outbox with CDC

In high-throughput systems, polling the outbox adds latency. Change Data Capture tools like Debezium tail the database transaction log and publish when records change. This is faster but adds operational complexity.

The pattern stays the same: write to the outbox atomically with business data, let CDC handle propagation.

When Exactly-Once Matters

You do not always need exactly-once. For many use cases, at-least-once with idempotent consumers is sufficient and simpler. A log aggregator can handle duplicates. A notification system can usually tolerate a duplicate email if it is not critical.

Exactly-once matters for financial transactions (duplicate charges are serious), inventory operations (double decrements overstate deductions), unique constraints (double user creation may violate uniqueness), and external API calls (retrying a non-idempotent API creates duplicate side effects).

Measure the cost of duplicates in your system before implementing exactly-once. The complexity is real and the operational burden is higher.

Saga Pattern vs Outbox Pattern

The saga pattern and outbox pattern solve related but distinct problems. Both address exactly-once in distributed systems, but they handle different failure scenarios.

The saga pattern coordinates multiple services in a distributed transaction. When step 3 fails, saga executes compensating transactions to undo steps 1 and 2. This handles partial failure without distributed locks.

The outbox pattern handles the dual-write problem within a single service. Instead of writing to the database and publishing a message in two separate steps, you write both in a single database transaction. If the business record exists, the event will eventually be published.

Aspect	Saga Pattern	Outbox Pattern
Problem solved	Multi-service distributed transactions	Single-service dual-write consistency
Failure handling	Compensating transactions	Eventual publication
Idempotency needed	Yes (compensating transactions must be idempotent)	Yes (outbox relay may retry)
Complexity	High (requires careful compensation design)	Medium (requires polling outbox)
Consistency model	Eventual	Stronger eventual (within a single service)
Use when	Multiple services must update in coordinated way	Service needs to publish events atomically with database changes

Use the saga pattern when you have multiple independent services each making state changes that must be coordinated. Use the outbox pattern when a single service needs to publish events reliably without a distributed transaction coordinator.

For more on saga, see Saga Pattern. For outbox, see Outbox Pattern.

Idempotent Consumer with Redis Deduplication

Redis works well as a fast deduplication store for idempotent consumers. Store processed message IDs with a TTL longer than your retry window:

import redis
import json
import hashlib

class RedisIdempotentConsumer:
    def __init__(self, redis_client: redis.Redis, ttl_seconds: int = 86400):
        self.redis = redis_client
        self.ttl = ttl_seconds

    def _get_dedup_key(self, message) -> str:
        """Derive a unique key from the message."""
        msg_id = message.get('message_id')
        if msg_id:
            return f"dedup:{msg_id}"

        # Fallback: hash the message content
        content = json.dumps(message, sort_keys=True)
        content_hash = hashlib.sha256(content.encode()).hexdigest()[:16]
        return f"dedup:hash:{content_hash}"

    def is_duplicate(self, message) -> bool:
        """Check if this message was already processed."""
        key = self._get_dedup_key(message)
        return self.redis.exists(key) == 1

    def mark_processed(self, message) -> None:
        """Mark message as processed with TTL."""
        key = self._get_dedup_key(message)
        self.redis.setex(key, self.ttl, '1')

    def process(self, message: dict) -> bool:
        """
        Process message idempotently.
        Returns True if processed, False if duplicate.
        """
        if self.is_duplicate(message):
            return False

        # Process the actual work
        do_work(message)

        # Mark as processed
        self.mark_processed(message)
        return True

# Usage in a consumer loop
consumer = RedisIdempotentConsumer(redis_client, ttl_seconds=86400)

for message in receive_messages():
    if consumer.process(message):
        print(f"Processed message {message['message_id']}")
    else:
        print(f"Skipped duplicate {message['message_id']}")

Set the TTL longer than your maximum retry window. If a message might be retried for up to 1 hour due to broker retry policies, set TTL to 2 hours or more. If you process millions of messages per day, watch Redis memory usage.

Kafka Exactly-Once with External Systems

Kafka transactions guarantee exactly-once within the Kafka ecosystem. When writing to external systems like databases, you need additional handling.

Kafka Transactions with JDBC

The Kafka JDBC sink connector takes a different approach to exactly-once than the outbox pattern. Rather than polling an outbox table, it uses Kafka’s transactional API to co-commit the consumer offset and the database write in the same transaction. When the connector processes a message, it writes to the database and records the offset together. If the database write fails, the offset is not committed, and Kafka redelivers the message on the next poll.

The connector manages a consumer group offset table in the database alongside your business data. That offset table is the commit boundary. The configuration:

# Kafka Connect JDBC sink configuration for exactly-once
transforms=insertIdempotenceKey
transforms.insertIdempotenceKey.type=org.apache.kafka.connect.transformes.ValueToKey
transforms.insertIdempotenceKey.fields=id

# Enable exactly-once for the sink connector
sasl.mechanism=OAUTHBEARER
security.protocol=SASL_SSL
# Required for exactly-once in Kafka Connect
producer.acks=all
producer.enable.idempotence=true

producer.enable.idempotence=true with producer.acks=all forces the producer to wait for all replicas before committing, and prevents duplicate sends from retries. The JDBC connector co-commits the offset and the database write in one transaction.

The timeout mismatch problem trips people up here. If your database transaction times out but Kafka has already committed the offset, you end up with a message marked consumed that never reached the database. Set your database statement timeout to at least 2x Kafka’s transaction.timeout.ms. If Kafka defaults to 60 seconds, give your database 120 seconds or more.

For Kafka Streams, you can often skip exactly-once entirely by making processing deterministic. A word count application that recomputes from the same input produces the same output, even with at-least-once reprocessing. If your count operation is associative and commutative, exactly-once is unnecessary complexity.

Google Pub/Sub Exactly-Once

Pub/Sub offers exactly-once delivery with a 7-day retry window. It deduplicates messages using publisher-supplied message IDs:

from google.cloud import pubsub_v1

publisher = pubsub_v1.PublisherClient()

# Publish with explicit message ID (recommended)
future = publisher.publish(
    'projects/my-project/topics/my-topic',
    data=b'event data',
    idempotency_key='my-unique-idempotency-key'
)

The idempotency_key parameter enables exactly-once delivery on the publish side. If you publish with the same key multiple times, Pub/Sub delivers only one message.

On the subscriber side, Pub/Sub acknowledges automatically after processing. If your subscriber crashes and restarts, messages are redelivered within the ack deadline window:

subscriber = pubsub_v1.SubscriberClient()
subscription_path = 'projects/my-project/subscriptions/my-sub'

def callback(message):
    try:
        process(message.data)
        message.ack()
    except Exception:
        # Nack and let Pub/Sub redeliver
        message.nack()

subscriber.subscribe(subscription_path, callback=callback)

Pub/Sub exactly-once works when:

Publisher provides an idempotency key (or uses automatic deduplication based on message content)
Subscriber acknowledges within the ack deadline
Subscriber is idempotent (safe to process the same message twice)

Pub/Sub does not provide exactly-once if the subscriber crashes after processing but before acknowledging. In that case, the message is redelivered and must be idempotent.

Comparing Broker Support

Broker	Exactly-Once Support	Notes
Kafka	Yes (transactions)	Within Kafka ecosystem
SQS FIFO	Yes (deduplication)	Within SQS, 5-min window
RabbitMQ	No (build it yourself)	At-least-once with idempotency
Google Pub/Sub	Yes	Delivery guarantee with retry window

Trade-off Comparison Tables

Idempotency Implementation Strategies

Picking an idempotency approach means balancing latency, storage, and complexity against your message volume. This table shows each strategy across those dimensions.

Strategy	Complexity	Latency Overhead	Storage Required	Best For
Deduplication table	Medium	1-2 DB round trips	Grows with message volume	High-value transactions
Redis TTL keys	Low	1-2 Redis ops	Keys with TTL	High-throughput, short retry windows
Natural idempotency	Low	None	None	Operations that set deterministic values
Message-level transactions	High	Transaction overhead	None	When message ack must align with DB commit

Message Broker Exactly-Once Comparison

Exactly-once support varies across brokers, and the guarantees are not equivalent. This table shows the scope, deduplication window, and whether your consumer still needs idempotency logic.

Broker	Exactly-Once Scope	Deduplication Window	Consumer Idempotency Required?
Kafka	Within Kafka ecosystem only	N/A (transactional)	Yes for external systems
SQS FIFO	Within SQS, queue-level	5 minutes	Yes, after visibility timeout
RabbitMQ	Not supported natively	N/A	Yes, mandatory
Google Pub/Sub	Delivery guarantee with retry	7 days	Yes, after crash before ack

Outbox Pattern Deployment Options

The outbox pattern works in a few different ways, each with different latency and operational costs. This table compares polling, CDC, and hybrid approaches.

Approach	Latency	Complexity	Consistency Guarantee	Operational Overhead
Polling outbox	Higher (poll interval)	Low	Eventual (within poll window)	Low
CDC (Debezium)	Low (log-based)	High	Near real-time	High (additional infrastructure)
Polling + CDC	Flexible	Medium	Tunable	Medium

Saga vs Outbox Decision Matrix

Saga and outbox overlap but suit different architectures. Use this matrix to pick the right pattern based on how many services are involved and what consistency guarantees you need.

Condition	Use Saga Pattern	Use Outbox Pattern
Multi-service distributed transaction	Yes	No
Single-service dual-write problem	No	Yes
Cross-service consistency required	Yes	No
Event publication alongside business data	No	Yes
Compensating transactions available	Yes	No
Low operational complexity desired	No	Yes

Production Failure Scenarios

Scenario 1: The Payment Processor Crash

A payment service processes an order and publishes a “payment successful” event to an SQS FIFO queue. The consumer receives the message, calls the payment gateway, updates the order status, but crashes before calling DeleteMessage on SQS. The visibility timeout passes. Another consumer instance picks up the same message and processes the payment again.

The failure window is the visibility timeout, defaulting to 30 seconds but configurable up to 12 hours. If processing takes longer than the timeout, SQS redelivers the message before your consumer has deleted it. This is a split-brain scenario that happens more often than people expect when running multiple consumer instances.

Idempotency is not optional here. A longer visibility timeout only delays the problem. The only real fix is making your payment consumer recognize that order X has already been paid and skip the duplicate.

Store order_id + payment_status + processed_at in a payments table with a unique constraint. Before processing, check if a record exists with status completed for this order ID. If it does, acknowledge and move on. If not, process the payment and insert the record atomically.

On the visibility timeout itself: set it to at least 2x your maximum processing time. If your payment gateway calls take up to 10 seconds, go with 30 seconds minimum. More is fine, but remember that messages are invisible to all consumers during this window, which can back up processing under load.

Scenario 2: The Kafka Transaction Timeout

A producer writes to Kafka with transactions enabled. The transaction commits on the Kafka side, but the database write that was part of the same transaction times out. The Kafka offset is committed. The downstream database update never happens. The offset says the message was processed, but the side effect did not occur.

The root cause is a timeout mismatch. Kafka’s default transaction timeout is 60 seconds. PostgreSQL might have a 30-second statement timeout. When the database write hits 30 seconds, PostgreSQL cancels the statement, rolls back the transaction, but Kafka has already committed the offset. The message is consumed. The side effect is gone.

Fix the timeout ordering: set your database statement timeout to at least 2x Kafka’s transaction.timeout.ms. If Kafka is at 60 seconds, give your database 120 seconds or more. This gives the database enough room to complete writes even when the system is under load.

The outbox pattern sidesteps this problem entirely. Write to the outbox table in the same database transaction as your business data. Kafka reads from the outbox and publishes separately. The Kafka transaction never spans the external database, so there is no timeout coupling. If the database write succeeds, the outbox event eventually publishes. If the database write fails, no event gets created.

If you are already in this inconsistent state, a batch job that compares consumer group offsets against processed database records can surface the gap. When the offset is ahead of what actually exists in the database, you have a missing side effect that needs replay.

Scenario 3: The RabbitMQ Network Partition

During a network partition, RabbitMQ’s publisher confirms timeout. The producer retries. After the partition heals, the original message and the retry both get delivered, creating duplicates. The consumer processes both and fails to detect the duplicate because a bug in its hash generation makes the same message look different each time.

Network partitions typically last from a few seconds to several minutes. During that window, the producer retries the same message multiple times. When the partition heals, the consumer gets all copies in rapid succession. If your deduplication logic uses message content hashing and your message contains a timestamp or sequence number, identical business payloads produce different hashes. The consumer treats them as distinct messages.

Use an explicit message_id header derived from the business event, not a random UUID generated per publish attempt:

# Bad: UUID generated on every publish attempt
MessageBody=json.dumps({'order_id': '12345', 'created_at': time.time()})

# Good: stable deduplication ID from business key
sqs.send_message(
    QueueUrl=queue_url,
    MessageBody=json.dumps({'order_id': '12345', 'action': 'ship'}),
    MessageDeduplicationId='order-12345-ship',  # stable ID
    MessageGroupId='orders'
)

Your deduplication table must survive consumer restarts. An in-memory cache wipes on restart and lets duplicates through. Store processed IDs in a database table with a unique constraint, or use Redis with persistence enabled (RDB or AOF). The ON CONFLICT DO NOTHING pattern handles concurrent inserts safely.

Scenario 4: The Pub/Sub Ack Deadline Miss

A Pub/Sub subscriber processes a message that requires calling an external API. The API call takes longer than the ack deadline. Pub/Sub redelivers the message to another subscriber instance. Both instances call the external API with the same payload. The external system gets duplicate side effects.

The ack deadline defaults to 10 seconds, configurable from 10 to 600 seconds. When processing exceeds this window, Pub/Sub treats the message as unacknowledged and redelivers it. With multiple subscriber instances running, you can end up with concurrent processing of the same message by different workers.

For long-running operations, increase the ack deadline to at least 2x your expected processing time. If your API call takes 15 seconds, set the deadline to 45 seconds. You can also call modifyAckDeadline dynamically when you know processing is running long.

Even with a longer deadline, you are not safe if the subscriber crashes mid-processing. The real fix is idempotent external API calls, not just message-level deduplication. The external API must handle duplicate requests. If it is a payment gateway, it must recognize a duplicate charge and return the original response rather than processing twice.

If the external API does not support idempotency keys, distributed locking is a last resort. Acquire a lock scoped to the message ID before calling the API, release it after ack. A second subscriber picks up the same message, finds the lock held, and nacks to trigger redelivery. This adds latency and complexity but protects systems that have no built-in idempotency.

Scenario 5: The Outbox Relay Crash

A service writes to the outbox table and business tables in a single transaction. The outbox relay crashes after reading the outbox event but before publishing to the broker. The event was never marked as processed, so it never gets published. Downstream consumers never receive it. The business operation completed, but the notification never went out.

The outbox relay reads unpublished events from the outbox table and publishes them to the broker. If it crashes between reading and publishing, the event is still in the outbox table but has not been published. On restart, the relay picks it up again and publishes it. That only works if the relay implements at-least-once semantics and idempotent publishing.

When the relay reads an outbox event, it should publish to the broker and mark the event as published atomically in the same transaction. If the publish succeeds but the mark-as-published transaction fails, the relay publishes the same event again on its next poll. The broker must accept a duplicate publish without creating duplicate messages. Use a deduplication ID derived from the outbox event ID or the business event ID. If your broker has built-in deduplication, it drops the second publish attempt. If it does not, make your consumer idempotent based on the outbox event ID.

On polling versus push-based relay: a polling relay reads the outbox on a schedule, so publish latency equals the poll interval. A push-based relay uses database triggers or change data capture to get notified immediately when an outbox event is created. Polling is simpler. CDC is faster but adds operational complexity.

Quick Recap

Interview Questions

1. What makes exactly-once delivery hard in distributed systems?

Partial failures cause message duplication when acks get lost
Consumers can crash after processing but before committing offsets
Any retry across producer-broker-consumer chain creates duplicates

2. How do at-least-once, at-most-once, and exactly-once differ?

At-least-once: redelivers but never loses; requires idempotent consumers
At-most-once: loses messages but never duplicates; consumer acks before processing
Exactly-once: processed precisely once; needs transactions or idempotent deduplication

3. How does Kafka implement exactly-once within its ecosystem?

Transactions (0.11+) write messages and offset commits atomically
Ties message consumption to offset commit in a single transaction
Only applies within Kafka; external systems need extra handling

4. What is content-based deduplication in SQS FIFO?

SQS hashes the message body to generate deduplication IDs
Identical bodies within 5 minutes get dropped automatically
You can also supply an explicit MessageDeduplicationId

5. Why does RabbitMQ need idempotent consumers?

RabbitMQ only guarantees at-least-once with manual acks
Use deduplication tables with ON CONFLICT DO NOTHING
Or leverage natural idempotency (setting a field to the same value)
Message-level transactions tie ack to database commit

6. What is the outbox pattern and what problem does it solve?

Writes the business record and outbox event in one database transaction
A relay process polls the outbox and publishes to the broker
If the business record exists, the event will eventually publish

7. When do you choose saga over outbox?

Saga: multi-service distributed transactions with compensating actions
Outbox: single-service dual-write consistency (one database, one transaction)
Saga coordinates across services; outbox handles reliable event publication

8. How does Redis help with message deduplication?

Use message_id or content hash as the dedup key
SETEX with TTL longer than your retry window prevents premature expiry
Fast in-memory lookups scale well for high-throughput systems

9. What are compensating transactions in sagas?

When a step fails, undo previous steps by running counter-transactions
Each saga step needs a corresponding compensation
Compensations must be idempotent since they can be retried

10. How does Google Pub/Sub deliver exactly-once?

7-day retry window with publisher-supplied idempotency_key
Without explicit key, content-based deduplication applies
Subscribers must ack within the deadline; crash before ack means redelivery

11. What happens in Pub/Sub when a consumer crashes after processing but before ack?

Message gets redelivered within the ack deadline window
Subscriber must be idempotent to survive this safely
Exactly-once is violated unless your consumer handles duplicates

12. What is natural idempotency and why does it matter?

Operations that produce the same result regardless of repetitions
Example: UPDATE users SET status='active' WHERE id=1 changes nothing if run twice
Reduces need for external deduplication stores and overhead

13. How does CDC relate to the outbox pattern?

CDC tools like Debezium tail the transaction log instead of polling
Faster propagation in high-throughput systems
Trade-off: more operational complexity than polling outbox

14. What are the costs of implementing exactly-once?

Higher latency from transactional coordination
Operational complexity: outbox polling, CDC, saga coordination
Consumer idempotency is non-negotiable regardless of broker support
At-least-once + idempotent consumers is simpler for most use cases

15. How does SQS visibility timeout interact with idempotency?

Consumer has a visibility window to process and delete the message
If it times out without deleting, SQS redelivers
Your consumer needs to handle this redelivery idempotently

16. What is an idempotency key in message publishing?

A unique identifier the publisher attaches to enable broker-side deduplication
Kafka includes it in transactions; Pub/Sub uses idempotency_key parameter
Lets publishers retry safely without creating duplicate messages

17. Why measure duplicate cost before implementing exactly-once?

Not every system needs it; at-least-once + idempotent consumers often suffices
Log aggregators and non-critical notifications tolerate duplicates fine
Financial transactions, inventory, and unique constraints may actually need it

18. How does Kafka JDBC sink achieve exactly-once?

Uses transactional outbox pattern internally
Writes offset and database update in one transaction
Failed DB write means offset not committed, preventing reprocessing

19. Why does deterministic stream processing make exactly-once unnecessary?

Reprocessing produces the same result if outputs are deterministic
Kafka Streams word count is naturally idempotent
If your stream operations are deterministic, exactly-once is redundant

20. Compare exactly-once support across Kafka, SQS FIFO, RabbitMQ, and Pub/Sub.

Kafka: yes via transactions (within Kafka only)
SQS FIFO: yes via content dedup (5-minute window)
RabbitMQ: no native support, build it yourself
Pub/Sub: yes via 7-day retry window and idempotency keys

Conclusion

Exactly-once delivery is not a single mechanism but a combination of transport guarantees, idempotent processing, and sometimes distributed transactions. Most systems should start with at-least-once and idempotent consumers. Add exactly-once semantics only where the cost of duplicates justifies the implementation complexity.

The outbox pattern and idempotency keys are the practical tools here. They work across brokers and do not require special broker features.

For related reading, see Message Queue Types, Distributed Transactions, and Pub-Sub Patterns.