Exactly-Once Delivery: The Elusive Guarantee

Explore exactly-once semantics in distributed messaging - why it's hard, how Kafka and SQS approach it, and practical patterns for deduplication.

published: reading time: 11 min read

Exactly-Once Delivery: The Elusive Guarantee

Every message processed exactly once, no duplicates, no loss. That is exactly-once delivery, and it is harder than it sounds. The name is misleading because exactly-once usually refers to processing semantics, not transport. The network can deliver a message multiple times. What matters is that the downstream system processes it only once.

This matters for financial transactions, inventory updates, and any system where duplicates create real problems. A payment processed twice is a disaster. An inventory decrement applied twice is a stock discrepancy.

Why Exactly-Once Is Hard

Distributed systems fail partially. A producer sends a message. The broker receives it and acknowledges. The acknowledgment gets lost in transit. The producer resends. Now the broker has two copies. Or the consumer processes the message, crashes before committing its offset, and on restart receives the same message again.

sequence
    Producer->Broker: Send(msg)
    Broker->Producer: ACK
    Note over Producer,Broker: ACK lost in transit
    Producer->Broker: Send(msg)
    Broker->Producer: ACK
    Note over Broker,Consumer: Two copies in broker

Any retry between producer, broker, or consumer can produce duplicates. At-least-once is easy. Getting exactly-once requires coordination across all three.

Delivery Semantics

Message systems typically offer one of three guarantees.

At-least-once means messages may be redelivered but never lost. The consumer acknowledges after processing. If processing succeeds but the ack fails, the message comes back. This is the most common setting, but it requires idempotent consumers.

At-most-once means messages may be lost but never duplicated. The consumer acknowledges before processing. If the consumer crashes after ack but before processing, the message is gone. This is rare and usually a poor trade-off.

Exactly-once means each message is processed precisely once. This requires either a transaction spanning the message and its side effects, or idempotent processing with deduplication.

stateDiagram-v2
    [*] --> AtMostOnce: Ack before processing
    AtMostOnce --> AtLeastOnce: Ack after processing
    AtLeastOnce --> ExactlyOnce: Add deduplication/idempotency

    note right of AtMostOnce
        Risk: Message loss
        Use case: Log aggregation
    end note

    note right of AtLeastOnce
        Risk: Duplicates
        Use case: Most applications
    end note

    note right of ExactlyOnce
        Risk: Higher latency
        Use case: Financial transactions
    end note

The delivery guarantee spectrum:

GuaranteeDuplicatesLossComplexityTypical Use
At-most-onceNoYesLowCritical logging
At-least-onceYesNoMediumMost applications
Exactly-onceNoNoHighFinancial transactions

Kafka’s Exactly-Once Semantics

Kafka offers exactly-once semantics within the Kafka ecosystem. It can guarantee that messages produced to a topic are consumed exactly once by consumers within Kafka.

Kafka achieves this through transactions introduced in version 0.11. When you enable transactions, Kafka writes messages to partitions as part of a transaction. Consumer offsets are committed as part of the same transaction, tying message consumption to offset commit.

graph LR
    Producer-->|produce| Kafka[Kafka Cluster]
    Kafka-->|consume| Consumer[Consumer]
    Consumer-->|commit offset| Kafka
    Kafka-->|write tx| Partition[Partition]
    Consumer-->|write| DB[(Database)]
    Note over Producer,DB: Atomic if both in same transaction

For exactly-once processing outside Kafka, say writing to a database, you need to handle it yourself. The consumer must write the offset and the database update in a single transaction, or rely on idempotent semantics to deduplicate.

For more on Kafka’s partitioning and consumer groups, see Apache Kafka.

AWS SQS FIFO

SQS offers two queue types. Standard queues provide at-least-once delivery. FIFO queues provide exactly-once processing and preserve order.

FIFO queues use content-based deduplication to discard duplicate messages within a 5-minute window. Send the same message twice within 5 minutes and SQS drops the second copy.

sqs.send_message(
    QueueUrl=queue_url,
    MessageBody=json.dumps({'order_id': '12345', 'action': 'ship'}),
    MessageDeduplicationId='order-12345-ship',
    MessageGroupId='orders'
)

SQS applies a hash to the message body to derive the deduplication ID when you enable content-based deduplication. Identical bodies produce identical deduplication IDs.

SQS FIFO can guarantee exactly-once within the queue itself, but your consumer still needs to be idempotent. If your consumer times out and does not delete the message within the visibility timeout window, SQS will deliver it again.

For more on SQS features, see AWS SQS and SNS.

RabbitMQ and Idempotent Consumers

RabbitMQ does not support exactly-once semantics out of the box. It gives you at-least-once with manual acknowledgments. Your consumer must make itself idempotent.

Idempotency means processing the same message multiple times has the same effect as processing it once. Here is how to actually do it.

Deduplication table: Store processed message IDs in a database. Before processing, check if the ID exists.

INSERT INTO processed_messages (msg_id, processed_at)
VALUES ('msg-123', NOW())
ON CONFLICT (msg_id) DO NOTHING;

-- If insert succeeds, process the message
-- If insert fails (duplicate), skip processing

Natural idempotency: Some operations are naturally idempotent. Setting a field to the same value twice leaves the database in the same state. UPDATE users SET status = 'active' WHERE id = 1 can run twice and nothing changes. Spotting these operations and using them cuts down on overhead.

Message-level transactions: Combine the message acknowledgment with your database transaction. If the database update fails, the message is not acknowledged and comes back.

For more on RabbitMQ’s acknowledgment model, see RabbitMQ.

Practical Patterns

Idempotency Keys

Every message carries a unique ID or you derive one from content. The consumer stores processed IDs. When a message arrives, check if its ID was already processed and skip if so. This approach works with any message broker.

def process_message(msg):
    msg_id = msg.headers.get('x-idempotency-key') or hash(msg.body)

    if redis.exists(f"processed:{msg_id}"):
        return

    do_work(msg)
    redis.setex(f"processed:{msg_id}", ttl=86400, value='1')

Set the TTL longer than your maximum retry window to prevent unbounded growth.

Outbox Pattern

The outbox pattern addresses the dual-write problem. Instead of writing to the database and publishing a message in two separate steps, you write both to the database in a single transaction.

graph TD
    Service-->|1 write| DB[(Outbox Table)]
    DB-->|2 read| Relay[Outbox Relay]
    Relay-->|3 publish| Broker[Message Broker]
    Note over Service,Broker: Atomic write ensures consistency

The outbox table lives in the same database as your business data. You write the business record and the outbox event in the same transaction. A separate process polls the outbox and publishes to the broker.

The key guarantee: if the business record exists, the event will eventually be published. The event is never published without the corresponding record.

For more on these patterns, see Event-Driven Architecture and Asynchronous Communication.

Transactional Outbox with CDC

In high-throughput systems, polling the outbox adds latency. Change Data Capture tools like Debezium tail the database transaction log and publish when records change. This is faster but adds operational complexity.

The pattern stays the same: write to the outbox atomically with business data, let CDC handle propagation.

When Exactly-Once Matters

You do not always need exactly-once. For many use cases, at-least-once with idempotent consumers is sufficient and simpler. A log aggregator can handle duplicates. A notification system can usually tolerate a duplicate email if it is not critical.

Exactly-once matters for financial transactions (duplicate charges are serious), inventory operations (double decrements overstate deductions), unique constraints (double user creation may violate uniqueness), and external API calls (retrying a non-idempotent API creates duplicate side effects).

Measure the cost of duplicates in your system before implementing exactly-once. The complexity is real and the operational burden is higher.

Saga Pattern vs Outbox Pattern

The saga pattern and outbox pattern solve related but distinct problems. Both address exactly-once in distributed systems, but they handle different failure scenarios.

The saga pattern coordinates multiple services in a distributed transaction. When step 3 fails, saga executes compensating transactions to undo steps 1 and 2. This handles partial failure without distributed locks.

The outbox pattern handles the dual-write problem within a single service. Instead of writing to the database and publishing a message in two separate steps, you write both in a single database transaction. If the business record exists, the event will eventually be published.

AspectSaga PatternOutbox Pattern
Problem solvedMulti-service distributed transactionsSingle-service dual-write consistency
Failure handlingCompensating transactionsEventual publication
Idempotency neededYes (compensating transactions must be idempotent)Yes (outbox relay may retry)
ComplexityHigh (requires careful compensation design)Medium (requires polling outbox)
Consistency modelEventualStronger eventual (within a single service)
Use whenMultiple services must update in coordinated wayService needs to publish events atomically with database changes

Use the saga pattern when you have multiple independent services each making state changes that must be coordinated. Use the outbox pattern when a single service needs to publish events reliably without a distributed transaction coordinator.

For more on saga, see Saga Pattern. For outbox, see Outbox Pattern.

Idempotent Consumer with Redis Deduplication

Redis works well as a fast deduplication store for idempotent consumers. Store processed message IDs with a TTL longer than your retry window:

import redis
import json
import hashlib

class RedisIdempotentConsumer:
    def __init__(self, redis_client: redis.Redis, ttl_seconds: int = 86400):
        self.redis = redis_client
        self.ttl = ttl_seconds

    def _get_dedup_key(self, message) -> str:
        """Derive a unique key from the message."""
        msg_id = message.get('message_id')
        if msg_id:
            return f"dedup:{msg_id}"

        # Fallback: hash the message content
        content = json.dumps(message, sort_keys=True)
        content_hash = hashlib.sha256(content.encode()).hexdigest()[:16]
        return f"dedup:hash:{content_hash}"

    def is_duplicate(self, message) -> bool:
        """Check if this message was already processed."""
        key = self._get_dedup_key(message)
        return self.redis.exists(key) == 1

    def mark_processed(self, message) -> None:
        """Mark message as processed with TTL."""
        key = self._get_dedup_key(message)
        self.redis.setex(key, self.ttl, '1')

    def process(self, message: dict) -> bool:
        """
        Process message idempotently.
        Returns True if processed, False if duplicate.
        """
        if self.is_duplicate(message):
            return False

        # Process the actual work
        do_work(message)

        # Mark as processed
        self.mark_processed(message)
        return True

# Usage in a consumer loop
consumer = RedisIdempotentConsumer(redis_client, ttl_seconds=86400)

for message in receive_messages():
    if consumer.process(message):
        print(f"Processed message {message['message_id']}")
    else:
        print(f"Skipped duplicate {message['message_id']}")

Set the TTL longer than your maximum retry window. If a message might be retried for up to 1 hour due to broker retry policies, set TTL to 2 hours or more. If you process millions of messages per day, watch Redis memory usage.

Kafka Exactly-Once with External Systems

Kafka transactions guarantee exactly-once within the Kafka ecosystem. When writing to external systems like databases, you need additional handling.

Kafka Transactions with JDBC

The Kafka JDBC sink connector handles exactly-once for database writes using the transactional outbox pattern:

# Kafka Connect JDBC sink configuration for exactly-once
transforms=insertIdempotenceKey
transforms.insertIdempotenceKey.type=org.apache.kafka.connect.transformes.ValueToKey
transforms.insertIdempotenceKey.fields=id

# Enable exactly-once for the sink
s Exactly-once support guarantees that records delivered to external systems are processed exactly once.

For the Kafka Streams word count application, each word count update is idempotent. If a count is recalculated due to reprocessing, the database write produces the same result. Design your stream processing to produce deterministic outputs, and exactly-once guarantees become unnecessary complexity.

## Google Pub/Sub Exactly-Once

Pub/Sub offers exactly-once delivery with a 7-day retry window. It deduplicates messages using publisher-supplied message IDs:

```python
from google.cloud import pubsub_v1

publisher = pubsub_v1.PublisherClient()

# Publish with explicit message ID (recommended)
future = publisher.publish(
    'projects/my-project/topics/my-topic',
    data=b'event data',
    idempotency_key='my-unique-idempotency-key'
)

The idempotency_key parameter enables exactly-once delivery on the publish side. If you publish with the same key multiple times, Pub/Sub delivers only one message.

On the subscriber side, Pub/Sub acknowledges automatically after processing. If your subscriber crashes and restarts, messages are redelivered within the ack deadline window:

subscriber = pubsub_v1.SubscriberClient()
subscription_path = 'projects/my-project/subscriptions/my-sub'

def callback(message):
    try:
        process(message.data)
        message.ack()
    except Exception:
        # Nack and let Pub/Sub redeliver
        message.nack()

subscriber.subscribe(subscription_path, callback=callback)

Pub/Sub exactly-once works when:

  1. Publisher provides an idempotency key (or uses automatic deduplication based on message content)
  2. Subscriber acknowledges within the ack deadline
  3. Subscriber is idempotent (safe to process the same message twice)

Pub/Sub does not provide exactly-once if the subscriber crashes after processing but before acknowledging. In that case, the message is redelivered and must be idempotent.

Comparing Broker Support

BrokerExactly-Once SupportNotes
KafkaYes (transactions)Within Kafka ecosystem
SQS FIFOYes (deduplication)Within SQS, 5-min window
RabbitMQNo (build it yourself)At-least-once with idempotency
Google Pub/SubYesDelivery guarantee with retry window

Summary

Exactly-once delivery is not a single mechanism but a combination of transport guarantees, idempotent processing, and sometimes distributed transactions. Most systems should start with at-least-once and idempotent consumers. Add exactly-once semantics only where the cost of duplicates justifies the implementation complexity.

The outbox pattern and idempotency keys are the practical tools here. They work across brokers and do not require special broker features.

For related reading, see Message Queue Types, Distributed Transactions, and Pub-Sub Patterns.

Category

Related Posts

Ordering Guarantees in Distributed Messaging

Understand how message brokers provide ordering guarantees - from FIFO queues to causal ordering across partitions, and trade-offs in distributed systems.

#distributed-systems #messaging #kafka

Dead Letter Queues: Handling Message Failures Gracefully

Design and implement Dead Letter Queues for reliable message processing. Learn DLQ patterns, retry strategies, monitoring, and recovery workflows.

#data-engineering #dead-letter-queue #kafka

Apache Kafka: Distributed Streaming Platform

Learn how Apache Kafka handles distributed streaming with partitions, consumer groups, exactly-once semantics, and event-driven architecture patterns.

#kafka #messaging #streaming