Publish/Subscribe Patterns: Topics, Subscriptions, Filtering

Learn publish-subscribe messaging patterns: topic hierarchies, subscription management, message filtering, fan-out, and dead letter queues.

published: reading time: 21 min read author: GeekWorkBench

Publish/Subscribe Patterns: Topics, Subscriptions, and Filtering

Publish/subscribe is a messaging pattern where senders (publishers) broadcast messages to topics instead of sending them directly to specific receivers. Subscribers receive messages from the topics they care about. This decoupling is practical but comes with tradeoffs around topic design, filtering, and delivery semantics.

This post walks through the practical aspects of building pub/sub systems: topic hierarchies, subscription management, message filtering, and fan-out patterns.

Introduction

Publish/subscribe is a messaging pattern where senders — called publishers — do not direct messages to specific receivers. Instead, they broadcast messages to named topics. Subscribers listen to the topics they care about and receive all messages published to those topics. This decouples producers from consumers in both time and space: publishers do not need to know who (if anyone) is listening, and subscribers do not need to be online when messages are published.

A single order.placed event can simultaneously trigger inventory updates in the warehouse service, receipt generation in the billing service, fraud detection in the risk service, and analytics in the data pipeline — without the order service knowing anything about those downstream consumers.

This post covers the practical building blocks: how to design topics, manage subscriptions, filter messages selectively, handle fan-out to many subscribers, and deal with failures through dead letter queues. It also covers the trade-offs that determine when pub/sub fits and when a queue or direct RPC is simpler.

Topic-Specific Deep Dives

Topic Design

Topics are the core abstraction in pub/sub. Getting them right makes everything else easier.

Flat Topics

The simplest approach: one topic per message type.

user.created
order.placed
payment.processed

Each subscriber chooses which topics to listen to. Simple, but limited. What if you want all user events? You subscribe to multiple topics.

Hierarchical Topics

Hierarchies organize topics into trees, enabling broader subscriptions:

users/
users.created
users.updated
users.deleted

orders/
orders.placed
orders.updated
orders.cancelled

payments/
payments.processed
payments.failed

Subscribing to orders/ gives you all order events. Subscribing to users.* gives you all user lifecycle events.

graph TD
    Publisher -->|publish| Topic[orders.placed]

    subgraph Subscribers
        Analytics[Analytics Service]
        Notifications[Notification Service]
        Audit[Audit Service]
    end

    Topic -->|orders.placed| Analytics
    Topic -->|orders.placed| Notifications
    Topic -->|orders.placed| Audit

Naming conventions matter. Without some discipline, hierarchies turn into a mess pretty quickly.

Subject-Based vs Content-Based

Most pub/sub systems are subject-based: topics are predefined strings that publishers use. Subscribers choose topics to listen to.

Content-based systems route based on message content. A subscriber might say “give me all messages where amount > 1000.” This is more flexible but harder to implement efficiently. It also raises security concerns—what if a subscriber crafts queries to sniff data they should not see?

Most teams end up using subject-based routing. It is simpler to implement and performs well at scale.

Subscription Management

Subscriptions track what each subscriber wants. Managing their lifecycles is trickier than it sounds.

Durable Subscriptions

A durable subscription persists even when the subscriber is offline. When it reconnects, it gets the messages that arrived while it was disconnected.

// Pseudo-code for durable subscription
subscription = client.subscribe("orders.*", durable=true)
client.disconnect()
client.reconnect()
subscription.resume() // missed messages delivered

This matters for mobile clients and services that restart. Without durability, offline periods mean permanent message loss.

Shared Subscriptions

When multiple instances of a service run (say, three notification service instances), you want them to share the work. Shared subscriptions distribute messages across instances.

orders/ -> [notification-1, notification-2, notification-3]
          each instance gets roughly 1/3 of messages

This is essential for scaling consumers horizontally. Without sharing, all instances receive all messages, which defeats the purpose of running multiple workers.

Subscription Types by Delivery

  • Exclusive: Only one consumer receives messages (no sharing)
  • Shared: Messages distributed across multiple consumers
  • Failover: One consumer receives all messages, others standby

Message Filtering

Sometimes subscribers only need a subset of what a topic publishes. Filtering lets them narrow what they receive.

Topic-Level Filtering

The simplest form: use narrow topics. A subscriber to orders.high-value gets only high-value orders, not everything else.

This requires publishers to classify messages correctly, which couples them to subscriber knowledge.

Header-Based Filtering

Messages have headers (key-value pairs) that you can use for filtering:

{
  "topic": "orders",
  "headers": {
    "priority": "high",
    "region": "us-west",
    "source": "web"
  },
  "body": { "order_id": "12345", "amount": 5000 }
}

A subscriber can filter: headers.region = 'us-west' AND headers.priority = 'high'.

This separates classification (publisher job) from filtering (subscriber job).

SQL-Based Filtering

Some systems let subscribers express filters as SQL-like conditions:

SELECT * FROM orders WHERE amount > 1000 AND region = 'us-west'

Amazon SNS supports SQL-based filtering:

{
  "FilterPolicy": {
    "amount": [{ "numeric": [">", 1000] }],
    "region": ["us-west"]
  }
}

Fan-Out Patterns

Fan-out describes how one message reaches multiple subscribers. The pattern changes depending on what you need.

Broadcast Fan-Out

Every subscriber receives every message. The simplest model.

graph LR
    Pub[Publisher] -->|1 message| Topic
    Topic -->|copy 1| Sub1[Subscriber 1]
    Topic -->|copy 2| Sub2[Subscriber 2]
    Topic -->|copy 3| Sub3[Subscriber 3]

The broker makes copies. Cost scales with subscriber count.

Selective Fan-Out

With filtering, only matching subscribers receive the message. More efficient but requires the broker to evaluate filters.

Dead Letter Queues

When a subscriber cannot process a message (crashes, returns error, times out), what happens? Dead letter queues capture failed messages so you can inspect them later.

graph LR
    Pub[Publisher] --> Topic
    Topic -->|normal| Sub[Subscriber]
    Topic -->|failed after retries| DLQ[Dead Letter Queue]

Without DLQs, poison messages block the queue or get dropped silently. DLQs let you see what failed and why.

Implementation Considerations

Message Ordering

Most pub/sub systems do not guarantee ordering across topics. Within a single ordered stream (like Kafka partitions), ordering holds. Across multiple publishers or topics, it does not.

If you need global ordering, you need a totally ordered broadcast protocol, which is expensive. More commonly, you accept per-partition ordering and use correlation IDs to reconstruct order client-side.

Backpressure Handling

When subscribers cannot keep up with message throughput, backpressure prevents the system from collapsing. Without it, consumers run out of memory or broker queues grow without bound.

Buffer-based approaches accumulate messages locally up to a limit:

local_buffer = RingBuffer(capacity=1000)
while local_buffer.has_capacity():
    message = broker.fetch()
    local_buffer.push(message)

process_batch(local_buffer.drain())

When the buffer fills, the subscriber must either drop messages, pause consumption, or signal the broker to slow down.

Flow control signals let subscribers push back on publishers:

subscriber --> [Window Full] --> broker --> publisher slows production

RabbitMQ uses basic.qos with prefetch to limit unacknowledged messages. Kafka has consumer group rebalance hooks for similar behavior.

Exponential backoff on consumption failures keeps repeated processing failures from creating message pileup:

retry_delay = 1_000  # ms
for attempt in range(max_retries):
    try:
        process(message)
        ack(message)
        break
    except TransientError:
        sleep(retry_delay)
        retry_delay *= 2
        retry_delay = min(retry_delay, max_delay)

The key takeaway: backpressure should propagate upward, not accumulate silently. If your subscribers cannot keep up, the system should throttle publishers or shed load, not buffer forever.

Message Deduplication

At-least-once delivery can deliver the same message twice. Idempotent consumers handle this by tracking processed message IDs.

if message.id in processed_ids:
    skip
else:
    process(message)
    add message.id to processed_ids

Design for at-least-once and make processing idempotent. Exactly-once is usually not worth the cost.

Delivery Guarantees Deep Dive

Pub/sub systems offer different delivery guarantees. The right choice depends on your processing model.

GuaranteeDescriptionIdempotent RequiredUse Case
At-most-onceMessage delivered once or not at allNoMetrics, fire-and-forget events
At-least-onceMessage delivered, may be duplicatedYesProcessing jobs, state updates
Exactly-onceMessage delivered exactly onceYes (and coord.)Payments, critical business events

Most broker systems (RabbitMQ, Amazon SNS/SQS, Google Cloud Pub/Sub) provide at-least-once by default. Exactly-once requires distributed consensus and adds significant overhead.

When designing consumers, assume at-least-once arrival. Make every processing step idempotent using message IDs stored in a deduplication table or bloom filter.

Pub/sub connects closely to event-driven architecture, where topics carry domain events between services. It also relates to message queue types for understanding where pub/sub fits in the broader messaging landscape.

Trade-off Analysis

Design ChoiceProsConsWhen to Use
Flat TopicsSimple, easy to understandNo wildcard subscriptions, subscribers must list each topicSmall number of independent topics
Hierarchical TopicsWildcard subscriptions, natural organizationNaming discipline required, can become messyLarge topic spaces with natural groupings
Subject-Based RoutingEfficient, simple broker logicLimited flexibilityMost production systems
Content-Based RoutingMaximum flexibility for subscribersComplex broker logic, security concernsSpecialized filtering requirements
Durable SubscriptionsNo message loss during offline periodsHigher broker memory/storageMobile clients, fault-tolerant services
Shared SubscriptionsHorizontal scaling, efficient resource useMessage ordering not guaranteed per instanceStateless workers, scaled consumers
Exclusive SubscriptionsStrict ordering guaranteeNo horizontal scalingOrdered processing, single consumer
Broadcast Fan-OutSimple, all subscribers get all messagesScales with subscriber countNotifications, event broadcasting
Selective Fan-OutEfficient, subscribers get only relevant messagesBroker must evaluate filtersTargeted deliveries, filtered streams
Delivery GuaranteeMessage Loss RiskDuplicate RiskImplementation Complexity
At-most-oncePossible (message may not arrive)NoneLow
At-least-onceNonePossible (duplicates)Medium (idempotency needed)
Exactly-onceNoneNoneHigh (distributed consensus)

When to Use / When Not to Use

When to Use Publish-Subscribe

Use pub/sub when one event should trigger actions in multiple services simultaneously — analytics, notifications, audit, and others. When services need to react to shared events without calling each other directly. When pushing updates to web clients, mobile apps, or IoT devices. When the same data needs to live in multiple systems — databases, caches, search indexes.

When Not to Use Publish-Subscribe

Don’t reach for pub/sub when you need a reply from a specific service — use direct RPC or a queue with a reply-to header. Not when messages must be processed in strict order with a single consumer. Not when you’re just distributing work evenly across workers (a plain queue handles that better). And not when all consumers must finish processing before the operation completes — pub/sub is eventually consistent by nature.

Production Failure Scenarios

FailureImpactMitigation
Broker goes downAll publishing and subscribing stopsDeploy broker clusters with replication; use multiple broker endpoints
Subscriber crash during processingMessage may be lost or reprocessedUse manual acknowledgment; configure prefetch limits
Message explosion (too many subscribers)Network saturation, increased latencyLimit subscription count; use content filtering at broker
Topic explosion (too many topics)Memory pressure on brokerImplement topic naming policies; use topic hierarchies instead of flat topics
Subscription driftConsumers receiving unintended messagesUse filter policies; audit subscriptions regularly
Ordering violationsMessages arrive out of order across topicsUse correlation IDs to reorder; use single-partition streams for ordering
Poison message blockingFailed messages block subscriberConfigure dead letter queues; set retry limits with exponential backoff

Common Pitfalls / Anti-Patterns

Common Pitfalls

Pitfall 1: Topic Proliferation

Creating too many topics (one per event type) leads to management overhead and broker resource exhaustion. Use hierarchical topics to group related events and apply naming policies.

Pitfall 2: Treating Pub/Sub as Synchronous Communication

Pub/sub is inherently asynchronous. If you need synchronous request/response, do not use topics. Clients must be designed to handle out-of-order, asynchronous responses.

Pitfall 3: Forgetting That Subscribers Can Go Offline

Without durable subscriptions, offline subscribers miss messages permanently. For critical subscribers, make sure durability is configured.

Pitfall 4: Not Designing for Message Replay

Subscribers that crash and restart need to reprocess messages they missed. Design consumers to handle gaps in the message stream gracefully.

Pitfall 5: Over-Filtering at the Publisher

When publishers classify messages for subscribers, they become coupled to subscriber requirements. Push filtering to subscribers and keep publishers unaware of subscriber specifics.

Pitfall 6: Ignoring Dead Letter Queue Accumulation

If DLQs are accumulating, something is wrong with message processing. Monitor DLQ depth and investigate root causes promptly.

Interview Questions

1. What is the fundamental difference between subject-based and content-based routing in pub/sub systems?

Expected answer points:

  • Subject-based routing uses predefined topic strings; subscribers choose which topics to listen to
  • Content-based routing evaluates message body attributes; subscribers define filter conditions on message content
  • Content-based routing is more flexible but harder to implement efficiently and raises security concerns
2. How does a durable subscription work and why is it important for critical subscribers?

Expected answer points:

  • Durable subscriptions persist state server-side so messages arriving while subscriber is offline are queued
  • On reconnect, the subscriber receives missed messages from the persisted state
  • Critical for mobile clients and services that restart frequently
  • Without durability, offline periods cause permanent message loss
3. What is the purpose of dead letter queues and how should they be used?

Expected answer points:

  • DLQs capture messages that fail processing after all retry attempts
  • Prevents poison messages from blocking the queue or being dropped silently
  • Enables debugging of processing failures by inspecting failed messages
  • DLQ depth should be monitored and alerts configured for threshold violations
4. Explain the difference between broadcast fan-out and selective fan-out.

Expected answer points:

  • Broadcast fan-out delivers every message to every subscriber; broker makes copies
  • Selective fan-out uses filtering so only matching subscribers receive the message
  • Selective fan-out is more efficient but requires broker to evaluate filters
  • Cost scales with subscriber count in broadcast; less so in selective
5. What are the three subscription types by delivery pattern and how do they differ?

Expected answer points:

  • Exclusive: Only one consumer receives messages from the subscription
  • Shared: Messages are distributed across multiple consumers in the group
  • Failover: One consumer receives all messages; others remain standby for high availability
6. What is backpressure handling in pub/sub systems and why is it necessary?

Expected answer points:

  • Backpressure prevents system collapse when subscribers cannot keep up with message throughput
  • Without it, consumers run out of memory or broker queues grow without bound
  • Buffer-based approaches accumulate messages locally up to a limit
  • Flow control signals let subscribers push back on publishers to slow things down
  • Should propagate upward, not accumulate silently
7. How does at-least-once delivery differ from exactly-once and at-most-once?

Expected answer points:

  • At-most-once: message delivered once or not at all; no idempotency required
  • At-least-once: message guaranteed to be delivered but may be duplicated; idempotent processing required
  • Exactly-once: message delivered exactly once; requires distributed consensus and significant overhead
  • Most brokers provide at-least-once by default
8. What is the difference between shared subscriptions and exclusive subscriptions?

Expected answer points:

  • Shared subscriptions distribute messages across multiple consumer instances for horizontal scaling
  • Exclusive subscriptions ensure only one consumer receives all messages
  • Without shared subscriptions, all instances get all messages, which defeats horizontal scaling
  • Shared works well for scaling consumers; exclusive is for single-threaded processing requirements
9. What are the key observability metrics to monitor in a pub/sub system?

Expected answer points:

  • Topic message rate (messages published per second per topic)
  • Subscription delivery rate (messages delivered per second per subscription)
  • Subscriber lag (time between publish and delivery)
  • Dead letter queue depth (failed messages waiting to be inspected)
  • Filter match rate (percentage of messages matching subscriber filters)
  • Connection count (active publishers and subscribers per topic)
10. What are the main failure scenarios in pub/sub systems and their mitigations?

Expected answer points:

  • Broker failure: Deploy broker clusters with replication; use multiple broker endpoints
  • Subscriber crash during processing: Use manual acknowledgment; configure prefetch limits
  • Message explosion (too many subscribers): Limit subscription count; use content filtering
  • Topic explosion (too many topics): Implement naming policies; use topic hierarchies
  • Ordering violations: Use correlation IDs to reorder; use single-partition streams
  • Poison message blocking: Configure dead letter queues; set retry limits with exponential backoff
11. How does message filtering at the header level differ from SQL-based filtering in pub/sub systems?

Expected answer points:

  • Header-based filtering uses key-value pairs attached to messages; subscribers filter by header attributes
  • SQL-based filtering lets subscribers express complex conditions with operators like >, <, =, AND, OR
  • Header filtering is simpler and faster but limited to predefined header fields
  • SQL filtering is more expressive but requires the broker to parse and evaluate SQL-like expressions
  • Amazon SNS supports SQL-based filtering via FilterPolicy with numeric and string matching
12. What is the relationship between pub/sub and event-driven architecture?

Expected answer points:

  • Pub/sub is often the messaging backbone for event-driven architecture
  • In EDA, topics carry domain events between services that react to those events
  • Pub/sub provides the decoupled, asynchronous communication that EDA requires
  • Services do not call each other directly; they subscribe to topics and react to events
  • This separation allows services to evolve independently without tight coupling
13. What are the tradeoffs between hierarchical topics and flat topics in pub/sub design?

Expected answer points:

  • Flat topics are simple but require explicit subscriptions for each topic
  • Hierarchical topics enable broad subscriptions using wildcard patterns (e.g., orders/*)
  • Hierarchies reduce coupling between publishers and subscribers
  • The tradeoff is naming discipline—without conventions, hierarchies become messy
  • Hierarchies work well when there is a natural domain tree; flat works when topics are independent
14. Why is idempotent processing important in pub/sub systems and how do you implement it?

Expected answer points:

  • At-least-once delivery can duplicate messages; idempotent processing handles duplicates
  • Idempotent processing means processing a message multiple times has the same effect as once
  • Implementation: track processed message IDs in a deduplication table or bloom filter
  • Check if message.id exists before processing; skip if already processed
  • Design for at-least-once and make every processing step idempotent
15. What is the role of correlation IDs in pub/sub message ordering?

Expected answer points:

  • Pub/sub does not guarantee ordering across topics or publishers
  • Correlation IDs allow clients to reconstruct order on their side
  • Messages related to the same logical operation share a correlation ID
  • Subscribers sort messages by correlation ID to restore proper sequence
  • For strict ordering, use single-partition ordered streams and correlation IDs together
16. How do you handle backpressure in a pub/sub consumer that cannot keep up with message throughput?

Expected answer points:

  • Buffer-based approaches accumulate messages locally in a ring buffer up to capacity
  • When buffer fills, the subscriber must drop messages, pause consumption, or signal the broker
  • Flow control signals let subscribers push back on publishers to slow production
  • RabbitMQ uses basic.qos with prefetch to limit unacknowledged messages
  • Kafka has consumer group rebalance hooks for similar backpressure behavior
  • Backpressure should propagate upward, not accumulate silently
17. What are the security considerations in a pub/sub system?

Expected answer points:

  • Authentication: Require credentials for publishers and subscribers; use mutual TLS
  • Authorization: Implement topic-level access controls; restrict who can publish to which topics
  • Encryption in transit: Enable TLS for all broker connections
  • Encryption at rest: Enable disk encryption if broker stores messages
  • Message-level security: Validate message schemas; sanitize content to prevent injection
  • Subscription security: Prevent unauthorized subscription creation
18. What is the difference between exclusive and shared subscriptions in terms of scaling behavior?

Expected answer points:

  • Exclusive subscriptions deliver all messages to one consumer; others receive nothing
  • Shared subscriptions distribute messages across multiple consumers in a group
  • Without sharing, all instances get all messages—defeating horizontal scaling
  • Shared works well for scaling out stateless workers; exclusive is for single-threaded requirements
  • Failover subscriptions designate one active consumer with others on standby
19. How does exponential backoff help with message processing failures?

Expected answer points:

  • Exponential backoff increases retry delays progressively after transient failures
  • Starting delay doubles or triples after each failed attempt up to a maximum
  • Prevents message pileup from repeated processing failures creating backlog
  • Example: retry_delay starts at 1000ms, doubles each attempt, caps at max_delay
  • Keeps the system responsive while failed messages are retried at sustainable rate
20. What observability metrics should you monitor in a production pub/sub system?

Expected answer points:

  • Topic message rate: Messages published per second per topic
  • Subscription delivery rate: Messages delivered per second per subscription
  • Subscriber lag: Time between message publish and delivery to subscribers
  • Dead letter queue depth: Failed messages awaiting inspection
  • Filter match rate: Percentage of published messages matching subscriber filters
  • Connection count: Active publishers and subscribers per topic
  • Alerts should trigger when DLQ depth exceeds threshold or delivery failures exceed rate

Further Reading

Conclusion

Key Points

  • Pub/sub broadcasts messages to all subscribers; each subscriber gets a copy
  • Topic hierarchies organize messages by domain and enable broad subscriptions
  • Durable subscriptions persist messages for offline subscribers
  • Shared subscriptions distribute load across multiple consumer instances
  • Dead letter queues capture poison messages for debugging
  • Design for at-least-once delivery with idempotent processing
  • Ordering is not guaranteed across topics; use correlation IDs when needed

Pre-Deployment Checklist

- [ ] Topic naming convention documented and enforced
- [ ] Durable subscriptions enabled for critical subscribers
- [ ] Dead letter queues configured with monitoring
- [ ] Filter policies documented per subscription
- [ ] Manual acknowledgment implemented in consumers
- [ ] Idempotent message processing implemented
- [ ] Retry limits set with exponential backoff
- [ ] TLS/encryption enabled for all connections
- [ ] Topic-level access controls configured
- [ ] Alert thresholds set for DLQ depth and delivery failures
- [ ] Consumer group scaling strategy defined
- [ ] Correlation ID propagation implemented for distributed tracing

Observability

Metrics to Monitor

  • Topic message rate: Messages published per second per topic
  • Subscription delivery rate: Messages delivered per second per subscription
  • Subscriber lag: Time between message publish and delivery to subscribers
  • Dead letter queue depth: Failed messages awaiting inspection
  • Filter match rate: Percentage of published messages matching subscriber filters
  • Connection count: Active publishers and subscribers per topic

Logs to Capture

  • Topic creation and deletion events
  • Subscription creation, modification, and deletion
  • Message publish events with topic, routing keys, and headers
  • Message delivery events per subscription
  • Dead letter queue arrivals with original topic and failure reason
  • Filter policy evaluations (when a message is filtered out)

Alerts to Configure

  • Dead letter queue depth exceeds threshold
  • Subscriber delivery failures exceed rate threshold
  • Topic message rate anomalies (spike or drop)
  • Subscription lag exceeds SLA threshold
  • Broker memory or connection limits approaching

Security Checklist

  • Authentication: Require credentials for publishers and subscribers; use mutual TLS for authentication
  • Authorization: Implement topic-level access controls; restrict who can publish to which topics
  • Encryption in transit: Enable TLS for all broker connections
  • Encryption at rest: Enable disk encryption if broker stores messages
  • Message-level security: Validate message schemas; sanitize content to prevent injection attacks
  • Subscription security: Prevent unauthorized subscription creation; validate subscriber identity
  • Audit trails: Log all topic and subscription administrative actions
  • Data classification: Do not publish sensitive data to shared topics without encryption

Category

Related Posts

CQRS Pattern

Separate read and write models. Command vs query models, eventual consistency implications, event sourcing integration, and when CQRS makes sense.

#database #cqrs #event-sourcing

Event Sourcing

Storing state changes as immutable events. Event store implementation, event replay, schema evolution, and comparison with traditional CRUD approaches.

#database #event-sourcing #cqrs

Message Queue Types: Point-to-Point vs Publish-Subscribe

Understand the two fundamental messaging patterns - point-to-point and publish-subscribe - and when to use each, including JMS, AMQP, and MQTT protocols.

#messaging #queues #distributed-systems