Apache Kafka: Distributed Streaming Platform

Learn how Apache Kafka handles distributed streaming with partitions, consumer groups, exactly-once semantics, and event-driven architecture patterns.

published: March 22, 2026 reading time: 16 min read

Apache Kafka: Distributed Streaming Platform

Kafka is not a message queue. It is a distributed streaming platform built for continuous data streams, not batch message processing. This distinction matters. Queues deliver messages point-to-point. Kafka publishes and consumes streams of events with durable storage and replay capability.

This post covers Kafka’s core concepts: topics and partitions, consumer groups, exactly-once semantics, and practical use cases.

Topics and partitions

Kafka organizes data into topics. Unlike a queue where messages are consumed and deleted, Kafka topics are logs. Messages are appended and kept for a configurable retention period: hours, days, or indefinitely.

Topic: order-events
Partition 0: [msg1, msg2, msg3, msg5, msg8]
Partition 1: [msg4, msg6, msg7, msg9]
Partition 2: [msg10, msg11, msg12]

Each topic splits into partitions for parallelism. Partitions distribute across brokers. Within a partition, messages have a monotonically increasing offset that uniquely identifies each one.

Message keying

Producers can specify a key when publishing:

producer.send(new ProducerRecord("order-events", orderId, orderJson));

Kafka hashes the key to determine the partition. Messages with the same key always go to the same partition, which means ordering per key. All events for the same order arrive in the same partition, in order.

Partition assignment

Partitions assign to brokers at topic creation time:

Broker 1: Partition 0, Partition 3
Broker 2: Partition 1, Partition 4
Broker 3: Partition 2, Partition 5

The leader broker for each partition handles reads and writes. Followers replicate for fault tolerance.

Consumer groups

Kafka consumers belong to consumer groups. Each message in a topic delivers to one consumer within each group.

graph LR
    Producer -->|publish| Topic[Topic with 3 Partitions]
    Topic -->|P0| CG1[Consumer Group A]
    Topic -->|P1| CG1
    Topic -->|P2| CG1
    Topic -->|P0| CG2[Consumer Group B]
    Topic -->|P1| CG2
    Topic -->|P2| CG2

Group A might have one consumer processing all partitions. Group B might have three consumers, each owning one partition. Both groups receive all messages independently.

Rebalancing

When a consumer joins or leaves a group, Kafka rebalances partition assignments. The system redistributes partitions to maintain even load.

Rebalancing has a cost: during rebalancing, no consumer processes messages. If rebalances happen frequently, throughput suffers.

Offset management

Kafka tracks consumption progress via offsets. Consumers commit offsets to mark what they have processed:

while (true) {
    ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
    for (ConsumerRecord<String, String> record : records) {
        process(record);
    }
    consumer.commitSync();
}

If a consumer crashes after processing but before committing, it receives the same messages on restart. This is at-least-once delivery. With idempotent processing, duplicates are harmless.

Exactly-once semantics

Exactly-once is the gold standard: each message processes precisely once, no duplicates, no loss. Kafka offers exactly-once semantics via transactions.

The problem

Exactly-once is hard because processing involves multiple systems:

Kafka -> Consumer -> Database

If the consumer crashes after writing to the database but before committing the Kafka offset, the message reprocesses on restart, causing a duplicate database write.

Kafka transactions

Kafka transactions solve this by atomically committing offsets and output:

producer.initTransactions();
producer.beginTransaction();
producer.send(producerRecord);
producer.sendOffsetsToTransaction(consumer.offsets(), consumer.groupMetadata());
producer.commitTransaction();

If the transaction commits, the message writes and the offset commits atomically. If it aborts, neither happens. The consumer reads the message again.

When you need exactly-once

Most use cases do not need exactly-once. At-least-once with idempotent processing is simpler and performs better. Consider exactly-once only when:

Duplicate processing has serious consequences (financial transactions, inventory updates)
Your output system does not support idempotent writes
The cost is acceptable

For most event processing pipelines, at-least-once plus idempotency is the right choice.

End-to-end exactly-once flow

The exactly-once flow in practice:

sequenceDiagram
    participant Producer
    participant Kafka as Kafka Cluster
    participant Consumer as Consumer App
    participant DB as Output DB
    participant Coordinator as Transaction Coordinator

    Producer->>Kafka: send() with transactional ID
    Kafka->>Producer: acknowledged

    Consumer->>Kafka: poll() receives record
    Consumer->>DB: write to database
    Consumer->>Coordinator: sendOffsetsToTransaction()
    Coordinator->>Kafka: commit offsets + data atomically
    Kafka->>Coordinator: commit confirmed
    Coordinator->>Consumer: offset commit confirmed

    Note over Consumer,DB: If crash here: replay from committed offset, skip already-written records

The transaction coordinator bundles the database write and the offset commit together. They either both commit or both abort. When the consumer restarts, it resumes from the last committed offset — skipping any records that were already written to the output system.

Without transactions, there is a window between “wrote to database” and “committed offset.” Crash in that window and you get duplicates. Transactions close that window.

Broker replication and ISR

Kafka replicates partitions across brokers for fault tolerance. Each partition has a leader and multiple followers. Followers replicate messages from the leader, staying in sync.

graph LR
    subgraph Broker-1
        L1[Partition 0 Leader]
    end
    subgraph Broker-2
        F1[Partition 0 Follower - ISR]
    end
    subgraph Broker-3
        F2[Partition 0 Follower - ISR]
    end
    Producer -->|writes| L1
    L1 -->|replicate| F1
    L1 -->|replicate| F2
    L1 -->|consume| Consumer

In-Sync Replicas (ISR) are replicas that have fully caught up with the leader. Only ISR members can become leaders after a failure. The replication.factor setting controls how many replicas exist, and min.insync.replicas defines the minimum ISR size for acknowledging writes.

For example, with replication.factor=3 and min.insync.replicas=2, a partition has 3 replicas. Writes acknowledge when at least 2 replicas (the leader plus 1 follower) have persisted the message.

Key configuration defaults

Parameter	Default	Description	Production recommendation
`replication.factor`	1	Number of replicas per partition	3 for critical topics
`min.insync.replicas`	1	Minimum ISR for acknowledge	2 (requires `acks=all`)
`retention.ms`	7 days	Message retention period	Based on replay requirements
`acks`	1 (leader)	Acknowledgment required	`all` for critical data
`compression.type`	producer	Compression codec	`lz4` or `zstd`
`max.in.flight.requests.per.connection`	5	Unacknowledged requests	1 for exactly-once

Kafka Streams example

Kafka Streams is a client library for building real-time stream processing applications. The word count example is the canonical demonstration. It counts occurrences of words across an infinite stream of text events:

import org.apache.kafka.common.serialization.Serdes;
import org.apache.kafka.streams.KafkaStreams;
import org.apache.kafka.streams.StreamsBuilder;
import org.apache.kafka.streams.StreamsConfig;
import org.apache.kafka.streams.kstream.KStream;
import org.apache.kafka.streams.kstream.Materialized;

import java.util.Arrays;
import java.util.Properties;

public class WordCountApplication {
    public static void main(String[] args) {
        Properties config = new Properties();
        config.put(StreamsConfig.APPLICATION_ID_CONFIG, "wordcount-app");
        config.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        config.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass());
        config.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass());

        StreamsBuilder builder = new StreamsBuilder();

        // Source: read from input topic
        KStream<String, String> textLines = builder.stream("text-lines-topic");

        // Process: split into words, group, count
        textLines
            .flatMapValues(textLine -> Arrays.asList(textLine.toLowerCase().split("\\W+")))
            .groupBy((key, word) -> word)
            .count(Materialized.as("word-counts-store"))
            .toStream()
            .to("word-counts-output-topic");

        KafkaStreams streams = new KafkaStreams(builder.build(), config);
        streams.start();
    }
}

How it works:

flatMapValues splits each input line into lowercase words
groupBy groups by word, discarding the message key
count maintains a running count per word in a state store
toStream.to writes results to the output topic

Scaling behavior: each partition processes words independently. The word “kafka” appearing in partitions 0 and 1 produces two separate counts that must be aggregated downstream if you need a global count. For true global word count, either use a single partition or aggregate in a subsequent step.

Fault tolerance: Kafka Streams checkpointing persists state to Kafka topics. If a stream processor crashes, it resumes from the last checkpointed position without data loss.

Partition count sizing

The number of partitions determines parallelism for both producers and consumers. Getting it right at topic creation matters since partition count is immutable.

Factors that drive partition count:

Factor	Impact	Consideration
Desired consumer parallelism	More partitions means more concurrent consumers	One partition per consumer in a group
Producer throughput target	Each partition handles about 10 MB/s	Partitions should be >= target_MB_s / 10
Maximum broker scale	Kafka performance degrades past roughly 4000 partitions per broker	Consider broker count when sizing
Key cardinality	High-cardinality keys distribute evenly	Avoid partitions that become hot spots

Sizing formula:

def calculate_partition_count(
    target_throughput_mbps: float,
    max_consumer_parallelism: int,
    num_brokers: int,
    replication_factor: int = 3
) -> dict:
    """
    Calculate recommended partition count for a topic.
    """
    # Throughput-based: each partition handles ~10 MB/s reliably
    partitions_for_throughput = math.ceil(target_throughput_mbps / 10)

    # Consumer-based: need at least as many partitions as consumers
    partitions_for_consumers = max_consumer_parallelism

    # Broker-based: avoid too many partitions per broker
    # Guideline: < 4000 partitions per broker for good performance
    max_partitions_per_broker = 4000
    partitions_for_broker_capacity = num_brokers * max_partitions_per_broker

    recommended = max(partitions_for_throughput, partitions_for_consumers)
    recommended = min(recommended, partitions_for_broker_capacity)

    # Account for replication: total partitions including replicas
    total_partitions_in_cluster = recommended * replication_factor

    return {
        'recommended_partitions': recommended,
        'based_on': 'throughput' if partitions_for_throughput >= partitions_for_consumers else 'consumer_parallelism',
        'throughput_achievable_mbps': recommended * 10,
        'max_consumer_parallelism': recommended,
        'total_partition_slots_used': total_partitions_in_cluster,
        'partitions_per_broker': recommended / num_brokers
    }

# Example: 100 MB/s target, 30 consumers, 6 brokers
result = calculate_partition_count(100, 30, 6)
# Returns: recommended=30 (based on consumer count),
#          throughput_achievable=300 MB/s,
#          partitions_per_broker=5

Practical guidelines:

Start conservative: 6-12 partitions for most topics
Increase when consumer lag appears and more consumers cannot be added
Monitor per-partition throughput to detect hot spots
When repartitioning is needed, create a new topic and migrate data with a consumer that reads from both

Producer and consumer implications:

More partitions means more producer connections and higher client-side memory
More partitions means longer leader election time after broker failure
Consumers with many partitions need more heap for offset tracking

Kafka use cases

Kafka excels at specific workloads:

Event streaming

Capture events from many sources and distribute to multiple consumers:

Clickstream -> Kafka -> Analytics
                     -> Personalization
                     -> Fraud Detection
                     -> Audit Log

Message bus replacement

Replace traditional message queues with Kafka for better throughput and replay capability.

Data integration

Connect different systems without point-to-point coupling:

Database CDC -> Kafka -> Search Index
                     -> Cache
                     -> Data Warehouse
                     -> ML Pipeline

Change Data Capture from databases fits this pattern well. Any system can consume the stream without touching the source database.

Kafka vs traditional queues

Aspect	Kafka	Traditional Queue
Retention	Days/weeks/forever	Until consumed
Replay	Yes	No (usually)
Ordering	Per partition	Per queue
Throughput	Very high	Moderate
Consumer groups	Independent per group	Shared or exclusive

Kafka’s retention and replay make it unique. You can reprocess historical data if your processing logic changes. Traditional queues cannot do this.

For broader event-driven patterns, see our post on event-driven architecture. For pub/sub patterns that overlap with Kafka’s topic model, see pub/sub patterns.

When to use and when not to use

When to use Apache Kafka

High-throughput event streaming: when you need to process millions of events per second (clickstreams, IoT sensors, logs)
Event sourcing: when you need to store and replay event history for audit trails or read model rebuilding
Data pipeline backbone: when connecting multiple systems (CDC, ETL, data lake ingestion) without point-to-point coupling
Multiple independent consumers: when many different consumer groups need to process the same data stream independently
Ordered per-key processing: when messages with the same key must maintain order (same partition)
Long-term message retention: when you need replay capability for historical data analysis

When not to use Apache Kafka

Simple task queues: when you just need point-to-point work distribution with single consumer
Low-volume point-to-point messaging: when the overhead of managing partitions and consumer groups is not justified
Request/response: when you need synchronous responses from specific services
Message priority: when you need priority queuing (Kafka does not support this)
Small teams with limited operational capacity: when Kafka’s operational complexity (partition management, replication, leader election) exceeds team capabilities
Very low latency requirements: when sub-millisecond latency is critical (Kafka adds latency for batching and acknowledgment)

Production failure scenarios

Failure	Impact	Mitigation
Broker goes down	Partition leader election; temporary unavailability	Configure replication factor of 3; use ISR configuration
Controller failure	Cluster-wide coordination停顿	Run multiple brokers; use ZooKeeper/KRaft for controller election
Network partition	Followers fall out of ISR; potential data loss	Monitor ISR size; alert when replicas fall behind
Producer retry storm	Duplicate messages after transient failures	Enable idempotent producer; design idempotent consumers
Consumer rebalance storm	Throughput drops during rebalancing	Use sticky partition assignment; avoid frequent consumer restarts
Offset commit failure	Messages reprocessed or skipped	Use transactional producers with exactly-once semantics when needed
Partition imbalance	Some brokers overloaded while others idle	Monitor partition distribution; use Cruise Control for rebalancing
Data loss on leader change	Under-replicated partitions lose messages	Ensure min.insync.replicas >= 2; acks=all on producers

Observability checklist

Metrics to monitor

Consumer lag: difference between latest offset and consumer position (critical for SLAs)
Under-replicated partitions: partitions without full replication
ISR size: In-Sync Replicas count per partition
Message throughput: messages and bytes per second per topic
Request latency: P99 producer and consumer request latencies
Disk usage: broker disk utilization and growth rate
Consumer group status: active members and partition assignments
Controller status: leader elections and controller changes

Logs to capture

Broker startup and shutdown events
Partition leader election events
Consumer group rebalancing events
Producer acknowledgment failures and retries
Controller changes and election events
Under-replicated partition events
Disk space warnings

Alerts to configure

Consumer lag exceeds SLA threshold (for example, more than 5 minutes behind)
Under-replicated partitions greater than 0
Broker disk usage above 80%
Producer error rate above 1%
Consumer group has no active members
Controller is unavailable
Messages per second exceeds capacity threshold
Request latency P99 exceeds threshold

ZooKeeper and KRaft storage requirements

Kafka needs somewhere to store cluster metadata — ZooKeeper traditionally, KRaft in Kafka 3.3+.

What is stored:

Partition leadership and ISR membership
Consumer group offsets and membership
Access control lists (ACLs)
Topic configurations
Delegation token information

ZooKeeper/KRaft storage ≈
    partitions × (leadership + ISR state)
  + consumer_groups × (member offsets + metadata)
  + topics × (configs + ACLs)
  + delegation_tokens

Typical storage needs:

Cluster Size	Topics	Partitions	Consumer Groups	ZooKeeper/KRaft Storage
Small (3 brokers)	50	200	30	50-200 MB
Medium (6 brokers)	200	1,000	150	200-500 MB
Large (12 brokers)	500	5,000	500	1-3 GB
Very large (24+ brokers)	1,000+	20,000+	2,000+	5-10 GB

Storage is not the real bottleneck. Znode count is. ZooKeeper falls over when there are millions of child znodes — which happens when you have many consumer group members or partition replicas. KRaft mode removes ZooKeeper dependency and scales better.

If you are still on ZooKeeper, migrate to KRaft. Kafka 3.3+ supports live migration — no downtime needed.

Security checklist

Authentication: use SASL/PLAIN or SCRAM for client authentication; mTLS for certificate-based auth
Authorization: implement ACLs to restrict topic access; principle of least privilege
Encryption in transit: enable SSL/TLS for all broker and client connections
Encryption at rest: use disk encryption or Kafka’s Secret API for sensitive data
Schema validation: validate message schemas with Confluent Schema Registry
Data sanitization: sanitize message keys and values to prevent injection
Audit logging: enable Kafka’s audit logging for admin operations
Network segmentation: place brokers in private networks; restrict inter-broker communication

Common pitfalls and anti-patterns

Pitfall 1: too many partitions

Each partition increases Kafka’s overhead (file handles, memory, leader elections). Creating thousands of partitions when you only need dozens causes unnecessary complexity. Start with fewer partitions and increase only when needed.

Pitfall 2: ignoring consumer lag

If consumers cannot keep up with producers, lag grows indefinitely. Monitor lag continuously and scale consumers or optimize processing logic.

Pitfall 3: not using compression

Uncompressed messages waste network bandwidth and storage. Enable compression (LZ4, ZSTD, or Snappy) on producers.

Pitfall 4: auto.offset.reset = earliest without understanding consequences

This setting replays all messages from the beginning of the log. In production with large retention, this can overwhelm consumers. Use it deliberately, not as a default.

Pitfall 5: sending sensitive data unencrypted

Kafka does not encrypt messages by default. If you send PII, credentials, or sensitive data, enable encryption or do not send such data through Kafka.

Pitfall 6: not planning for partition count

Partition count is fixed at topic creation. If you need more later, you must recreate the topic. Plan partition count based on expected throughput and consumer parallelism requirements.

Quick recap

Key points

Kafka is a distributed log, not a queue; messages are retained and can be replayed
Topics divide into partitions for parallelism; same key goes to same partition
Consumer groups enable independent consumption by different services
At-least-once delivery is the default; exactly-once requires Kafka transactions
Rebalancing occurs when consumers join or leave; frequent rebalances hurt throughput
ZooKeeper (or KRaft in newer versions) manages cluster metadata

Pre-deployment checklist

- [ ] Replication factor set to 3 for critical topics
- [ ] min.insync.replicas configured appropriately
- [ ] Consumer lag monitoring configured and alerts set
- [ ] Producer retries and idempotency configured
- [ ] Compression enabled on producers
- [ ] Schema Registry deployed for schema validation
- [ ] ACLs configured for topic access control
- [ ] TLS/SSL encryption enabled for all connections
- [ ] Partition count planned based on throughput requirements
- [ ] Retention policy configured (hours, days, weeks)
- [ ] Dead letter topic configured for failed messages
- [ ] Consumer group offset reset policy defined
- [ ] Backup and disaster recovery plan documented

Conclusion

Kafka is a distributed log that happens to support messaging patterns. Topics are logs, partitions are shards, consumer groups enable independent consumption. The offset-based consumption model gives you replay capability that traditional queues lack.

For high-throughput event streaming, Kafka delivers. The operational complexity is real (ZooKeeper or KRaft for metadata, partition management, replication), but the scalability and durability are legitimate.

If you need true exactly-once, Kafka transactions are available but costly. For most pipelines, at-least-once with idempotent consumers is simpler and sufficient.

Apache Kafka: Distributed Streaming Platform

Topics and partitions

Message keying

Partition assignment

Consumer groups

Rebalancing

Offset management

Exactly-once semantics

The problem

Kafka transactions

When you need exactly-once

End-to-end exactly-once flow

Broker replication and ISR

Key configuration defaults

Kafka Streams example

Partition count sizing

Kafka use cases

Event streaming

Message bus replacement

Data integration

Kafka vs traditional queues

When to use and when not to use

When to use Apache Kafka

When not to use Apache Kafka

Production failure scenarios

Observability checklist

Metrics to monitor

Logs to capture

Alerts to configure

ZooKeeper and KRaft storage requirements

Security checklist

Common pitfalls and anti-patterns

Pitfall 1: too many partitions

Pitfall 2: ignoring consumer lag

Pitfall 3: not using compression

Pitfall 4: auto.offset.reset = earliest without understanding consequences

Pitfall 5: sending sensitive data unencrypted

Pitfall 6: not planning for partition count

Quick recap

Key points

Pre-deployment checklist

Conclusion

Category

Tags

Related Posts

Exactly-Once Delivery: The Elusive Guarantee

Ordering Guarantees in Distributed Messaging

Dead Letter Queues: Handling Message Failures Gracefully