Apache Kafka - Achieving Exactly-Once Semantics (EOS)

November 2025| Tags: Kafka, Distributed System, Event-driven
image

In distributed systems, message delivery guarantees are one of the most critical design considerations. Every pipeline must answer a deceptively simple question: how many times will this message be processed?

Traditionally, systems have offered two guarantees:

  • At-most-once delivery: Messages are processed zero or one time. This avoids duplicates but risks data loss if failures occur.
  • At-least-once delivery: Messages are processed one or more times. This ensures reliability but introduces duplicates that downstream systems must handle.

In domains like payments, fraud detection, and compliance auditing, duplicates or lost events can have severe consequences.

Exactly-once Semantics (EOS)

Exactly-once semantics ensures that each record is processed once and only once, even in the face of retries, crashes, or rebalances. Achieving this guarantee in a distributed, high-throughput system is notoriously difficult.

Apache Kafka introduced native support for exactly-once semantics through idempotent producers, transactional writes, and atomic offset commits, making it possible to build pipelines that are both reliable and correct.

In this blog, we’ll explore:

  • Why exactly-once semantics is hard to achieve in distributed systems
  • How Kafka implements EOS under the hood
  • The trade-offs between correctness and performance

Distributed Systems and their Unreliability

Distributed systems are inherently unreliable: networks drop packets, brokers crash, consumers restart, and retries happen. Because of this, message delivery guarantees are framed in three categories:

1. At-most-once delivery

If a failure occurs after the broker acknowledges receipt but before the consumer processes it, the message is lost.

2. At-least-once delivery

Retries ensure reliability, but duplicates are possible if the consumer processes the same message multiple times.

3. Exactly-once delivery

This is the “holy grail” of messaging guarantees. Each message is delivered once and only once, even in the presence of retries, crashes, or rebalances.

Why Exactly-Once Is Hard

There are multiple factors making exactly-once delivery difficult to achieve.

  • A producer retry after a timeout can cause the same message to be appended twice.
  • If a consumer commits offsets before processing, crashes can cause data loss. If it commits after processing, retries can cause duplicates.
  • Broker crashes, network partitions, or consumer rebalances can break the guarantee.

How Kafka implements EOS under the hood

Achieving exactly-once semantics in Kafka relies on three key features working together - idempotent producers, transactions, and atomic offset commits.

Idempotent Producers

Retries are inevitable in distributed systems, but they often cause duplicates. Kafka solves this with idempotent producers: each producer gets a unique ID, and every message carries a sequence number. Brokers track these numbers and discard duplicates, ensuring that even if a producer retries, only one copy is stored.

    enable.idempotence=true
Transactions

Idempotence prevents duplicates, but not partial updates across partitions. Kafka transactions allow producers to group multiple writes together and commit or abort them as a unit. This guarantees atomicity, so consumers never see half-completed operations.

Example scenario: Consider a payment pipeline where a debit event must be written to one topic and a credit event to another. If one succeeds and the other fails, the system is left in an inconsistent state.

    transactional.id=tx-1 # unique per-producer instance
Atomic Offset Commits

Consumers also play a role in achieving exactly-once semantics. Traditionally, consumers risk double-processing if they commit offsets separately from their writes. Kafka resolves this by allowing offset commits to be part of the same transaction as producer writes. Consumers configured with ‘read_committed’ isolation only see records that belong to committed transactions, guaranteeing that no duplicates or lost messages slip through.

    isolation.level=read_committed # Consumer-side setting
Entire Workflow

Producer-side

  1. Producer sends messages with idempotence enabled.
  2. Messages are grouped into a transaction.
  3. Offsets are committed atomically with the transaction. Consumer-side
  4. Consumers read only committed messages.

Using Kafka Streams for EOS

Kafka Streams makes exactly-once semantics simple by handling transactions, state updates, and output records under the hood. With a single configuration, it ensures that every record is processed once and only once, even when failures or retries occur. This is achieved by atomically committing state store changes and output messages together, so your stream processing logic remains consistent and correct.

    processing.guarantee=exactly_once_v2

Performance & Trade-Offs

  • Transaction overhead - Each transaction requires coordination with the broker’s transaction coordinator, adding latency.
  • Reduced throughput - EOS pipelines process fewer messages per second compared to at-least-once delivery.
  • Broker memory pressure - Large or long-running transactions keep uncommitted records in memory until completion.
  • Consumer lag - With read_committed isolation, consumers wait for transactions to finalize before records are visible.
  • Correctness vs performance - EOS ensures accuracy but sacrifices raw speed.

Hybrid approach - Many architectures mix guarantees — EOS where correctness matters, at-least-once where throughput is key.

While exactly-once semantics has long been considered the “holy grail” of distributed messaging, in practice, EOS should be applied selectively. By understanding the building blocks, workflows, and pitfalls, engineers can design systems that balance reliability with efficiency — and confidently build event-driven applications that process each record once and only once.

Let's Connect

for a cup of coffee, challenges, or conversations that spark something new

dakshin.g [at] outlook [dot] com
www.dakshin.cc