00:00

Kafka Offset

In Apache Kafka, an offset is a unique number that identifies the position of a message within a partition. Think of it like a page number in a book. Each message in a Kafka partition gets a sequential number, starting from 0.

Offsets help Kafka consumers know:

  • Which messages have already been read
  • Which message should be read next
  • How to resume reading after a restart or failure

In simple words, a Kafka offset tracks message consumption.


Key Features of Kafka Offset

  • Unique per Partition: Offsets are unique only within a partition, not across the entire topic.
  • Sequential Order: Kafka assigns offsets in increasing order (0, 1, 2, 3, and so on).
  • Consumer Controlled: Consumers keep track of offsets, not producers.
  • Supports Replay: Consumers can reset offsets to re-read old messages if needed.
  • Fault Tolerant: Kafka stores committed offsets, allowing consumers to resume after crashes.
  • Independent Consumption: Different consumer groups can read the same messages using their own offsets.

Simple Example of Kafka Offset

Let’s understand Kafka offset with a simple example.

Scenario

  • Topic name: order-topic
  • Partition: Partition 0

Messages in the Partition

Offset Message
0 Order Created
1 Payment Received
2 Order Packed
3 Order Shipped

How Consumer Uses Offset

If a consumer has successfully processed messages up to offset 1, it means:

  • Messages with offset 0 and 1 are already consumed
  • The next message to read will be offset 2

If the consumer crashes and restarts, Kafka allows it to continue from offset 2 instead of starting from the beginning.


Kafka Offset Commit

When a consumer finishes processing messages, it commits the offset. This tells Kafka, “I have safely processed messages up to this point.”

Offsets can be committed in two ways:

  • Automatic Commit: Kafka commits offsets automatically at a fixed interval
  • Manual Commit: The application explicitly commits offsets after processing

Manual commit gives better control and is commonly used in production systems.


Why Kafka Offset is Important

  • Prevents message loss
  • Avoids duplicate message processing
  • Enables reliable and scalable data processing
  • Supports consumer recovery and reprocessing

Summary

A Kafka offset is a number that represents the position of a message in a partition. It helps consumers track what they have already read and what comes next. Offsets make Kafka reliable, fault-tolerant, and flexible by allowing message replay and safe recovery after failures.

In short, Kafka offsets are the backbone of message tracking in Kafka.