00:00

Apache Kafka Zookeeper

Apache Kafka is a popular tool used for building real-time data pipelines and streaming applications. To work properly, Kafka needs a system to manage and coordinate its cluster of servers. This is where ZooKeeper comes in.

Role of ZooKeeper in Kafka

ZooKeeper is like the manager of a Kafka cluster. It keeps track of all the Kafka servers (called brokers), maintains configuration information, and handles leader elections for topics. This helps Kafka to remain reliable, consistent, and fault-tolerant.

  • Manages Kafka brokers and nodes.
  • Tracks which broker is the leader for a topic partition.
  • Stores metadata about topics, partitions, and consumer groups.
  • Notifies brokers of changes in the cluster.

Simple Kafka-ZooKeeper Example

Suppose you want to create a Kafka topic called orders. ZooKeeper helps Kafka by managing which broker will be the leader for the partitions of this topic. This ensures that producers and consumers always know which broker to send data to or read data from.

# Start ZooKeeper
bin/zookeeper-server-start.sh config/zookeeper.properties

# Start Kafka broker
bin/kafka-server-start.sh config/server.properties

# Create a topic named "orders"
bin/kafka-topics.sh --create --topic orders --bootstrap-server localhost:9092 --partitions 3 --replication-factor 1

Summary

In simple terms, Kafka ZooKeeper is the backbone that keeps a Kafka cluster organized and running smoothly. It acts as a coordinator, helping brokers communicate, elect leaders, and maintain cluster metadata. Without ZooKeeper, managing Kafka in a distributed environment would be very difficult.