Apache Kafka Part 5: KRaft - The Future of Kafka Architecture

Welcome to the final part of our Apache Kafka series! Today we’ll explore KRaft (Kafka Raft) - the game-changing consensus protocol that represents the future of Kafka architecture.

This is Part 5 of our comprehensive Kafka series. Make sure you’ve read the previous parts: Part 1 (Introduction), Part 2 (Building Blocks), Part 3 (Development Tools), and Part 4 (Administration).

What is KRaft?

KRaft (Kafka Raft Consensus Protocol) is Kafka’s native consensus mechanism that replaces Apache ZooKeeper for cluster coordination. Introduced in Kafka 2.8.0, KRaft represents a fundamental architectural shift toward a more streamlined, self-contained system.

The Evolution: From ZooKeeper to KRaft

Traditional Kafka Architecture (with ZooKeeper):
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              ZooKeeper Cluster                  β”‚
β”‚         (Manages Metadata & Coordination)       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚         β”‚         β”‚
        β–Ό         β–Ό         β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚Broker 1 β”‚ β”‚Broker 2 β”‚ β”‚Broker 3 β”‚
   β”‚         β”‚ β”‚         β”‚ β”‚         β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Modern Kafka Architecture (with KRaft):
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚           KRaft Controller Quorum               β”‚
β”‚    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”‚
β”‚    β”‚Ctrl 1   │◄─Ctrl 2   β”‚β–Ίβ”‚Ctrl 3   β”‚         β”‚
β”‚    β”‚(Leader) β”‚ β”‚         β”‚ β”‚         β”‚         β”‚
β”‚    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚ (Metadata Distribution)
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚         β”‚         β”‚
        β–Ό         β–Ό         β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚Broker 1 β”‚ β”‚Broker 2 β”‚ β”‚Broker 3 β”‚
   β”‚         β”‚ β”‚         β”‚ β”‚         β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Why Replace ZooKeeper?

Challenges with ZooKeeper

  1. Operational Complexity: Managing two separate systems (Kafka + ZooKeeper)
  2. Scaling Limitations: ZooKeeper becomes a bottleneck at scale
  3. Metadata Propagation: Inefficient broadcast model
  4. Resource Overhead: Additional infrastructure requirements

KRaft Advantages

Simplified Architecture: KRaft eliminates external dependencies, making Kafka truly self-contained and easier to operate.

  • Simpler Deployment: Single system to manage and monitor
  • Improved Scalability: Better handling of large clusters
  • Faster Metadata Operations: More efficient consensus mechanism
  • Right-sized Clusters: No need for separate ZooKeeper ensemble
  • Faster Recovery: Quicker failover and startup times

How KRaft Works

The Raft Consensus Algorithm

KRaft implements the Raft consensus algorithm, a well-understood distributed consensus protocol that ensures:

  • Leader Election: Automatic selection of a single leader
  • Log Replication: Consistent state across all nodes
  • Safety: Strong consistency guarantees

KRaft Architecture Components

# KRaft cluster roles
Controller Nodes: Manage cluster metadata and consensus
Broker Nodes: Handle client requests and data storage
Combined Nodes: Act as both controller and broker (for smaller deployments)

Metadata Management

KRaft stores all cluster metadata in a special internal topic:

# The metadata topic
__cluster_metadata

# What it contains:
- Cluster membership information
- Controller election state
- Topic configurations (partitions, replicas)
- Access Control Lists (ACLs)
- Quota configurations

KRaft Architecture Deep Dive

Controller Quorum

# Example KRaft controller configuration
process.roles=controller
node.id=1
controller.quorum.voters=1@localhost:9093,2@localhost:9094,3@localhost:9095
listeners=CONTROLLER://localhost:9093
controller.listener.names=CONTROLLER
log.dirs=/var/kafka-logs

Metadata Synchronization

Instead of ZooKeeper’s broadcast model, KRaft uses a pull-based approach:

  1. Active Controller: Leader of the metadata partition
  2. Follower Controllers: Replica followers of metadata
  3. Brokers: Replica observers that fetch metadata changes
KRaft Metadata Synchronization Flow:

Active Controller    Follower Controller    Broker
       β”‚                     β”‚               β”‚
       β”‚ 1. Write metadata   β”‚               β”‚
       β”‚    change           β”‚               β”‚
       │◄────────────────────┼────────────────
       β”‚                     β”‚ 2. Fetch      β”‚
       β”‚                     β”‚    latest     β”‚
       β”‚                     β”‚    metadata   β”‚
       │──────────────────────               β”‚
       β”‚ 3. Metadata updates β”‚               β”‚
       β”‚                     β”‚               β”‚
       │◄────────────────────┼────────────────
       β”‚                     β”‚               β”‚ 4. Fetch
       β”‚                     β”‚               β”‚    latest
       β”‚                     β”‚               β”‚    metadata
       │─────────────────────┼────────────────
       β”‚                     β”‚ 5. Metadata   β”‚
       β”‚                     β”‚    updates    β”‚

Benefits of Pull-Based Model

  • Faster Restarts: Brokers load entire metadata cache on demand
  • Better Synchronization: All nodes stay in sync automatically
  • Reduced Network Traffic: Efficient metadata distribution

Setting Up KRaft

Controller Configuration

# kraft-controller.properties
process.roles=controller
node.id=1
controller.quorum.voters=1@controller1:9093,2@controller2:9093,3@controller3:9093
listeners=CONTROLLER://localhost:9093
controller.listener.names=CONTROLLER
log.dirs=/var/kafka-controller-logs
metadata.log.dir=/var/kafka-controller-logs

Broker Configuration

# kraft-broker.properties
process.roles=broker
node.id=101
controller.quorum.voters=1@controller1:9093,2@controller2:9093,3@controller3:9093
listeners=PLAINTEXT://localhost:9092
log.dirs=/var/kafka-broker-logs

Combined Node Configuration

# kraft-combined.properties (for smaller deployments)
process.roles=broker,controller
node.id=1
controller.quorum.voters=1@localhost:9093,2@localhost:9094,3@localhost:9095
listeners=PLAINTEXT://localhost:9092,CONTROLLER://localhost:9093
controller.listener.names=CONTROLLER
log.dirs=/var/kafka-logs

Migration from ZooKeeper to KRaft

Migration Process Overview

Production Migration: Migrating from ZooKeeper to KRaft in production requires careful planning and testing. Always test in a staging environment first.

# Step 1: Prepare the migration
kafka-storage.sh format -t <cluster-id> -c kraft-controller.properties

# Step 2: Start KRaft controllers
kafka-server-start.sh kraft-controller.properties

# Step 3: Migrate metadata
kafka-metadata-shell.sh --snapshot /path/to/metadata/snapshot

# Step 4: Start KRaft brokers
kafka-server-start.sh kraft-broker.properties

Migration Considerations

  1. Cluster ID: Generate and maintain consistent cluster ID
  2. Metadata Export: Export existing ZooKeeper metadata
  3. Rolling Migration: Gradual transition of brokers
  4. Validation: Verify metadata consistency post-migration

Monitoring KRaft Clusters

KRaft-Specific Metrics

# Controller metrics
kafka.controller:type=KafkaController,name=ActiveControllerCount
kafka.controller:type=ControllerStats,name=LeaderElectionRateAndTimeMs

# Metadata metrics
kafka.server:type=metadata-log,name=NumRecordsInLog
kafka.server:type=metadata-log,name=CommittedOffset

Health Checks

# Check controller status
kafka-metadata-shell.sh --snapshot /var/kafka-logs/__cluster_metadata-0

# Verify quorum health
kafka-log-dirs.sh --bootstrap-server localhost:9092 --describe

Performance Improvements

Startup Time Comparison

# Traditional Kafka with ZooKeeper
Startup Time: 30-60 seconds (depending on metadata size)

# KRaft Mode
Startup Time: 5-15 seconds (faster metadata loading)

Scalability Improvements

  • Larger Clusters: Support for 100,000+ partitions
  • Faster Metadata Operations: Reduced latency for topic operations
  • Better Resource Utilization: No ZooKeeper overhead

Production Readiness

KRaft Maturity Timeline

Kafka 2.8.0 (April 2021): Early Access
Kafka 3.0.0 (September 2021): Production Ready (with limitations)
Kafka 3.3.0 (October 2022): Feature Complete
Kafka 3.5.0+ (June 2023): Fully Production Ready

Current Limitations (as of Kafka 3.6)

Migration Path: While KRaft is production-ready, some advanced features are still being migrated from the ZooKeeper implementation.

  • JBOD (Just a Bunch of Disks): Limited support
  • Delegation Tokens: Not yet supported
  • Some Admin Operations: Still being ported

Best Practices for KRaft

Controller Deployment

# Recommended controller setup
- Use dedicated controller nodes for large clusters
- Deploy controllers across different availability zones
- Use odd number of controllers (3 or 5)
- Ensure fast, reliable network between controllers

Resource Planning

# Controller resource requirements
CPU: 2-4 cores per controller
Memory: 4-8 GB heap size
Storage: Fast SSD for metadata logs
Network: Low-latency, high-bandwidth

Security Configuration

# KRaft security settings
controller.listener.names=CONTROLLER
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
ssl.keystore.location=/path/to/controller.keystore.jks

The Future of Kafka

Roadmap Highlights

  • Complete ZooKeeper Removal: Full feature parity achieved
  • Enhanced Scalability: Support for even larger clusters
  • Improved Operations: Better tooling and monitoring
  • Cloud-Native Features: Better Kubernetes integration

Industry Impact

KRaft positions Kafka as:

  • Simpler to Operate: Reduced operational complexity
  • More Scalable: Better performance at scale
  • Cloud-Ready: Easier containerization and orchestration
  • Future-Proof: Modern architecture for next-generation workloads

Series Conclusion

Throughout this 5-part series, we’ve explored:

  1. Kafka Fundamentals: Event streaming concepts and motivation
  2. Core Building Blocks: Topics, partitions, producers, and consumers
  3. Development Tools: APIs and frameworks for building applications
  4. Administration: Monitoring, security, and operational excellence
  5. KRaft Architecture: The future of Kafka without ZooKeeper

Key Takeaways

  • KRaft Simplifies Operations: Eliminates ZooKeeper dependency
  • Better Performance: Faster startup and metadata operations
  • Production Ready: Suitable for enterprise deployments
  • Future Direction: Represents Kafka’s architectural evolution
  • Migration Path: Gradual transition from ZooKeeper is possible

The Future is KRaft: As ZooKeeper support will eventually be deprecated, adopting KRaft positions your Kafka infrastructure for long-term success.

Apache Kafka with KRaft represents the culmination of years of architectural evolution, delivering a more robust, scalable, and operationally simple event streaming platform. Whether you’re starting fresh or planning a migration, KRaft is the foundation for your event-driven future.

Comments

Join the discussion and share your thoughts