Apache Kafka: Part 6 - KRaft, The Future of Kafka Architecture
Welcome to the final part of my Apache Kafka series! Today weβll explore KRaft (Kafka Raft). This is the game changing consensus protocol that represents the future of Kafka architecture.
What is KRaft?
KRaft (Kafka Raft Consensus Protocol) is Kafkaβs native consensus mechanism that replaces Apache ZooKeeper for cluster coordination. Introduced in Kafka 2.8.0, KRaft represents a fundamental architectural shift toward a more streamlined, self contained system.
The Evolution: From ZooKeeper to KRaft
Traditional Kafka Architecture (with ZooKeeper):
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β ZooKeeper Cluster β
β (Manages Metadata & Coordination) β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββ
β
βββββββββββΌββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββ βββββββββββ βββββββββββ
βBroker 1 β βBroker 2 β βBroker 3 β
β β β β β β
βββββββββββ βββββββββββ βββββββββββ
Modern Kafka Architecture (with KRaft):
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β KRaft Controller Quorum β
β βββββββββββ βββββββββββ βββββββββββ β
β βCtrl 1 βββ€Ctrl 2 ββΊβCtrl 3 β β
β β(Leader) β β β β β β
β βββββββββββ βββββββββββ βββββββββββ β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββ
β (Metadata Distribution)
βββββββββββΌββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββ βββββββββββ βββββββββββ
βBroker 1 β βBroker 2 β βBroker 3 β
β β β β β β
βββββββββββ βββββββββββ βββββββββββ
The diagrams above show the difference. With KRaft, everything is self contained within Kafka.
Why Replace ZooKeeper?
Challenges with ZooKeeper
- Operational Complexity: Managing two separate systems (Kafka + ZooKeeper)
- Scaling Limitations: ZooKeeper becomes a bottleneck at scale
- Metadata Propagation: Inefficient broadcast model
- Resource Overhead: Additional infrastructure requirements
Running ZooKeeper alongside Kafka means more moving parts, more things that can break, and more things to monitor.
KRaft Advantages
Simplified Architecture: KRaft eliminates external dependencies, making Kafka truly self contained and easier to operate.
- Simpler Deployment: Single system to manage and monitor
- Improved Scalability: Better handling of large clusters
- Faster Metadata Operations: More efficient consensus mechanism
- Right sized Clusters: No need for separate ZooKeeper ensemble
- Faster Recovery: Quicker failover and startup times
How KRaft Works
The Raft Consensus Algorithm
KRaft implements the Raft consensus algorithm. This is a well understood distributed consensus protocol that ensures:
- Leader Election: Automatic selection of a single leader
- Log Replication: Consistent state across all nodes
- Safety: Strong consistency guarantees
KRaft Architecture Components
# KRaft cluster roles
Controller Nodes: Manage cluster metadata and consensus
Broker Nodes: Handle client requests and data storage
Combined Nodes: Act as both controller and broker (for smaller deployments)
Metadata Management
KRaft stores all cluster metadata in a special internal topic:
# The metadata topic
__cluster_metadata
# What it contains:
- Cluster membership information
- Controller election state
- Topic configurations (partitions, replicas)
- Access Control Lists (ACLs)
- Quota configurations
KRaft Architecture Deep Dive
Controller Quorum
# Example KRaft controller configuration
process.roles=controller
node.id=1
controller.quorum.voters=1@localhost:9093,2@localhost:9094,3@localhost:9095
listeners=CONTROLLER://localhost:9093
controller.listener.names=CONTROLLER
log.dirs=/var/kafka-logs
Metadata Synchronization
Instead of ZooKeeperβs broadcast model, KRaft uses a pull based approach:
- Active Controller: Leader of the metadata partition
- Follower Controllers: Replica followers of metadata
- Brokers: Replica observers that fetch metadata changes
KRaft Metadata Synchronization Flow:
Active Controller Follower Controller Broker
β β β
β 1. Write metadata β β
β change β β
βββββββββββββββββββββββΌββββββββββββββββ€
β β 2. Fetch β
β β latest β
β β metadata β
βββββββββββββββββββββββ€ β
β 3. Metadata updates β β
β β β
βββββββββββββββββββββββΌββββββββββββββββ€
β β β 4. Fetch
β β β latest
β β β metadata
βββββββββββββββββββββββΌββββββββββββββββ€
β β 5. Metadata β
β β updates β
Benefits of Pull Based Model
- Faster Restarts: Brokers load entire metadata cache on demand
- Better Synchronization: All nodes stay in sync automatically
- Reduced Network Traffic: Efficient metadata distribution
Setting Up KRaft
Controller Configuration
# kraft-controller.properties
process.roles=controller
node.id=1
controller.quorum.voters=1@controller1:9093,2@controller2:9093,3@controller3:9093
listeners=CONTROLLER://localhost:9093
controller.listener.names=CONTROLLER
log.dirs=/var/kafka-controller-logs
metadata.log.dir=/var/kafka-controller-logs
Broker Configuration
# kraft-broker.properties
process.roles=broker
node.id=101
controller.quorum.voters=1@controller1:9093,2@controller2:9093,3@controller3:9093
listeners=PLAINTEXT://localhost:9092
log.dirs=/var/kafka-broker-logs
Combined Node Configuration
# kraft-combined.properties (for smaller deployments)
process.roles=broker,controller
node.id=1
controller.quorum.voters=1@localhost:9093,2@localhost:9094,3@localhost:9095
listeners=PLAINTEXT://localhost:9092,CONTROLLER://localhost:9093
controller.listener.names=CONTROLLER
log.dirs=/var/kafka-logs
For smaller setups or development environments, combined nodes work great. For production, youβll probably want dedicated controllers.
Migration from ZooKeeper to KRaft
Migration Process Overview
Production Migration: Migrating from ZooKeeper to KRaft in production requires careful planning and testing. Always test in a staging environment first!
# Step 1: Prepare the migration
kafka-storage.sh format -t <cluster-id> -c kraft-controller.properties
# Step 2: Start KRaft controllers
kafka-server-start.sh kraft-controller.properties
# Step 3: Migrate metadata
kafka-metadata-shell.sh --snapshot /path/to/metadata/snapshot
# Step 4: Start KRaft brokers
kafka-server-start.sh kraft-broker.properties
Migration Considerations
- Cluster ID: Generate and maintain consistent cluster ID
- Metadata Export: Export existing ZooKeeper metadata
- Rolling Migration: Gradual transition of brokers
- Validation: Verify metadata consistency after migration
Monitoring KRaft Clusters
KRaft Specific Metrics
# Controller metrics
kafka.controller:type=KafkaController,name=ActiveControllerCount
kafka.controller:type=ControllerStats,name=LeaderElectionRateAndTimeMs
# Metadata metrics
kafka.server:type=metadata-log,name=NumRecordsInLog
kafka.server:type=metadata-log,name=CommittedOffset
Health Checks
# Check controller status
kafka-metadata-shell.sh --snapshot /var/kafka-logs/__cluster_metadata-0
# Verify quorum health
kafka-log-dirs.sh --bootstrap-server localhost:9092 --describe
Performance Improvements
Startup Time Comparison
# Traditional Kafka with ZooKeeper
Startup Time: 30-60 seconds (depending on metadata size)
# KRaft Mode
Startup Time: 5-15 seconds (faster metadata loading)
Thatβs a huge improvement! Faster startup means faster recovery from failures.
Scalability Improvements
- Larger Clusters: Support for 100,000+ partitions
- Faster Metadata Operations: Reduced latency for topic operations
- Better Resource Utilization: No ZooKeeper overhead
Production Readiness
KRaft Maturity Timeline
Kafka 2.8.0 (April 2021): Early Access
Kafka 3.0.0 (September 2021): Production Ready (with limitations)
Kafka 3.3.0 (October 2022): Feature Complete
Kafka 3.5.0+ (June 2023): Fully Production Ready
Current Limitations (as of Kafka 3.6)
Migration Path: While KRaft is production ready, some advanced features are still being migrated from the ZooKeeper implementation.
- JBOD (Just a Bunch of Disks): Limited support
- Delegation Tokens: Not yet supported
- Some Admin Operations: Still being ported
These limitations are being addressed in newer Kafka releases.
Best Practices for KRaft
Controller Deployment
# Recommended controller setup
- Use dedicated controller nodes for large clusters
- Deploy controllers across different availability zones
- Use odd number of controllers (3 or 5)
- Ensure fast, reliable network between controllers
Resource Planning
# Controller resource requirements
CPU: 2-4 cores per controller
Memory: 4-8 GB heap size
Storage: Fast SSD for metadata logs
Network: Low latency, high bandwidth
Security Configuration
# KRaft security settings
controller.listener.names=CONTROLLER
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
ssl.keystore.location=/path/to/controller.keystore.jks
The Future of Kafka
Roadmap Highlights
- Complete ZooKeeper Removal: Full feature parity achieved
- Enhanced Scalability: Support for even larger clusters
- Improved Operations: Better tooling and monitoring
- Cloud Native Features: Better Kubernetes integration
Industry Impact
KRaft positions Kafka as:
- Simpler to Operate: Reduced operational complexity
- More Scalable: Better performance at scale
- Cloud Ready: Easier containerization and orchestration
- Future Proof: Modern architecture for next generation workloads
Series Conclusion
Throughout this 6 part series, weβve explored:
- Kafka Fundamentals: Event streaming concepts and motivation
- Core Building Blocks: Topics, partitions, producers, and consumers
- Cluster Architecture: Creating topics and understanding partition distribution
- Development Tools: APIs and frameworks for building applications
- Administration: Monitoring, security, and operational excellence
- KRaft Architecture: The future of Kafka without ZooKeeper
Itβs been quite a journey! I hope these posts have helped you understand Kafka better.
Key Takeaways
- KRaft Simplifies Operations: Eliminates ZooKeeper dependency
- Better Performance: Faster startup and metadata operations
- Production Ready: Suitable for enterprise deployments
- Future Direction: Represents Kafkaβs architectural evolution
- Migration Path: Gradual transition from ZooKeeper is possible
The Future is KRaft: As ZooKeeper support will eventually be deprecated, adopting KRaft positions your Kafka infrastructure for long term success.
Apache Kafka with KRaft represents the culmination of years of architectural evolution. It delivers a more robust, scalable, and operationally simple event streaming platform. Whether youβre starting fresh or planning a migration, KRaft is the foundation for your event driven future.
Thanks for following along with this series!
Part 6 of 6
Comments
Join the discussion and share your thoughts