Click to learn more about author Zeke Dean.
True to its name, Apache ZooKeeper corrals the proverbial zoo of otherwise unruly elements within an Apache Kafka cluster. The open-source ZooKeeper project delivers a centralized way of coordinating Kafka’s brokers and cluster topology, serving as a file system that ensures the consistency of Kafka configuration information. It also performs leadership elections among Kafka brokers and topic partition pairs and cleanly manages service discovery for brokers as they balance one another to make up the cluster.
Kafka itself brings a low-overhead design that: 1) is exceptionally friendly to horizontal scaling and 2) runs very smoothly on inexpensive commodity hardware. Adding ZooKeeper to the mix accentuates Kafka’s inherent advantages. Pair ZooKeeper with abundant bandwidth and powerful disks in-line with Kafka’s best practices, and you will drastically reduce latency.
Here are three best practices for optimizing Kafka deployments with ZooKeeper while ensuring security and keeping latency to a minimum.
1. Keep ZooKeeper nodes to five or fewer (unless you have a strong case for surpassing that limit)
In dev environments, a single ZooKeeper node is all you need. In staging environments, it makes sense to mirror the number of nodes that will be in place in production. Generally, a typical Kafka cluster will be well served by three ZooKeeper nodes. If a Kafka deployment is particularly large, then consider utilizing five ZooKeeper nodes. Doing so will lower latencies, though, with the side effect of increasing the burden on the nodes.
It’s critical to be conscious of not pushing ZooKeeper beyond its bounds — which are most often determined by CPU and network throughput limits. Unless there’s a use case demanding a particularly compelling reason to do so (and that is likely to function effectively), you want to avoid scaling ZooKeeper beyond five nodes. Greater numbers will produce massive load, as all nodes try to stay in sync while handling Kafka requests. (That said, this issue continues to be reduced as new versions of Kafka are released that place less reliance and strain upon ZooKeeper.)
2. Secure Kafka and ZooKeeper by practicing isolation
The security of a Kafka deployment hinges on two key elements: 1) Kafka’s internal configuration and 2) the infrastructure it uses. Kafka offers support for Kafka/client and Kafka/ZooKeeper authentication, as well as TLS support that safeguards deployments with public internet clients.
Isolation is essential to securing Kafka and ZooKeeper. ZooKeeper should only connect with Kafka — never the public internet. Kafka and ZooKeeper should also be isolated and protected by appropriate firewalls and security groups. Brokers should be positioned in a private network that does not accept outside connections, and Kafka itself should remain a step removed from public internet clients through the use of a load balancing or middleware layer.
3. Tune-up ZooKeeper to minimize latency
There are several other relatively quick tweaks that will add up to a more effective ZooKeeper deployment when implemented together and conscientiously maintained. Use servers with high-performance network bandwidth. Utilize appropriate disks. Store logs to a separate disk. Fully isolate the ZooKeeper process and make sure that swaps are disabled. Implement effective monitoring and alerts and track latency using instrumentation dashboards.