top of page

Performance tuning — Apache Kafka

Performance tuning — Apache Kafka

Running a self managed Apache Kafka cluster could be a daunting task, once you move beyond basic hello world style setup and start moving serious heavy volume of data.

Here are some of the Kafka cluster optimization which has been gathered across the internet.

Not everything is possible

You can’t get everything. Kafka performance tuning requires managing opposing goals between latency and throughput basically choosing between how fast you need the messages to be consumed vs how much volume of the messages.

These principals are applicable across all distributed systems including DynamoDB, Elasticsearch, in fact the concept is called CAP theorem, , which balances consistency , availability, and partition tolerance.


Partitions play a crucial role in determining the parallelism of consumers. Increasing the number of partitions allows for the addition of more consumers, leading to enhanced throughput. It is advisable to determine the number of partitions based on the performance capabilities of your consumer and the required consumption rate. For instance, if your consumer can handle 1,000 events per second (EPS) and you need to achieve a consumption rate of 5,000 EPS, it is recommended to opt for 5 partitions.


Optimal Kafka replication settings depend on various factors, including your specific use case, infrastructure, and performance requirements. However, I can provide some general recommendations for configuring Kafka replication:

  • Replication Factor : Set a replication factor that ensures fault tolerance. A replication factor of 3 is commonly used, allowing for one broker to fail without data loss.

  • Min In-Sync Replicas: Configure `min.insync.replicas` to a value greater than 1. This ensures that a minimum number of in-sync replicas are available before acknowledging a write.

  • Unclean Leader Election: Disable unclean leader election to prevent a replica with out-of-date data from becoming a leader. This helps maintain data consistency.

  • Replica Lag Time: Adjust `` to control the maximum time a replica can lag behind the leader. This can help prevent slow or lagging replicas from affecting overall cluster performance.

JVM Settings

  • Heap Settings: Set the initial and maximum heap size to the same value to prevent the JVM from resizing the heap dynamically.

  • Garbage Collection: Use the G1 Garbage Collector, which is known for better performance in terms of latency and throughput. Add the following flags

  • Thread Stack Size: Adjust the thread stack size if necessary

  • JVM Performance Options: Optimize for server-class machines:

  • Set the JVM to aggressive optimization

  • Use tiered compilation for improved startup

  • File Descriptors: Increase the maximum number of file descriptors if needed

  • JMX (Java Management Extensions): Enable JMX for monitoring Kafka using tools like JConsole

Easier way to pass all these settings into the Kafka container would be

docker run \ -e KAFKA_JVM_PERFORMANCE_OPTS=”-Xms8G -Xmx8G -XX:+UseG1GC -Xloggc:/var/log/gc.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xss512k” \ other_kafka_options \ confluentinc/cp-kafka

Broker Settings

# Broker Listenerslisteners=PLAINTEXT://:9092,SSL://:9093# Log Directorylog.dirs=/path/to/kafka/data# Auto-Create Topicsauto.create.topics.enable=false# Num Partitionsnum.partitions=3# Default Replication Factordefault.replication.factor=2# Unclean Leader Electionunclean.leader.election.enable=false# Min In-Sync Replicasmin.insync.replicas=2# Default Log Segment Sizelog.segment.bytes=1073741824# Offsets Topic Replication Factoroffsets.topic.replication.factor=2# Socket Buffer Sizessocket.send.buffer.bytes=102400socket.receive.buffer.bytes=102400# Fetch Min Bytesfetch.min.bytes=1# Group Initial Rebalance Advertised Listenersadvertised.listeners=PLAINTEXT://

Producer Setting

# Bootstrap Serversbootstrap.servers=kafka-broker1:9092,kafka-broker2:9092# Acknowledge Modeacks=1# Retriesretries=3# Batch Sizebatch.size=16384# Linger Compression Typecompression.type=snappy# Max Request Sizemax.request.size=1048576# Buffer Memorybuffer.memory=33554432# Idempotenceenable.idempotence=true# Max In-Flight Requests Per Retry Delivery Transaction Security Settingssecurity.protocol=SSL

5 views0 comments


bottom of page