Vous êtes sur la page 1sur 7

Kafka Deployment Hardware and

Tuning Specs by Arkilic


This is a summary of all the documents I have been through online:

Hardware
Memory: Kafka heavily relies on filesystem for caching and storing messages. Uses headspace
very carefully and doesnt require setting heap size more than 5 GB. File system cache up to
3GB per machine. 64 GB machines are safe bets, less than 32 is useless.

CPUs: Enabling SSL hurts CPU performance. Overall, Kafka doesnt require fancy CPUs. More
CPUs are better than few but powerful CPUs. 24+ cores is a safe bet.

Disks: Disks better be standalone and not performing tasks for the rest of the OS. For low
latency, one can RAID the external drives together to a single volume or format each drive to its
own directory. Whatever the choice is, data must be well balanced among partitions. There are
several trade-offs to discuss, it would be better if a sysadmin takes a look at pros and cons
making the final decision on how these disks will be mounted.

Network: What we have in place 1-10 GbE is sufficient. Kafka cluster assumes all nodes are
equal and latency introduced by the network is low.

Filesystem: Kafka recommends either XFS or ext4.

JVM
Kafka recommends latest version of JDK with G1 collector. There are tuning specs available
online depending on traffic we are expecting. Kunal might want to take a good look at it as he is
the most qualified java person around.

Large messages can cause longer garbage collection (GC) pauses as brokers allocate large
chunks. Monitor the GC log and the server log. If long GC pauses cause Kafka to abandon the
ZooKeeper session, you may need to configure longer timeout values for
zookeeper.session.timeout.ms.
Commonly Tweaked Configuration Parameters

Kafka ships with very good defaults, especially when it comes to performance-related settings
and options. When in doubt, just leave the settings alone.
With that said, there are some logistical configurations that should be changed for production.
These changes are necessary either to make your life easier, or because there is no way to set
a good default (because it depends on your cluster layout).
zookeeper.connect
The list of zookeeper hosts that the broker registers at. It is recommended that you configure
this to all the hosts in your zookeeper cluster
Type: string
Importance: high
broker.id
Integer id that identifies a broker. No two brokers in the same Kafka cluster can have the same
id.
Type: int
Importance: high
log.dirs
The directories in which the Kafka log data is located.
Type: string
Default: /tmp/kafka-logs
Importance: high
listeners
Comma-separated list of URIs (including protocol) that the broker will listen on. Specify
hostname as 0.0.0.0 to bind to all interfaces or leave it empty to bind to the default interface. An
example is PLAINTEXT://myhost:9092.
Type: string
Default: PLAINTEXT://host.name:port where the default for host.name is an
empty string and the default for port is 9092
Importance: high
advertised.listeners
Listeners to publish to ZooKeeper for clients to use. In IaaS environments, this may need to be
different from the interface to which the broker binds. If this is not set, the value for listeners will
be used.
Type: string
Default: listeners
Importance: high
num.partitions
The default number of log partitions for auto-created topics. We recommend increasing this as it
is better to over partition a topic. Over partitioning a topic leads to better data balancing as well
as aids consumer parallelism. For keyed data, in particular, you want to avoid changing the
number of partitions in a topic.
Type: int
Default: 1
Importance: medium
Replication configs
default.replication.factor
The default replication factor that applies to auto-created topics. We recommend setting this to
at least 2.
Type: int
Default: 1
Importance: medium
min.insync.replicas
The minimum number of replicas in ISR needed to commit a produce request with
required.acks=-1 (or all).
Type: int
Default: 1
Importance: medium
unclean.leader.election.enable
Indicates whether to enable replicas not in the ISR set to be elected as leader as a last resort,
even though doing so may result in data loss.
Type: int
Default: 1
Importance: medium

Handling Large Messages


Before configuring Kafka to handle large messages, first consider the following options to
reduce message size:
The Kafka producer can compress messages. For example, if the original
message is a text-based format (such as XML), in most cases the compressed message will be
sufficiently small.
Use the compression.codec and compressed.topics producer configuration parameters to
enable compression. Gzip and Snappy are supported.

If shared storage (such as NAS, HDFS, or S3) is available, consider placing large
files on the shared storage and using Kafka to send a message with the file location. In many
cases, this can be much faster than using Kafka to send the large file itself.
Split large messages into 1 KB segments with the producing client, using
partition keys to ensure that all segments are sent to the same Kafka partition in the correct
order. The consuming client can then reconstruct the original large message.
If you still need to send large messages with Kafka, modify the following configuration
parameters to match your requirements:
Broker Configuration
message.max.bytes
Maximum message size the broker will accept. Must be smaller than the consumer
fetch.message.max.bytes, or the consumer cannot consume the message.
Default value: 1000000 (1 MB)

log.segment.bytes
Size of a Kafka data file. Must be larger than any single message.
Default value: 1073741824 (1 GiB)

replica.fetch.max.bytes
Maximum message size a broker can replicate. Must be larger than message.max.bytes, or a
broker can accept messages it cannot replicate, potentially resulting in data loss.
Default value: 1048576 (1 MiB)

Consumer Configuration
fetch.message.max.bytes
Maximum message size a consumer can read. Must be at least as large as
message.max.bytes.
Default value: 1048576 (1 MiB)

Tuning Kafka for Optimal Performance


Performance tuning involves two important metrics: Latency measures how long it takes to
process one event, and throughput measures how many events arrive within a specific amount
of time. Most systems are optimized for either latency or throughput. Kafka is balanced for both.
A well tuned Kafka system has just enough brokers to handle topic throughput, given the latency
required to process information as it is received.
Tuning your producers, brokers, and consumers to send, process, and receive the largest
possible batches within a manageable amount of time results in the best balance of latency and
throughput for your Kafka cluster.

Tuning Kafka Producers


Kafka uses an asynchronous publish/subscribe model. When your producer calls the send()
command, the result returned is a future. The future provides methods to let you check the
status of the information in process. When the batch is ready, the producer sends it to the
broker. The Kafka broker waits for an event, receives the result, and then responds that the
transaction is complete.
If you do not use a future, you could get just one record, wait for the result, and then send a
response. Latency is very low, but so is throughput. If each transaction takes 5 ms, throughput
is 200 events per second.slower than the expected 100,000 events per second.
When you use Producer.send(), you fill up buffers on the producer. When a buffer is full, the
producer sends the buffer to the Kafka broker and begins to refill the buffer.
Two parameters are particularly important for latency and throughput: batch size and linger time.

Batch Size
batch.size measures batch size in total bytes instead of the number of messages. It controls
how many bytes of data to collect before sending messages to the Kafka broker. Set this as
high as possible, without exceeding available memory. The default value is 16384.
If you increase the size of your buffer, it might never get full. The Producer sends the information
eventually, based on other triggers, such as linger time in milliseconds. Although you can impair
memory usage by setting the buffer batch size too high, this does not impact latency.
If your producer is sending all the time, you are probably getting the best throughput possible. If
the producer is often idle, you might not be writing enough data to warrant the current allocation
of resources.
Linger Time
linger.ms sets the maximum time to buffer data in asynchronous mode. For example, a setting
of 100 batches 100ms of messages to send at once. This improves throughput, but the buffering
adds message delivery latency.
By default, the producer does not wait. It sends the buffer any time data is available.
Instead of sending immediately, you can set linger.ms to 5 and send more messages in one
batch. This would reduce the number of requests sent, but would add up to 5 milliseconds of
latency to records sent, even if the load on the system does not warrant the delay.
The farther away the broker is from the producer, the more overhead required to send
messages. Increase linger.ms for higher latency and higher throughput in your producer.

Tuning Kafka Brokers


Topics are divided into partitions. Each partition has a leader. Most partitions are written into
leaders with multiple replicas. When the leaders are not balanced properly, one might be
overworked, compared to others. For more information on load balancing, see Partitions and
Memory Usage.
Depending on your system and how critical your data is, you want to be sure that you have
sufficient replication sets to preserve your data. Cloudera recommends starting with one
partition per physical storage disk and one consumer per partition.

Tuning Kafka Consumers


Consumers can create throughput issues on the other side of the pipeline. The maximum
number of consumers for a topic is equal to the number of partitions. You need enough
partitions to handle all the consumers needed to keep up with the producers.
Consumers in the same consumer group split the partitions among them. Adding more
consumers to a group can enhance performance. Adding more consumer groups does not
affect performance.
How you use the replica.high.watermark.checkpoint.interval.ms property can affect throughput.
When reading from a partition, you can mark the last point where you read information. That
way, if you have to go back and locate missing data, you have a checkpoint from which to move
forward without having to reread prior data. If you set the checkpoint watermark for every event,
you will never lose a message, but it significantly impacts performance. If, instead, you set it to
check the offset every hundred messages, you have a margin of safety with much less impact
on throughput.

Configuring JMX Ephemeral Ports


Kafka uses two high-numbered ephemeral ports for JMX. These ports are listed when you view
netstat -anp information for the Kafka Broker process.
You can change the number for the first port by adding a command similar to -
Dcom.sun.management.jmxremote.rmi.port=<port number> to the field Additional
Broker Java Options (broker_java_opts) in Cloudera Manager. The JMX_PORT configuration
maps to com.sun.management.jmxremote.port by default.
The second ephemeral port used for JMX communication is implemented for the JRMP protocol
and cannot be changed.
Some More Kafka Tuning Recommendations by
Cloudera in addition to documentation ones above
Kafka Brokers per Server
Recommend 1 Kafka broker per server- Kafka not only disk-intensive but can be
network intensive so if you run multiple broker in a single host network I/O can be the bottleneck
. Running single broker per host and having a cluster will give you better availability.
Increase Disks allocated to Kafka Broker
Kafka parallelism is largely driven by the number of disks and partitions per topic.
From the Kafka documentation: We recommend using multiple drives to get
good throughput and not sharing the same drives used for Kafka data with application logs or
other OS filesystem activity to ensure good latency. As of 0.8 you can format and mount each
drive as its own directory. If you configure multiple data directories partitions will be assigned
round-robin to data directories. Each partition will be entirely in one of the data directories. If
data is not well balanced among partitions this can lead to load imbalance between disks.
Number of Threads
Make sure you set num.io.threads to at least no.of disks you are going to use by
default its 8. It be can higher than the number of disks.
Set num.network.threads higher based on number of concurrent producers,
consumers, and replication factor.
Number of partitions
Ideally you want to assign the default number of partitions (num.partitions) to at
least n-1 servers. This can break up the write workload and it allows for greater parallelism on
the consumer side. Remember that Kafka does total ordering within a partition, not over multiple
partitions, so make sure you partition intelligently on the producer side to parcel up units of work
that might span multiple messages/events.
Message Size
Kafka is designed for small messages. I recommend you to avoid using kafka for
larger messages. If thats not avoidable there are several ways to go about sending larger
messages like 1MB. Use compression if the original message is json, xml or text using
compression is the best option to reduce the size. Large messages will affect your performance
and throughput. Check your topic partitions and replica.fetch.size to make sure it doesnt go
over your physical ram.
Large Messages
Another approach is to break the message into smaller chunks and use the same
message key to send it same partition. This way you are sending small messages and these
can be re-assembled at the consumer side.
Broker side:
1. message.max.bytes defaults to 1000000 . This indicates the maximum size of
message that a kafka broker will accept.
2. replica.fetch.max.bytes defaults to 1MB . This has to be bigger than
message.max.bytes otherwise brokers will not be able to replicate messages.
Consumer side:
1. fetch.message.max.bytes defaults to 1MB. This indicates maximum size of a
message that a consumer can read. This should be equal or larger than message.max.bytes.
Kafka Heap Size
By default kafka-broker jvm is set to 1Gb this can be increased using Ambari
kafka-env template. When you are sending large messages JVM garbage collection can be an
issue. Try to keep the Kafka Heap size below 4GB.
Example: In kafka-env.sh add following settings.
export KAFKA_HEAP_OPTS="-Xmx16g -Xms16g"
export KAFKA_JVM_PERFORMANCE_OPTS="-XX:MetaspaceSize=96m -XX:
+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -
XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=50 -
XX:MaxMetaspaceFreeRatio=80"
Dedicated Zookeeper
Have a separate zookeeper cluster dedicated to Storm/Kafka operations. This
will improve Storm/Kafkas performance for writing offsets to Zookeeper, it will not be competing
with HBase or other components for read/write access.
ZK on separate nodes from Kafka Broker
Do Not Install zk nodes on the same node as kafka broker if you want optimal
Kafka performance. Disk I/O both kafka and zk are disk I/O intensive.
Disk Tuning sections
Please review the Kafka documentation on filesystem tuning parameters here.
Disable THP according to documentation here.
Either ext4 or xfs filesystems are recommended for performance benefit.
Minimal replication
If you are doing replication, start with 2x rather than 3x for Kafka clusters larger
than 3 machines. Alternatively, use 2x even if a 3 node cluster if you are able to reprocess
upstream from your source.
Avoid Cross Rack Kafka deployments
Avoid cross-rack Kafka deployments for now until Kafka 0.8.2 - see: https://
issues.apache.org/jira/browse/KAFKA-1215

Vous aimerez peut-être aussi