Skip to content

Instantly share code, notes, and snippets.

@mostafa-asg
Created January 13, 2022 08:24
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mostafa-asg/8817c6eb45d7eb029535dfa34a7003ed to your computer and use it in GitHub Desktop.
Save mostafa-asg/8817c6eb45d7eb029535dfa34a7003ed to your computer and use it in GitHub Desktop.
How many partitions to use for a topic?
The formula for determining the number of partitions per Kafka topic has been pretty well explored over time.
When creating a new topic in your Kafka cluster, you should first think about your desired throughput (t) in MB/sec.
Next, consider the producer throughput that you can achieve on a single partition (p)—this is affected by producer
configurations but generally sits at roughly 10s of MB/sec. Finally, you need to determine the consumer throughput (c)
you will have—this is application-dependent so you’ll have to measure it yourself. You should anticipate having at least
max(t/p, t/c) partitions for that topic. So if you have a throughput requirement of 250 MB/sec, with a producer and
consumer throughput of 50 MB/sec and 25 MB/sec respectively.
Then you should use at least max(250/50, 250/25) = max(5, 10) = 10 partitions for that topic.
https://www.confluent.io/blog/5-common-pitfalls-when-using-apache-kafka/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment