Skip to content

Instantly share code, notes, and snippets.

@jeqo
Last active March 21, 2024 06:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jeqo/d32cf07493ee61f3da58ac5e77b192b2 to your computer and use it in GitHub Desktop.
Save jeqo/d32cf07493ee61f3da58ac5e77b192b2 to your computer and use it in GitHub Desktop.
[kafka] Active segment rotation
  1. Set retention check to shorter periods for more frequency (e.g. 10sec) on server.properties
log.retention.check.interval.ms=10000

and start zookeeper/kafka.

  1. Create a couple of topics
❯ bin/kafka-topics.sh --bootstrap-server localhost:9092 --create --topic t1 --config retention.ms=60000 --config retention.bytes=-1
Created topic t1.
❯ bin/kafka-topics.sh --bootstrap-server localhost:9092 --create --topic t2 --config retention.ms=-1 --config retention.bytes=10000
Created topic t2.
❯ bin/kafka-topics.sh --bootstrap-server localhost:9092 --describe
Topic: t1       TopicId: ddEp183ZRh-SBHc7BKXd6w PartitionCount: 1       ReplicationFactor: 1     Configs: retention.ms=60000,retention.bytes=-1
        Topic: t1       Partition: 0    Leader: 0       Replicas: 0     Isr: 0
Topic: t2       TopicId: W6XrFsD1TKWv3Q4rRskFug PartitionCount: 1       ReplicationFactor: 1     Configs: retention.ms=-1,retention.bytes=10000
        Topic: t2       Partition: 0    Leader: 0       Replicas: 0     Isr: 0
❯ ls /tmp/kafka-logs/t1-0
00000000000000000000.index     leader-epoch-checkpoint
00000000000000000000.log       partition.metadata
00000000000000000000.timeindex
❯ bin/kafka-topics.sh --bootstrap-server localhost:9092 --describe
❯ ls /tmp/kafka-logs/t1-0
00000000000000000000.index     leader-epoch-checkpoint
00000000000000000000.log       partition.metadata
00000000000000000000.timeindex
  1. Write ~20KB records to both topics:
❯ bin/kafka-producer-perf-test.sh --producer-props bootstrap.servers=localhost:9092 --num-records 2000 --record-size 10 --throughput -1 --topic t1
2000 records sent, 8771.929825 records/sec (0.08 MB/sec), 123.60 ms avg latency, 222.00 ms max latency, 123 ms 50th, 128 ms 95th, 129 ms 99th, 129 ms 99.9th.
❯ bin/kafka-producer-perf-test.sh --producer-props bootstrap.servers=localhost:9092 --num-records 2000 --record-size 10 --throughput -1 --topic t2
2000 records sent, 18018.018018 records/sec (0.17 MB/sec), 3.92 ms avg latency, 97.00 ms max latency, 4 ms 50th, 7 ms 95th, 9 ms 99th, 9 ms 99.9th.
  1. And wait for retention to kick in:
❯ ls -lh /tmp/kafka-logs/t*
/tmp/kafka-logs/t1-0:
total 41048
-rw-r--r--@ 1 jorge.quilcate  wheel    10M Mar 21 06:10 00000000000000000000.index
-rw-r--r--@ 1 jorge.quilcate  wheel    35K Mar 21 06:16 00000000000000000000.log
-rw-r--r--@ 1 jorge.quilcate  wheel    10M Mar 21 06:10 00000000000000000000.timeindex
-rw-r--r--@ 1 jorge.quilcate  wheel     8B Mar 21 06:10 leader-epoch-checkpoint
-rw-r--r--@ 1 jorge.quilcate  wheel    43B Mar 21 06:10 partition.metadata

/tmp/kafka-logs/t2-0:
total 41048
-rw-r--r--@ 1 jorge.quilcate  wheel    10M Mar 21 06:12 00000000000000000000.index
-rw-r--r--@ 1 jorge.quilcate  wheel    35K Mar 21 06:16 00000000000000000000.log
-rw-r--r--@ 1 jorge.quilcate  wheel    10M Mar 21 06:12 00000000000000000000.timeindex
-rw-r--r--@ 1 jorge.quilcate  wheel     8B Mar 21 06:12 leader-epoch-checkpoint
-rw-r--r--@ 1 jorge.quilcate  wheel    43B Mar 21 06:12 partition.metadata

After few seconds Kafka rotates active segment from the time-based retention topic (t1):


[2024-03-21 06:17:32,066] INFO [UnifiedLog partition=t1-0, dir=/tmp/kafka-logs] Deleting segment LogSegment(baseOffset=0, size=35991, lastModifiedTime=1711001784748, largestRecordTimestamp=1711001784630) due to log retention time 60000ms breach based on the largest record timestamp in the segment (kafka.log.UnifiedLog)
[2024-03-21 06:17:32,072] INFO [UnifiedLog partition=t1-0, dir=/tmp/kafka-logs] Incremented log start offset to 2000 due to segment deletion (kafka.log.UnifiedLog)
[2024-03-21 06:18:32,076] INFO [LocalLog partition=t1-0, dir=/tmp/kafka-logs] Deleting segment files LogSegment(baseOffset=0, size=35991, lastModifiedTime=1711001784748, largestRecordTimestamp=1711001784630) (kafka.log.LocalLog$)

but not active segment from the size-based topic (t2) -- even though the size is larger than the retention.bytes.

❯ ls -lh /tmp/kafka-logs/t*
/tmp/kafka-logs/t1-0:
total 112
-rw-r--r--@ 1 jorge.quilcate  wheel    16B Mar 21 06:17 00000000000000000000.index.deleted
-rw-r--r--@ 1 jorge.quilcate  wheel    35K Mar 21 06:16 00000000000000000000.log.deleted
-rw-r--r--@ 1 jorge.quilcate  wheel    24B Mar 21 06:17 00000000000000000000.timeindex.deleted
-rw-r--r--@ 1 jorge.quilcate  wheel     0B Mar 21 06:17 00000000000000002000.log
-rw-r--r--@ 1 jorge.quilcate  wheel    56B Mar 21 06:17 00000000000000002000.snapshot
-rw-r--r--@ 1 jorge.quilcate  wheel    10M Mar 21 06:17 00000000000000002000.timeindex
-rw-r--r--@ 1 jorge.quilcate  wheel    11B Mar 21 06:17 leader-epoch-checkpoint
-rw-r--r--@ 1 jorge.quilcate  wheel    43B Mar 21 06:10 partition.metadata

/tmp/kafka-logs/t2-0:
total 41048
-rw-r--r--@ 1 jorge.quilcate  wheel    10M Mar 21 06:12 00000000000000000000.index
-rw-r--r--@ 1 jorge.quilcate  wheel    35K Mar 21 06:16 00000000000000000000.log
-rw-r--r--@ 1 jorge.quilcate  wheel    10M Mar 21 06:12 00000000000000000000.timeindex
-rw-r--r--@ 1 jorge.quilcate  wheel     8B Mar 21 06:12 leader-epoch-checkpoint
-rw-r--r--@ 1 jorge.quilcate  wheel    43B Mar 21 06:12 partition.metadata
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment