Skip to content

Instantly share code, notes, and snippets.

@stanislavkozlovski
stanislavkozlovski / rc4_test_failures_system_test_run_1.md
Created February 14, 2024 19:49
rc4_test_failures_system_test_run_1
{'kafkatest.benchmarks.core.benchmark_test.Benchmark#test_consumer_throughputArguments:{  "compression_type": "none",  "interbroker_security_protocol": "PLAINTEXT",  "security_protocol": "SSL",  "tls_version": "TLSv1.2"}': 1,
 'kafkatest.benchmarks.core.benchmark_test.Benchmark#test_consumer_throughputArguments:{  "compression_type": "none",  "interbroker_security_protocol": "PLAINTEXT",  "security_protocol": "SSL",  "tls_version": "TLSv1.3"}': 1,
 'kafkatest.benchmarks.core.benchmark_test.Benchmark#test_consumer_throughputArguments:{  "compression_type": "snappy",  "interbroker_security_protocol": "PLAINTEXT",  "security_protocol": "SSL",  "tls_version": "TLSv1.2"}': 1,
 'kafkatest.benchmarks.core.benchmark_test.Benchmark#test_consumer_throughputArguments:{  "compression_type": "snappy",  "interbroker_security_protocol": "PLAINTEXT",  "security_protocol": "SSL",  "tls_version": "TLSv1.3"}': 1,
 'kafkatest.benchmarks.core.benchmark_test.Benchmark#test_end_to_end_latencyArguments:{  "compression_type": "no

Hey all,

I've heard various complaints that build times in trunk are taking too long, some taking as much as 8 hours (the timeout) - and this is slowing us down from being able to meet the code freeze deadline.

I took it upon myself to gather up some data in Gradle Enterprise to see if there are any outlier tests that are causing this slowness. Turns out there are, in this particular build - https://ge.apache.org/s/un2hv7n6j374k/ - which took 10 hours and 29 minutes in total

Here are the top offending tests:

  • ✅ - PASS
  • 🟧 - FLAKY
un2hv7n6j374k-org.apache.kafka.streams.integration.TimeWindowedKStreamIntegrationTest-shouldAggregateWindowedWithNoGrace[ON_WINDOW_UPDATE_false]-1-output.txt
metaPropertiesEnsemble=MetaPropertiesEnsemble(metadataLogDir=Optional.empty, dirs={/tmp/kafka-7862265964045152878: EMPTY})
[2023-12-06 05:14:14,945] INFO [LocalTieredStorage Id=0] Creating directory: [/tmp/kafka-remote-tier-deletesegmentsbyretentionsizetest18303542483502926029/kafka-tiered-storage] (org.apache.kafka.server.log.remote.storage.LocalTieredStorage:289)
[2023-12-06 05:14:14,946] INFO [LocalTieredStorage Id=0] Created local tiered storage manager [0]:[kafka-tiered-storage] (org.apache.kafka.server.log.remote.storage.LocalTieredStorage:301)
[2023-12-06 05:14:14,946] INFO Started configuring topic-based RLMM with configs: {remote.log.metadata.topic.replication.factor=1, remote.log.metadata.topic.num.partitions=5, remote.log.metadata.common.client.bootstrap.servers=localhost:36617, broker.id=0, remote.log.metadata.initialization.retry.interval.ms=300, remote.log.metadata.common.client.security.protocol=PLAINTEXT, cluster.id=Si-GEbyrSJeNRePZuz8ysQ, log.dir=/tmp/kafka-9703822247831753795} (org.apache.ka
[2023-12-06 05:14:34,969] INFO [LocalTieredStorage Id=0] Creating directory: [/tmp/kafka-remote-tier-transactionswithtieredstoretest7406968791210700929/kafka-tiered-storage] (org.apache.kafka.server.log.remote.storage.LocalTieredStorage:289)
[2023-12-06 05:14:34,969] INFO [LocalTieredStorage Id=0] Created local tiered storage manager [0]:[kafka-tiered-storage] (org.apache.kafka.server.log.remote.storage.LocalTieredStorage:301)
[2023-12-06 05:14:34,969] INFO Started configuring topic-based RLMM with configs: {remote.log.metadata.topic.replication.factor=3, remote.log.metadata.topic.num.partitions=3, remote.log.metadata.common.client.bootstrap.servers=localhost:36181, broker.id=0, remote.log.metadata.initialization.retry.interval.ms=300, remote.log.metadata.common.client.security.protocol=PLAINTEXT, cluster.id=9Ott-KxpTMqMRChhvk2mQg, log.dir=/tmp/kafka-3584143055344816110} (org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager:358)
[2023-12-06 05:14:34,970] INFO Successfully con
metaPropertiesEnsemble=MetaPropertiesEnsemble(metadataLogDir=Optional.empty, dirs={/tmp/kafka-10894890209183283184: EMPTY})
[2023-12-06 03:13:26,603] WARN maxCnxns is not configured, using default value 0. (org.apache.zookeeper.server.ServerCnxnFactory:309)
[2023-12-06 04:03:28,062] WARN Client session timed out, have not heard from server in 39992ms for session id 0x0 (org.apache.zookeeper.ClientCnxn:1256)
metaPropertiesEnsemble=MetaPropertiesEnsemble(metadataLogDir=Optional.empty, dirs={/tmp/kafka-11154148799385091495: EMPTY})
[2023-12-06 05:14:49,465] WARN [QuorumController id=1000] Performing controller activation. The metadata log appears to be empty. Appending 1 bootstrap record(s) in metadata transaction at metadata.version 3.7-IV3 from bootstrap source 'test harness'. Setting the ZK migration state to NONE since this is a de-novo KRaft cluster. (org.apache.kafka.controller.QuorumController:108)
[2023-12-06 05:14:53,973] WARN [QuorumController id=1000] Performing controller activation. Loaded ZK migration state of NONE. (org.apache.kafka.controller.QuorumController:204)
[2023-12-06 05:14:59,814] ERROR Unexpected error handling org.apache.kafka.server.AssignmentsManager$DispatchEvent@17fa1fff (org.apache.kafka.server.AssignmentsManager:117)
java.lang.IllegalStateException: Cannot enqueue a request if the request thread is not running
at kafka.server.NodeToControllerRequestThread.enqueue(NodeToControllerCh
[2023-12-06 05:14:50,514] WARN maxCnxns is not configured, using default value 0. (org.apache.zookeeper.server.ServerCnxnFactory:309)
[2023-12-06 05:15:08,116] WARN [RequestSendThread controllerId=0] Controller 0's connection to broker localhost:45709 (id: 1 rack: null) was unsuccessful (kafka.controller.RequestSendThread:72)
java.net.SocketTimeoutException: Failed to connect within 1500 ms
at kafka.controller.RequestSendThread.brokerReady(ControllerChannelManager.scala:299)
at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:251)
at org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:131)
[2023-12-06 05:15:10,365] WARN [RequestSendThread controllerId=0] Controller 0's connection to broker localhost:45709 (id: 1 rack: null) was unsuccessful (kafka.controller.RequestSendThread:72)
java.net.SocketTimeoutException: Failed to connect within 1500 ms
at kafka.controller.RequestSendThread.brokerReady(ControllerChannelManager.scala:299)
at kafka.controller.RequestS