Skip to content

Instantly share code, notes, and snippets.

@amoeba
Created March 21, 2024 04:48
Show Gist options
  • Save amoeba/b95102829280dbe2b1f64e6c23a5f594 to your computer and use it in GitHub Desktop.
Save amoeba/b95102829280dbe2b1f64e6c23a5f594 to your computer and use it in GitHub Desktop.

From main:

Benchmark Time CPU Iterations items_per_second
AllocateDeallocate/size:4096/real_time/threads:1 40.6 ns 40.5 ns 17256246 24.6592M/s
AllocateDeallocate/size:4096/real_time/threads:2 37.1 ns 74.3 ns 18816032 26.9313M/s
AllocateDeallocate/size:4096/real_time/threads:4 68.5 ns 274 ns 10096888 14.5911M/s
AllocateDeallocate/size:4096/real_time/threads:8 85.2 ns 682 ns 8969680 11.733M/s
AllocateDeallocate/size:4096/real_time/threads:16 64.1 ns 1022 ns 10904176 15.5995M/s
AllocateDeallocate/size:4096/real_time/threads:32 57.1 ns 1033 ns 13711680 17.5085M/s
AllocateDeallocate/size:65536/real_time/threads:1 45.0 ns 45.0 ns 15550007 22.2443M/s
AllocateDeallocate/size:65536/real_time/threads:2 52.1 ns 104 ns 18439302 19.1794M/s
AllocateDeallocate/size:65536/real_time/threads:4 76.0 ns 304 ns 9022692 13.1598M/s
AllocateDeallocate/size:65536/real_time/threads:8 87.0 ns 696 ns 9883360 11.4977M/s
AllocateDeallocate/size:65536/real_time/threads:16 61.2 ns 978 ns 11521504 16.3496M/s
AllocateDeallocate/size:65536/real_time/threads:32 52.3 ns 983 ns 13423712 19.1278M/s
AllocateDeallocate/size:1048576/real_time/threads:1 46.2 ns 46.2 ns 15085182 21.6221M/s
AllocateDeallocate/size:1048576/real_time/threads:2 48.8 ns 97.6 ns 15959920 20.4852M/s
AllocateDeallocate/size:1048576/real_time/threads:4 72.6 ns 290 ns 8770500 13.7701M/s
AllocateDeallocate/size:1048576/real_time/threads:8 82.6 ns 661 ns 9626720 12.1014M/s
AllocateDeallocate/size:1048576/real_time/threads:16 60.6 ns 970 ns 11545920 16.489M/s
AllocateDeallocate/size:1048576/real_time/threads:32 55.2 ns 976 ns 13927584 18.1264M/s
AllocateDeallocate/size:16777216/real_time/threads:1 47.3 ns 47.3 ns 14798147 21.1493M/s
AllocateDeallocate/size:16777216/real_time/threads:2 53.0 ns 106 ns 12193390 18.8762M/s
AllocateDeallocate/size:16777216/real_time/threads:4 73.1 ns 292 ns 8888352 13.673M/s
AllocateDeallocate/size:16777216/real_time/threads:8 83.5 ns 667 ns 8606728 11.9754M/s
AllocateDeallocate/size:16777216/real_time/threads:16 61.3 ns 978 ns 11482768 16.3183M/s
AllocateDeallocate/size:16777216/real_time/threads:32 53.5 ns 965 ns 14079456 18.6938M/s

After commenting out the body of UpdateAllocatedBytes:

Benchmark Time CPU Iterations items_per_second
AllocateDeallocate/size:4096/real_time/threads:1 38.1 ns 38.1 ns 18245705 26.2773M/s
AllocateDeallocate/size:4096/real_time/threads:2 21.1 ns 42.2 ns 33524684 47.439M/s
AllocateDeallocate/size:4096/real_time/threads:4 11.2 ns 44.6 ns 64754672 89.6438M/s
AllocateDeallocate/size:4096/real_time/threads:8 6.32 ns 50.5 ns 80000000 158.29M/s
AllocateDeallocate/size:4096/real_time/threads:16 5.92 ns 94.7 ns 120264464 168.939M/s
AllocateDeallocate/size:4096/real_time/threads:32 5.29 ns 93.8 ns 157807072 189.189M/s
AllocateDeallocate/size:65536/real_time/threads:1 42.6 ns 42.6 ns 16384852 23.4741M/s
AllocateDeallocate/size:65536/real_time/threads:2 22.2 ns 44.4 ns 31642666 45.0489M/s
AllocateDeallocate/size:65536/real_time/threads:4 11.4 ns 45.6 ns 61972996 87.7315M/s
AllocateDeallocate/size:65536/real_time/threads:8 6.44 ns 51.5 ns 103990392 155.353M/s
AllocateDeallocate/size:65536/real_time/threads:16 6.15 ns 98.2 ns 112948512 162.575M/s
AllocateDeallocate/size:65536/real_time/threads:32 5.65 ns 98.2 ns 153753472 176.919M/s
AllocateDeallocate/size:1048576/real_time/threads:1 42.5 ns 42.5 ns 16411385 23.5441M/s
AllocateDeallocate/size:1048576/real_time/threads:2 22.2 ns 44.4 ns 31565932 45.0146M/s
AllocateDeallocate/size:1048576/real_time/threads:4 11.5 ns 45.9 ns 61519376 87.1179M/s
AllocateDeallocate/size:1048576/real_time/threads:8 6.44 ns 51.5 ns 100738824 155.244M/s
AllocateDeallocate/size:1048576/real_time/threads:16 6.20 ns 99.2 ns 114865280 161.251M/s
AllocateDeallocate/size:1048576/real_time/threads:32 5.52 ns 97.8 ns 139016768 181.033M/s
AllocateDeallocate/size:16777216/real_time/threads:1 43.4 ns 43.4 ns 16132988 23.0522M/s
AllocateDeallocate/size:16777216/real_time/threads:2 22.2 ns 44.4 ns 31198652 45.0013M/s
AllocateDeallocate/size:16777216/real_time/threads:4 14.4 ns 57.7 ns 56729008 69.2928M/s
AllocateDeallocate/size:16777216/real_time/threads:8 7.73 ns 61.8 ns 98822728 129.359M/s
AllocateDeallocate/size:16777216/real_time/threads:16 6.24 ns 99.5 ns 111817376 160.287M/s
AllocateDeallocate/size:16777216/real_time/threads:32 5.13 ns 97.6 ns 140379712 194.757M/s

Just before a recent change in Arrow 13, https://github.com/apache/arrow/commit/ddfa8eed9b188fcc7b38767d1858c2588c588f05#diff-2111aac8ee579e238fb15d4380f2ea1e2f3e2830da939ec3f07c61dc68038d1f. The result is intermediate between the two extremes:

Benchmark Time CPU Iterations items_per_second
AllocateDeallocate/size:4096/real_time/threads:1 39.8 ns 39.8 ns 17704095 25.1397M/s
AllocateDeallocate/size:4096/real_time/threads:2 33.0 ns 66.0 ns 21156262 30.2882M/s
AllocateDeallocate/size:4096/real_time/threads:4 49.3 ns 197 ns 14550456 20.2986M/s
AllocateDeallocate/size:4096/real_time/threads:8 53.1 ns 425 ns 13473320 18.8344M/s
AllocateDeallocate/size:4096/real_time/threads:16 29.1 ns 462 ns 23901536 34.4059M/s
AllocateDeallocate/size:4096/real_time/threads:32 26.5 ns 463 ns 31765728 37.7303M/s
AllocateDeallocate/size:65536/real_time/threads:1 44.0 ns 44.0 ns 15886376 22.7218M/s
AllocateDeallocate/size:65536/real_time/threads:2 35.6 ns 71.3 ns 17757214 28.0662M/s
AllocateDeallocate/size:65536/real_time/threads:4 38.0 ns 152 ns 19507004 26.2896M/s
AllocateDeallocate/size:65536/real_time/threads:8 36.9 ns 295 ns 19788320 27.098M/s
AllocateDeallocate/size:65536/real_time/threads:16 29.8 ns 477 ns 23458016 33.5527M/s
AllocateDeallocate/size:65536/real_time/threads:32 27.6 ns 482 ns 32773568 36.2807M/s
AllocateDeallocate/size:1048576/real_time/threads:1 45.3 ns 45.3 ns 15442833 22.0897M/s
AllocateDeallocate/size:1048576/real_time/threads:2 41.7 ns 83.4 ns 17880106 23.9737M/s
AllocateDeallocate/size:1048576/real_time/threads:4 32.6 ns 131 ns 18217508 30.6478M/s
AllocateDeallocate/size:1048576/real_time/threads:8 36.3 ns 290 ns 19272912 27.5667M/s
AllocateDeallocate/size:1048576/real_time/threads:16 30.1 ns 481 ns 23301200 33.2427M/s
AllocateDeallocate/size:1048576/real_time/threads:32 27.8 ns 489 ns 31171680 35.9621M/s
AllocateDeallocate/size:16777216/real_time/threads:1 45.6 ns 45.6 ns 15317019 21.9423M/s
AllocateDeallocate/size:16777216/real_time/threads:2 44.7 ns 89.4 ns 16540960 22.3799M/s
AllocateDeallocate/size:16777216/real_time/threads:4 37.8 ns 151 ns 19134904 26.4453M/s
AllocateDeallocate/size:16777216/real_time/threads:8 36.4 ns 291 ns 19363048 27.4389M/s
AllocateDeallocate/size:16777216/real_time/threads:16 30.0 ns 479 ns 23268240 33.3274M/s
AllocateDeallocate/size:16777216/real_time/threads:32 27.8 ns 486 ns 30116928 35.9979M/s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment