Here are detailed 64bit pyperformance results on my Windows 10 PC (dusty i5-4570 CPU) run with --fast --affinity 0
for commit https://github.com/python/cpython/pull/129907/commits/9db1a297d95574cc3114854c8c427736d9521917, when working on
python/cpython#130090: Support PGO for clang-cl
Compilers used:
- Microsoft Visual Studio 2022 17.13.0 Preview 5.0 which can do PGO again (python/cpython#129244)
- which ships with clang-cl 19.1.1
- for the clang-cl 18.1.18 builds I manually added
<AdditionalOptions Condition="'$(Platform)' == 'x64' and $(PlatformToolset) == 'ClangCL'">/arch:AVX</AdditionalOptions>
, hence dirty. See also python/cpython#130213 cg
markscomputed gotos
builds, where I manually addedHAVE_COMPUTED_GOTOS
toPreprocessorDefinitions
, hence "dirty"tc
markstail call
builds, where I manually addedPy_TAIL_CALL_INTERP=1
toPreprocessorDefinitions
, hence "dirty"- fixed in 20.x, backport not yet : llvm/llvm-project#130585
Raw data is here: https://gist.github.com/chris-eibl/c73b02762a7c467e9a410a0aa19c7701
Interestingly, clang 18.1.8 is faster than 19.1.1, but especially in case of the computed gotos this is a known issue
Unsurprisingly, that in case of "working" computed gotos, the regex benchmarks are faster, too, because they are there used as well.
So I propose to add --with-computed-gotos
to build.bat
, like --tail-call-interp
(python/cpython#130040), since the latter won't speed up regex (yet).