Created
August 16, 2019 22:37
-
-
Save rygorous/a549832e23b913ac70237d23c1600f8a to your computer and use it in GitHub Desktop.
pseudo-ucode expansion for LOOP <dest>
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
lea rcx, [rcx-1] ; decrement rcx w/o flag update | |
mov temp0, rax ; save rax that we're about to trash | |
lahf ; save original flags | |
test rcx, rcx ; check whether updated rcx is zero | |
setz temp1 ; temp1 = 1 if rcx=0, 0 otherwise | |
sahf ; restore flags | |
mov rax, temp0 ; restore rax | |
jecxz temp1, dest ; jump if temp1 is zero, not rcx (doesn't exist in regular ISA but rcx is renamed anyway so the internal uop can do any source) | |
NOTE the actual ucode expansion probably doesn't have the MOVs since I would expect the internal LAHF/SAHF uops | |
to have explicit sources, meaning the actual code is more like | |
lea rcx, [rcx-1] ; decrement rcx w/o flag update | |
lahf temp0 ; save original flags to ucode temp | |
test rcx, rcx ; check whether updated rcx is zero | |
setz temp1 ; temp1 = 1 if rcx=0, 0 otherwise | |
sahf temp0 ; restore flags | |
jecxz temp1, dest ; jump if temp1 is zero, not rcx (doesn't exist in regular ISA but rcx is renamed anyway so the internal uop can do any source) | |
also note that jecxz is internally expanded into two uops and these would be "spelled out" in the actual ucode | |
anyway, that's 7 uops, which matches the uop counts reported for LOOP on Skylake. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment