Skip to content

Instantly share code, notes, and snippets.

@rygorous
Created August 16, 2019 22:37
Show Gist options
  • Save rygorous/a549832e23b913ac70237d23c1600f8a to your computer and use it in GitHub Desktop.
Save rygorous/a549832e23b913ac70237d23c1600f8a to your computer and use it in GitHub Desktop.
pseudo-ucode expansion for LOOP <dest>
lea rcx, [rcx-1] ; decrement rcx w/o flag update
mov temp0, rax ; save rax that we're about to trash
lahf ; save original flags
test rcx, rcx ; check whether updated rcx is zero
setz temp1 ; temp1 = 1 if rcx=0, 0 otherwise
sahf ; restore flags
mov rax, temp0 ; restore rax
jecxz temp1, dest ; jump if temp1 is zero, not rcx (doesn't exist in regular ISA but rcx is renamed anyway so the internal uop can do any source)
NOTE the actual ucode expansion probably doesn't have the MOVs since I would expect the internal LAHF/SAHF uops
to have explicit sources, meaning the actual code is more like
lea rcx, [rcx-1] ; decrement rcx w/o flag update
lahf temp0 ; save original flags to ucode temp
test rcx, rcx ; check whether updated rcx is zero
setz temp1 ; temp1 = 1 if rcx=0, 0 otherwise
sahf temp0 ; restore flags
jecxz temp1, dest ; jump if temp1 is zero, not rcx (doesn't exist in regular ISA but rcx is renamed anyway so the internal uop can do any source)
also note that jecxz is internally expanded into two uops and these would be "spelled out" in the actual ucode
anyway, that's 7 uops, which matches the uop counts reported for LOOP on Skylake.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment