Skip to content

Instantly share code, notes, and snippets.

@travisdowns
Created June 8, 2019 23:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save travisdowns/9216bffba33876ee578aa0bb74b3c8f2 to your computer and use it in GitHub Desktop.
Save travisdowns/9216bffba33876ee578aa0bb74b3c8f2 to your computer and use it in GitHub Desktop.
903:
vmovdqa ymm0,YMMWORD PTR [rax+0x20]
add rax,0x40
vmovdqa ymm3,YMMWORD PTR [rax-0x40]
vperm2i128 ymm4,ymm3,ymm0,0x20 ; what
vperm2i128 ymm3,ymm3,ymm0,0x31 ; no
vpshufd ymm4,ymm4,0xd8 ; don't need
vpshufd ymm3,ymm3,0xd8 ; nope
vpunpcklqdq ymm7,ymm4,ymm3 ; I really don't understand
vpunpckhqdq ymm3,ymm4,ymm3 ; why it is doing this
vpsrld ymm4,ymm7,0x10 ; starting here it's fine
vpand ymm7,ymm6,ymm7
vpsrld ymm0,ymm3,0x10
vpand ymm3,ymm6,ymm3
vpaddd ymm3,ymm3,ymm7
vpaddd ymm0,ymm0,ymm4
vpaddd ymm1,ymm1,ymm3
vpaddd ymm2,ymm2,ymm0
cmp rcx,rax
jne 903 <sum_halves_bench(unsigned long, void*)+0x53>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment