Skip to content

Instantly share code, notes, and snippets.

@pelletier
Last active January 5, 2022 21:44
Show Gist options
  • Save pelletier/637c1cc16ec8fc08dd19009343976998 to your computer and use it in GitHub Desktop.
Save pelletier/637c1cc16ec8fc08dd19009343976998 to your computer and use it in GitHub Desktop.
AVX2 right shift cross lanes fill with zeroes
// Go AVO code that generates a right shift cross lanes with zeroes fill when offset is known and less than 16 bytes.
Comment("Shift right that works if remaining bytes >= 16")
VPMOVQU(Mem{Base: d}, currentBlockY)
zeroes := YMM()
VPXOR(zeroes, zeroes, zeroes)
src1 := YMM()
VPERM2I128(Imm(3), currentBlockY, zeroes, src1)
ymm3 := YMM()
VPALIGNR(Imm(garbageSize), currentBlockY, src1, ymm3)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment