Skip to content

Instantly share code, notes, and snippets.

@emoon
Created June 15, 2012 15:40
Show Gist options
  • Save emoon/2937100 to your computer and use it in GitHub Desktop.
Save emoon/2937100 to your computer and use it in GitHub Desktop.
selb in odd pipe
Challenge:
Implement selb with only using odd instructions on SPU.
----------------------------------------------------------------------------------
input : mask (comes from a floating point compare so result is always zero or ones for each 32-bit value (and 4 of them))
a, b to select between
----------------------------------------------------------------------------------
Suggestion 1 by @daniel_collin
----------------------------------------------------------------------------------
// 18 cycles latency
gb t, mask // 4
rotqbii offset, t, 4 // 4
lqx shufb_mask, offset, shuffle_table // 6
shufb res, a, b, shufb_mask // 4
----------------------------------------------------------------------------------
Suggestion 2 by @postgoodism
----------------------------------------------------------------------------------
SPU selb using only odd instructions
(with details elided because I'm only half-awake)
Given a selb mask:
v1 = FF00FFFF 0000FFFF 00FF0000 FFFFFFFF
SHUFB v1 with a qword of zeros, using v1 as the shuffle mask.
v2 = 80008080 00008080 00800000 80808080
Rotate v2 to the right by 7 bits with a ROTQMBII (or is it ROTQMBYBI? can't ever remember without a cheat sheet)
v3 = 01000101 00000101 00010000 01010101
Broadcast the bytes of v3 into two new qwords v4 and v5 using SHUFB, interleaved with bytes from the following constant k1:
k1 = 00102030 40506070 8090A0B0 C0D0E0F0
v4 = 01000010 01200130 00400050 01600170
v5 = 00800190 00A000B0 01C001D0 01E001F0
Rotate v4 and v5 right by 4 bits using ROTQMBII/ROTQMBYBI to create v6/v7
v6 = 00100001 00120013 00040005 00160017
v7 = 00080019 000A000B 001C001D 001E001F
Re-combine v6 and v7 into v8 using shufb, taking only the even-numbered bytes from each:
v8 = 10011213 04051617 08190A0B 1C1D1E1F
v8 is a shufb mask that replicates the original selb mask
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment