The following sorting network for 29 inputs has 165 compare-exchange-units, which is one less that the most size-optimal 29-input sorting networks that I could find in the litterature. Here is how I generated it: first it sorts the first 16 inputs and the last 13 inputs independently. Then it merges the two sorted subarrays using a size 32 Batcher odd-even merge network (the version that does not need the inputs to be interleaved), where all compare-exchange units working on indexes greater than 28 have been dropped. Dropping comparators in such a way is ok: consider that the values at the indexes [29, 32) are greater than every other value in the array to sort, and it will become intuitive that dropping them generates a correct merging network of a smaller size.
That said, even though I have been unable to find a 29-input sorting network with as few compare-exchange units as 165 in the litterature, I can't claim that I found the technique used to generate it: the unclassif