Inside cmp() the following code:
return std.mem.order(u8, word1, word2).compare(std.math.CompareOperator.lt);
is much faster than the following 2 alternatives (that will be compiled to the same machine code):
return std.mem.order(u8, word1, word2) == .lt;
return std.mem.lessThan(u8, word1, word2);