Ibex does Latin squares in a way that can potentially have serious unintended consequences in the form of spurious effects.
The problem: When you submit an experiment to Amazon Mechanical Turk, a lot of workers will immediately jump at it but the rate of participation quickly decays (the distribution over time often looks like an exponential decay). For every participant, Ibex selects a stimulus list based on an internal counter and this counter is incremented when a participant submits their results. Unfortunately, this means that the initial wave of participants all work on the same list of the Latin square and this list will therefore be strongly overrepresented. This can lead to strong spurious effects that are not due to the experimental manipulation but due to between-item differences. This is an easy-to-miss problem and I would not be surprised if some published results obtained with Ibex were false because of this prob