Skip to content

Instantly share code, notes, and snippets.

@Ravenslofty
Last active February 11, 2020 18:25
Show Gist options
  • Save Ravenslofty/37e4823a4792b61e94178b91c28a2961 to your computer and use it in GitHub Desktop.
Save Ravenslofty/37e4823a4792b61e94178b91c28a2961 to your computer and use it in GitHub Desktop.

This is kind of a long and rambling post, but I hope it makes some sense.

First off, a data point that caused this whole ramble. The following graph comes from 100 runs of Yosys and nextpnr on picorv32_large.v from yosys-tests, targeting an iCE40HX8K.

foo

Obviously, ABC1 resulting in a higher average Fmax than ABC9 is concerning, given that the point of ABC9 is to improve synthesis quality compared to ABC1.

But this got me wondering: how do we actually quantify that a change produces a better maximum frequency?

My background is mostly computer chess, which has internally struggled with a similar question over the past two decades, and has found a solution in the form of the sequential probability-ratio test (SPRT).

SPRT requires two hypotheses to test:for chess, these are the null hypothesis which states that a change has a win rate of p0, and the alternate hypothesis, which states that a change has a win rate of p1, where p1 > p0. Generally, p0 is a win rate at which we would reject the change, and p1 is a win rate at which we would accept the change. If you are aiming to accept a change that improves win rate, you would set p0 to about 50% win rate (to reject changes which do not improve strength), and set p1 to, say, 52% win rate (to accept changes which do improve strength).

SPRT then requires the false-negative rate alpha (i.e. chance to reject a winning change), and the false-positive rate beta (i.e. chance to accept a losing change). Note that reducing the error rates requires correspondingly more evidence and thus takes longer on average (e.g. halving the error rates requires approximately double the number of samples).

SPRT will then run sequentially, taking the results of chess games starting from a list of predefined positions until it either accepts the null hypothesis (with false-negative rate alpha) or the alternate hypothesis (with false-positive rate beta). Computer chess generally uses alpha = beta = 0.05, i.e. 5% chance, or an average of one bad patch per twenty accepted. Wikipedia explains the basic algorithm fairly well, so I'll leave the implementation details to them.

So, how could this be applied to synthesis?

First we would need a testing criterion. You could test by awarding a win to the change with the lowest resulting netlist size, or to the change with the lowest resulting maximum delay.

But my personal belief is that approaches utilising synthesis alone are flawed. Consider the "lowest resulting netlist size" criteria; this would encourage changes to maximise LUT input usage, which might be a good idea on LUT4 architectures such as ECP5, but could result in poor fracturable LUT usage on LUT6 architectures like modern Xilinx and Intel chips. Likewise, minimum delay from a synthesis-only point of view would have to assume a fixed interconnect delay and no routing congestion, which naturally is not representative of real situations.

As a result, I think post-PnR maximum frequency (which is inverse of maximum delay) is possibly the best metric, with the "win" going to whichever change has the highest Fmax. This would incorporate things like netlist routability in the final number.

awygle raised the point of wanting to verify that -nowidelut resulted in a lower area (as that's the intended tradeoff of it), and so there is the question of whether post-PnR area should be tested as a criterion.

Additionally, runtime could be measured, and since an easier-to-route design implies lower router time, it would also be a measurement of relative routability.

However, while there will be some inherent nondeterminism due to PnR randomness, using a single design will cause the process to inherently fit to producing the best result for that design, thus a wide variety of input designs should be used as test cases.

So, we have the null hypothesis for synthesis being that the change produces a better Fmax with probability p0, and the alternate hypothesis being that the change produces a better Fmax with probability p1, and so you can run SPRT to determine whether to accept or reject a change.

I suppose my suggestion here is not so much to argue for SPRT specifically, but rather to argue that there should be some form of objective criterion for acceptance of a patch into Yosys/nextpnr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment