0xekez/Gas Efficient Ranked Choice Voting.md Secret

## Gas Efficient Ranked Choice Voting.md

      
    Raw
  

              Gas Efficient Ranked Choice Voting.md
            
          
    Gas Efficient Ranked Choice Voting

ekez & bluenote
Here we describe a method for selecting the Condorcet winner from a set of ballots suitable for implementation in a smart contract. To do so we guarantee that if a proposal can be created it can be voted on and executed within gas limits, by performing tallying in constant time over the number of votes cast. We provide a complete implementation of this method as a DAO DAO proposal module and formally verify the conditions for proposals being passed and rejected early.
This implementation has completed an audit with Oak Security as part of a larger DAO DAO audit.
Background

Smart contracts are computer programs that can be run on blockchains. This unique execution environment leads to two properties relevant to this essay:

Execution is deterministic.
Compute is tightly limited.

Both properties stem from properties of the blockchains smart contracts run on. Code executed on a blockchain must be deterministic, so that all of the nodes on the chain can come to agreement about what its output is. Compute must be limited so that one program can't crowd out all the compute power on the chain. In blockchain lingo, we call compute "gas" and compute limits "gas limits".
An interesting consequence of properties one and two is that any program execution that runs out of gas will always run out of gas.
Smart Contracts & Elections

For an election system to be suitable for implementation in a smart contract it must never run out of gas. Where this tends to get challenging is tallying votes.
For example, for yes / no voting, we might design an algorithm like:

Collect ballots
Once all ballots have been submitted: tally up the yes and no ballots and see which has the higher score.

As step two requires iterating over all of the ballots and taking their sum, this algorithm is linear time over the number of ballots cost. Thus, this system is not suitable for implementation in a smart contract, as there is some number of ballots that may be cast which makes the compute cost of step two more than the gas limit, making it impossible to determine the election's outcome.
Fortunately, we can modify our yes / no voting algorithm to make it constant time over the number of votes cast:

Keep track a the yes tally and a no tally.
When receiving a ballot: if it is a no vote, add one to the no tally, if it is a yes vote, add one to the yes tally.
Once all ballots have been submitted: check if the yes or no tallys are larger.

The remainder of this essay explores how the same treatment can be applied to ranked choice voting.
Instant Run Off

The most common form of ranked choice voting, instant run off, works like this:

Voters submit a list of candidates sorted by their preference.
If there is an option with the majority of first-preference votes, that option is the winner.
Otherwise, remove the option with the fewest first-preference votes from all preference lists and repeat.

A naive implementation of this has a runtime that scales with the number of votes cast (note the line "from all preference lists"). We were unable to discover a method for tallying Instant Runoff that didn't have this property.
The Condorcet Method


Without looking terribly deeply into the history of it, this appears to have been first described by Paul Cuff Et al. 1 in 2012. We didn't come up with this part.

A Condorcet winner is a candidate in an election who would win a 1v1 with every other candidate. If we assume that voters won't change their relative preferences when candidates are removed we can find the Condorcet winner in a set of ranked choice ballots.
For example, given three ballots:
maroon > indigo > violet
indigo > maroon > violet
violet > maroon > indigo

Per our assumption that a candidate dropping out won't change relative preferences, to see who would win in a 1v1 we can remove all other candidates from the ballots and compare them using majority-wins.
maroon v indigo  |  maroon v violet  
                 |
maroon > indigo  |  maroon < voilet
indigo > maroon  |  maroon < voilet
maroon > indigo  |  voilet < maroon

With these views on the ballots, we can see that maroon is the winner as it wins the majority of 1v1s against both indigo and violet.

$O(1)$ Condorcet Method

In order to implement the Condorcet Method in a smart contract, we have two requirements.

Tallying results is constant time over the number of votes cast.
All created proposals can be executed and voted on within gas limits, i.e. gas(proposal_creation) >= gas(vote) && gas(proposal_creation) >= gas(execute).

Constant Time Over Votes Cast

Let $C$ be the set of candidates and $V$ be the set of voters and their ballots weights, defined such that $\forall v \in V$ $v[c]$ is the weight given to candidate $c$ by voter $v$. we can then say that $c_i$ is a Condorcet winner if ¹:
$$
\forall c \in C, c \neq c_i \implies 2 * \vert\vert { v \mid v \in V, v[c_i] > v[c] }\vert\vert > \vert\vert V \vert\vert
$$
This is just a mathematical restatement of our earlier definition: a candidate is a Condorcet winner if they would win a majority runoff with all other candidates.
From here, we set our sights on pre-computing this value whenever a vote is cast. To do so, we define a new matrix $M$:
$$
M[i][j] := \vert\vert { (a, b) \mid a \in V,b \in V, a \neq b, a[i] > a[j] } \vert\vert - \vert\vert { (a, b) \mid a \in V,b \in V, a \neq b, a[j] > a[i] } \vert\vert
$$
The math notation here is unpleasant, so in pseudocode:
M[i][j] = number_of_times_i_has_beaten_j - number_of_times_j_has_beaten_i

Given such a matrix, a candidate $c_i$ is a winner if:
$$
\forall v \in M[c_i], v \neq c_i \implies v \gt 0
$$
In english: a candidate wins if they were preferred more-often-than-not in a 1v1 with every other candidate.
The complexity of finding this candidate is $O(candidates^2)$ which does not scale with the number of votes so this satisfies our requirement that tallying doesn't scale with number of votes cast.
This runtime is made clear by in this example implementation of this method:
pub struct Condorcet {
    m: Vec<Vec<i32>>,
}

impl Condorcet {
    pub fn new(candidates: usize) -> Self {
        Self {
            m: vec![vec![0; candidates]; candidates],
        }
    }

    pub fn vote(&mut self, preferences: Vec<usize>) {
        for (index, preference) in preferences.iter().enumerate() {
            // increment every value in self.m[preference] which
            // appears in preferences[index + 1..] as preference is
            // ranked higher.
            for victory in (index + 1)..preferences.len() {
                self.m[*preference][preferences[victory]] += 1
            }
            // decrement every value in self.m[preference] which
            // appears in preferences[0..index] as perference is
            // ranked lower.
            for defeat in 0..index {
                self.m[*preference][preferences[defeat]] -= 1
            }
        }
    }

    pub fn winner(&self) -> Option<usize> {
        // a winner is someone who wins a majority runoff with all of
        // the other candidates.
        self.m
            .iter()
            .enumerate()
            .find(|(index, row)| row.iter().skip_nth(*index).all(|&p| p > 0))
            .map(|(index, _)| index)
    }
}
Created Proposals Can Be Executed

Our second requirement for our system is that all proposals that can be created can also be voted on and executed. The solution to this lacks math theory, and is primarily an implementation detail, though an important one.
At a high level, to accomplish this:

We divide up state such that more state is read and written when creating a proposal than when executing and voting on it.
When a proposal is created we perform a tally on the empty $M$ matrix to ensure that the gas cost in terms of compute of creating a proposal is >= voting on a proposal as the only compute needed when voting is a tally.

A for a more detailed discussion of this in terms of the implementation, see here.
Diagonalization

An interesting property of our $M$ matrix is that it is reflected across the line $y=x$.
$$
M[i][j] = -M[j][i]
$$
In english:
M[i][j] = number_of_times_i_has_beaten_j - number_of_times_j_has_beaten_i
M[j][i] = number_of_times_j_has_beaten_i - number_of_times_i_has_beaten_j
M[i][j] = -M[j][i]

This means that we can decrease the size of $M$ in storage to $\frac{N(N-1)}{2}$ by storing only the upper diagonal of the matrix in a vector $V$. To translate an index in $M$ to an index in $V$:
$$
index(x, y) = y * N - (y + 1) * y / 2 + x - (y + 1)
$$
And to retrieve a value from $M$:
$$
get(x, y) =
\begin{cases}
\neg get(y, x),& \text{if } x\lt y\\
V[index(x, y)],& \text{otherwise}
\end{cases}
$$
This indexing method can be hard to understand without working out on paper. Here's our implementation of this with some comments which may help build intuition:
fn index(&self, (x, y): (u32, u32)) -> u32 {
	let n = self.n;
	// the start of the row in `self.cells`.
	//
	// the easiest way to conceptualize this is
	// geometrically. `y*n` is the area of the whole matrix up to
	// row `y`, and thus the start index of the row if
	// `self.cells` was not diagonalized [1]. `(y + 1) * y / 2` is the
	// area of the space that is not in the upper diagonal.
	//
	// whole_area - area_of_non_diagonal = area_of_diagonal
	//
	// because we're in the land of discrete math and things are
	// zero-indexed, area_of_diagonal == start_of_row.
	let row = y * n - (y + 1) * y / 2;
	// we know that x > y, so to get the index in `self.cells` we
	// offset x by the distance of x from the line x = y (the
	// diagonal), as `self.cells`' first index corresponds to the
	// first item in that row of the upper diagonal.
	let offset = x - (y + 1);
	row + offset
}
Making use of this diagonalization, our actual implementation has a runtime of $O(\frac{N(N-1)}{2})$, where $N\times N$ is the size of the $M$ matrix and the number of candidates.
Complexity Conclusion

This completes a description of an implementation of the Condorcet Method that is constant time over the number of votes cast, storage efficient, and that guarantees that created proposals can be voted on.
Runtime alone a usable voting system does not make. We now analyze the conditions for passing and rejecting proposals early, and discuss this systems resistance to strategic voting.
Passing Proposals Early

A proposal can be closed early if there is currently a winner, and no sequence of votes can change that winner.
For the Condorcet Method, there is an undisputed winner if there is a winner who's smallest margin of victory in a 1v1 is larger than all outstanding voting power. For example, if candidate a is winning their closest 1v1 by a margin of $6$ votes, and there are $5$ votes outstanding, candidate a is the undisputed winner.
In terms of our $M$ matrix, we can pass early if there is a column with only positive values, and the minimum value in that column is $&gt;$ the outstanding voting power. This is shown more rigorously later.
The Filibuster

A side-effect of this is that a group of voters who's voting power is larger than the smallest margin of the winning candidate can attempt to collude to prevent a proposal from having a winner by ranking the winning candidate last on all of their ballots. For example, if c was the current winner, an example ballot modification might be [a, c, b] -> [a, b, c].
Fortunately, as discussed here, if the other voters catch on to this they may also modify their ballots in the opposite way by always ranking c > a and thus prevent the filibuster and still electing the Condorcet winner.
Practically speaking, this method biases itself towards no outcome. A relatively small group of voters preferring no action to the winning option may cause no winner to be selected unless the larger voting body is active in their resistance to this happening.
Strategic Voting

This filibustering is a form of strategic voting wherein voters vote against their true preferences to cause an outcome. What about other forms of strategic voting?
Fortunately, we appear to be quite safe. There is a rather large collection of literature which explores and confirms the resistance of Condorcet methods to strategic voting (see: 2, 3, 4).
We do not claim to be experts on voting systems nor their math, so it is quite possible that there exist unknown unknowns here.
A Voting Game

To help build intuition about strategic voting, we designed The Condorcet Voting Game, a game where if you can find a winning strategy, you have also found a strategic voting method. We were unable to come up with any strategies other than the filibuster described earlier.
Rejecting Proposals Early

Here we prove the conditions for early proposal rejection. We first analyze the properties of our $M$ matrix in more depth, then with those in hand prove a conservative bound for early proposal rejection.
Properties of $M$

Let $M$ be a $N\times N$ matrix where $N$ is the number of candidates in a ranked choice election. This is the same matrix that we defined earlier, and, like earlier, say that $M[i][j]$ is the number of times that candidate $i$ has beat candidate $j$ in a 1v1, minus the number of times candidate $j$ has beat candidate $i$.
M[i][j] = number_of_times_i_has_beaten_j - number_of_times_j_has_beaten_i

We then say if there exists a candidate $c_i$ who has won more often than they have lost against every other candidate, that candidate is the Condorcet Winner. This being another way of saying that they have won the majority of ballots against every other candidate.
$$
\forall v \in M[c_i], v \neq c_i \implies v \gt 0
$$
Note here the $v \neq c_i$ condition which prevents a comparison between the candidate and themselves. It will actually be the case that $\forall (x, y) \in \mathbb{Z}^2, x = y \implies M[x][y] = 0$ , which is a math heavy way of saying that ballots will never prefer a candidate over themselves, so there will only be zero values where that is the case. We imprecisely ignore the $y=x$ line when calling a row "all positive" throughout this piece.
Because of the diagonalization discussed earlier, if there is a row like this there can be no other rows that are also all positive. This row casts a shadow of negative values across all other rows due to the symmetry that preferences have.
$$
A \gt B \iff B \lt A
$$
Condorcet Paradox

A Condorcet Paradox is a situation wherein no candidate is majority-preferred to every other candidate in pairwise comparison, meaning, there is no winner. In $M$, this would appear as a matrix with no column with only positive values.
It may help to have an example of a Condorcet Paradox when working through these proofs. Here is one set of ballots and a corresponding matrix $M$ that creates a Condorcet Paradox:
Choices: A, B, C

Ballots: 

A > B > C
B > C > A
C > A > B

Resulting $M$ matrix:


A
B
C


A
0
1
-1


B
-1
0
1


C
1
-1
0


As no column is positive, there is no winner for this set of votes.
Definitions

For a column $c_i$ let $dfp(c_i)$, its distance from positivity, be the number of voting power that would need to be added to candidate $c_i$ for all the values in that column to be positive.
$$
dfp(c) := sum({|v| + 1 \mid v \in M[c_i]})
$$
We use the notation ${v \mid v \in M[c_i]}$ to mean the set of values in the column $c_i$ in $M$ and ${c \mid c \in M}$ to mean the set of columns of $M$.
Let $minDistance$ be the minimum distance from positivity over all columns.
$$
minDistance := min({dfp(c) \mid c \in M})
$$
For a column $c_i$ let $mnm(c_i)$ be the highest magnitude negative number in that column.
$$
mnm(c_i) := max({ abs(v) | v \in M[c_i], c_i <= 0 })
$$
Let $P$ be the outstanding voting power. For example, with three voters with one voting power each, if only one had voted $P$ would be $2$.
When a vote is cast with voting power $V$, and its highest ranked choice is $c_i$, then $V$ is added to every cell for the column of $c_i$ excluding the cell at $(c_i, c_i$). This is because for each of the $N-1$ choices besides $c_i$, there is now $V$ more voting power that has ranked $c_i$ has over that choice. Therefore, $V(N-1)$ is added in total to $c_i$'s column.
Claims and proofs

A: In a matrix with a Condorcet Paradox, there will be no Condorcet Winner found if:
$$
minDistance > P(N-1)
$$
B: In a matrix with a Condorcet Paradox, there will be no Condorcet Winner found if:
$$
\forall c_i \in M,dpf(c_i)\le P(N-1)\implies mnm(c_i) \ge P
$$
Proof Of Claim A

We claim that if this is true, there will never be a Condorcet Winner in $M$:
$$
minDistance > P(N-1)
$$
Say there is a way that you can create a Condorcet Winner in $M$ using the remaining voting power.
Let this Condorcet Winner be $W$, and its corresponding column in $M$ be $C$, and let $C'$ be the state of $C$ after $W$ becomes the Condorcet Winner.
Let $X = dfp(C)$. If $C'$ is the Condorcet Winner, then it must be the case that $P(N-1) &gt;= X$.
$minDistance \le X$ by definition, so $minDistance &lt;= X &lt;= P(N-1)$.
But this contradicts our premise that $minDistance &gt; P(N-1)$.
Therefore, if $minDistance &gt; P(N-1)$, there will never be a Condorcet Winner in $M$.
Proof Of Claim B

We claim that if this is true, there will never be a Condorcet Winner in $M$:
$$
\forall c_i \in M, dfp(c_i) \le P(N-1) \implies mnm(c_i) \ge P
$$
Let $C$ be a column in $M$ such that $dfp(C) \le P(N-1)$ and $mnm(C) \ge P$.
Say there is a way that you can create a Condorcet Winner in $M$ using the remaining voting power.
Let this Condorcet Winner be $W$, and its corresponding column in $M$ be $C$, and let $C'$ be the state of $C$ after $W$ becomes the Condorcet Winner.
Let $Y = mnm(C)$, and $z$ be the corresponding cell for which $C[z] = Y$.
Let $p = C'[z] - C[z]$. $p &gt; Y$ as $C'[z]$ must be positive by the definition of Condorcet Winner.
$p \ge P$, since the most that $C[z]$ could have been increased by is $P$.
So this means $P \ge p &gt; Y = mnm(C)$.
But this contradicts our premise as we have selected $C$ from a set of columns where $mnm(C) \ge P$².
Therefore, there will never be a Condorcet Winner in $M$ if:
$$
\forall c_i \in M, dfp(c_i) \le P(N-1) \implies mnm(c_i) \ge P
$$
Conclusion

Here we have described a ranked choice voting system that is suitable for use by on-chain DAOs. To do so, we have:

Designed a tallying algorithm that is constant time over the number of votes cast.
Arranged our contract's state and gas usage such that any proposal that can be created may be voted on and executed.
Increased the storage efficiency of our implementation by taking advantage of the symmetry of preferences.
Explored our systems resistance to strategic voting and found a filibuster.
Proven the conditions under which a proposal may be rejected or passed early.

Future work will incorporate this voting system in the DAO DAO frontend and explore effectively communicating the status of a proposal in detail. A complete implementation of this as a DAO DAO proposal module can be found under the BSD license here.
Footnotes


Note that the condition for winning a majority can be written as $2*votes > total$. ↩


Note that we also know that no value outside of the set we have selected $C$ from could be a Condorcet Winner from our earlier proof of claim A. ↩