Skip to content

Instantly share code, notes, and snippets.

@anbnyc
Last active September 27, 2023 07:33
Show Gist options
  • Save anbnyc/282646575f41fabf41da239cbc9ace1f to your computer and use it in GitHub Desktop.
Save anbnyc/282646575f41fabf41da239cbc9ace1f to your computer and use it in GitHub Desktop.
Unofficial README for NYC Ranked Choice Voting Cast Vote Record (CVR)

Unofficial README for NYC Ranked Choice Voting Cast Vote Record (CVR)

The NYC BOE released the data files, but no metadata or explanation, so we have to do this ourselves.

What is in the files?

Each row is a cast vote, and each column is a preference for a candidate. These are the columns that appear in all spreadsheets, regardless of what borough or ballot type they are:

Column Explanation
Cast Vote Record unique ID number
Precinct AD (assembly district) and ED (election district) where the voter is registered
(your polling place is determined by your AD/ED)
Ballot Style Identifier for the type of ballot the voter used, based on the unique combination of races in their AD/ED

Other columns follow this format: [PARTY] [Office] Choice [#] of 5 [County] ([race id]). For example, DEM Borough President Choice 1 of 5 New York (024307). If there were fewer than 5 candidates in a race, these columns may only go up to the number of candidates, for example, ...Choice 1 of 3....

Since there is a column for every race in the borough in each spreadsheet, most columns don't actually apply to any given row, which is why we see so many undervote cells. The spreadsheet does not differentiate, for example, between a City Council race outside the voter's district and a City Council race where the voter skipped one or more ranked choices. Both cases are listed as undervote.

What do the file names mean?

Prefixes (P*)

Abbreviation Meaning
P1 New York county (Manhattan)
P2 Bronx county
P3 Kings county (Brooklyn)
P4 Queens county
P5 Richmond county (Staten Island)

Suffixes

Abbreviation Meaning
ABS Absentee ballots
AFF Affidavit ballots
ELE# Day-of election ballots
(numbers after ELE appear to be related to file size, not contents)
EMG Emergency ballots

Undervotes and Overvotes

This section is a best guess based on comparing the CVR to published RCV results.

If a ballot starts with an undervote, it is skipped and the vote instead goes to the candidate selected in the next correct column; everything shifts to the left. For example, a ballot with undervote in first and Eric L. Adams in second would be counted for Eric Adams as if that were the voter's first choice. If a ballot starts with an overvote, the ballot does not count for that race, because it would be impossible to determine which candidate to allocate the voter's vote to. (The BOE gave people the opportunity to correct overvotes in person if they voted in person, or by mail if they voted absentee, but some overvotes still got through.)

This image from the BOE page explaining RCV illustrates both an undervote and an overvote: here the voter cast an undervote (no candidate selected) in rank 1, and an overvote (multiple candidates selected) in rank 2. This ballot would not count for any candidate because the undervote is skipped but the overvote disqualifies the ballot.

Illustration from BOE website of a ballot with two marks in one column, which is an error.

Duplicate votes

This section is a best guess based on comparing the CVR to published RCV results.

Some ballots contain multiple votes for the same candidate, either consecutively or with other candidates in between. It appears that the first rank where a candidate appears on the ballot is counted toward that candidate, and subsequent ones are disregarded. For example:

Cast vote How it's counted
Wiley, Wiley, Adams, Garcia 1st choice: Wiley. 2nd choice: Adams. 3rd choice: Garcia
Wiley, Yang, Wiley, Garcia 1st choice: Wiley. 2nd choice: Yang. 3rd choice: Garcia
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment