Skip to content

Instantly share code, notes, and snippets.

@apetresc
Last active July 20, 2021 18:14
Show Gist options
  • Save apetresc/6b7768bb124c173020257cb6452cc8ee to your computer and use it in GitHub Desktop.
Save apetresc/6b7768bb124c173020257cb6452cc8ee to your computer and use it in GitHub Desktop.
A script to filter a PGN game collection by a minimum time control

Installation

The script requires the python-chess library which can be installed via:

pip install -r requirements.txt

Usage

The filter.py script takes in a .pgn file as an argument, and filters out games that are faster than a pre-set time control (MIN_TIME_CONTROL in the script).

The time control is measured in seconds. If the game used a more complex time setting than simple absolute time, it will try to calculate an estimated game duration assuming an average of 40 moves per game. (For example, a 120+5 game would count as 120 + 40 * 5 = 320 second game).

Example

$ for g in $(curl -Ls https://api.chess.com/pub/player/Hikaru/games/archives | jq -rc ".archives[]") ; do curl -Ls "$g" | jq -rc ".games[].pgn" ; done >> games.pgn
$ python filter.py games.pgn > long_games.pgn
import chess.pgn
import sys
MIN_TIME_CONTROL = 180
pgn = open(sys.argv[1], 'r')
offsets = []
while True:
offset = pgn.tell()
headers = chess.pgn.read_headers(pgn)
if headers is None:
break
tc = headers.get("TimeControl", "0")
if '/' in tc:
moves, tc = map(int, tc.split('/'))
atc = tc / moves * 40
elif '+' in tc:
main, inc = map(int, tc.split('+'))
atc = main + inc * 40
elif int(tc):
atc = int(tc)
if atc > MIN_TIME_CONTROL:
offsets.append(offset)
for offset in offsets:
pgn.seek(offset)
print(chess.pgn.read_game(pgn))
print()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment