Skip to content

Instantly share code, notes, and snippets.

View kirillbobyrev's full-sized avatar
♟️
Building things

Kirill Bobyrev kirillbobyrev

♟️
Building things
View GitHub Profile
@kirillbobyrev
kirillbobyrev / eda.py
Last active November 30, 2023 01:36
Scripts for "Analyzing long win streaks in online chess"
import requests
import pandas as pd
import numpy as np
from datetime import datetime
import matplotlib.pyplot as plt
from collections import Counter
PLAYER = "Hikaru"
START_DATE = "2023-01-01"
END_DATE = "2023-11-28"
@kirillbobyrev
kirillbobyrev / td-gamma.ipynb
Last active June 6, 2018 15:35
TD(\gamma).ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@kirillbobyrev
kirillbobyrev / tic_tac_toe.py
Last active May 30, 2022 11:11
Reinforcement Learning vs Tic Tac Toe: Temporal Difference (TD) Agent that beats Tic Tac Toe game through self-play
'''
Author: Kirill Bobyrev (https://github.com/kirillbobyrev)
This module implements "An Extended Example: Tic Tac Toe" from `Reinforcement
Learning: An Introduction`_ book by Richard S. Sutton and Andrew G. Barto
(January 1, 2018 complete draft) described in Section 1.5. The implemented
Reinforcement Learning algorithm is TD(0) and it is trained via self-play
between two agents. The update rule is slightly modified given the environment
specifics to comply with the one introduced in the Chapter 1, but as shown
later is equivalent to the one used in generic settings.