This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
import pandas as pd | |
import numpy as np | |
from datetime import datetime | |
import matplotlib.pyplot as plt | |
from collections import Counter | |
PLAYER = "Hikaru" | |
START_DATE = "2023-01-01" | |
END_DATE = "2023-11-28" |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
''' | |
Author: Kirill Bobyrev (https://github.com/kirillbobyrev) | |
This module implements "An Extended Example: Tic Tac Toe" from `Reinforcement | |
Learning: An Introduction`_ book by Richard S. Sutton and Andrew G. Barto | |
(January 1, 2018 complete draft) described in Section 1.5. The implemented | |
Reinforcement Learning algorithm is TD(0) and it is trained via self-play | |
between two agents. The update rule is slightly modified given the environment | |
specifics to comply with the one introduced in the Chapter 1, but as shown | |
later is equivalent to the one used in generic settings. |