Skip to content

Instantly share code, notes, and snippets.

@deebs67
Last active March 4, 2021 12:57
Show Gist options
  • Save deebs67/936ffcacbf299ac20a6edfd44dbb832c to your computer and use it in GitHub Desktop.
Save deebs67/936ffcacbf299ac20a6edfd44dbb832c to your computer and use it in GitHub Desktop.
Reinforcement Learning (RL)

Reinforcement Learning (RL)

On this page we pull together some key links on the topic of Reinforcement Learning (RL), which is a particular technique within the wider fields of Machine Learning (ML) or Artificial Intelligence (AI).

RL educational resources

Here is 'Reinforcement Learning with Matlab and Simulink'. It is produced by Mathworks, the company which produces the software products mentioned in the title. There are some videos (narrated by Brian Douglas) and an ebook, which all do a good job of explaining how RL might be used to control robots, navigate gridworlds, and other examples:
https://uk.mathworks.com/campaigns/offers/reinforcement-learning-with-matlab-ebook.html

Gym from OpenAI is a toolkit for developing and comparing Reinforcement Learning algorithms. It supports teaching RL 'agents' everything from walking to playing games like Pong or Pinball:
https://gym.openai.com/

One particular RL technique is called 'Q-learning'. Whilst much of the literature on Q-learning is somewhat tricky for the beginner to understand, this short (~13-minute) video, and the others in the same series, have proven to be quite enlightening:
https://www.youtube.com/watch?v=bHeeaXgqVig

Matchbox Machine-Learning (MML)

Using matchboxes filled with coloured beads is a particular 'low-tech' implementation of RL, as first pioneered by Donald Michie with his 'MENACE' MML algorithm for playing noughts-and-crosses (tic-tac-toe). For more information on MML, see the Gist page dedicated to the topic here:
https://gist.github.com/deebs67/8fbcf8b127a63e70d4a3f8590c97701d

RL for playing games

Reinforcement Learning has had significant success in recent years in learning to play board games (e.g. Chess, Go/Baduk) and computer games (e.g. Atari arcade games). Here are some useful links:

Alphago is software by a company called DeepMind (now a part of Google) which has learned to play Go at and beyond the level of the best human professional players (prior to Alphago, computer programs never exceeded amateur level play at Go). The same Alphago software (or variants thereof) has also been applied to other problems such as playing Chess:
https://deepmind.com/research/case-studies/alphago-the-story-so-far
...and here's the official trailer for a documentary movie about Alphago and its challenge match against top Go player Lee Sedol:
https://www.youtube.com/watch?v=8tq1C8spV_g
...and the full movie (~1hr30 mins, but highly recommended - it's both educational and gripping at the same time):
https://www.youtube.com/watch?v=8tq1C8spV_g

DeepMind have also had success in teaching RL agents to play computer arcade games such as the Atari series of arcade games. Here's a link to a paper about the 'Arcade Learning Environment' (ALE), an environment for teaching an RL agent to play arcade games:
https://arxiv.org/abs/1207.4708

Blog post on using RL to play the OpenAI Taxi game:
https://medium.com/@arnavparuthi/a-simple-reinforcement-learning-algorithm-which-plays-openais-taxi-game-2b1c6c251bb1

Potential serious 'real-world' applications of RL

For those who may not consider playing games to be a 'serious' application of RL, here is a space for links to some other 'real-world' applications. Many applications of RL can readily be envisaged, such as autonomous driving, tumour detection in medical imaging, target detection in radar etc. Note, however, that it isn't always easy to tell which of these have actually been implemented, and which have so far only been talked about...

Blog article on '10 Real-Life Applications of Reinforcement Learning':
https://neptune.ai/blog/reinforcement-learning-applications

RL for tuning microwave RF cavity filters:
Z. Wang, J. Yang, J. Hu, W. Feng and Y. Ou, "Reinforcement Learning Approach to Learning Human Experience in Tuning Cavity Filters", Proceedings of the 2015 IEEE Conference on Robotics and Biomimetics, Zhuhai, China, December 6-9, 2015

Google DeepMind on 'Using AI to predict retinal disease progression':
https://deepmind.com/blog/article/Using_ai_to_predict_retinal_disease_progression
...and 'AlphaFold: Using AI for scientific discovery' (protein folding):
https://deepmind.com/blog/article/AlphaFold-Using-AI-for-scientific-discovery

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment