Skip to content

Instantly share code, notes, and snippets.

@domdomegg
Created July 13, 2020 20:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save domdomegg/ee93098c6a1e7f19b87f6cf34713235c to your computer and use it in GitHub Desktop.
Save domdomegg/ee93098c6a1e7f19b87f6cf34713235c to your computer and use it in GitHub Desktop.
Quick n' dirty Q learning implementation in JavaScript
const input = {
learningRate: 0.1,
discountFactor: 0.95,
initialQ: {},
moves: [
{ start: 's17', action: 'right', reward: 0, finish: 's18' },
{ start: 's18', action: 'up', reward: 10, finish: 's14' },
{ start: 's14', action: 'right', reward: -4, finish: 's15' },
{ start: 's23', action: 'up', reward: 0, finish: 's18' },
{ start: 's18', action: 'up', reward: 0, finish: 's13' },
{ start: 's13', action: 'right', reward: 10, finish: 's14' },
]
}
const Q = {};
input.moves.forEach(({ start, action, reward, finish }) => {
Q[start] = Q[start] || {};
Q[start][action] = Q[start][action] || 0;
Q[start][action] += input.learningRate * (reward - Q[start][action] + input.discountFactor * (Math.max(0, ...Object.values(Q[finish] || {}))))
});
console.log(Q);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment