Skip to content

Instantly share code, notes, and snippets.

@PetarIvancevic
Created September 24, 2018 14:35
Show Gist options
  • Save PetarIvancevic/6d66d8c2e7f4877336ce76f098e63b93 to your computer and use it in GitHub Desktop.
Save PetarIvancevic/6d66d8c2e7f4877336ce76f098e63b93 to your computer and use it in GitHub Desktop.
const discountFactor = 0.85
const trainingSets = []
for (let i = 0; i < numMoves - 1; i++) {
let currentNetStateNormalized = netConfig.netNormalizedOutput(moves[i].boardVector)[0]
let reward = moves[i].reward
let nextNetStateNormalized = netConfig.netNormalizedOutput(moves[i + 1].boardVector)[0]
let netOutput = normalizeReluOutput(reward + discountFactor * nextNetStateNormalized)
trainingSets.push({
boardVector: moves[i].boardVector,
netOutput: [netOutput]
})
}
netConfig.net.train(_.map(trainingSets, function (trainingSet) {
return {
input: trainingSet.boardVector,
output: trainingSet.netOutput
}
}), {
iterations: 1
})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment