Skip to content

Instantly share code, notes, and snippets.

@davidADSP
Last active December 1, 2019 21:04
Show Gist options
  • Save davidADSP/a9ea858710cc04f0192cd752df03d824 to your computer and use it in GitHub Desktop.
Save davidADSP/a9ea858710cc04f0192cd752df03d824 to your computer and use it in GitHub Desktop.
def select_action(config: MuZeroConfig, num_moves: int, node: Node,
network: Network):
visit_counts = [
(child.visit_count, action) for action, child in node.children.items()
]
t = config.visit_softmax_temperature_fn(
num_moves=num_moves, training_steps=network.training_steps())
_, action = softmax_sample(visit_counts, t)
return action
def visit_softmax_temperature(num_moves, training_steps):
if num_moves < 30:
return 1.0
else:
return 0.0 # Play according to the max.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment