Skip to content

Instantly share code, notes, and snippets.

@mlzxy
Last active December 26, 2020 08:08
Show Gist options
  • Save mlzxy/a409f7f99802bf6ce79bcbbe5085c341 to your computer and use it in GitHub Desktop.
Save mlzxy/a409f7f99802bf6ce79bcbbe5085c341 to your computer and use it in GitHub Desktop.
# in cfr, we simulate all actions
v[I] = {a: cfr(h + [a], {**π_i, P(h): π_i[P(h)] * σ[t][I][a]}, i, t)
for a in A[I]}
# in outcome sampling mccfr, we only need to sample one a from A[I]
a = sample(A[I], σ[t][I]) # or use `ϵ * uniform + (1-ϵ) * σ[t][I]`
v[I][a] = mccfr(h + [a], {**π_i, P(h): π_i[P(h)] * σ[t][I][a]})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment