Skip to content

Instantly share code, notes, and snippets.

@ForteXX-2020
Last active January 12, 2021 22:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ForteXX-2020/620f80de70686804caee634f6c173c6a to your computer and use it in GitHub Desktop.
Save ForteXX-2020/620f80de70686804caee634f6c173c6a to your computer and use it in GitHub Desktop.
0002_reinforcement
import numpy as np
ssp = [1, 1, 1, 1, 0]
def epoch():
asp = [1, 0]
tr = 0
for _ in range(100):
a = np.random.choice(asp)
s = np.random.choice(ssp)
if a == s:
tr += 1
asp.append(s)
return tr
rl = np.array([epoch() for _ in range(15)])
print(rl.mean())
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment