Skip to content

Instantly share code, notes, and snippets.

View OscarBastardo's full-sized avatar
😁
Coding

Oscar Bastardo OscarBastardo

😁
Coding
View GitHub Profile
def new_stack(size):
# our stack will be a list of [list, int] for [contents, head]
# so stack[0] is contents
# stack[1] is head
contents = [0] * size
head = 0
return [contents, head]
def is_full(stack):
# this will not work
@OscarBastardo
OscarBastardo / onestep_AC
Created April 20, 2022 07:50
Running 2 agents for 100 episodes
Total returns of episode 0: -570.6499741403661
Total returns of episode 1: -648.0553254342431
Total returns of episode 2: -278.59876059469025
Total returns of episode 3: -451.24529167272294
Total returns of episode 4: -252.09881225187402
Total returns of episode 5: -53.53698923783439
Total returns of episode 6: -303.4262905515709
Total returns of episode 7: -127.713942630245
Total returns of episode 8: -538.5784137058114
Total returns of episode 9: -503.7856493101774
@OscarBastardo
OscarBastardo / DQN_returns
Created April 20, 2022 10:51
100 episodes
Episode #1 Epsilon:0.99 Episode Score:-124 Episode time:1.97s Replay time: 1.95s
Episode #2 Epsilon:0.98 Episode Score:-79 Episode time:2.37s Replay time: 2.36s
Episode #3 Epsilon:0.97 Episode Score:-82 Episode time:3.03s Replay time: 2.89s
Episode #4 Epsilon:0.96 Episode Score:-93 Episode time:2.35s Replay time: 2.29s
Episode #5 Epsilon:0.95 Episode Score:-209 Episode time:2.28s Replay time: 2.22s
Episode #6 Epsilon:0.94 Episode Score:-62 Episode time:2.55s Replay time: 2.48s
Episode #7 Epsilon:0.93 Episode Score:-330 Episode time:4.68s Replay time: 4.54s
Episode #8 Epsilon:0.92 Episode Score:-178 Episode time:5.29s Replay time: 5.01s
Episode #9 Epsilon:0.91 Episode Score:-252 Episode time:2.22s Replay time: 2.17s
Episode #10 Epsilon:0.90 Episode Score:-119 Episode time:4.03s Replay time: 3.76s