Skip to content

Instantly share code, notes, and snippets.

View renan-cunha's full-sized avatar
🏠
Working from home

Renan Cunha renan-cunha

🏠
Working from home
View GitHub Profile
@renan-cunha
renan-cunha / Fill.tst
Created December 25, 2019 05:13
Hack Machine Language I/O
// This is a program of the 4th project of Nand2Tetris
// https://www.nand2tetris.org/project04
// The program runs an infinite loop that listens to the keyboard input.
// When a key is pressed (any key), the program blackens the screen, i.e.
// writes "black" in every pixel; the screen should remain fully black as
// long as the key is pressed.
// When no key is pressed, the program clears the screen, i.e. writes "white"
// in every pixel; the screen should remain fully clear as long as no key is pressed.
@renan-cunha
renan-cunha / bit_flipping_env.py
Created November 23, 2019 21:06
HER not working with modified version of bitflipping that has max_step = 1
from collections import OrderedDict
import numpy as np
from gym import GoalEnv, spaces
class BitFlippingEnv(GoalEnv):
"""
Simple bit flipping env, useful to test HER.
The goal is to flip all the bits to get a vector of ones.
@renan-cunha
renan-cunha / bit_flipping_env.py
Created November 23, 2019 21:06
HER not working with modified version of bitflipping that has max_step = 1
from collections import OrderedDict
import numpy as np
from gym import GoalEnv, spaces
class BitFlippingEnv(GoalEnv):
"""
Simple bit flipping env, useful to test HER.
The goal is to flip all the bits to get a vector of ones.
@renan-cunha
renan-cunha / agent.py
Created August 30, 2019 13:49 — forked from edwardyu/agent.py
Implementation of policy gradient from scratch.
import random
import numpy as np
import scipy.stats
class LinearSoftmaxAgent(object):
"""Act with softmax policy. Features are encoded as
phi(s, a) is a 1-hot vector of states."""
def __init__(self, state_size, action_size):
self.state_size = state_size