Skip to content

Instantly share code, notes, and snippets.

View renan-cunha's full-sized avatar
🏠
Working from home

Renan Cunha renan-cunha

🏠
Working from home
View GitHub Profile
@renan-cunha
renan-cunha / agent.py
Created August 30, 2019 13:49 — forked from edwardyu/agent.py
Implementation of policy gradient from scratch.
import random
import numpy as np
import scipy.stats
class LinearSoftmaxAgent(object):
"""Act with softmax policy. Features are encoded as
phi(s, a) is a 1-hot vector of states."""
def __init__(self, state_size, action_size):
self.state_size = state_size