Skip to content

Instantly share code, notes, and snippets.

View mashoujiang's full-sized avatar
💭
I may be slow to respond.

Murphy mashoujiang

💭
I may be slow to respond.
View GitHub Profile
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@mashoujiang
mashoujiang / pg-pong.py
Created March 21, 2018 01:31 — forked from karpathy/pg-pong.py
Training a Neural Network ATARI Pong agent with Policy Gradients from raw pixels
""" Trains an agent with (stochastic) Policy Gradients on Pong. Uses OpenAI Gym. """
import numpy as np
import cPickle as pickle
import gym
# hyperparameters
H = 200 # number of hidden layer neurons
batch_size = 10 # every how many episodes to do a param update?
learning_rate = 1e-4
gamma = 0.99 # discount factor for reward