Skip to content

Instantly share code, notes, and snippets.


Andrej karpathy

View GitHub Profile
karpathy / gist:587454dc0146a6ae21fc
Last active Oct 27, 2020
An efficient, batched LSTM.
View gist:587454dc0146a6ae21fc
This is a batched LSTM forward and backward pass
import numpy as np
import code
class LSTM:
def init(input_size, hidden_size, fancy_forget_bias_init = 3):
karpathy / gist:7bae8033dcf5ca2630ba
Created May 5, 2015
Efficient LSTM cell in Torch
View gist:7bae8033dcf5ca2630ba
Efficient LSTM in Torch using nngraph library. This code was optimized
by Justin Johnson (@jcjohnson) based on the trick of batching up the
LSTM GEMMs, as also seen in my efficient Python LSTM gist.
function LSTM.fast_lstm(input_size, rnn_size)
local x = nn.Identity()()
local prev_c = nn.Identity()()
local prev_h = nn.Identity()()
karpathy / gist:f3ee599538ff78e1bbe9
Last active Jul 6, 2019
Batched L2 Normalization Layer for Torch nn package
View gist:f3ee599538ff78e1bbe9
This layer expects an [n x d] Tensor and normalizes each
row to have unit L2 norm.
local L2Normalize, parent = torch.class('nn.L2Normalize', 'nn.Module')
function L2Normalize:__init()
function L2Normalize:updateOutput(input)
karpathy /
Last active Feb 25, 2021
Minimal character-level language model with a Vanilla Recurrent Neural Network, in Python/numpy
Minimal character-level Vanilla RNN model. Written by Andrej Karpathy (@karpathy)
BSD License
import numpy as np
# data I/O
data = open('input.txt', 'r').read() # should be simple plain text file
chars = list(set(data))
data_size, vocab_size = len(data), len(chars)
karpathy / gist:88701557e59199f16045
Last active Mar 6, 2019
Google slides in present form shows the next slide, but it is tiny and very difficult to see. This CSS hacks it so that the next slide is large.
View gist:88701557e59199f16045
.punch-viewer-speakernotes-side-panel {
width: 400px !important;
.punch-viewer-speakernotes-text-body-scrollable {
left: 435px !important;
.punch-viewer-speakernotes-page svg {
width:400px !important;
height:300px !important;
karpathy /
Created May 30, 2016
Training a Neural Network ATARI Pong agent with Policy Gradients from raw pixels
""" Trains an agent with (stochastic) Policy Gradients on Pong. Uses OpenAI Gym. """
import numpy as np
import cPickle as pickle
import gym
# hyperparameters
H = 200 # number of hidden layer neurons
batch_size = 10 # every how many episodes to do a param update?
learning_rate = 1e-4
gamma = 0.99 # discount factor for reward
karpathy /
Last active Jan 23, 2021
Natural Evolution Strategies (NES) toy example that optimizes a quadratic function
A bare bones examples of optimizing a black-box function (f) using
Natural Evolution Strategies (NES), where the parameter distribution is a
gaussian of fixed standard deviation.
import numpy as np
# the function we want to optimize