Skip to content

Instantly share code, notes, and snippets.

@nsaphra
nsaphra / lstm_internal_hook.py
Created July 16, 2019 11:23
Rerun an LSTM as a hook, so we can analyze the disassembled gate activations.
"""
Because pytorch does not expose the internal activations of a module,
we must instead rerun the same exact function inside that module.
This is written specifically for a 1 layer LSTM with all default settings.
"""
import torch
import torch.nn as nn
from torch.autograd import Variable
@nsaphra
nsaphra / token_type_counter.py
Created September 20, 2018 15:21
count the type and tokens in a file
import sys
types = set()
token_count = 0
for i, line in enumerate(sys.stdin):
if i % 1000 == 0:
print('.')
line = line.strip().split()
types.update(line)
@nsaphra
nsaphra / shuffle_corpus.py
Created July 9, 2018 14:59
If you have a corpus in a format where 1 file contains tokens and a different file has the corresponding POS tags, take the 2 files and shuffle them simultaneously so the tokens are still aligned with the correct tags.
# -*- coding: utf-8 -*-
import os
from random import shuffle
import argparse
parser = argparse.ArgumentParser(description='shuffle a corpus such that the tags and the original tokenized text still align')
parser.add_argument('--unshuffled_dir', type=str)
parser.add_argument('--shuffled_dir', type=str)
parser.add_argument('--tag_suffix', type=str, default='.tag')
args = parser.parse_args()

Keybase proof

I hereby claim:

  • I am nsaphra on github.
  • I am nsaphra (https://keybase.io/nsaphra) on keybase.
  • I have a public key ASCpyzsqtJYqR6IjSCnoPwSjrInpOg35MPypGR9l_pvTcQo

To claim this, I am signing this object:

@nsaphra
nsaphra / zipf.py
Created April 19, 2017 15:50
discrete log uniform power distribution
def zipf(size, exponent):
x = np.arange(size, dtype='float')
pmf = (x ** exponent).reciprocal()
pmf /= pmf.sum()
return stats.rv_discrete(values=range(size), pmf)
@nsaphra
nsaphra / naughtandcrosses.py
Created March 6, 2017 19:14
recurse center interview code
class NoughtsAndCrosses:
NOUGHT = "O"
CROSS = "X"
EMPTY = " "
STALEMATE = "Nobody"
def __init__(self):
self.board = [[self.EMPTY] * 3, [self.EMPTY] * 3, [self.EMPTY] * 3]
@nsaphra
nsaphra / tf.sh
Last active November 24, 2017 16:24
Activate a conda jupyter notebook in tmux, for use on a server with timeouts after each notebook start.
#!/bin/bash
if [ "$TERM" != "screen" ]
then
if type tmux >/dev/null 2>&1
then
tmux att || tmux \
new -s tensorflow -n shell \; \
neww -n notebook "source activate tensorflow; cd Documents/dynamic_curriculum; jupyter notebook" \; \
neww -n dir "cd Documents/dynamic_curriculum"
@nsaphra
nsaphra / LispParser.jl
Last active March 2, 2016 14:49
Simple lisp parser for RC pair programming interview.
type SyntaxNode
label::AbstractString
parent::SyntaxNode
children::Array{SyntaxNode}
# TODO No error handling when going up a level with undefined parent.
SyntaxNode() = (
x = new();
x.label = "";
x.children = [];
@nsaphra
nsaphra / concatenate_corpus.py
Created February 17, 2015 17:29
Concatenate all the files in a directory, recursively, and print their contents.
#!/usr/bin/python
from collections import defaultdict
import json
import os
import argparse
import gzip
import sys
import codecs
from time import asctime
@nsaphra
nsaphra / SparsePy.jl
Created December 16, 2014 23:01
SparsePy.jl
module SparsePy
# TODO this is only for CSC sparse matrix PyObjects and julia matrices.
# Add other types when julia releases them?
require("PyCall")
using PyCall
export jlmat2pymat, pymat2jlmat
@pyimport scipy.sparse as pysparse