Skip to content

Instantly share code, notes, and snippets.

View nschneid's full-sized avatar

Nathan Schneider nschneid

View GitHub Profile
@nschneid
nschneid / iologreg.py
Created March 24, 2014 20:17
Preliminary attempt at sparse learning in creg2. Non-sparse counterpart code is included for comparison.
import numpy as np
import scipy
import random
import math
import sys
INFINITY = float('inf')
def logadd(a,b):
"""
@nschneid
nschneid / pre-commit
Created April 26, 2015 14:19
Prevent git commits that miss files included in a LaTeX project
#!/bin/bash
# Git pre-commit hook to look for untracked files mentioned in the LaTeX and BibTeX logs.
# Fail if any are found. Note that this is not foolproof, as included .tex files
# not generating any errors or warnings may not be mentioned in the log.
#
# Goes in file .git/hooks/pre-commit under the repository root.
#
# Nathan Schneider (nschneid@cs.cmu.edu), 2015-02-26
# Adapted from http://stackoverflow.com/a/10932301
#
@nschneid
nschneid / allcats.txt
Created June 25, 2012 15:52
Document-to-category mapping for NLTK ptb module (full Penn Treebank corpus reader)
WSJ/00/WSJ_0001.MRG news
WSJ/00/WSJ_0002.MRG news
WSJ/00/WSJ_0003.MRG news
WSJ/00/WSJ_0004.MRG news
WSJ/00/WSJ_0005.MRG news
WSJ/00/WSJ_0006.MRG news
WSJ/00/WSJ_0007.MRG news
WSJ/00/WSJ_0008.MRG news
WSJ/00/WSJ_0009.MRG news
WSJ/00/WSJ_0010.MRG news
@nschneid
nschneid / universal_tags.py
Created December 7, 2012 06:50
Utility for mapping to universal part-of-speech tagset
'''
Interface for converting POS tags from various treebanks
to the universal tagset of Petrov, Das, & McDonald.
The tagset consists of the following 12 coarse tags:
VERB - verbs (all tenses and modes)
NOUN - nouns (common and proper)
PRON - pronouns
ADJ - adjectives
@nschneid
nschneid / Lectures.html
Created February 13, 2016 16:55
Exported HTML from Quiver note
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<!-- common.css -->
<style>* {-webkit-tap-highlight-color: rgba(0,0,0,0);}html {-webkit-text-size-adjust: none;}body {font-family: Arial, Helvetica, sans-serif;margin: 0;color: #333;word-wrap: break-word;}h1, h2, h3, h4, h5, h6 {line-height: 1.1;}img {max-width: 100% !important;}blockquote {margin: 0;padding: 0 15px;color: #777;border-left: 4px solid #ddd;}hr {background-color: #ddd;border: 0;height: 1px;margin: 15px 0;}code {font-family: Menlo, Consolas, 'Ubuntu Mono', Monaco, 'source-code-pro', monospace;line-height: 1.4;margin: 0;padding: 0.2em 0;font-size: 85%;background-color: rgba(0,0,0,0.04);border-radius: 3px;}pre > code {margin: 0;padding: 0;font-size: 100%;word-break: normal;background: transparent;border: 0;}ol {list-style-type: decimal;}ol ol, ul ol {list-style-type: lower-latin;}ol ol ol, ul ol ol, ul ul ol, ol ul ol {list-style-type: lower-roman;}table {border-spacing: 0;border-coll
@nschneid
nschneid / Non-MWE non-hyphenated -ies nouns with a corresponding -y or -ie noun in WN
Last active January 3, 2017 17:56
Use fine-grained POS tags to avoid problems with WordNet morphy lemmatizer (e.g., synsets with plural lemmas)
NNS--stem to -y:
allies amenities authorities canaries contemporaries
flies follies formalities fries funnies liabilities
hostilities jimmies skivvies
NNS--stem to -y or -ie
hippies
NNPS--leave:
alleghenies
@nschneid
nschneid / GitHubLightToolbar.md
Last active February 14, 2017 01:08
Light-themed GitHub toolbar

GitHub.com recently changed its header (toolbar at the top of every page) to a dark theme, which sticks out light a sore thumb against the rest of the page. Below is CSS to approximate the old theme. A browser add-on like Custom Style Script can be configured to inject the CSS into all pages served from the github.com domain.

@nschneid
nschneid / vivid-dark-hljs.css
Last active June 4, 2017 16:57
Vivid themes for highlight.js
/* Vivid Dark - Theme for highlight.js */
/* by Nathan Schneider (@nschneid) */
/* adapted from Atom theme: https://github.com/nschneid/vivid-syntax */
/* medium gray */
.hljs-comment {
color: hsl(220,9%, 55%)
}
@nschneid
nschneid / ptbpos2uni.py
Last active June 15, 2017 22:58
Given a new-style Penn Treebank English tree, produce the part-of-speech tags according to the Universal Dependencies project.
#!/usr/bin/env python2.7
'''
Converts new-style PTB POS tags to the English tagset from the Universal Dependencies project
(see universal-pos-en.html, from http://universaldependencies.github.io/docs/en/pos/all.html).
There are 17 such tags, expanded from the original 12 Universal POS tags of Petrov et al. 2011.
See "limitations" comment below for some details on our interpretation of the difficult-to-map
categories.
In new-style PTB, TO only applies to prepositional (not infinitival) "to".
@nschneid
nschneid / Purpose-fragment.txt
Created July 16, 2018 23:20
Xposition Markdown note
The following subcases serve to clarify the boundaries of [ss Purpose]:
1. A desired outcome that is separate from, but typically a motivation for (hence subtype of [ss Explanation]), the main event.
It is possible to complete the main event without realizing the purpose.
2. **Inanimate** thing or event which is aided/facilitated/addressed/achieved/acquired as a consequence of the main event:
- [ex 006 "We hired a caterer [p en/for Purpose] the party."]