Skip to content

Instantly share code, notes, and snippets.

richmarr /
Created March 28, 2012 10:24 — forked from kohlmeier/
Bayes net example in Python with Khan Academy data
#!/usr/bin/env python
from numpy import asmatrix, asarray, ones, zeros, mean, sum, arange, prod, dot, loadtxt
from numpy.random import random, randint
import pickle
MISSING_VALUE = -1 # a constant I will use to denote missing integer values
def impute_hidden_node(E, I, theta, sample_hidden):
richmarr / gist:3944934
Created October 24, 2012 08:56 — forked from mattb/gist:3888345
Some pointers for Natural Language Processing / Machine Learning

Here are the areas I've been researching, some things I've read and some open source packages...

Nearly all text processing starts by transforming text into vectors:

Often it uses transforms such as TFIDF to normalise the data and control for outliers (words that are too frequent or too rare confuse the algorithms):

Collocations is a technique to detect when two or more words occur more commonly together than separately (e.g. "wishy-washy" in English) - I use this to group words into n-gram tokens because many NLP techniques consider each word as if it's independent of all the others in a document, ignoring order:

$.fn.parsley.defaults = {
// basic data-api overridable properties here..
inputs: 'input, textarea, select' // Default supported inputs.
, excluded: 'input[type=hidden], :disabled' // Do not validate input[type=hidden] & :disabled.
, trigger: false // $.Event() that will trigger validation. eg: keyup, change..
, animate: true // fade in / fade out error messages
, animateDuration: 300 // fadein/fadout ms time
, focus: 'first' // 'fist'|'last'|'none' which error field would have focus first on form validation
, validationMinlength: 3 // If trigger validation specified, only if value.length > validationMinlength
, successClass: 'has-success' // Class name on each valid input