Skip to content

Instantly share code, notes, and snippets.

@bishboria
bishboria / springer-free-maths-books.md
Last active June 8, 2024 06:39
Springer made a bunch of books available for free, these were the direct links
@mattb
mattb / gist:3888345
Created October 14, 2012 11:53
Some pointers for Natural Language Processing / Machine Learning

Here are the areas I've been researching, some things I've read and some open source packages...

Nearly all text processing starts by transforming text into vectors: http://en.wikipedia.org/wiki/Vector_space_model

Often it uses transforms such as TFIDF to normalise the data and control for outliers (words that are too frequent or too rare confuse the algorithms): http://en.wikipedia.org/wiki/Tf%E2%80%93idf

Collocations is a technique to detect when two or more words occur more commonly together than separately (e.g. "wishy-washy" in English) - I use this to group words into n-gram tokens because many NLP techniques consider each word as if it's independent of all the others in a document, ignoring order: http://matpalm.com/blog/2011/10/22/collocations_1/

@coordt
coordt / check_for_updates.py
Created April 8, 2011 18:39
Check locally installed packages against one or more package indexes for updates and list them.
#!/usr/bin/env python
"""
Use pip to get a list of local packages to check against one or more package
indexes for updated versions.
"""
import pip
import sys, xmlrpclib
from cStringIO import StringIO
from distutils.version import StrictVersion, LooseVersion