Skip to content

Instantly share code, notes, and snippets.

@sankroh
sankroh / useful_pandas_snippets.py
Created April 22, 2016 15:12 — forked from bsweger/useful_pandas_snippets.md
Useful Pandas Snippets
#List unique values in a DataFrame column
pd.unique(df.column_name.ravel())
#Convert Series datatype to numeric, getting rid of any non-numeric values
df['col'] = df['col'].astype(str).convert_objects(convert_numeric=True)
#Grab DataFrame rows where column has certain values
valuelist = ['value1', 'value2', 'value3']
df = df[df.column.isin(value_list)]
@sankroh
sankroh / gist:3912514
Created October 18, 2012 15:21 — forked from mattb/gist:3888345
Some pointers for Natural Language Processing / Machine Learning

Here are the areas I've been researching, some things I've read and some open source packages...

Nearly all text processing starts by transforming text into vectors: http://en.wikipedia.org/wiki/Vector_space_model

Often it uses transforms such as TFIDF to normalise the data and control for outliers (words that are too frequent or too rare confuse the algorithms): http://en.wikipedia.org/wiki/Tf%E2%80%93idf

Collocations is a technique to detect when two or more words occur more commonly together than separately (e.g. "wishy-washy" in English) - I use this to group words into n-gram tokens because many NLP techniques consider each word as if it's independent of all the others in a document, ignoring order: http://matpalm.com/blog/2011/10/22/collocations_1/

@sankroh
sankroh / ajaxfileupload.js
Created April 26, 2012 20:24 — forked from HenrikJoreteg/ajaxfileupload.js
AJAX file uploading using jQuery and XMLHttpRequest 2.0 and adding listener for progress updates
// grab your file object from a file input
$('#fileInput').change(function () {
sendFile(this.files[0]);
});
// can also be from a drag-from-desktop drop
$('dropZone')[0].ondrop = function (e) {
e.preventDefault();
sendFile(e.dataTransfer.files[0]);
};
@sankroh
sankroh / gmaildomain.py
Created March 2, 2012 06:11 — forked from dstevensio/gmaildomain.py
Python Script To Add DNS Overrides For WebFaction Hosted Domains to use Google Apps Mail
'''
Run this script either by doing:
python gmaildomain.py
And you will be prompted for details, or you can save time by doing:
python gmaildomain.py WEBFACTION_USERNAME_HERE DOMAIN_TO_USE_GMAIL_FOR_HERE_WITHOUT_WWW WEBMAIL_DOMAIN
And you will only be prompted for your password.