Skip to content

Instantly share code, notes, and snippets.

@sankroh
sankroh / useful_pandas_snippets.py
Created April 22, 2016 15:12 — forked from bsweger/useful_pandas_snippets.md
Useful Pandas Snippets
#List unique values in a DataFrame column
pd.unique(df.column_name.ravel())
#Convert Series datatype to numeric, getting rid of any non-numeric values
df['col'] = df['col'].astype(str).convert_objects(convert_numeric=True)
#Grab DataFrame rows where column has certain values
valuelist = ['value1', 'value2', 'value3']
df = df[df.column.isin(value_list)]
NAME EXPLANATION EXAMPLES
Common Name The fully qualified domain name (FQDN) of your server. This must match exactly what you type in your web browser or you will receive a name mismatch error.
*.google.com
@sankroh
sankroh / install_puppet.sh
Last active August 29, 2015 14:10
install_puppet.sh
#!/bin/sh
# WARNING: REQUIRES /bin/sh
#
# Install Puppet with shell... how hard can it be?
#
# 0.0.1a - Here Be Dragons
#
# Set up colours
if tty -s;then
@sankroh
sankroh / .pylintrc
Created April 2, 2014 16:51
Pylintrc
[MASTER]
profile=no
persistent=yes
ignore=migrations
cache-size=500
[BASIC]
# Regular expression which should only match correct module names
module-rgx=([a-z][a-z0-9_]*)$
@sankroh
sankroh / blackhawks_schedule.py
Created September 24, 2013 05:47
Blackhawks schedule
import pprint
import requests
def get_blackhawks_schedule():
url = "http://blackhawks.nhl.com/schedule/full.csv"
response = requests.get(url)
if response.status_code == 200:
data = filter(None, response.text.split('\r\n'))
headers = data[0].split(',')
data = [dict((headers[i], d) for i, d in enumerate(dt.split(','))) for dt in data]
@sankroh
sankroh / dow.py
Created October 29, 2012 02:22
Converting schedule days of week
from dateutil import rrule, relativedelta
from django.utils.timezone import now, get_default_timezone
import pytz
def main(datetime, weekdays):
tz_day = datetime.weekday()
print "TZ Day:", tz_day
utc_day = datetime.astimezone(pytz.utc).weekday()
print "UTC Day:", utc_day
weekdays = [eval('rrule.%s.weekday' % day) for day in weekdays]
@sankroh
sankroh / gist:3912514
Created October 18, 2012 15:21 — forked from mattb/gist:3888345
Some pointers for Natural Language Processing / Machine Learning

Here are the areas I've been researching, some things I've read and some open source packages...

Nearly all text processing starts by transforming text into vectors: http://en.wikipedia.org/wiki/Vector_space_model

Often it uses transforms such as TFIDF to normalise the data and control for outliers (words that are too frequent or too rare confuse the algorithms): http://en.wikipedia.org/wiki/Tf%E2%80%93idf

Collocations is a technique to detect when two or more words occur more commonly together than separately (e.g. "wishy-washy" in English) - I use this to group words into n-gram tokens because many NLP techniques consider each word as if it's independent of all the others in a document, ignoring order: http://matpalm.com/blog/2011/10/22/collocations_1/

@sankroh
sankroh / requests_test.py
Created May 30, 2012 16:29
Requests test
import requests
import time
def run():
output = []
for x in range(100):
resp = requests.get("http://perf.herokuapp.com")
output.append(resp.text)
print output
@sankroh
sankroh / ajaxfileupload.js
Created April 26, 2012 20:24 — forked from HenrikJoreteg/ajaxfileupload.js
AJAX file uploading using jQuery and XMLHttpRequest 2.0 and adding listener for progress updates
// grab your file object from a file input
$('#fileInput').change(function () {
sendFile(this.files[0]);
});
// can also be from a drag-from-desktop drop
$('dropZone')[0].ondrop = function (e) {
e.preventDefault();
sendFile(e.dataTransfer.files[0]);
};
@sankroh
sankroh / gist:2346670
Created April 9, 2012 21:28
Nginx Init
### BEGIN INIT INFO
# Provides: nginx
# Required-Start: $all
# Required-Stop: $all
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: starts the nginx web server
# Description: starts nginx using start-stop-daemon
### END INIT INFO