Skip to content

Instantly share code, notes, and snippets.

@brandonb927
brandonb927 / osx-for-hackers.sh
Last active July 24, 2024 15:28
OSX for Hackers: Yosemite/El Capitan Edition. This script tries not to be *too* opinionated and any major changes to your system require a prompt. You've been warned.
#!/bin/sh
###
# SOME COMMANDS WILL NOT WORK ON macOS (Sierra or newer)
# For Sierra or newer, see https://github.com/mathiasbynens/dotfiles/blob/master/.macos
###
# Alot of these configs have been taken from the various places
# on the web, most from here
# https://github.com/mathiasbynens/dotfiles/blob/5b3c8418ed42d93af2e647dc9d122f25cc034871/.osx
@reorx
reorx / python_deployment.rst
Last active December 14, 2020 03:31
Python Deployment

This document is still a scratch

Python Deployment

Setup A Workplace

You’ll want a directory to do this in so that you don’t screw up your machine.

@arvearve
arvearve / gist:4158578
Created November 28, 2012 02:01
Mathematics: What do grad students in math do all day?

Mathematics: What do grad students in math do all day?

by Yasha Berchenko-Kogan

A lot of math grad school is reading books and papers and trying to understand what's going on. The difficulty is that reading math is not like reading a mystery thriller, and it's not even like reading a history book or a New York Times article.

The main issue is that, by the time you get to the frontiers of math, the words to describe the concepts don't really exist yet. Communicating these ideas is a bit like trying to explain a vacuum cleaner to someone who has never seen one, except you're only allowed to use words that are four letters long or shorter.

What can you say?

@neubig
neubig / crf.py
Created November 7, 2013 10:59
This is a script to train conditional random fields. It is written to minimize the number of lines of code, with no regard for efficiency.
#!/usr/bin/python
# crf.py (by Graham Neubig)
# This script trains conditional random fields (CRFs)
# stdin: A corpus of WORD_POS WORD_POS WORD_POS sentences
# stdout: Feature vectors for emission and transition properties
from collections import defaultdict
from math import log, exp
import sys
@swinton
swinton / AESCipher.py
Last active April 30, 2023 05:48
Encrypt & Decrypt using PyCrypto AES 256 From http://stackoverflow.com/a/12525165/119849
#!/usr/bin/env python
import base64
from Crypto import Random
from Crypto.Cipher import AES
BS = 16
pad = lambda s: s + (BS - len(s) % BS) * chr(BS - len(s) % BS)
unpad = lambda s : s[0:-ord(s[-1])]

Text Classification

To demonstrate text classification with Scikit Learn, we'll build a simple spam filter. While the filters in production for services like Gmail will obviously be vastly more sophisticated, the model we'll have by the end of this chapter is effective and surprisingly accurate.

Spam filtering is the "hello world" of document classification, but something to be aware of is that we aren't limited to two classes. The classifier we will be using supports multi-class classification, which opens up vast opportunities like author identification, support email routing, etc… However, in this example we'll just stick to two classes: SPAM and HAM.

For this exercise, we'll be using a combination of the Enron-Spam data sets and the SpamAssassin public corpus. Both are publicly available for download and are retreived from the internet during the setup phase of the example code that goes with this chapter.

Loading Examples

var graph = 'graph.json'
var w = 960,
h = 700,
r = 10;
var vis = d3.select(".graph")
.append("svg:svg")
.attr("width", w)
.attr("height", h)
@matthewmueller
matthewmueller / osx-for-hackers.sh
Last active April 21, 2024 03:30
OSX for Hackers (Mavericks/Yosemite)
# OSX for Hackers (Mavericks/Yosemite)
#
# Source: https://gist.github.com/brandonb927/3195465
#!/bin/sh
# Some things taken from here
# https://github.com/mathiasbynens/dotfiles/blob/master/.osx
# Ask for the administrator password upfront
@seancoyne
seancoyne / fix-google-drive-menu-bar-icons.sh
Created October 21, 2014 15:34
Fix Google Drive Yosemite Dark Mode Menu Bar Icons
#!/usr/bin/env bash
# change to proper directory
cd /Applications/Google\ Drive.app/Contents/Resources/
# back up the files
sudo mkdir icon-backups
sudo cp mac-animate*.png icon-backups/
sudo cp mac-error*.png icon-backups/
sudo cp mac-inactive*.png icon-backups/
@mblondel
mblondel / multiclass_svm.py
Last active March 3, 2023 07:57
Multiclass SVMs
"""
Multiclass SVMs (Crammer-Singer formulation).
A pure Python re-implementation of:
Large-scale Multiclass Support Vector Machine Training via Euclidean Projection onto the Simplex.
Mathieu Blondel, Akinori Fujino, and Naonori Ueda.
ICPR 2014.
http://www.mblondel.org/publications/mblondel-icpr2014.pdf
"""