Skip to content

Instantly share code, notes, and snippets.

View soaxelbrooke's full-sized avatar
📈
Text ⇨ Understanding

Stuart Axelbrooke soaxelbrooke

📈
Text ⇨ Understanding
View GitHub Profile
  • Update HISTORY.rst
  • Update version number in my_project/__init__.py
  • Update version number in setup.py
  • Install the package again for local development, but with the new version number:
python setup.py develop
  • Run the tests:
python setup.py test
class GzipBuffer(object):
def __init__(self):
self.len = 0
self.buffer = io.BytesIO()
self.writer = gzip.GzipFile(fileobj=self.buffer, mode='wb')
def append(self, thing):
self.len += 1
self.writer.write(thing)
class GzipBuffer(object):
def __init__(self):
self.len = 0
self.buffer = io.BytesIO()
self.writer = gzip.GzipFile(fileobj=self.buffer, mode='wb')
def append(self, thing):
self.len += 1
self.writer.write(thing)

Keybase proof

I hereby claim:

  • I am stuartaxelowen on github.
  • I am soaxelbrooke (https://keybase.io/soaxelbrooke) on keybase.
  • I have a public key whose fingerprint is F8B6 D6F2 A6A7 5C3C C49A 702F 9F22 8954 24AC 725A

To claim this, I am signing this object:

object implicits {
implicit class ESFuture[Response <: ActionResponse](future: ListenableActionFuture[Response])
extends Future[Response] {
override def onComplete[U](f: (Try[Response]) => U)(implicit executor: ExecutionContext): Unit = {
future.addListener(new ActionListener[Response] {
override def onFailure(e: Throwable): Unit = throw e
override def onResponse(response: Response): Unit = f(Try(response))
})
@soaxelbrooke
soaxelbrooke / ec2ssh.sh
Last active August 24, 2016 17:51
SSH into the first found ec2 instance matching your name filter
#!/usr/bin/env bash
# Usage: $ ec2ssh cassandra-i-3
ssh $(aws ec2 describe-instances --query 'Reservations[].Instances[].[Tags[?Key==`Name`].Value | [0], PrivateIpAddress]' --output text | grep $1 | head -n 1 | python -c 'import sys; print(sys.stdin.read().split("\t")[1].strip())')
@soaxelbrooke
soaxelbrooke / gensim_phrase_prefix_tree_export.py
Last active October 13, 2016 09:08
Script for exporting large Gensim Phrase models to prefix trees to save memory and CPU time.
from gensim.models import Phrases
import sys
assert len(sys.argv) > 2, "Need gensim model path and output filename!"
in_path, out_path = sys.argv[:2]
class PrefixTree(object):
def __init__(self, words, impl=dict, suffix_impl=list):
self.word = words[0]
@soaxelbrooke
soaxelbrooke / beta_distribution_fit.scala
Last active October 24, 2016 19:56
Fitting beta distributions in Scala 😉
import scala.sys.process._
object BetaDistributionFit {
val distName: String = "beta"
def fitCommand(samples: Seq[Double]): Seq[String] =
Seq("python", "-c",
s"""
|from scipy import stats
@soaxelbrooke
soaxelbrooke / elasticsearch_python_talk.md
Last active January 21, 2017 02:40
Transcript of a live-coded Python + Elasticsearch talk about text analytics

Text analytics engine!

Hey guys! I'm @soaxelbrooke, and I am here to show you ladies and guys how to create a basic text analytics engine with Elasticsearch.

Getting the data

Let's get the data first! These are product reviews from Amazon, which can be found here.

$ curl http://times.cs.uiuc.edu/~wang296/Data/LARA/Amazon/AmazonReviews.zip -o reviews.zip
@soaxelbrooke
soaxelbrooke / callable_dict.py
Last active March 17, 2017 07:57
A callable dictionary useful for functional programming
from typing import Optional, Hashable, TypeVar
class CallableDict(dict):
V = TypeVar('V')
""" A callable dictionary useful for functional programming """
def __call__(self, key: Hashable, default: Optional[V]=None) -> Optional[V]:
return self.get(key, default)