Skip to content

Instantly share code, notes, and snippets.

View amontalenti's full-sized avatar

Andrew Montalenti amontalenti

View GitHub Profile
database = "parsely_articles"
collections = range(10) # populate with collections similarly to before
script_cmds = []
for collection in collections:
cmd =\
"echo restoring {db}/{col}"\
" && "\
"lbzip2 --decompress {db}-{col}.bson.bz2"\
" && "\
"mongorestore --d {db} -c {col} {db}-{col}.bson".format(
<script>
/*
* $sf.ext.meta looks up metadata from parent window. Here's
* the relevant documentation from the SafeFrame standard:
*
* "Use to retrieve metadata about the SafeFrame position that
* was specified by the host. The host may specify additional
* metadata about this 3rd party content. The host specifies
* this metadata using the $sf.host.PosMeta class."
*
function getCurrentDateString() {
// JavaScript Date API does not return zero-padded date strings, so we need this utility
var pad = function(number) {
if (number < 10) {
return '0' + number;
}
return number;
};
// get the current day and format it as YYYY-MM-DD-HH
var date = new Date(),
@amontalenti
amontalenti / urltools.py
Last active August 29, 2015 13:57
example take home assignment using string splitting, joining, tuples, lists, dictionaries, and basic functions
"""urltools.py - parse and format web URLs.
HINT:
>>> "http://google.com".split("://")
["http", "google"]
>>> "google.com/hangout/parsely.com/am".split("/")
["google.com", "hangout", "parsely.com", "am"]

Hello again, growing Pythonistas!

Pleased and excited

So, first of all, I have to say that I was simultaneously pleased & impressed with the quality of submissions I got from you guys for our little Python take-home assignment. Pleased, because the problem seemed to be accessible enough that each of you could work on it with little background in Python beyond the bits you've been exposed to over the last few months. Impressed, because all of your submissions passed all of my test cases!

I was also happy with how different the submissions were. We Pythonistas often like to brag that, unlike in other languages, in Python there is "rarely more than one way to do it". However, this was a simple example, yet the solutions provided varied widely. (And indeed, in programming, no matter how simple the language, there's always more than one way to do it.) This therefore gives us a nice window into various understandings of Python code style, architecture, and idioms.

Patterns among code

@amontalenti
amontalenti / letter_editor.py
Created March 19, 2014 02:08
muckhacker letter to editor analysis for fun (NLTK, Pandas, utilities)
from nltk import FreqDist
from nltk.corpus import stopwords
from nltk import wordpunct_tokenize
# my little NLP utility library
import nlp2
import pandas as pd
df = pd.read_csv("hook_letters.csv")
from sst.actions import *
import time
import json
from settings import DASH_USERNAME, DASH_PASSWORD, APIKEYS
envs = {
"bri": "dash.parsely.com",
"ue1a": "ue1a-dash-web1.cogtree.com",
}
import time
import random
import requests
words = [word.strip() for word in open("/usr/share/dict/words")]
names = ["John", "Bob", "Peter", "Joe", "Sarah", "Clare"]
sections = ["politics", "entertainment", "life", "sports"]
apikeys = ["arstechnica_com", "foxnews_com"]
def measured_query(apikey, q):
@amontalenti
amontalenti / storm_coroutines.py
Created May 14, 2014 18:04
experiment of patching streamparse spouts and bolts and assembling a Python coroutine pipeline to model a simple topology
import sys
import types
print "importing Spouts and Bolts..."
from spouts_pig_sample import PigSampleSpout
from bolts_sessionize import SessionizeBolt
from bolts_visit_classify import VisitClassifyBolt
from bolts_aggregate import AggregateBolt
from bolts_cassandra_store import CassandraStoreBolt
@amontalenti
amontalenti / storm_stacktrace.txt
Created June 12, 2014 22:10
Storm stack trace that happens at Spout and similar traces throughout topo.
java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: Broken pipe
at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:107)
at backtype.storm.utils.DisruptorQueue.consumeBatch(DisruptorQueue.java:66)
at backtype.storm.disruptor$consume_batch.invoke(disruptor.clj:74)
at backtype.storm.daemon.executor$eval3848$fn__3849$fn__3864$fn__3893.invoke(executor.clj:539)
at backtype.storm.util$async_loop$fn__384.invoke(util.clj:433)
at clojure.lang.AFn.run(AFn.java:24)
at java.lang.Thread.run(Thread.java:701)
Caused by: java.lang.RuntimeException: java.io.IOException: Broken pipe
at backtype.storm.spout.ShellSpout.querySubprocess(ShellSpout.java:126)