Skip to content

Instantly share code, notes, and snippets.

@evz
evz / smallbell.py
Last active December 20, 2015 17:28
Tarbell guts for dumping Google Spreadsheet out as JSON to an S3 bucket. Lovingly stolen from the NewsApps team at the Chicago Tribune
import os
from ordereddict import OrderedDict
from gdata.spreadsheet.service import SpreadsheetsService
from gdata.spreadsheet.service import CellQuery
import json
import codecs
import shutil
# Source for this is here https://github.com/newsapps/flask-tarbell/blob/master/tarbell/slughifi.py
from slughifi import slughifi
@evz
evz / README
Created August 3, 2013 20:41 — forked from sasha-id/README
MongoDB upstart scripts for Ubuntu.
Run following commands after installing upstart scripts:
ln -s /lib/init/upstart-job /etc/init.d/mongoconf
ln -s /lib/init/upstart-job /etc/init.d/mongodb
ln -s /lib/init/upstart-job /etc/init.d/mongos
To start services use:
@evz
evz / json_query.py
Last active December 17, 2015 22:59
A class based method of accessing and running basic queries against JSON files stored on a filesystem
import os
import json
from datetime import datetime
# first a super stupid exception to use
class QueryError(Exception):
def __init__(self, value):
self.value = value
def __str__(self):
return repr(self.value)
@evz
evz / 2007-10-24.json
Last active December 17, 2015 05:09
Chicago Crime vs. Weather: Under the hood.
{
"weather": {
"FAHR_MIN": 42.980000000000004,
"CELSIUS_MIN": 6.1,
"CELSIUS_MAX": 12.8,
"FAHR_MAX": 55.040000000000006
},
"meta": {
"total": {
"key": "total",
@evz
evz / s3_update.py
Last active December 12, 2015 10:19
Loop over and apply a content-disposition header to PDFs in an S3 bucket
from boto.s3.connection import S3Connection
from boto.s3.key import Key
def doit(bucket_name, file_type):
conn = S3Connection(AWS_ACCESS_KEY, AWS_SECRET_KEY)
bucket = conn.get_bucket(bucket_name)
for k in bucket.list():
if k.name.lower().endswith('.%s' % file_type):
print ('Updating: %s' % k.name)
k = k.copy(k.bucket.name, k.name, {'Content-Disposition':'attachment', 'Content-Type': 'application/pdf'}, preserve_acl=True)
@evz
evz / matching_traceback.txt
Created December 4, 2015 14:36
traceback
INFO:dedupe.api:reading training from file
^CTraceback (most recent call last):
File "run_queue.py", line 2, in <module>
queue_daemon()
File "/home/eric/code/dedupe-api/api/queue.py", line 144, in queue_daemon
processMessage()
File "/home/eric/code/dedupe-api/api/queue.py", line 88, in processMessage
upd_args['return_value'] = func(*args, **kwargs)
File "/home/eric/code/dedupe-api/api/tasks/review_tasks.py", line 35, in bulkMarkClusters
initializeMatching(session_id)
@evz
evz / nyc_loaddata_traceback.txt
Created October 8, 2015 13:49
NYC loaddata 2015-10-08
Traceback (most recent call last):
File "/home/datamade/.virtualenvs/nyc/lib/python3.4/site-packages/django/db/models/query.py", line 405, in get_or_create
return self.get(**lookup), False
File "/home/datamade/.virtualenvs/nyc/lib/python3.4/site-packages/django/db/models/query.py", line 334, in get
self.model._meta.object_name
core.models.DoesNotExist: Event matching query does not exist.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
@evz
evz / ynr_traceback.txt
Created October 5, 2015 16:25
For Mark
Environment:
Request Method: GET
Request URL: http://127.0.0.1:8001/election/council-member-2015/post/ocd-division,country:us,state:mn,place:st_paul,ward:1/council-member-for-ward-1
Django Version: 1.8.3
Python Version: 2.7.9
Installed Applications:
('django.contrib.admin',
@evz
evz / models.py
Created May 14, 2012 14:32
SRP Maps: Under the Hood
class SRPLocality(models.Model):
locality = models.CharField(primary_key=True, max_length=255)
cluster = models.CharField(max_length=100, null=True)
region = models.CharField(max_length=25, null=True)
nsa = models.CharField(max_length=15, null=True)
date_modified = models.DateTimeField(null=True)
stats = DictField(null=True)
file_id = ListField(max_length=25)
loaded = models.BooleanField()
objects = MongoDBManager()
@evz
evz / audio_search.py
Created May 10, 2012 17:22
Terrace under the hood, part 4
results = AudioFile.objects.filter(Q(recorded__lte=upload_to),Q(recorded__gte=upload_from)).filter(Q(title__icontains=keyword)|Q(description__icontains=keyword)|Q(tags__tag_name__in=tags))
results = set(results) # Make sure there are no duplicates