Skip to content

Instantly share code, notes, and snippets.

View driki's full-sized avatar

Matt MacDonald driki

View GitHub Profile
@driki
driki / gist:1169914
Created August 25, 2011 03:27
A simple scraper for geo-coding Watertown, MA building permits
require 'rubygems'
require 'csv'
require 'typhoeus'
require 'json'
GOOGLE_GEO_URL = "http://maps.googleapis.com/maps/api/geocode/json?sensor=false&address="
# Create the output file
CSV.open("geo-coded-permits.csv", "wb") do |csv|
@driki
driki / gist:1176346
Created August 28, 2011 07:13
Retrieve town council meeting notes.
require 'rubygems'
require 'typhoeus'
require 'nokogiri'
require 'uri'
require 'json'
require 'calais'
BASE_URL = 'http://www.ci.watertown.ma.us'
# the request object
@driki
driki / gist:1867250
Created February 20, 2012 02:10
Snippet to generate open-budget-data stubs for csv docs.
Organization.order(:state).each do |org|
org_dir = "#{org.state.downcase}/#{org.name.downcase.gsub(/[^[:alnum:]]/,'_').gsub(/-{2,}/,'_')}/#{Time.now.year}"
expense_file = "#{org.slug}-#{Time.now.year}-expense.csv"
revenue_file = "#{org.slug}-#{Time.now.year}-revenue.csv"
FileUtils.mkdir_p(org_dir)
File.new("#{org_dir}/#{expense_file}", "w+").close
File.new("#{org_dir}/#{revenue_file}", "w+").close
end
@driki
driki / gist:3259436
Created August 4, 2012 19:16
Elasticsearch Mapping
{
municipalities: {
document: {
properties: {
classification: {
type: "string",
index: "not_analyzed"
},
content_url: {
type: "string",
@driki
driki / gist:3259474
Created August 4, 2012 19:26
Document for Somerville, MA
{
took: 3,
timed_out: false,
_shards: {
total: 5,
successful: 5,
failed: 0
},
hits: {
total: 1,
@driki
driki / How I cut my RCN internet bill
Created November 28, 2012 01:00
How I cut my monthly RCN internet bill
19:40John B.: Hello Matt! How may I help you?
19:41Matt M.: Hi John. I need to lower my bill. Comcast internet is much cheaper. Thinking of switching
19:43Matt M.: They are offering me $34.99/mo my latest RCN bill is $49.99
19:43John B.: Let me see what I can do, sir. I'd be happy to help you. May I please have your address and phone number?
19:44Matt M.: Sure. XXXXXXXXX, MA XXXXX and my phone number is XXX-XXX-XXXX.
19:45John B.: Thank you sir. One moment please while I research what I can do.
19:45Matt M.: Great thanks.
19:48John B.: I tried to put you in the $34.99 plan, however the system knows your an active customer and denies me.
19:49John B.: But we do have another option!
19:49Matt M.: OK. What are you thinking?
@driki
driki / gist:4390503
Last active December 10, 2015 05:49
require 'rubygems'
require 'tire'
Tire.configure { logger 'elasticsearch.log', :level => 'debug' }
class Municipality
include Tire::Model::Persistence
include Tire::Model::Search
include Tire::Model::Callbacks
@driki
driki / gist:4400443
Last active December 10, 2015 07:19
Memory leak in Net::HTTP.get and non memory leak in HTTPClient.get which I think impacts Anemone https://github.com/NearbyFYI/anemone/blob/next/lib/anemone/http.rb#L136
require 'net/http'
# Memory continues to climb.
idx = 0
loop do
begin
# Our sample website
url = "http://localhost:2000"
resp = Net::HTTP.get(URI.parse(url))
puts "run loop: #{idx}"
08:51:20 worker.1 | 2012-12-31T13:51:20Z 11451 TID-ovw3jav00 ExtractTextFromDocument JID-21e38454b9696092b4b8313c WARN: !!! Failure ExtractTextFromDocument(56410)
08:51:20 worker.1 | 2012-12-31T13:51:20Z 11451 TID-ovw3jav00 ExtractTextFromDocument JID-21e38454b9696092b4b8313c DEBUG: Failure! Retry 5 in 742 seconds
08:51:20 worker.1 | 2012-12-31T13:51:20Z 11451 TID-ovw3jav00 ExtractTextFromDocument JID-21e38454b9696092b4b8313c INFO: fail: 42.405 sec
08:51:20 worker.1 | 2012-12-31T13:51:20Z 11451 TID-ovw3jav00 WARN: {"retry"=>10, "queue"=>"high", "class"=>"ExtractTextFromDocument", "args"=>[56410, "Workflow.end_extract_text(arg)"], "jid"=>"21e38454b9696092b4b8313c", "error_message"=>"!!! Caught exception while executing ExtractTextFromDocument(56410): Text extraction service error: (503)", "error_class"=>"StandardError", "failed_at"=>"2012-12-30T20:28:13Z", "retry_count"=>5, "retried_at"=>2012-12-31 13:51:20 UTC}
08:51:20 worker.1 | 2012-12-31T13:51:20Z 11451 TID-ovw3jav00 WARN: !!! Caught exception while execu
@driki
driki / gist:4679255
Last active December 11, 2015 23:48
Municipal government topics.
Topic 00 4060.638381492015
number 131.84254459571682
dwelling 130.0942066586498
name 107.50313007741401
state 107.23074287437205
unit 94.60188337501059
address 83.35383631029445
code 75.011551118471
yes 68.04857715759431
owner 65.04524048536197