Skip to content

Instantly share code, notes, and snippets.

@mertonium
mertonium / custom_rss_parsers_with_feedjira.md
Last active August 29, 2015 14:00
Write up of how we make sure are parser work across the ridiculous world of RSS.

Custom Feed Parsers in a Rails App with Feedjira

Note: Just before I started writing this, I noticed that the Feedzirra gem had 1) changed it's name to Feedjira (explanation here) and 2) had pushed a bunch of updates. Everthing I talk about below worked on 0.2.4, and appears to work just fine on 1.2.0. Also, looking at the CHANGELOG, none of the updates should mess with what I talk about here.

Web feeds (RSS, Atom, etc. - but I tend to call them all "RSS feed"), while sometimes viewed as unfashionable/antiquated, are ubiquitous among web publishers. When you are building a system that needs to know when new content is published, RSS feeds are a perfect low-bar method for doing programmatically discovering new content.

In the Ruby world, Feedjira (formerly, Feedzirra) is the [most popular tool](https://www.

@mertonium
mertonium / Gemfile
Last active August 29, 2015 14:07
Nokogiri custom pseudo selector
source 'https://rubygems.org'
gem "nokogiri", "1.6.1"
gem "rspec", "~>2.14.0"
@mertonium
mertonium / sf_endpoint_creation.rb
Last active August 29, 2015 14:15
Blocks of ruby code that can add the Citygram-SF endpoints to citygram.org
Publisher.create! do |pub|
pub.title = "Street Tree List"
pub.endpoint = "https://citygram-sf-registries.herokuapp.com/tree-planting"
pub.active = true
pub.visible = true
pub.city = "San Francisco"
pub.icon = "leaf-collection.png"
pub.state = "CA"
pub.description = "List of DPW maintained street trees including: Planting date, species, and location"
pub.tags = ["san-francisco","san francisco","sf","trees"]
@mertonium
mertonium / couchapp_mass_installer.php
Created April 19, 2011 02:48
Little php script to install the same couch app on a bunch of dbs in the same instance
<?php
// Configure control variables
$couch_url = "http://yourcouchinstance.com";
$path_to_couchapp = "/local_path_to/couchapp/";
// Counter
$db_count = 0;
// Time check (we'll use this later)
$start_ts = time();
@mertonium
mertonium / couchdb_replicator.php
Created April 19, 2011 02:50
Replicates all the dbs from one couch instance to another
<?php
// Config variables
define("FULL_OVERWRITE", true);
$source_couch = "http://yourcouchinstance.com";
$destination_couch = "http://yourothercouchinstance.com";
// Counter
$db_count = 0;
// Time check (we'll use this later)
@mertonium
mertonium / philly311_scraper.rb
Created May 5, 2011 16:45
This script cycles through the Philly311 Knowledgebase, devouring FAQs.
#!/usr/bin/env ruby
###############################################################################
# Philly Open311 Knowledgebase Scraper
###############################################################################
require 'nokogiri'
require "net/http"
require "uri"
require "CSV"
@mertonium
mertonium / categoryexcluder.php
Created August 13, 2011 05:34
Copy of the default WordPress category widget - but you can pass a list of IDs to exclude. (Children categories will be excluded as well)
<?php
/*
Plugin Name: Category Excluder
Plugin URI: https://gist.github.com/1143511
Description: Added a way to pass a list of IDs to be excluded from the default Wordpress Categories widget.
Author: @mertonium
Version: 0.1
Author URI: http://mertonium.com
*/
/**
@mertonium
mertonium / tcase_transform.js
Created August 16, 2011 17:46
Recline transform to Title Case
function(doc) {
doc['clean_address'] = doc['ALL_meSSed_uP_addr'].toLowerCase().replace(/\b[a-z]/g, function() { return arguments[0].toUpperCase();});
return doc;
}
[
{
"rmt":"276/0",
"latitude":"42.325384",
"longitude":"-71.075201",
"markername":"orange",
"opts":{
"title":"Roxbury Friendship Fence"
},
"text":"<div class=\"gmap-popup\"></div>"
@mertonium
mertonium / massupload.js
Created September 12, 2011 03:20
Node.js script that loads a CSV into a Couch. I used it to upload a stop_times.txt file from the Twin Cities GTFS feed.
var csv = require('csv'),
cradle = require('cradle');
var docs = [];
var successCount = 0,
failCount = 0,
curproc = 0,
throttle = 5;