Skip to content

Instantly share code, notes, and snippets.

View fxn's full-sized avatar

Xavier Noria fxn

View GitHub Profile
#!/usr/bin/env ruby
# A simple script to gzip content recursively. I use it for the Nginx gzip_static module.
require 'find'
EXTENSIONS = %w(.js .html .css)
def gzname(file)
"#{file}.gz"
require 'set'
require 'digest/md5'
# Let original be a collection with the names of the files in the
# original directory. Let target be the ones in the target directory.
#
# We want to find which files in original are equal to some in target
# except perhaps for the filename.
osizes, tsizes = classify_by_size(original, target)
#!/usr/bin/env ruby
#
# Joins paragraphs so that they span one long line.
# Quoted or indented text are left untouched.
# Useful for Gmail.
#
open('| pbcopy', 'w') do |pbcopy|
pbcopy.write(`pbpaste`.gsub(/([^>\n])\n(\w)/, '\\1 \\2'))
end
@fxn
fxn / config.ru
Created April 1, 2011 21:48
demonstrates early fetch of an asset in the HEAD, and the 1K threshold
class Html
def initialize(env)
@server_port = env['SERVER_PORT']
end
def first_chunk_prefix
<<EOS
<!DOCTYPE html>
<html>
<head>
class Img
def each
puts "I WAS REQUESTED"
yield ''
end
end
map '/img' do
run lambda { |env| [200, {'Content-Type' => 'image/png', 'Content-Length' => '0'}, Img.new(env)] }
end
@fxn
fxn / compute_ancestry.pl
Created May 20, 2011 23:03
Computes the ancestry path of GeoPlanet places
use strict;
use warnings;
use constant {
# Original Yahoo! TSV.
GPP => 'geoplanet_places_7.6.0.tsv',
# Output, same Yahoo! TSV with an extra ancestry column.
ANC => 'geoplanet_places_with_ancestry_7.6.0.tsv'
};
@fxn
fxn / post.md
Created May 23, 2011 21:08
GeoPlanet data with ancestor chain cache imported in 10 minutes

GeoPlanet data with ancestor chain cache imported in 10 minutes

Yahoo! provides its GeoPlanet data as three separate TSV files, available for download here.

That's a database with some 17 million records:

  • 5.7 million records: locations (aka places).
  • 2.2 million records: alternative names for each place (aka aliases).
  • 9.6 million records: matrix of neighbourhoods per place (aka adjacencies).
# encoding: binary
require 'socket'
require 'zlib'
require 'stringio'
PADDING = 256
GZIP = true
CHUNKED = true
def compress(gzip, io, data)
;;
;; Custom Defuns ---------------------------------------------------------------
;;
(defun fxn-find-init-file ()
(interactive)
(find-file "~/.emacs.d/init.el"))
(global-set-key (kbd "C-c i") 'fxn-find-init-file)
(defun fxn-kill-whole-line ()
@fxn
fxn / gist:1520171
Created December 26, 2011 00:52
Coprimality test with a regular expression
See https://github.com/fxn/math-with-regexps.