Skip to content

Instantly share code, notes, and snippets.

jronallo /
Created Feb 16, 2017 — forked from cazzerson/
Extracting HB2 Tweet IDs from multiple twarc datasets
# This script requires the jq utility
# Datasets created with twarc
mkdir -p NCHB2-ids
rm NCHB2-ids/NCHB2*
touch NCHB2-ids/NCHB2-ids-with-dupes.txt
# Create more relevant subset of "North Carlina" search
jronallo / code4lib-vote
Created Nov 16, 2015
Quick Ruby script to get libnotify desktop notifications of the current vote tally of your talk
View code4lib-vote
#!/usr/bin/env ruby
# To add this to cron do something like this to use the ruby wrapper script:
# */15 8-17 * * 1-5 env DISPLAY=:0.0 /home/jnronall/.rvm/wrappers/ruby-2.1.1/ruby /home/jnronall/bin/code4lib-vote > $HOME/tmp/code4lib-vote-cron.log 2>&1
require 'httpclient'
require 'json'
require 'date'
require 'libnotify'
require 'slop'
View net_ldap_overrides.rb
class Net::LDAP
def initialize(args = {})
@host = args[:host] || DefaultHost
@port = args[:port] || DefaultPort
@verbose = false # Make this configurable with a switch on the class.
@auth = args[:auth] || DefaultAuth
@base = args[:base] || DefaultTreebase
encryption args[:encryption] # may be nil
jronallo / commandline use
Last active Aug 29, 2015
pandoc does not convert poster images into data-URIs when using --self-contained
View commandline use
$ ~/.cabal/bin/pandoc --version
$ ~/.cabal/bin/pandoc -w dzslides --standalone --self-contained ~/tmp/ > ~/tmp/pandoc-poster-image-test.html
jronallo / dzslides2pdf.rb
Created Jul 18, 2013
Ruby script that uses capybara-webkit and imagemagick (convert actually) to turn a dzslides HTML slideshow into a PDF.
View dzslides2pdf.rb
#! /usr/bin/env ruby
# dzslides2pdf.rb
# dzslides2pdf.rb http://localhost/presentation_root presentation.html
require 'capybara/dsl'
require 'capybara-webkit'
# require 'capybara/poltergeist'
require 'fileutils'
include Capybara::DSL
View dabblet.css
* "Google Now" Card
body {
background: #e1e1e1;
min-height: 100%;
margin: auto;
ul.gNow {
width: 450px;
jronallo / item.json
Created Mar 26, 2013
elasticsearch example document
View item.json
"type": [
"properties": {
"name": [
"url": [
jronallo / tesse
Created Mar 13, 2013
A toy command line utility for OCRing and cleaning OCR output.
View tesse
#!/usr/bin/env ruby
# tesse: commandline tool for looking at tesseract OCR and cleaning the output
# Besides the following gem requirements it requires the following Linux programs:
# eog: for viewing the images
# wmctrl: for resizing and positioning the image viewing window
require 'tesseract'
require 'ffi/aspell'
jronallo /
Last active Dec 11, 2015
scripts for outputting some reports from the Web Data Commons NQuads
#!/usr/bin/env bash
# These steps will take a long time to download the data set.
# First, get the list of available NQuad files to download.
# We're only interested in the microdata set right now since that seems to be where is used more. So create a file list
cat files.list | grep html-microdata > microdata_files.list
# OK, this will take a while depending on your connection. Let it run overnight.
wget -i microdata_files.list
View common_crawl_hostname_count.rb
#!/usr/bin/env ruby
# a quick, simple script to partially parse output from
# and output subdomains in order of count
url_counts = {}
total_urls = 0
File.readlines(ARGV[0]).each do |line|
url = line.split(' ').first
reverse_hostname = url.split('/').first