Skip to content

Instantly share code, notes, and snippets.

View ryanfb's full-sized avatar

Ryan Baumann ryanfb

View GitHub Profile
@raws
raws / hector.conf
Created January 4, 2012 03:36
Hector Upstart config
description "Hector IRC server"
env LC_CTYPE=en_US.UTF-8
env RBENV_VERSION=1.9.3-p0
env HECTOR_ROOT=/home/ross/hector/blolol.hect
script
/home/ross/.rbenv/bin/rbenv exec hector daemon
end script
@ryanfb
ryanfb / README.md
Last active December 31, 2015 21:29
Lace hOCR + PDF recombination

Lace hOCR + PDF recombination

Use the lace branch of my fork of HocrConverter: https://github.com/ryanfb/HocrConverter/tree/lace (make sure you git pull to get the latest changes)

Download and compile jbig2enc in your script path. Modify pdf.py to use 300 instead of 72 dpi.

Example run:

./lace2pdf.sh xenophon04xeno

@acairns
acairns / publish
Created January 5, 2014 11:58
Bash script to publish Jekyll post from _drafts into _posts
#!/bin/sh
if [ -z "$1" ]
then
echo "No draft file found"
exit
fi
mv $1 _posts/`date +"%Y-%m-%d"`-`basename $1`
<?xml version='1.0' encoding='utf-8'?>
<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
<!--
Author: Rod Page
Source: http://iphylo.blogspot.com/2011/07/correcting-ocr-using-hocr-firefox.html#comment-400434491
-->
<xsl:output method='html' version='1.0' encoding='utf-8' indent='yes'/>
<xsl:variable name="scale" select="800 div //page/@width" />
@codingjester
codingjester / oauth_tumblr.py
Created April 3, 2012 23:33
OAuth Tumblr Getting Access Tokens
import urlparse
import oauth2 as oauth
consumer_key = 'consumer_key'
consumer_secret = 'consumer_secret'
request_token_url = 'http://www.tumblr.com/oauth/request_token'
access_token_url = 'http://www.tumblr.com/oauth/access_token'
authorize_url = 'http://www.tumblr.com/oauth/authorize'
@PonteIneptique
PonteIneptique / hocr_to_kraken_transcribe.xsl
Last active March 21, 2020 11:25
XSL for transforming (need Saxon-EE > 9.8) HOCR from tesseract to transcribing file for Kraken (à la ketos prefill)
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:saxon="http://saxon.sf.net/"
xmlns:my="foo.bar"
exclude-result-prefixes="xs my saxon uuid"
xpath-default-namespace="http://www.w3.org/1999/xhtml"
version="2.0"
xmlns:uuid="java:java.util.UUID">
@acdha
acdha / ocr-file.py
Created March 17, 2014 22:49
Fragment of code used to process images with Tesseract OCR
def ocr_file(filename, languages, output_base, temp_dir):
log.info("Launching tesseract on %s", filename)
output = subprocess.check_output(['tesseract', filename, output_base,
'-l', '+'.join(languages), TESSERACT_CONFIG],
cwd=temp_dir,
stderr=subprocess.STDOUT)
with OCR_STORAGE.open('%s/%s/%s.log' % (item_id, group, index), 'w') as log_f:
log_f.write(output)
@JPLeBreton
JPLeBreton / wadls.py
Last active March 7, 2021 23:53
wadls - list all map files within a WAD/PK3/ZIP
#!/usr/bin/python
import os, sys, zipfile, tempfile
# wadls (pronounced "waddles", thx joshthenesnerd) - list all maps in a wad/zip/pk3
# requires omgifol module, set path to it here or in env variable
OMG_PATH = os.environ.get('OMG_PATH', None) or '/home/jpl/projects/wadsmoosh'
sys.path.append(OMG_PATH)
import omg
@osnr
osnr / search-twitter-around-screenotate.md
Created February 9, 2021 03:40
Search your tweets around the time you took a screenshot.

If you have a Screenotate screenshot HTML file open in your browser, clicking this bookmarklet will search your old tweets from around the time you took the screenshot, so you can find your original tweet of it (if it exists).

Make a new bookmark with the below as URL, replacing from:rsnous with from:YourTwitterUsername:

javascript:void%20function(){const%20a=new%20Date(document.body.innerHTML.match(/%3Cdd%3E(\d\d\d\d\-\d\d\-\d\d)/)[1]),b=new%20Date(a.getTime());b.setDate(b.getDate()-1);const%20c=new%20Date(a.getTime());c.setDate(c.getDate()+1);const%20d=`from:rsnous%20since:${b.toISOString().slice(0,10)}%20until:${c.toISOString().slice(0,10)}`;window.location.href=`https://twitter.com/search%3Fq=${encodeURIComponent(d)}`}();

Or construct the bookmarklet URL yourself -- here's the source code:

@rrrodrigo
rrrodrigo / instagram-download.txt
Created April 10, 2012 11:45
How to download all your Instagram pictures in highest resolution without using any API
Following the news about Facebook buying Instagram I decided to delete my Instagram account before Facebook claims ownership of my pictures.
Since the Instagram-recommended (in their FAQ): http://instaport.me/export doesn't work for me (probably they can't cope with the high demand),
here is a quick and dirty way to download all my Instagram pictures in their highest resolution in a few easy steps.
You will need: Firefox, Firebug, some text editor, wget
1. Go to http://statigr.am/yourlogin using Firefox with Firebug extension active
2. Scroll down as many times as it is needed to have all yor pictures thumbnails displayed (I had some 3 hundred pictures so it was not that much scrolling, YMMV)
3. In the Firebug JS console run this JS code: $(".lienPhotoGrid a img").each(function(index) { console.log($(this).attr('src')) })
4. JS console will contain urls to all the thumbnails images, like this: http://distilleryimage1.s3.amazonaws.com/4ed46cf2801511e1b9f1123138140926_5.jpg