Skip to content

Instantly share code, notes, and snippets.

View sheepeeh's full-sized avatar

Rachel Donahue sheepeeh

View GitHub Profile
@sheepeeh
sheepeeh / download_from_ia.rb
Last active August 29, 2015 13:58
For a given TXT file of URLs, download PDFs from archive.org
require 'mechanize'
require 'open-uri'
# Usage: download_from_ia Login Password
# Expects files to be named [name]_urls.txt. Change line 12 for a different naming scheme.
def get_pdf(from_file)
abort "#{$0} email password" if (ARGV.size != 2)
log = File.open("ia_pdf_downloads.log","a")
@sheepeeh
sheepeeh / get_omeka_ids.rb
Created April 10, 2014 19:34
For a given search URL, retrieve the Omeka item IDs for all results.
require 'mechanize'
require 'open-uri'
def get_ids(fname)
# Create Mechanize agent
a = Mechanize.new { |agent|
agent.follow_meta_refresh = true
}
@sheepeeh
sheepeeh / text_timeline.rb
Created April 10, 2014 20:04
For a given TXT file with Omeka item IDs, generate HTML for a text-only version of a Neatline Timeline.
# Using the same query used for your Neatline Timeline, generate a list of Omeka IDs with https://gist.github.com/sheepeeh/10415207
# When script is finished, you will have an HTML file with the same name as your TXT file. It will be ugly, you should probably pretty-print it.
require 'nokogiri'
require 'open-uri'
Item = Struct.new(:id, :title, :date, :link, :year, :flag)
def get_metadata(from_file)
items = []
@sheepeeh
sheepeeh / chicago-library-list.csl
Last active August 29, 2015 14:01
Chicago (library list) with abstract and extra fields added; sorted by local identifier.
<?xml version="1.0" encoding="utf-8"?>
<style xmlns="http://purl.org/net/xbiblio/csl" class="note" version="1.0" demote-non-dropping-particle="never" page-range-format="chicago">
<info>
<title>Chicago Manual of Style 16th edition (library list)</title>
<id>http://www.zotero.org/styles/chicago-library-list</id>
<link href="http://www.zotero.org/styles/chicago-library-list" rel="self"/>
<link href="http://www.chicagomanualofstyle.org/tools_citationguide.html" rel="documentation"/>
<author>
<name>Julian Onions</name>
<email>julian.onions@gmail.com</email>
@sheepeeh
sheepeeh / text-only-past.ahk
Last active November 16, 2016 16:06
Autohotkey text-only paste and join lines paste
; Text-only paste (strips all formatting)
#v::
Clip0 = %ClipBoardAll%
ClipBoard = %ClipBoard%
Send ^v
Sleep 50
ClipBoard = %Clip0%
VarSetCapacity(Clip0, 0)
Return
@sheepeeh
sheepeeh / copy-to-notepad.ahk
Created May 9, 2014 15:13
Autohotkey copy to open Notepad window
#c::
Send, {CTRLDOWN}c{CTRLUP}
WinActivate, Untitled - Notepad
sleep, 300
Send, {CTRLDOWN}v{CTRLUP}{ENTER}{ALTDOWN}{TAB}{ALTUP}
return
@sheepeeh
sheepeeh / format-zotero.rb
Last active August 29, 2015 14:01
Format and sort Zotero HTML bibliographies
#encoding: UTF-8
require 'nokogiri'
# This script is intended to be used with the custom CSL at https://gist.github.com/sheepeeh/dbb7b02973644d397378
# as it relies on sorting by call number to work.
# Takes a directory of HTML bibliographies exported by Zotero and makes it a little nicer
# for display on an Omeka Simple Page.
# Expects files to be named CollectionNumber-BoxNumber.html
@sheepeeh
sheepeeh / os_dir.rb
Created June 20, 2014 14:58
Get the current working directory (used for many of my little command line utilities.)
require 'rbconfig'
def init
@os
@current_dir
end
def os
@os ||= (
host_os = RbConfig::CONFIG['host_os']
@sheepeeh
sheepeeh / filerename.rb
Last active August 29, 2015 14:02
Rename files in the current directory based on a tab-delimited file.
# REQUIRES os_dir, available at https://gist.github.com/sheepeeh/39c25bd67ccc09ad78a0
# place in an easy to remember directory or add to your PATH. Expects (C/T)SV headings "oldname" and "newname."
# USAGE: filename.rb file_with_filenames.csv
require 'csv'
require_relative 'os_dir'
def rename_files(source)
current_dir
@sheepeeh
sheepeeh / fix_sitemap_for_omeka.rb
Last active August 29, 2015 14:11
Add real last modified dates, priorities, and change frequencies to XML Sitemaps-generated sitemaps for Omeka.
#--------------------------------------------------------------------------------------------------------------
# This is the script I use to add real dates, change frequencies, and priorities to Omeka items,
# exhibits, and simple pages to sitemaps generated with the XML Sitemap tool.
# https://www.xml-sitemaps.com/standalone-google-sitemap-generator.html
#
# I use the following settins in order to keep the number of URLs down (only include simple pages, exhibits
# and exhibit pages, exhibit items, and collection pages).
#
# Exclude from sitemap extensions:
# divx flv zip m4a m4v rar tar bz2 tgz exe gif tif jpg png class jar mpeg mpg mp3 wav mp4 avi wmv