This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# USAGE: Hash.from_xml:(YOUR_XML_STRING) | |
require 'nokogiri' | |
# modified from http://stackoverflow.com/questions/1230741/convert-a-nokogiri-document-to-a-ruby-hash/1231297#1231297 | |
class Hash | |
class << self | |
def from_xml(xml_io) | |
begin | |
result = Nokogiri::XML(xml_io) | |
return { result.root.name.to_sym => xml_node_to_hash(result.root)} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
. heritrix.conf | |
if [ -z "$1" ] || [ -z "$2" ]; then | |
echo usage: $0 jobname seedsfile | |
exit | |
fi | |
JOB=$1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
import grp | |
import mimetypes | |
from optparse import OptionParser | |
import os | |
from pprint import pprint | |
import pwd | |
from stat import * | |
import sys |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?xml version="1.0" encoding="UTF-8" ?> | |
<!-- | |
Licensed to the Apache Software Foundation (ASF) under one or more | |
contributor license agreements. See the NOTICE file distributed with | |
this work for additional information regarding copyright ownership. | |
The ASF licenses this file to You under the Apache License, Version 2.0 | |
(the "License"); you may not use this file except in compliance with | |
the License. You may obtain a copy of the License at | |
http://www.apache.org/licenses/LICENSE-2.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?php | |
// The gChart PHP library is required in order to make this work. You can download it from http://code.google.com/p/gchartphp/ | |
// Make sure you put it in the same directory as this script | |
ini_set('display_errors','1'); | |
$server_address = 'http://142.132.138.20:8983'; | |
require ('gChart.php'); | |
if ( isset($_GET['query'])) { | |
$query = 'mimetype:' . strtolower($_GET['query']) . '*'; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def ocr_file(filename, languages, output_base, temp_dir): | |
log.info("Launching tesseract on %s", filename) | |
output = subprocess.check_output(['tesseract', filename, output_base, | |
'-l', '+'.join(languages), TESSERACT_CONFIG], | |
cwd=temp_dir, | |
stderr=subprocess.STDOUT) | |
with OCR_STORAGE.open('%s/%s/%s.log' % (item_id, group, index), 'w') as log_f: | |
log_f.write(output) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
package itforarchivists | |
import ( | |
"encoding/json" | |
"fmt" | |
"net/http" | |
"github.com/richardlehane/siegfried/pkg/core" | |
"github.com/richardlehane/siegfried/pkg/pronom" | |
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/sh | |
# create a custom mapping | |
cat > /tmp/mapping.json << MAPPING | |
{ | |
"types": { | |
"_default": { | |
"properties": { | |
"location": { | |
"properties": { |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
PREFIX=$(basename "$1" .pdf) | |
if [ ! -z "$TESSERACT_FLAGS" ]; then | |
echo "Picked up TESSERACT_FLAGS: $TESSERACT_FLAGS" | |
fi | |
echo "Prefix is: $PREFIX" | |
echo "Converting to TIFF..." | |
if command -v parallel >/dev/null 2>&1; then | |
LAST_PAGE=$(($(pdfinfo "$1"|grep '^Pages:'|awk '{print $2}') - 1)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/sh | |
#set -x | |
# Usage: shibb-cas-get.sh {username} {password} # If you have any errors try removing the redirects to get more information | |
# The service to be called, and a url-encoded version (the url encoding isn't perfect, if you're encoding complex stuff you may wish to replace with a different method) | |
DEST=https://myapp.example.com/ | |
SP=https://myapp.example.com/index.php | |
IDP="https://myidp.example.com/idp/shibboleth&btn_sso=SSOok" |
OlderNewer