Skip to content

Instantly share code, notes, and snippets.

@richardlehane
richardlehane / bininstall
Last active August 29, 2015 14:18
Bintray install
wget -qO - https://bintray.com/user/downloadSubjectPublicKey?username=bintray | sudo apt-key add -
echo "deb http://dl.bintray.com/siegfried/debian wheezy main" | sudo tee -a /etc/apt/sources.list
sudo apt-get update
sudo apt-get install siegfried
---
siegfried : 0.8.3
scandate : 2015-02-28T14:09:20+11:00
signature : pronom.gob
created : 2015-02-19T11:50:57+11:00
identifiers :
- name : 'pronom'
details : 'DROID_SignatureFile_V81.xml; container-signature-20150218.xml'
---
filename : 'debug.jpg'
@richardlehane
richardlehane / sf.py
Last active August 29, 2015 14:15
Archivematica FPR script for sf
#!/usr/bin/env python
import base64
import json
import urllib2
import sys
sfurl = 'http://localhost:5138/identify/'
try:
@richardlehane
richardlehane / droid_pronom_diff
Created November 16, 2014 10:59
Differences between pronom and droid
This shows the fails when I compare signatures from PRONOM report files with DROID signature file v79.
A little background:
In siegfried, signatures are lists of frames. Frames enclose patterns. Frames tell you where to test, patterns perform the test.
The test output below shows the text printout of failing signatures. Within the parentheses, pipes separate each frame.
In the first example, the "F" means it is a fixed frame. This implies a fixed offset from either the BOF (B) or EOF (E). The first example is therefore at a fixed offset of 0 bytes from the BOF. The enclosed pattern is a sequence (seq) of 4 bytes. The frame that follows is also a fixed frame (F) but it is anchored to the previous frame (P), not to the BOF. In EOF segments you will see frames to the left of the EOF frame are anchored to their succeeding (S) frames.
The last frame in the test output below is a WW which stands for window. In the example below, it should be a window of between 0 and 16000 bytes from the EOF. The DROID signatu
package itforarchivists
import (
"encoding/json"
"fmt"
"net/http"
"github.com/richardlehane/siegfried/pkg/core"
"github.com/richardlehane/siegfried/pkg/pronom"
)
sig = "10 00 00 00 'Word.Document.' ['6'-'7'] 00"
l = sig.split("'")
ns = ""
for i in range(len(l)):
if i % 2 != 0:
ns += "".join([hex(ord(x))[2:] for x in l[i]])
else:
ns += l[i]
print ns
type keyFrame struct {
Typ OffType // defined in frames.go
Min int
Max int
Alive func(int)(bool, int, int) // return L and R distances for tests
}
@richardlehane
richardlehane / MODS_changes.js
Created June 3, 2011 01:34
Suggested changes to Zotero's MODS translator
if(mods.m::location.m::shelfLocator.length()) {
// change behaviour
newItem.archiveLocation = mods.m::location.m::shelfLocator.text().toString();
newItem.archive = mods.m::location.m::physicalLocation.text().toString();
} else {
// leave as is
newItem.archiveLocation = mods.m::location.m::physicalLocation.text().toString();
}
@richardlehane
richardlehane / timeline.rb
Created June 3, 2011 01:29
This script generates a CSV that can be used by Propublica's timeline-setter tool to make a nice timeline. It calls out to Wikipedia and Wragge's TROVE api to fill out the data provided by State Records NSW in the ministries.xml file.
#
# This script generates a CSV that can be used by Propublica's timeline-setter
# tool to make a nice timeline. It calls out to Wikipedia and Wragge's TROVE
# api to fill out the data provided by State Records NSW in the ministries.xml file.
#
#
require 'rubygems'
require 'nokogiri'
require 'net/http'
require 'date'
@richardlehane
richardlehane / batch_network_drives.xsl
Created June 3, 2011 01:28
This XSL stylesheet demonstrates how a batch file can be generated to automatically create directory structures that correspond to the terms (functions and activities) of an XML authority.
<?xml version="1.0" encoding="UTF-8"?>
<!-- This XSL stylesheet demonstrates how a batch file can be generated to automatically create directory structures that correspond to the terms (functions and activities) of an XML authority. -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:rda="http://www.records.nsw.gov.au/schemas/RDA" version="1.0" >
<xsl:output method="text" encoding="UTF-8"/>
<xsl:include href="include/utils.xsl"/>
<xsl:template match="/">
<xsl:apply-templates select="//rda:Term"/>
</xsl:template>
<xsl:template match="rda:Term">
<!-- the "%~dp0" creates the new directories in the directory where the batch file sits.