Skip to content

Instantly share code, notes, and snippets.

View bitsgalore's full-sized avatar

Johan van der Knijff bitsgalore

View GitHub Profile
bitsgalore /
Created October 3, 2023 14:23
Create Jpylyzer testfiles with different profiles
# Location of Kakadu binaries
# Add Kakadu path to LD_LIBRARY_PATH
# Create TIFF from existing JP2
/Applications/kakadu/kdu_expand -i aware.jp2 -o aware.tif
bitsgalore / tweet-example.json
Created November 20, 2022 13:32
Example of JSON format used in Twitter archive for one single Tweet (from tweets.js)
"tweet" : {
"edit_info" : {
"initial" : {
"editTweetIds" : [
"editableUntil" : "2022-11-03T13:51:02.000Z",
"editsRemaining" : "5",
"isEditEligible" : false
bitsgalore / ht2jk-jpylyzer.xnl
Created August 3, 2022 18:52
Jpylyzer output for High Throughput JPEG 2000 codestream downloaded from
<?xml version='1.0' encoding='UTF-8'?>
<jpylyzer xmlns="" xmlns:xsi="" xsi:schemaLocation="">
bitsgalore /
Last active June 14, 2022 15:20
Storage media type detection using the Windows API and Python
bitsgalore / iso9660-withschema.xml
Created April 19, 2022 16:54
Isolyzer output with added namespace and XSD schema definitions
<?xml version="1.0" ?>
<isolyzer xmlns="" xmlns:xsi="" xsi:schemaLocation="">
bitsgalore / highsierra.xml
Created April 14, 2022 15:09
Isolyzer output for High Sierra file system
<?xml version="1.0" ?>
bitsgalore /
Created April 12, 2022 16:49
Cross-platform CLI file input with wildcards
import sys
import os
import glob
import platform
import codecs
import argparse
# Create parser
parser = argparse.ArgumentParser(
description="Test CLI input with wildcards, multiple platforms")
bitsgalore /
Last active April 22, 2021 22:01
Saves URLs (from either list or root URL) to internet Archive's Wayback Machine
#! /usr/bin/env python3
Save web pages to Wayback Machine. Argument urlsIn can either be
a text file with URLs (each line contains one URL), or a single
URL. In the first (input file) case it will simply save each URL.
In the latter case (input URL) it will extract all links from the URL, and
save those as well as the root URL (useful for saving a page with all
of its direct references). The optional --extensions argument can be used
to limit this to one or more specific file extensions. E.g. the following
#! /usr/bin/env python3
from warcio.capture_http import capture_http
import requests
def main():
# Existing warc.gz file (created with wget, then compressed using warcio's
# 'recompress' command)
with capture_http(""):
for indexOnder in range(1, 8):
for indexMidden in range(1, 8):

Instructies omSipCreator


Gebruik omSipCreator voor de tests op kopieën van batches, en NIET op de originele opslaglocaties! Dit is vooral omdat omSipCreator in 'prune' modus (opschoonfunctie) batches wijzigt en daarbij data verwijdert!!

Ik neem in de voorbeelden hieronder even aan dat Python onder de volgende folder geïnstalleerd is: