Skip to content

Instantly share code, notes, and snippets.

View epoz's full-sized avatar

Etienne Posthumus epoz

View GitHub Profile
@epoz
epoz / textbase.py
Created March 13, 2021 14:15
Textbase loader for visidata.org
from visidata import Column, TableSheet
def open_dmp(p):
return TextBaseSheet(p.name, source=p)
class TextBaseSheet(TableSheet):
rowtype = "records" # rowdef: a list, of collections.OrderedDict objects
@epoz
epoz / codesigning.md
Last active February 14, 2021 13:41
Code signing certificates prices/availability
@epoz
epoz / ndjson2nt.py
Created January 29, 2021 20:52
Scan a NDJSON file of JSON-LD data and convert it to Turtle
# Make sure you have the rdflib and rdflib-jsonld libraries installed
# This gist is the Pythn equivalent of Mark's repo from here: https://github.com/mightymax/ndjson2ttl
# As referenced in this tweet: https://twitter.com/markuitheiloo/status/1355252255327449090
import sys
from rdflib import Graph
for line in open(sys.argv[1]):
g = Graph().parse(data=line, format="json-ld")
print(g.serialize(format="n3").decode("utf8"))
@epoz
epoz / scan_publishers.py
Created April 23, 2020 21:26
Scan Crossref data dump of 2020-04-14
import json
import gzip
import os
from progress.bar import Bar
# Scan the Crossref data dump as mentioned in : https://twitter.com/CrossrefOrg/status/1250146935861886976
# And parse out the publishers names, so you know where in the giant dump your own data can be found
# Note this script uses the progress library, so before running do a "pip install progress"
filenames = [filename for filename in os.listdir('.') if filename.endswith('.json.gz')]
@epoz
epoz / notes.md
Last active January 9, 2020 11:46
How to expose a port on a running Docker container

Getting to access a internal port inside a running Docker container

Now that we have moved to a mostly Docker-based infrastructure, one of the tricky things is to try and debug things when there is something pear-shaped. It used to be possible to just SSH into the machine with a local port-forward, and then for example access the Elasticsearch server via a handy browser extension to do debugging.

But what to do if your container is running in a Docker Swarm and has no ports forwarded by default? (which is the right thing to do, keep it simple and closed...) Thanks to stirling help from https://github.com/eelkevdbos here is the solution, and I am writing it up here so I can remember it in future, cause I sure am gonna forget the details...

First thing, create a new docker overlay network that you can use for getting to the container in question:

docker network create foobar

@epoz
epoz / dzi_to_iiif.py
Created November 8, 2019 11:46
Converts a directory of Deepzoom images to IIIF Level-0 compliant static store
import os
import sys
if __name__ == '__main__':
dzifile = sys.argv[1]
for dirpath, dirnames, filenames in os.walk('.'):
for filename in filenames:
print(os.path.join(dirpath, filename))
@epoz
epoz / linkall.py
Last active October 17, 2019 11:37
HardLink all files into single directory
#!/usr/bin/env python
u = """Link all files from a directory and its descendants into a specified destination directory.
This 'flattens' the source dir into the destination dir
Usage: %s source_dir destination_dir
"""
import os
import sys
from progress.bar import Bar
@epoz
epoz / thumbs.py
Created April 4, 2019 15:04
Read a directory of images and convert them into a single sqlite3 db of thumbs
# sqlite> .schema
# CREATE TABLE thumbs(filename, data);
from PIL import Image
import os
from io import BytesIO
import sys
import sqlite3
from progress.bar import Bar
### Keybase proof
I hereby claim:
* I am epoz on github.
* I am epoz (https://keybase.io/epoz) on keybase.
* I have a public key ASAX-cAXJOiDTnX1CA73U80bpWL-sbX1XNsQGvWVi3BLZAo
To claim this, I am signing this object:
@epoz
epoz / uniq_ic.py
Created October 24, 2018 09:51
Make Iconclass notations unique for a given textbase file
import iconclass
import textbase
import sys
from progress.bar import Bar
d = textbase.parse(sys.argv[1])
bar = Bar('Processing', max=len(d))
def is_in_there(notation, notations):