Skip to content

Instantly share code, notes, and snippets.

View epoz's full-sized avatar

Etienne Posthumus epoz

View GitHub Profile
@epoz
epoz / gist:f4a1024d89616df6fac5
Last active August 29, 2015 14:03
Elasticsearch Python client adaptor to be used with Django Paginator
class ElasticSearchPaginatorListException(Exception):
pass
class ElasticSearchPaginatorList(object):
def __init__(self, client, *args, **kwargs):
self.client = client
self.args = args
self.kwargs = kwargs
self._count = None
@epoz
epoz / expand.py
Created September 14, 2015 10:08
How to expand texts for a given set of ICONCLASS codes
codes = ['31D11222', '34B11', '45(+26)', '45C1', '45D12', '48C7341']
codes = [urllib.quote(x) for x in codes]
paths = set()
for obj in json.loads(urllib2.urlopen('http://iconclass.org/json/?notation='+'&notation='.join(codes)).read()):
paths.update(obj.get('p'))
paths.add(obj.get('n'))
txts = []
kws = set()
for p in json.loads(urllib2.urlopen('http://iconclass.org/json/?notation='+'&notation='.join(paths)).read()):
txts.append(p.get('txt').get('de', u''))
@epoz
epoz / BL_RDF_2json.py
Created January 18, 2012 16:56
Streaming RDF/XML to JSON converter for the BL catalog data using Python and iterparse
This gist has been replaced by: https://gist.github.com/1731588
@epoz
epoz / Default (OSX).sublime-keymap
Created July 19, 2012 12:48
Sublime Text 2 command to get latest Iconclass Clipboard
[
{ "keys": ["ctrl+i"], "command": "icclipboard" }
]
@epoz
epoz / convert.py
Created August 1, 2012 08:29
Converting BNE Bibliography ntriples to BibJSON
import sys
import json
import ntriples
from datetime import datetime
import httplib
ES_URL = "localhost:9200"
ES_PATH = "/bibserver/"
field_mapping = {
@epoz
epoz / gist:3760964
Created September 21, 2012 11:26
Markdown Watcher and auto regenerater
#!/usr/bin/env python
'''
Markdown Watcher and auto regenerater
While sitting in an aeroplane, I found myself editing a bunch of Markdown
files and needing to regenerate the HTML and preview in a browser.
It was tedious re-typing the 'markdown' command every time, so I made
this little script to watch the *.markdown files and create the corresponding
.html flavour if the modification date of the markdown file is newer or the
html does not exist yet.
@epoz
epoz / gimmesrc.py
Created October 1, 2012 19:02
Retrieves the full source of a title from Wikisource
#!/usr/bin/env python
# Example: python gimmesrc.py De_Cive > txt
import sys, urllib, urllib2
URL = 'http://en.wikisource.org/w/index.php?action=raw&title='
if __name__ == '__main__':
title = sys.argv[1]
title_parts = []
@epoz
epoz / STCN raw parser
Last active December 27, 2015 19:39
Read a STCN http://picarta.pica.nl/xslt/DB=3.11/ raw data dump, parse it, and spit it out as a columnar tab-separated-value file that can more easily be opened in Excel
'''
Read in a STCN data dump file, and convert it to a CSV file (delimited with tabs)
The data looks something like this:
SET: S0 [10000] TTL: 5 PPN: 339722142 PAG: 1 .
Ingevoerd: 1996:31-01-12 Gewijzigd: 1996:07-02-12 09:12:25 Status: 1996:31-01-12
0500 Aav
# Working out number of quires from a STCN collation
examples = [
{coll:'[*]2 2*-4*4 A-3Q4 2*2 `︠LO`3Q2 3R-5S4 5T2 5V-5Y4, 2A-G4 2H2 2I4 (3Q4 blank; lacks 3*4, blank?)',
url:'http://picarta.pica.nl/xslt/DB=3.11/XMLPRS=Y/PPN?PPN=318093766',
req:121,
# A-Z 23, A-Z 23, A-Q 16, Q 1, R-Z 7, A-Z 23, A-S 18, T 1, V-Y 3, A-G 7, H 1, I 1,
},
{coll:'A-V8 W8 X-Z8', # if W found, then noted 'loose' like A-V8 W8 X-Z8
req:24}
]
@epoz
epoz / old_planodo.py
Last active March 22, 2016 11:20
Planodo turn a bunch of files into a big zoomable image
#!./bin/python
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import A4
from reportlab.lib.units import cm
from reportlab.pdfbase import pdfmetrics
from reportlab.pdfbase.ttfonts import TTFont
from PIL import Image
import PIL
import os
import json