Skip to content

Instantly share code, notes, and snippets.

View rsimon's full-sized avatar

Rainer Simon rsimon

View GitHub Profile
@rsimon
rsimon / gist:7784870
Created December 4, 2013 09:38
Scala transformation script to extract plaintext from TEI - Pliny Natural History
import scala.xml.XML
import java.io.FileWriter
import scala.xml.transform.RewriteRule
import scala.xml.Node
import scala.xml.NodeSeq
import scala.xml.Elem
import scala.xml.transform.RuleTransformer
import scala.xml.Text
object TEI extends App {
@rsimon
rsimon / gist:7906496
Created December 11, 2013 07:48
SBT Multi Project Build
import sbt._
import Keys._
object ScalagiosBuild extends Build {
lazy val core = Project(id = "scalagios-core",
base = file("scalagios-core"))
lazy val legacy = Project(id = "scalagios-legacy",
base = file("foo")) dependsOn(core)
@rsimon
rsimon / gist:8036508
Created December 19, 2013 09:11
Clustering Hello World
'''
Created on 02.05.2013
@author: simonr
'''
from pylab import plot,show
from numpy import vstack,array
from numpy.random import rand
from scipy.cluster.vq import kmeans,vq
@rsimon
rsimon / gist:2c050ba033e2f6881d91
Last active August 29, 2015 14:01
Getting top unidentified places from Recogito DB
SELECT
-- concat (toponym, toponym_corrected), count(*)
coalesce(toponym, toponym_corrected), count(*)
FROM annotations WHERE
(status = 'NOT_IDENTIFYABLE' OR status = 'NO_SUITABLE_MATCH' OR status = 'AMBIGUOUS' OR status = 'MULTIPLE')
-- GROUP BY concat(toponym, toponym_corrected)
GROUP BY coalesce(toponym, toponym_corrected)
ORDER BY count desc ;
@rsimon
rsimon / ptdoubles.csv
Created May 15, 2014 08:06
List of Peutinger Table data doubles
Segment 1 1C4 TPPlace98
Segment 10 10B4 TPPlace2650
Segment 4 4B2 TPPlace3550
Segment 3 3B4 TPPlace1122
Segment 3 3B5 TPPlace2953
Segment 4 4B1 TPPlace2958
Segment 5 5B2 TPPlace1293
Segment 10 10B3 TPPlace2616
Segment 8 8A1 TPPlace3121
Segment 4 4B2 TPPlace2966
@rsimon
rsimon / gist:411cffc81edbd9a3604e
Last active August 29, 2015 14:02
Adding columns to a Postgres table
sudo -u postgres psql recogito
ALTER TABLE gdocuments ADD COLUMN geo_origin character varying(254);
ALTER TABLE gdocuments ADD COLUMN geo_findspot character varying(254);
ALTER TABLE gdocuments ADD COLUMN geo_author_location character varying(254);
  • Install the Tile Layer Plugin
  • The tile layer plugin needs to be configured with tile sources. Configuration works via tab-separated-value (.tsv) files. Locate the plugin's default .tsv directory. This depends on you platform, but will ususally be some place like:
.qgis2/python/plugins/TileLayerPlugin/layers/
C:\Users\rsimon\.qgis2/pyhton/plugins/TileLayerPlugin/layers
@rsimon
rsimon / common.json
Created November 27, 2015 07:51
ElasticSearch Mapping for Pelagios 'Common Object Properties'
{
"object": {
"properties": {
"identifier": { "type": "string", "index": "not_analyzed" },
"object_type": { "type": "string", "index": "not_analyzed" },
"title": { "type": "string" },
"description": { "type": "string" },
"homepage": { "type": "string", "index": "no" },
"is_in_dataset": {
"type": "nested",
@rsimon
rsimon / annotation.json
Last active November 27, 2015 08:32
ElasticSearch Mapping for Pelagios annotations
{
"object": {
"properties": {
"identifier": { "type": "string", "index": "not_analyzed" },
"homepage": { "type": "string", "index": "no" },
"is_in_dataset": {
"type": "nested",
"properties": {
"identifier": { "type": "string", "index": "not_analyzed" },
"title": { "type": "string", "index": "no" }
{
"size" : 5,
"query" : {
"nested" : {
"path" : "is_conflation_of",
"query" : {
"bool": {
"should": [
{ "match" : { "is_conflation_of.title": "athenae" } },
{