Skip to content

Instantly share code, notes, and snippets.

View knudmoeller's full-sized avatar

Knud Möller knudmoeller

View GitHub Profile
@knudmoeller
knudmoeller / ckan_installation.md
Last active January 29, 2024 08:57
CKAN 2.9.10: "Installing from Source" on Ubuntu 20.04 + "Deploying a source install"

Installing CKAN from Source + Deployment

The installation instructions on https://docs.ckan.org/en/2.9/maintaining/installing/install-from-source.html are fine, except:

Solr

  • I don't install Solr locally on the same machine, so I don't need to install Solr, Jetty or Java. Installing the required packages is therefore done like this:
$ sudo apt-get install python3-dev postgresql libpq-dev python3-pip python3-venv git-core redis-server
@knudmoeller
knudmoeller / get_geojson.md
Last active July 11, 2023 09:25
Get WGS84-GeoJSON from WFS with Soldner-Coordinates

Get WGS84-GeoJSON from WFS with Soldner-Coordinates

The problem: get GeoJSON data from a WFS that uses a projection other than WGS84. This is e.g. true for all geo data in Berlin's FIS-Broker GIS. The data there uses the "Soldner" projection (or EPSG:25833).

Download as XML

@knudmoeller
knudmoeller / planets_and_moons.rq
Created January 17, 2023 23:21
Wikidata Query to get all planets of the solar system (and Pluto) and their moons
# Planets of the solar system
SELECT DISTINCT ?planet ?planetLabel ?child ?childLabel
WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
{
# things that are instances of planets (or subclasses thereof)
# and are part of the solar system (or parts of parts)
?planet wdt:P31/wdt:P279* wd:Q634 .
@knudmoeller
knudmoeller / wordle_filter.py
Last active January 14, 2022 11:22
Wordle Filter
import subprocess
# read the dictionary file:
result = subprocess.run(['egrep', '^.{5}$', "/usr/share/dict/words"], stdout=subprocess.PIPE)
words = result.stdout.decode('utf-8').splitlines()
present = ['a', 'n', 'g', 't'] # letters that we know are in the word
not_present = ['o', 'i', 's', 'e', 'u'] # letters we know are not in the word
# filter words with all() and not any()
  • Input data are multiple json files in sorted/ with this minimal structure:
{
  "dump_finished": "2021-12-14T10:43:42+01:00",
  "datasets": [
    ...
  ]
}
@knudmoeller
knudmoeller / jq_multi_query.sh
Created December 21, 2021 13:59
Query multiple files with jq, slurp results into one large array
jq "{date: .dump_finished, count: .datasets | length}" sorted/*.json | jq -s
@knudmoeller
knudmoeller / bulk_purge.py
Created November 18, 2021 08:57
Bulk-purging Datasets in CKAN with ckanapi
from ckanapi import RemoteCKAN
import os
# require ckanapi: https://github.com/ckan/ckanapi
package_names = [
"versickerung-aus-niederschlagen-2017-umweltatlas-wfs-beb56dfa",
"versickerung-aus-niederschlagen-ohne-versiegelung-2017-umweltatlas-wfs-e4a931f6",
"versiegelung-2005-unkorrigierte-versiegelungsgrade-rasterdaten-atom-451f714b",
"versiegelung-2011-unkorrigierte-versiegelungsgrade-rasterdaten-atom-c973b948",
@knudmoeller
knudmoeller / drush_count_nodes.sh
Created October 21, 2020 15:26
Drush/SQL command to count number of nodes for each node type
drush sqlq 'select count(node.nid) as node_count, node_type.type from node inner join node_type on node.type = node_type.type group by node_type.type'
@knudmoeller
knudmoeller / or_queries_for_outlook_categories.md
Last active January 30, 2023 12:44
Perform OR queries for categories on Microsoft Outlook for Mac

Perform OR queries for categories on Microsoft Outlook for Mac

Bizarrely, one cannot make OR queries in Microsoft Outlook for Mac, at least not easily. For example, I want a smart folder that contains all messages tagged with either the FIS Broker category or the Dubletten category.

The way to do it non-easily is to select Raw Query as the query rule type (in Search > Advanced), and then enter a Spotlight query string. The attribute to query for categories is com_microsoft_outlook_categories. That attribute takes a numeric id - but what the hell is the id for a category like FIS Broker? Outlook doesn't tell me, so I have to go even deeper down the rabbit hole. Here is how:

@knudmoeller
knudmoeller / recent_data_updates.sh
Created March 11, 2020 16:24
Get the most recent data updates in Berlin's Open Data Portal
#!/bin/bash
# The CKAN Action-API endpoint:
DATENREGISTER_API_BASE="https://datenregister.berlin.de/api/3/action/"
# Query filtering datasets that came in through the FIS-Broker harvester
QUERY="berlin_source:harvest-fisbroker"
# date_updated refers to the when the data (not the metadata) was updated last
SORT="date_updated+desc"