Skip to content

Instantly share code, notes, and snippets.

@knbknb
Last active January 26, 2023 10:12
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save knbknb/b1899dac8f4972206b2540ee3a92a817 to your computer and use it in GitHub Desktop.
Save knbknb/b1899dac8f4972206b2540ee3a92a817 to your computer and use it in GitHub Desktop.
dbpedia-snippets.sh
# some examples of quering dbpedia from the command line
# knb 20201218
#
# show serialization formats that dbpedia can return
# (other than text/turtle)
#
# curl -sL: silent, follow redirects
# curl -I : HTTP HEAD request
# RDF triples: all facts about Pudding
# Hypernyms (~subtypes of) Pudding. For uni2utf8.pl see Stackoverflow
curl -sL --header "Accept: text/turtle" "http://dbpedia.org/resource/Pudding" \
| rapper -q -i turtle - http://dbpedia.org/resource \
| grep hypernym | sort | uni2utf8.pl
# write it to a local file and pretty-print RDF
rapper -i turtle ./pudding.ttl
# returns 917 facts in 2022
rapper --count -i turtle ./pudding.ttl
# HTTP HEAD request to DBpedia, Pudding. (/resource -> ...3 redirects... -> /data)
curl -sL --no-progress-meter --head --header "Accept: text/turtle" "http://dbpedia.org/resource/Pudding"
## shorter version of previous command:
## --no-progress-meter and -s are the same, --head and -I are the same.
## same command as before (HTTP HEAD request), but grab the "Alternates: " response-header,
## and make it more humanreable.
# HTTP HEAD request to DBpedia - show available formats
curl -sL -I --header "Accept: text/turtle" "http://dbpedia.org/resource/Pudding" \
| awk 'BEGIN {FS=": "}/^link:/{print $2}' \
| perl -pE "s/, ?/\n/g; s/0{3,}//g;" \
| sort -n -k2 -k1
# HTTP GET
# count different @prefixes used in result
curl -sL -H "Accept: text/turtle" "http://dbpedia.org/resource/Pudding" | grep @prefix | wc -l
# (n = 44)
# same request, to .../data/... URL not .../resource/... link
curl -sL "http://dbpedia.org/data/Pudding.ttl" | grep @prefix | wc -l
# (n = 24)
# have a look at the actual triles returned -
# for uni2utf8.pl, see below
curl -sL -H "Accept: text/turtle" "http://dbpedia.org/resource/Pudding" | uni2utf8.pl
##!/usr/bin/env bash
# return json for 1 URL, return URLs with related resources
curl -sL -H "Accept: application/json" https://dbpedia.org/resource/Potsdam \
| jq '. | keys_unsorted ' | jq .
| perl -pE "s#http://dbpedia.org/resource/##g" | sort | cat -n
# from someone's Twitter timeline, Dec 2022
# Wikidata: search for emigrants during Nazi era
SELECT DISTINCT ?category ?person ?description WHERE {
?category a skos:Concept .
?category skos:prefLabel ?label .
FILTER (CONTAINS(?label, "Zeit des Nationalsozialismus"))
?person dct:subject ?category.
?person rdfs:comment ?description .
FILTER (LANG(?description) = 'de') . }
@knbknb
Copy link
Author

knbknb commented Dec 17, 2020

uni2utf8.pl: helper script for converting \u1234 sequences to characters the terminal can display.
example: turn Tavuk_g\u00F6\u011Fs\u00FC into Tavuk_göğsü
example call: curl -sL -H "Accept: text/turtle" "http://dbpedia.org/resource/Pudding" | uni2utf8.pl
source: stackoverflow

#!/usr/bin/perl
# uni2ascii.pl
# helper script
# convert \u1234 sequences to characters the terminal can display.
# save this as ~/bin/uni2utf8.pl
# run example:
# curl -sL -H "Accept: text/turtle" "http://dbpedia.org/resource/Pudding"   | uni2utf8.pl
use strict;
use warnings;

binmode(STDOUT, ':utf8');

while (<>) {
    s/\\u([0-9a-fA-F]{4})/chr(hex($1))/eg;
    print;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment