Skip to content

Instantly share code, notes, and snippets.

View philippgeisler's full-sized avatar

Philipp Geisler philippgeisler

View GitHub Profile
@philippgeisler
philippgeisler / sum.awk
Created August 13, 2014 13:59
Simple Awk-script to calculate sums of values (here in $2) grouped by some field (here $1)
#! /usr/bin/awk -f
# equivalent oneliner for aliases: awk -F "|" '{group[$1] += $2} END{for (g in group) print group[g], "\t", g;}'
BEGIN {
FS="|"
}
{
group[$1] += $2
}
END {
for (g in group)
@philippgeisler
philippgeisler / mysql2csv.awk
Last active August 29, 2015 14:05
Should create one .csv for each table CREATEd in a MySQL export. Result might be dependent on export parameters, this has been tested for only one particular export file.
#!/usr/bin/awk -f
BEGIN { RS=";\n"; }
{
if ( $1 == "CREATE" ) {
fn = substr($3,2,length($3)-2) ".csv";
print fn > fn;
}
if ( $1 == "INSERT" ) {
iv = index($0,"(")+1;
@philippgeisler
philippgeisler / start.html
Created September 10, 2014 18:52
HTML5 bare minimum starting point
<!DOCTYPE html>
<html lang="">
<head>
<meta charset="utf-8">
<title></title>
<link rel="stylesheet" href="">
<script src=""></script>
</head>
<body>
@philippgeisler
philippgeisler / hh_wohnlage.sh
Last active August 29, 2015 14:06
(ungeprüft) Anzahl Hamburger Hausnummern nach Wohnlage 2011 gemäß http://suche.transparenz.hamburg.de/dataset/hamburger-wohnlagenverzeichnis
awk -F";" 'NR>1{lage[$7] += $(($4-$2)/2);}END{for (l in lage) printf("%8s%7s\n", l, lage[l]);}' < wohnlagenverzeichnis2011-opendata.txt
#> gut 40168
#> normal 103333
@philippgeisler
philippgeisler / consolidate_headers.awk
Created July 15, 2015 21:01
Awk script to fix header values/fill in missing headers in csv/tsv data created by export from WYSIWYG spreadsheets with joined cells.
#!/usr/local/bin/awk -f
BEGIN {
# adjust the following variables according to your needs:
FS = "\t" # the field seperator, "\t" for tab seperated, "," for comma seperated data
first = 2 # number of the first header line to be processed
last = 4 # number of the last header line to be processed
string_delimiter = "\"" # character used for delimiting strings in the input file; set to ""
# if none is used (empty string)
# Please note that the script will not work as expected if there are additional characters/
@philippgeisler
philippgeisler / selcol.sh
Created July 16, 2015 21:28
interactively select a column from (here: tab-)seperated value data by its title (expected to be in 1st line) and see its values; expects a (tsv) file path as 1st and only argument; requires percol https://github.com/mooz/percol
#!/bin/sh
head -1 $1 | \
tr "\t" "\n" | \
cat -n | \
percol | \
awk '{print $1}' | \
xargs -I c awk -v x=c 'BEGIN{FS="\t"}{print $x}' $1 | \
less
@philippgeisler
philippgeisler / bboxes.sh
Created July 11, 2014 13:48
Shell script grabbing bounding box coordinates for an array of cities
#!/usr/bin/zsh
url='http://open.mapquestapi.com/nominatim/v1/search.php'
# url='http://nominatim.openstreetmap.org/search'
if [ -e locations ]
then
rm locations
fi
for c in Berlin Dortmund Frankfurt Hamburg München
@philippgeisler
philippgeisler / mdb-to-tsv.sh
Created August 30, 2016 19:48
export all tables from Access database file (.mdb) into individual .tsv files using brianb/mdbtools
mdb-tables -1 -t table database.mdb | xargs -I tablename sh -c "mdb-export -d \\\\t database.mdb \"tablename\" > \"tablename.tsv\""
@philippgeisler
philippgeisler / eg.sh
Created September 23, 2016 14:53
example for use of xmlstarlet with MEDS/MODS metadata
ls -f ./*/*.xml | xargs -I {} xml sel -T -t -m "//mets:structMap/mets:div[@ADMID='AMD']" -v "@LABEL" -n {} | sort | uniq -c