Skip to content

Instantly share code, notes, and snippets.

View marians's full-sized avatar

Marian Steinbach marians

View GitHub Profile
#!/usr/bin/perl
# This script concatenates and minifies a given set of CSS and JavaScript files
# so that only one JS and one CSS file are the result.
#
# In order to make it work for your project, configure the path settings and the
# file names both for input and output files.
#
# by Marian Steinbach <marian@sendung.de>
@marians
marians / gist:879241
Created March 21, 2011 09:57
CSV dump from local database
#!/usr/bin/python
# -*- coding: utf-8 -*-
'''
This script extracts CSV dumps from my local database
Author: Marian Steinbach, marian@sendung.de, http://www.sendung.de/japan-radiation-open-data/
'''
import csv
@marians
marians / german-porter-stemmer.js
Created April 26, 2011 14:06
German Porter Stemmer in JavaScript
/* by Joder Illi, Snowball mailing list */
function stemm(word) {
/*
Put u and y between vowels into upper case
*/
word = word.replace(/([aeiouyäöü])u([aeiouyäöü])/g, '$1U$2');
word = word.replace(/([aeiouyäöü])y([aeiouyäöü])/g, '$1Y$2');
/*
and then do the following mappings,
@marians
marians / rename.sh
Created July 7, 2011 20:22
Batch renaming files to create a numbered file sequence
#!/bin/bash
# Assuming that you have a number of
# PNGs in a folder pngs/, this script
# will rename them to a file sequence
# like 0001.png, 0002.png, ...
n=0
for f in pngs/*.png; do
n=`expr $n + 1`;
@marians
marians / database.sql
Created July 27, 2011 07:55
Snapshot of the script I wrote to generate http://vimeo.com/26157684
CREATE TABLE `stations` (
`id` varchar(9) COLLATE latin1_general_ci NOT NULL,
`postalcode` varchar(5) COLLATE latin1_general_ci NOT NULL,
`name` varchar(255) COLLATE latin1_general_ci NOT NULL,
`longitude` decimal(5,2) NOT NULL,
`latitude` decimal(5,2) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci COMMENT='BfS Sensor Stations';
CREATE TABLE `values_2h` (
@marians
marians / couchimport.py
Created November 11, 2011 11:57
Import of certain CSV data from STDIN to CouchDB
#!/opt/local/bin/python2.7
# encoding: utf-8
"""
couchimport.py
Created by Marian Steinbach on 2011-11-10.
"""
import sys
import os
import datetime
#!/usr/bin/env python
# encoding: utf-8
"""
This script acquires statistics on usenet groups.
It first reads a list of groups from one or more usenet servers
and then gets monthly post statistics about these groups from
Google Groups.
"""
@marians
marians / getpostcounts.py
Created January 5, 2012 10:26
de. Newsgroups und Statistiken darüber
#!/usr/bin/env python
# encoding: utf-8
"""
This script acquires statistics on usenet groups.
It first reads a list of groups from a file and then
reads the Google Group info page about that group to
gather monthly post counts.
"""
@marians
marians / kill-old-processes.py
Created January 23, 2012 20:12
A rather simple script to kill stalled processes. Adapt the grep string in the PS_CMD variable to make it match your needs.
# coding: utf-8
"""
Dieses Script beendet bestimmte Scraper-Prozesse, die
schon länger als einen Tag laufen.
"""
PS_CMD = 'ps -eo pid,etime,cmd|grep scrape_updated_dose_values.py'
@marians
marians / 292011.txt
Created February 16, 2012 18:39
OCR-Ergebnisse mit tesseract-ocr (aktuelle SVN-Version vom 16.2.2012)
CDU
CDU-Fraktion in der Bezirksvertretung
des Stadtbezirke Ehrenfeld
Henn Oberbürgennei „ _ ‚ erm Bezirksbürgenneister
Jürgen Roters * " _ „_ . ‚ osef Wrrges
Rathaus Eingaig 28 Feb 2m Bezirksrathaus Ehrenfeld