Skip to content

Instantly share code, notes, and snippets.

TimRepke /
Last active Jan 24, 2019
Spark vs Python doc2vec

Spark vs Single-core python

Question: can parallel pre- and postprocessing speed up Gensim Doc2Vec?

  • Spark: 349s
  • Vanilla: 373s

(only one run, so not a very scientific comparison)

Run on a single machine with 16GB RAM and Intel i7-8550U CPU @ 1.80GHz

TimRepke /
Last active Aug 24, 2018
Solr import/export

Solr import/export

You need to move from one solr instance to another and can't be bothered with mismatching versions or whatever? These two scripts will help you :)

First you need to create a new core in the target instance. You may want to use the schema/configset from the originating instance though, as the default schema might not be ideal.

Im my scenario I moved from Solr 5.5.5 to Solr 7.4. Therefore I had to (at least) update the solrconfig.xml, where the lucene version is specified. The exact version you need can be found in the default configset ([solr_root]/server/solr/configsets/...)

TimRepke /
Last active Dec 10, 2020
PST Archive to RFC822 (*.eml) script

PST Archive to RFC822

This script extracts all emails from an Outlook PST archive and saves them into some output folder as individual RFC822 compliant *.eml files.

Installing the external dependency pypff may not be straight forward (it wasn't for me). I forked the original repository to make it work in Python 3. If you get errors, check their wiki pages for help or try my fork. Below are the steps that worked for me:


TimRepke /
Last active Jul 24, 2018
Solr Downloader

ElasticSearch Downloader

Very basic script for downloading a specific index from ElasticSearch into a file containing one document per line (json-formatted).

The argparse should be pretty self explanatory.

Call the script by:

python --url= --port=9200 --index=my_index --out=/path/to/output/
View gist:f13bbb74f34d4c97d10742601a4a0406
tim@klapprechner ~/workspace/satnavpi/valhalla (git)-[2.1.8] % ./ :(
libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, '.'.
libtoolize: copying file './'
libtoolize: putting macros in AC_CONFIG_MACRO_DIRS, 'm4'.
libtoolize: copying file 'm4/libtool.m4'
libtoolize: copying file 'm4/ltoptions.m4'
libtoolize: copying file 'm4/ltsugar.m4'
libtoolize: copying file 'm4/ltversion.m4'
libtoolize: copying file 'm4/lt~obsolete.m4' installing './compile'
TimRepke /
Last active Apr 11, 2016
Use your ThinkPad 'i'-LED to morse stuff

ThinkPad morse (aka ThinkBlink)


$ sudo ./ "sos"
s . . . 
o - - - 
s . . .
TimRepke / index.html
Last active Sep 4, 2015 ScholarlyArticle demo
View index.html
<!DOCTYPE html>
<html lang="en">
<meta charset="UTF-8">
<title>RefMe Publications - Article viewer</title>
<div id="refme-cite-widget"></div>
<div itemscope itemtype="">
<strong>Title:</strong> <span itemprop="name">Reviewing the advantages of reference generators like RefME</span><br/>