Skip to content

Instantly share code, notes, and snippets.

View arne-cl's full-sized avatar

Arne Neumann arne-cl

  • Potsdam
View GitHub Profile
@strayer
strayer / youtube-dl-dash.bash
Last active March 24, 2021 18:00
youtube-dl wrapper script to download DASH Video and Audio and combine it with ffmpeg with automatic best format detection and fallback to default youtube-dl behaviour for videos without DASH
#!/usr/bin/env bash
set -e
YOUTUBE_FORMATS=$(youtube-dl -F "$1")
if [[ "$YOUTUBE_FORMATS" == *"(DASH Video)"* ]]; then
VIDEO_NAME=$(youtube-dl --get-filename "$1")
VIDEO_NAME_TMP="$VIDEO_NAME.tmp"
echo "Filename: $VIDEO_NAME"
@syllog1sm
syllog1sm / gist:10343947
Last active November 7, 2023 13:09
A simple Python dependency parser
"""A simple implementation of a greedy transition-based parser. Released under BSD license."""
from os import path
import os
import sys
from collections import defaultdict
import random
import time
import pickle
SHIFT = 0; RIGHT = 1; LEFT = 2;
@arne-cl
arne-cl / setup.py
Created May 4, 2014 13:02
minimal example setup.py for a single-module package
#!/usr/bin/env python
import sys
import os
try:
from setuptools import setup
except ImportError:
from distutils.core import setup
here = os.path.abspath(os.path.dirname(__file__))
@fginter
fginter / gist:2d4662faeef79acdb772
Last active August 31, 2020 06:55
Super-fast sort - uniq for ngram counting

The problem:

  • 1.3TB data with 5B lines in a 72GB .gz file
  • Need to sort the lines and get a count for each unique line, basically a sort | uniq -c
  • Have a machine with 24 cores, 128GB of memory, but not 1.3TB of free disk space
  • Solution: sort | uniq -c with lots of non-standard options and pigz to take care of compression

Here's the sort part, uniq I used as usual.

INPUT=$1

OUTPUT=${INPUT%.gz}.sorted.gz

@arne-cl
arne-cl / which.py
Created August 20, 2014 13:08
prints the install path of a given Python package
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Author: Arne Neumann
#
# Purpose: prints the path where the given Python package is installed.
# This might be interesting if you're working with multiple environments
# and are unsure if/where a package was installed.
import os
import sys
@kylebgorman
kylebgorman / treeify.py
Last active May 13, 2021 04:27
convert PTB-style parse tree (essentially, an sexps) to the format for LaTeX's `qtree`/`tikz-qtree` library
#!/usr/bin/env python
# treeify.py: convert PTB parse to LaTeX's `qtree`/`tikz-qtree` format
#
# NB: this only works for documents with a single tree, due to a limitation
# with `nltk.tree`.
import fileinput
from nltk import Tree
@jpf
jpf / app.py
Created March 21, 2015 00:46
Example SAML SP using PySAML2. Can handle IdP initiated requests and make SP initated (authn) requests
# -*- coding: utf-8 -*-
import logging
import os
import uuid
from flask import Flask
from flask import redirect
from flask import request
from flask import url_for
from flask.ext.login import LoginManager
@ohanhi
ohanhi / frp.md
Last active May 6, 2024 05:17
Learning FP the hard way: Experiences on the Elm language

Learning FP the hard way: Experiences on the Elm language

by Ossi Hanhinen, @ohanhi

with the support of Futurice 💚.

Licensed under CC BY 4.0.

Editorial note

@xiongchiamiov
xiongchiamiov / why.sh
Last active March 14, 2023 04:19
Use this when Amazon gives you an "Encoded authorization failure message" and you need to turn it into something readable. If you only get a request id... you're out of luck.
function decode-authorization-failure-message {
if [ $# -ne 1 ] || [ "$1" = -h ] || [ "$1" = --help ]; then
cat <<'EOT'
Usage: decode-authorization-failure-message <message>
Use this when Amazon gives you an "Encoded authorization failure message" and
you need to turn it into something readable.
EOT
return 1
fi
@Faheetah
Faheetah / Jenkinsfile.groovy
Last active May 21, 2024 02:11
Jenkinsfile idiosynchrasies with escaping and quotes
node {
echo 'Results included as an inline comment exactly how they are returned as of Jenkins 2.121, with $BUILD_NUMBER = 1'
echo 'No quotes, pipeline command in single quotes'
sh 'echo $BUILD_NUMBER' // 1
echo 'Double quotes are silently dropped'
sh 'echo "$BUILD_NUMBER"' // 1
echo 'Even escaped with a single backslash they are dropped'
sh 'echo \"$BUILD_NUMBER\"' // 1
echo 'Using two backslashes, the quotes are preserved'
sh 'echo \\"$BUILD_NUMBER\\"' // "1"