Skip to content

Instantly share code, notes, and snippets.

cupdike / shErrorCode255Tip.txt
Created Mar 27, 2019
sh.ErrorReturnCode_255 using Python sh package
View shErrorCode255Tip.txt
If you are trying to run a script like this
import sh
myScriptCommand = sh.Command("/path/to/script")
myScriptCommand("my arg")
and you see this error:
cupdike / gist:c5554233e1dd6b233a9b6ec6adb05c5a
Created Nov 1, 2018
Python function to round down minutes to a user specified resolution
View gist:c5554233e1dd6b233a9b6ec6adb05c5a
from datetime import datetime, timedelta
def round_minutes(dt, resolutionInMinutes):
"""round_minutes(datetime, resolutionInMinutes) => datetime rounded to lower interval
Works for minute resolution up to a day (e.g. cannot round to nearest week).
# First zero out seconds and micros
dtTrunc = dt.replace(second=0, microsecond=0)
cupdike /
Created Sep 20, 2018
Use Airflow's ORM to delete all DagRuns. Could also use sqlalchemy filtering if desired. This was with Airflow 1.8.
from airflow.models import DagRun
from sqlalchemy import *
from airflow import settings
session = settings.Session()
cupdike / ConnectionSetup.txt
Last active Jun 14, 2018
Airflow Connection to Remote Kerberized Hive Metastore
View ConnectionSetup.txt
# Let's say this is your kerberos ticket (likely from a keytab used for the remote service):
Ticket cache: FILE:/tmp/airflow_krb5_ccache
Default principal: hive/myserver.myrealm@myrealm
Valid starting Expires Service principal
06/14/2018 17:52:05 06/15/2018 17:49:35 krbtgt/myrealm@myrealm
renew until 06/17/2018 05:49:33
cupdike / AirflowBeelineConnectionSample
Created Jun 13, 2018
Airflow Beeline Connection Using Kerberos via CLI
View AirflowBeelineConnectionSample
### There aren't many good examples of how to do this when also using kerberos
(venv) [airflow@cray01 dags]$ airflow connections --add \
--conn_id beeline_hive \
--conn_type 'beeline' \
--conn_host '' \
--conn_port 10000 \
--conn_extra '{"use_beeline": true, "auth":"kerberos;principal=mysvcname/myservicehost@MYDOMAIN.COM;"}'
### Then, a sample DAG to use it
cupdike / BeelineJarDependencyFinder
Created Jul 12, 2017
Bash commands that will provide the list of jars needed to run beeline without installing hive
View BeelineJarDependencyFinder
# If you want to run Beeline without installing Hive...
# This will help you find the jars that you need:
# Ref:
# Turn on verbose classloading
$ export _JAVA_OPTIONS=-verbose:class
# Run beeline and process out the needed jars.
# Below assumes the hadoop jars are under a 'cloudera' path (adjust accordingly)
$ /usr/bin/beeline | tr '[' '\n' | tr ']' ' ' | grep jar | grep cloudera | grep -v checksum | awk '{last=split($0,a,"/"); print a[last]}' | sort | uniq
cupdike /
Created Oct 6, 2015
Polls a file hosted at a URL and downloads it initially and if it changes.
"""Polls a file hosted at a URL and downloads it initially and if it changes."""
# Should be fairly robust to web server issues (in fact, it would only
# be a handful of lines were it not for error handling)
import requests
import time
import sys
FILE_URL = "http://<mywebserver>/<myfile>"
# Inspired by:
def quicksort(l):
if len(l) < 2:
return l
iSwap = 1
pivot = l[0] # left most value is the pivot
for i, val in enumerate(l[1:], start=1): # Skip the pivot cell
if val < pivot:
You can’t perform that action at this time.