Skip to content

Instantly share code, notes, and snippets.

@cupdike
cupdike / InstallCaCerts.sh
Created February 3, 2023 15:07
Install SSL Proxy Certs Into Finch VM to Fix 'x509: certificate signed by unknown authority'
##### Scenario:
# You are getting 'x509: certificate signed by unknown authority' trying to
# run a simple finch container on your mac.
# $ finch run --rm public.ecr.aws/finch/hello-finch
# public.ecr.aws/finch/hello-finch:latest: resolving |--------------------------------------|
# elapsed: 0.1 s total: 0.0 B (0.0 B/s)
# INFO[0000] trying next host error="failed to do request: Head \"https://public.ecr.aws/v2/finch/hello-finch/manifests/latest\": x509: certificate signed by unknown authority" host=public.ecr.aws
# FATA[0000] failed to resolve reference "public.ecr.aws/finch/hello-finch:latest": failed to do request: Head "https://public.ecr.aws/v2/finch/hello-finch/manifests/latest": x509: certificate signed by unknown authority
# FATA[0000] exit status 1
@cupdike
cupdike / gist.customattributes.py
Created January 10, 2023 16:30
Multiprocessing Pool Using Process Subclass with Custom Attributes
import multiprocessing as mp
from multiprocessing.pool import Pool
# GOAL IN CONTEXT:
# Simulate using a multiprocessing pool to download a list of files synchronously
# from a set of servers where each worker in the pool targets a specific
# download server.
# Our Worker subclasses Process so the target server can be added as an attribute.
# A CustPool subclasses Pool so our Worker subclass is used instead of Process.
@cupdike
cupdike / zshrc_et_al
Created July 8, 2022 17:27
Python Disposable Virtual Environment Using Venv Package
function venvtemp {
# Inspired by: https://gist.github.com/csinchok/9714005#file-bash_profile
THROWAWAY_DIR=$(mktemp -d -t venv);
cd $THROWAWAY_DIR;
python3 -m venv venv;
source venv/bin/activate;
printf "\n\nMaybe run:\n\tpython -m pip install --upgrade pip wheel setuptools\n\n"
printf "Consider:\n\texport PIP_DEFAULT_TIMEOUT=100 && python -m pip install <PACKAGES> \\\t

Set up a Dev Kubeflow Environment Using Kubeflow Manifests, Kustomize and Rancher Desktop

Rancher Desktop provides a solid Kubernetes cluster platform for developer workstations. Here, we'll use it to install Kubeflow via the Manifests project with Kustomize. For the best experience, use a release version and make sure you follow the README guidance specific to that version (include prerequisite versions of Kustomize and Kubeflow).

Steps

@cupdike
cupdike / JupyterShellExperiments.txt
Last active December 3, 2021 19:17
Demystifying Jupyter Shell Variable Substitution SEO: magic %%bash %%sh bang ! OS
# TL;DR
# There is some tricky behavior lurking underneath Jupyter's shell variable substition syntax.
# For the best experience, stick with single quotes with {} variable placeholders, e.g:
python_variable='blah'
!echo somebashcommand '{python_variable} some extra gobblygook'
### Lessons Learned
# 1) Failed substitions (including unintentional variables) cause silent failures preventing any substitution.
# 1.1) If your substitution isn't working:
# 1.1.1) Prefix with echo, and strip your statement down until you find the part that isn't working
@cupdike
cupdike / gist:2d3ce5b3aa31a77f6b27d400d7c531b9
Created March 27, 2020 14:24
Python string.partition() example
# Demonstrates string.partition() to split a string by a sequence of delimiters.
# Not terribly useful, can do with regex pretty easily.
s = "apple AND banana AND cherry AND date OR elderberry BUT fig"
delims = [" AND "]*3 + [" OR ", " BUT "]
# [' AND ', ' AND ', ' AND ', ' OR ', ' BUT ']
def splitByDelimList(str, delimList):
delims = delimList.copy()
@cupdike
cupdike / pyarrowKerberizedHdfsDebugger.py
Created December 12, 2019 17:16
Helps debug connecting Pyarrow to Kerberized HDFS. Took a bit of doing to get it working and the guidance found on the web isn't always helpful. Useful error messages aren't always bubbling out from the driver. This will let you experiment with drivers, LIBJVM_PATH, LD_LIBRARY_PATH, CLASSPATH, HADOOP_HOME.
import pyarrow
import os
import sh
# Get obscure error without this: pyarrow.lib.ArrowIOError: HDFS list directory failed, errno: 2 (No such file or directory)
os.environ['CLASSPATH'] = str(sh.hadoop('classpath','--glob'))
# Not needed
#os.environ['HADOOP_HOME'] = '/opt/cloudera/parcels/CDH-<your version>/'
@cupdike
cupdike / CombiningPythonGenerators.txt
Created October 17, 2019 14:30
Combine Python Generators Into One Generator
>>> def genX():
... for i in range(3):
... yield i
...
>>> for i in genX(): print(i)
...
0
1
2
>>> def genY():
@cupdike
cupdike / shErrorCode255Tip.txt
Created March 27, 2019 21:15
sh.ErrorReturnCode_255 using Python sh package
If you are trying to run a script like this
import sh
myScriptCommand = sh.Command("/path/to/script")
myScriptCommand("my arg")
and you see this error:
sh.ErrorReturnCode_255
@cupdike
cupdike / gist:c5554233e1dd6b233a9b6ec6adb05c5a
Created November 1, 2018 20:59
Python function to round down minutes to a user specified resolution
from datetime import datetime, timedelta
def round_minutes(dt, resolutionInMinutes):
"""round_minutes(datetime, resolutionInMinutes) => datetime rounded to lower interval
Works for minute resolution up to a day (e.g. cannot round to nearest week).
"""
# First zero out seconds and micros
dtTrunc = dt.replace(second=0, microsecond=0)