Skip to content

Instantly share code, notes, and snippets.

View malcolmgreaves's full-sized avatar

Malcolm Greaves malcolmgreaves

View GitHub Profile
@malcolmgreaves
malcolmgreaves / local_temp_dir_rm_exit_trap.sh
Last active April 12, 2024 17:59
Reusable bash functions for creating a local temporary directory with rm exit trap.
#!/usr/bin/env bash
set -euo pipefail
####################################################################
#
# Reusable functions for creating a local temporary directory:
# - [mk_tmp_dir] create local directory with unique name
# - [cleanup] add exit trap to rm this directory
#!/usr/bin/env bash
apt update
#
# install lmdb
#
apt install -y liblmdb-dev
LMDB_FORCE_SYSTEM=1 LMDB_FORCE_CFFI=1 pip install cffi
@malcolmgreaves
malcolmgreaves / conda.Dockerfile
Last active April 5, 2024 19:18
A Dockerfile that installs conda. Uses a GPU-enabled image (with CUDA) as a base, but miniconda install & setup is portable.
# syntax=docker/dockerfile:1.3
ARG UBUNTU_VERSION=18.04
ARG CUDA_VERSION=11.3.1
# Or use a different image.
FROM nvidia/cuda:${CUDA_VERSION}-cudnn8-devel-ubuntu${UBUNTU_VERSION}
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# #
# system packages #
@malcolmgreaves
malcolmgreaves / bug_pandas_map_mangles_column_datatype.py
Created February 13, 2024 19:35
Demonstration showing a bug in Pandas: it automatically converts datetime columns into a different pandas-specific type, even when the original column has `dtype=object`.
from datetime import datetime
import pandas as pd
now = datetime.now()
df = pd.DataFrame.from_dict(
{
"created_at": pd.Series([now, now - timedelta(seconds=100), now + timedelta(seconds=10)], dtype='object'),
}
@malcolmgreaves
malcolmgreaves / pandas_required_columns.py
Last active February 10, 2024 02:18
Conceptual framework for writing Pandas DataFrame code where required columns are not only documented, but parameterized. This establishes an interface between the name of a column in code vs. its name in the data.
from abc import ABC
from dataclasses import dataclass
from typing import List, NamedTuple, Sequence, Type, TypeVar
import pandas as pd
__all__: Sequence[str] = (
# main abstraction & utilities for columns required in a dataframe
"Columns",
@malcolmgreaves
malcolmgreaves / env_var_secret.Dockerfile
Created January 11, 2024 21:35
Example passing a secret value via an env var to a docker build.
# Run this example:
#
# mysecret=SECRET_VALUE docker build --secret id=mysecret,env=mysecret -f Dockerfile -t deleteme .
#
FROM debian:trixie-slim
RUN <<EOF cat >> file
#!/bin/bash
if [[ -z "\${MYSECRET}" ]]; then
echo "No MYSECRET env var!!!"
@malcolmgreaves
malcolmgreaves / git-largest-files
Last active January 10, 2024 12:44 — forked from nk9/largestFiles.py
Python script to find the largest files in a git repository.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
#
# Updated to use Python 3 by Malcolm Greaves.
#
# Python script to find the largest files in a git repository.
# The general method is based on the script in this blog post:
# http://stubbisms.wordpress.com/2009/07/10/git-script-to-show-largest-pack-objects-and-trim-your-waist-line/
#
# The above script worked for me, but was very slow on my 11GB repository. This version has a bunch
@malcolmgreaves
malcolmgreaves / requirements.txt--pyproject.toml
Created January 9, 2024 22:08
A pyproject.toml that uses setuptools & gets `dependencies` dynamically from a requirements.txt file.
[build-system]
requires = ["setuptools", "wheel", "setuptools_scm"]
build-backend = "setuptools.build_meta"
[project]
name = "mypackage"
requires-python = ">=3.10"
dynamic = ["dependencies"]
[tool.setuptools.dynamic]
@malcolmgreaves
malcolmgreaves / Dockerfile--cuda_117-torch_113-geometric_204
Last active January 5, 2024 20:58
Dockerfile based on Ubuntu 22.04 that has CUDA 11.7 dev libraries & drivers installed alongside PyTorch 1.13 and Torch-Geometric 2.0.4 libraries.
FROM nvidia/cuda:11.7.1-devel-ubuntu22.04
RUN DEBIAN_FRONTEND=noninteractive apt-get update && \
apt-get install -y software-properties-common && \
add-apt-repository -y ppa:deadsnakes/ppa && \
apt-get install -y \
python3-setuptools python3-dev swig \
wget git unzip tmux vim tree xterm \
build-essential gcc \
@malcolmgreaves
malcolmgreaves / testing_args_easier_debug_messages.py
Last active November 28, 2023 20:02
Exploring patterns for validating function arguments in Python.
"""
$ python testing_args_easier_debug_messages.py.py
Hello world, I can't believe you've have 42 birthdays! I hope you find time for crafting soon!
Hello universe, I can't believe you've have 117.0 birthdays! I hope you find time for crafting soon!
ValueError: Need positive numbers, not: age=whoops
ValueError: Need positive numbers, not: age=-1
ValueError: Need non-empty strings, not: name=
ValueError: Need non-empty strings, not: hobby=None
"""