Skip to content

Instantly share code, notes, and snippets.

View pmbaumgartner's full-sized avatar

Peter Baumgartner pmbaumgartner

View GitHub Profile
@pmbaumgartner
pmbaumgartner / cloud-init.yaml
Last active January 23, 2025 13:36
Multipass & Docker Setup
#cloud-config
package_upgrade: true
ssh_authorized_keys:
- <your key>
packages:
- apt-transport-https
- ca-certificates
- curl
@pmbaumgartner
pmbaumgartner / conda-pack-win.md
Last active January 7, 2025 13:54
Conda-Pack Windows Instructions

Packing Conda Environments

You must be using conda for this approach. You will need conda installed on the Source machine and the Target machine. The Source machine must have an internet connection, the Target does not. The OS in both environments must match; no going from macOS to Win10 for example.

1. (Source) Install conda-pack in your base python environment.

conda install -c conda-forge conda-pack
RELEASE="15.2.0"
mkdir /tmp/iosevka-font/v$RELEASE
cd /tmp/iosevka-font/v$RELEASE
wget https://github.com/be5invis/Iosevka/releases/download/v$RELEASE/ttf-iosevka-$RELEASE.zip
wget https://github.com/be5invis/Iosevka/releases/download/v$RELEASE/ttf-iosevka-aile-$RELEASE.zip
wget https://github.com/be5invis/Iosevka/releases/download/v$RELEASE/ttf-iosevka-curly-$RELEASE.zip
wget https://github.com/be5invis/Iosevka/releases/download/v$RELEASE/ttf-iosevka-curly-slab-$RELEASE.zip
wget https://github.com/be5invis/Iosevka/releases/download/v$RELEASE/ttf-iosevka-etoile-$RELEASE.zip
wget https://github.com/be5invis/Iosevka/releases/download/v$RELEASE/ttf-iosevka-fixed-$RELEASE.zip
@pmbaumgartner
pmbaumgartner / softie.py
Last active July 12, 2024 13:50
Create a soft label classifier from any scikit-learn regressor object
from sklearn.base import BaseEstimator, ClassifierMixin
from scipy.special import expit, logit
class SoftLabelClassifier(BaseEstimator, ClassifierMixin):
def __init__(self, regressor, eps=0.001):
self.regressor = regressor
self.eps = eps
def fit(self, X, y=None):
@pmbaumgartner
pmbaumgartner / digit_ngram_suggester.py
Last active July 1, 2024 16:08
A span candidate suggester function for spaCy that suggests spans containing a digit.
from typing import Optional, Iterable, cast, List
from thinc.api import get_current_ops, Ops
from thinc.types import Ragged, Ints1d
from spacy.pipeline.spancat import Suggester
from spacy.tokens import Doc
from spacy.util import registry
@registry.misc("ngram_digits_suggester.v1")
@pmbaumgartner
pmbaumgartner / _run_api.sh
Last active January 6, 2024 21:17
Mistal w/ vLLM. Run w/ a RTX 3090
# https://docs.mistral.ai/self-deployment/vllm/
export HF_TOKEN=<Huggingface Token>
docker run --gpus all \
-e HF_TOKEN=$HF_TOKEN -p 8000:8000 \
ghcr.io/mistralai/mistral-src/vllm:latest \
--host 0.0.0.0 \
--model mistralai/Mistral-7B-Instruct-v0.2
@pmbaumgartner
pmbaumgartner / ndict.py
Created February 21, 2023 20:27
create a dictionary based off of string variable names. based on rust struct shorthand init.
def ndict(*args):
"""Return a dictionary with the given keys and their values from globals."""
g = globals()
result = {}
for arg in args:
if arg in g:
result[arg] = g[arg]
else:
raise KeyError(f"Key '{arg}' not found in globals")
return result
@pmbaumgartner
pmbaumgartner / docx-cli-search.md
Created July 19, 2021 15:17
Search the contents of Word docs via CLI

Search Contents of Word Documents from the Terminal

You'll need ripgrep and pandoc to get started. You can read more about ripgrep here and pandoc here. I use both of these frequently and they're quite helpful.

You can install them both with homebrew:

brew install pandoc ripgrep
@pmbaumgartner
pmbaumgartner / performance_test.py
Last active November 10, 2022 17:24
Django/Postgres Data Load - Performance Comparison
from contextlib import closing
from io import StringIO
from time import time
import pandas as pd
from django.core.management.base import BaseCommand
from django.db import transaction
from faker import Faker
from core.models import Thing
from typing import TypedDict
DisplacyDepsWords = TypedDict(
"DisplacyDepsWords", {"text": str, "tag": str, "lemma": Optional[str]}
)
DisplacyDepsArcs = TypedDict(
"DisplacyDepsArcs", {"start": int, "end": int, "label": str, "dir": str}
)
DisplacyDepsData = TypedDict(
"DisplacyDepsData",