Skip to content

Instantly share code, notes, and snippets.

View soaxelbrooke's full-sized avatar
📈
Text ⇨ Understanding

Stuart Axelbrooke soaxelbrooke

📈
Text ⇨ Understanding
View GitHub Profile
@soaxelbrooke
soaxelbrooke / custom_sql_query_postgres.md
Last active March 18, 2024 14:35
Custom SQL Query Execution in Postgrest

Postgrest doesn't like you executing arbitrary queries, but you can get around it by defining a function that executes the query for you:

$ psql mydb
mydb=# create function custom_query(query text) returns setof json as $f$
    begin 
    return query execute format('with tmp as (%s) select row_to_json(tmp.*) from tmp;', query); 
    end
 $f$ language plpgsql;
@soaxelbrooke
soaxelbrooke / example_file_input.rs
Created August 23, 2020 04:05
Example Handling Input Files With Yew
use yew::events::ChangeData;
use yew::web_sys::File;
use yew::prelude::*;
pub struct MyFileInput {
link: ComponentLink<Self>,
file: Option<File>,
}
pub enum Msg {
@soaxelbrooke
soaxelbrooke / main.py
Created August 7, 2023 01:31
Reading/Querying Parquet Datasets from Self-Hosted S3-Compatible Block Storage with s3fs + PyArrow + Polars
# Having already:
# export AWS_ACCESS_KEY_ID=youraccesskey
# export AWS_SECRET_ACCESS_KEY=yoursecretkey
import pyarrow.dataset as ds
import polars as pl
import s3fs
S3_ENDPOINT = "http://your.s3.endpoint:3900"
@soaxelbrooke
soaxelbrooke / wvsqlite.py
Last active March 1, 2023 09:37
Script for converting txt word embedding files to SQLite databases for fast embedding lookup.
#!/usr/bin/env python3.6
"""
Example usage:
$ python3.6 wvsqlite.py glove.840B.300d.txt
Produces an sqlite database at with byte strings of floats for each word vector, indexed by
token for fast lookup for vocabs much smaller than the embedding vocab (aka most real vocabs).
Float size can be set via FLOAT_BYTES env var, and can be 4 or 8, and LIMIT can be set to take
@soaxelbrooke
soaxelbrooke / adding-tailscale-to-edgerouter.md
Created January 9, 2023 18:14 — forked from lg/adding-tailscale-to-edgerouter.md
Add tailscale to an EdgeRouter and surviving system upgrade

Adding tailscale to an EdgeRouter (and surviving system upgrades)

I suggest you run sudo bash on all of these so you're the root user.

Installing

  1. Download tailscale and put the files in /config/. Find the latest stable or unstable version for your EdgeRouter's processor (ex. ER4 is mips and ERX is mipself)
sudo bash    # if you havent already
@soaxelbrooke
soaxelbrooke / systemd-talk.md
Last active February 13, 2020 00:04
Instructions for setting up a systemd service!

systemd Talk

First, let's make ourselves a simple python web server with flask:

from flask import Flask
app = Flask(__name__)
import os

PORT = int(os.getenv('FLASK_PORT', 5000))
#!/usr/bin/env python3
import sqlite3
import csv
import sys
quantize = '--quantize' in sys.argv
@soaxelbrooke
soaxelbrooke / quickjest.js
Last active August 25, 2019 23:30
quickjest.js - A quickcheck-style property-based test wrapper for Jest
// A prototype-based test wrapper based on generators.
// See original Haskell quickcheck paper: http://www.cs.tufts.edu/~nr/cs257/archive/john-hughes/quick.pdf
// --------------------------- //
// Scroll to bottom for usage! //
// --------------------------- //
import R from 'ramda';
const RUNS_PER_TEST = 50;
@soaxelbrooke
soaxelbrooke / ml_utils.py
Last active October 13, 2018 23:34 — forked from zmjjmz/ml_utils.py
regexp match lookup layer
import keras
import tensorflow
import numpy
import re
# Capturing group is important so it can be left padded with space (token splitter)
token_pattern = r"([\w']+|[,\.\?;\-\(\)])"
substitution = r" \1"
@soaxelbrooke
soaxelbrooke / parallel_word_frequency_count.sh
Last active September 2, 2018 22:16
Counts word frequencies in parallel, combining them.
# Need wf - install with `cargo install wf`
mkdir splits wfs
echo 'Splitting file into parts...'
split -a 5 -l 200000 $1 splits/split
ls splits/ | parallel 'echo "Counting {}..."; cat splits/{} | wf > wfs/{}_wf.txt'
echo 'Combining split counts...'
python -c 'from tqdm import tqdm; from functools import reduce; from glob import glob; from collections import Counter; of = open("wfs.txt", "w"); wf = reduce(lambda a, b: a + b, (Counter(dict((pair[0], int(pair[1])) for pair in (line.strip().split() for line in open(fpath)))) for fpath in tqdm(glob("wfs/*"))), Counter()); [of.write("{} {}\n".format(key, count)) for key, count in sorted(wf.items(), key=lambda p: -p[1])]'
rm -rf wfs splits
echo 'Word frequencies written to wfs.txt.'