Skip to content

Instantly share code, notes, and snippets.

@pryce-turner
pryce-turner / caching.py
Created November 29, 2023 02:06
Node caching client side implementation
import os
import shutil
import hashlib
from time import sleep
from typing import List
from random import randint
from pathlib import Path
from flytekit import task, workflow, dynamic
from flytekit.types.file import FlyteFile
from flytekitplugins.pod import Pod
@pryce-turner
pryce-turner / flyte_tests.md
Last active August 14, 2023 18:02
Testing Tasks in Flyte

Testing Tasks in Flyte

Motivation

There are a million ways to test code (and a million names to call those test types, often overlapping). Coacervate has come a long way and been through a few rewrites (when does a project earn a version number??) since I wrote my first tests. After settling on Flyte and modularizing things properly instead of cramming everything into a hacky docker image, it came down to write my first tests... again... hopefully for the last time.

Flyte offers a few utilities to make mocking external services more ergonomic. There are also some great examples of unit tests in Flytekit's own repo that leverage the ubiquitous pytest framework. While these are great for writing

@pryce-turner
pryce-turner / filebased_wf.md
Last active February 21, 2024 20:29
Processing Files with Dynamic Container Tasks in Flyte

Processing Files with Dynamic Container Tasks in Flyte

Motivation

I'm still very new to Flyte but it's quickly becoming one of my favorite tools. Between the modern architectural decisions and a welcoming and extremely helpful community, what's not to love?

Some of the abstractions however, do require a slight shift in assumptions compared to more localized workflow frameworks. Additionally, while the documentation is excellent at explaining Flyte's different features in isolation, there aren't as many posts on how they work together like there are for more established projects. This short piece is my attempt to capture my journey so far with a rather specific usecase that ties together a number of notable features. This is mostly to solidify my own understanding, but hopefully also a way to pay forward all the wonderful support I've received from the team.

Setup

  • Everything you need to run the code below is captured in the [Envi
@pryce-turner
pryce-turner / zfs_full_backup_send.sh
Created March 12, 2023 07:41
ZFS Deep Archive Backup
#!/bin/bash
# Take a snapshot of a given pool/dataset and send a replication stream to S3 Glacier Deep Archive
# AWS cli must be configured with crendentials granting access to the specified bucket
# NOTES:
# - Will only work for datasets that fit within the current S3 object size limit,
# currently 5TB.
# - Deep Archive requires objects be stored for 180 days. Recommend setting a lifecyle
# policy on the bucket to expire objects older than.
# - Recommend running script as a cronjob with appropriate interval, at least monthly.
@pryce-turner
pryce-turner / golem_log_parser.py
Last active November 24, 2022 21:54
Golem Yapapi Log Summarizer
import json
import argparse
from pathlib import Path
from collections import defaultdict
from datetime import datetime
# Small log-parsing CLI with no external dependencies
# Save file somewhere, see options with `python golem_log_parser.py -h`
class LogLine:
@pryce-turner
pryce-turner / staking_zfs.md
Last active October 23, 2023 16:41
Ethereum POS Staking on ZFS

Staking on ZFS

Intro

I always staked on ZFS before the merge, using a number of SATA SSDs in a simple stripe configuration, adding more as my space requirements increased. The merge imposed additional load on my disks that meant my setup was no longer appropriate; this sent me down a long road of testing and optimization. Let me say this up front, there are definitely more performant setups for this than ZFS. I've heard of very good results using mdadm and a simple ext4 filesystem (XFS also works). However, there are so many useful features baked into ZFS (compression, snapshots) and the ergonomics are so good that I was compelled to make this work for my (aging) setup.

Benchmark

I settled on a single fio benchmark for comparing my different setups, based on sar/iostat analyses of working setups. It is as follows: sudo fio --name=randrw --rw=randrw --direct=1 --ioengine=libaio --bs=4k --numjobs=8 --rwmixread=20 --size=1G --runtime=600 --group_reporting. This will lay down several fil

@pryce-turner
pryce-turner / testing.md
Last active September 3, 2022 04:05
Baby's First Tests

Baby's First Tests

I've heard that the best devs write tests that are born to fail. Then they implement the feature to make the test pass. Test-driven development (or even documentation-driven development if you're really into it) is a great way to write good code. Apart from forcing you to do your least-favorite part first, TDD also makes you take a step back and really think about what you're trying to implement. What are you really trying to achieve? What are the side-effects, what are the inputs and outputs that you will need to mock?

This is all well and good, but one time I feel like TDD doesn't actually work all that well is in the very early stages of building your greenfield project. In my experience, TDD leads to more work refactoring a nascent project than you save by automating the tests. If you barely have an idea how your project will be structured, let alone the code that fits into that structure, you're going

@pryce-turner
pryce-turner / requestor_start.md
Last active August 9, 2022 04:48
Writing (and re-writing) a requestor container start script

What are we making?

This little write-up describes a container image for a requestor on the Golem network, including a start-up script that cleanly handles the different ways we might want to spin up the container. I drew a lot of inspiration from a couple excellent repos that cut out a lot of the initial legwork. They did things slightly differently to my approach so I would recommend spending some time with them as well.

The Containerfile

I use podman for my containerization needs, and have created the following Containerfile for our requestor container.

FROM condaforge/mambaforge

RUN apt update && apt install screen jq -y
@pryce-turner
pryce-turner / airgap.md
Last active December 25, 2022 14:33
Air-gapped Raspberry Pi for eth2-deposit-cli

Motivation

The greatest strength of an airgapped machine is also it's biggest headache - no way out! This is a short guide for configuring an old Raspberry Pi 2 (no radio cards!) to securely use the eth2-deposit-cli tool. Whether using an existing mnemonic or generating a new one, the security conscious will appreciate doing so on a machine which never has and never will touch any network.


Requirements

  • Raspberry Pi
  • min 16Gb microSD