Skip to content

Instantly share code, notes, and snippets.

View chapmanjacobd's full-sized avatar
🥅
goal_net

Jacob Chapman chapmanjacobd

🥅
goal_net
View GitHub Profile
CREATE TABLE transport_intercity_guess_bus AS
WITH bus AS (
SELECT
cf.id city_from,
ct.id city_to,
0 AS transfers,
avg(price_avg) as price,
avg(distance_avg) as distance,
null transfer_city
from
@chapmanjacobd
chapmanjacobd / pylint-pyproject.toml
Created April 24, 2023 10:15
pylint pyproject.toml
[tool.pylint.main]
fail-under = 8.0
ignore = ["input"]
ignored-modules = ["pandas", "numpy"]
limit-inference-results = 100
persistent = true
suggestion-mode = true
[tool.pylint.basic]
argument-naming-style = "snake_case"
sqlite3 artist_similarity.db 'alter table artists add column artist_name text'
---
attach 'artist_similarity.db' AS sim;
attach 'track_metadata.db' AS tr;
UPDATE
sim.artists SET artist_name = (
SELECT
artist_name
FROM
@chapmanjacobd
chapmanjacobd / gdal_block_reading.py
Last active March 9, 2023 04:27
Actual speed impact of reading mismatched blocks
import math
import osgeo.gdal as gdal
import timeit
file_path = "1677269839.tif"
ds = gdal.Open(file_path)
file_block_size = ds.GetRasterBand(1).GetBlockSize()
xoff = 0
import argparse
from multiprocessing import Pool
import btrfs
parser = argparse.ArgumentParser()
parser.add_argument("btrfs_fs_mountpoint")
args = parser.parse_args()
@chapmanjacobd
chapmanjacobd / filter_file.py
Created January 26, 2023 20:30
writelines() is faster than write() if your data can fit in RAM
def filter_file(path, sieve):
with open(path, 'r') as fr:
lines = fr.readlines()
with tempfile.NamedTemporaryFile(mode='w', delete=False) as temp:
temp.writelines(l for l in lines if l.rstrip() not in sieve)
temp.flush()
os.fsync(temp.fileno())
os.replace(temp.name, path)
@>>> timeit.timeit("filter_file('/tmp/t', ['abcnewsvideo 9758031'])", number=100, setup="from __main__ import filter_file")
@chapmanjacobd
chapmanjacobd / btrfs_is_fun.md
Last active November 26, 2023 18:26
BTRFS single mode evaluation

The experiment

Preparation

truncate -s20G d1.img
truncate -s20G d2.img
truncate -s20G d3.img
truncate -s20G d4.img
set ld1 (sudo losetup --show --find d1.img)
@chapmanjacobd
chapmanjacobd / aaronsw1.md
Last active November 22, 2022 01:59
aaronsw.com but as one markdown file but it's actually two because GitHub only shows so much on one page

Aaron Swartz

Aaron Swartz is the founder of Demand Progress, which launched the campaign against the Internet censorship bills (SOPA/PIPA) and now has over a million members. He is also a Contributing Editor to [The

@chapmanjacobd
chapmanjacobd / de9im.py
Last active November 21, 2022 19:48
Dimensionally Extended 9-Intersections Matrix utilities
# Author: Sean Gillies
# License: BSD
# https://pypi.org/project/de9im/
"""
>>> from de9im import pattern
>>> side_hug = pattern('FF*F0****')
>>> im = p.relate(q)
>>> print im
FF2F01212