Skip to content

Instantly share code, notes, and snippets.

View thehappycheese's full-sized avatar

thehappycheese

  • Western Australia
View GitHub Profile
@thehappycheese
thehappycheese / get_faster.py
Last active November 20, 2023 03:10
GET faster in python. Maps a pandas Series of URL strings to data returned from the web using urllib3
import pandas as pd
import concurrent.futures
import urllib3
def _load_url(arg):
url, http = arg
response = http.request("GET", url)
if response.status!=200:
return f"ERROR: {response.status}"
return response.data.decode("utf8")
@thehappycheese
thehappycheese / file_structure.md
Last active May 31, 2024 03:08
Rust folder structure / file structure import example

I love ❤️ rust but I hate 😞 how vague the beginner documentation is about splitting up your project into a practical structure. This is how I do it (for a library project anyway):

.
└── project/
    ├── src/
    │   ├── lib.rs
    │   ├── top_level_module.rs
    │   └── util/
@thehappycheese
thehappycheese / python_version_used_by_terminal.ipynb
Last active February 24, 2022 07:41
Confirm the active python version in your jupyter notebook (the version used by ! terminal cell magics)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@thehappycheese
thehappycheese / ..Python Package Boilerplate.md
Last active September 14, 2022 01:52
Python Package Boilerplate

I like to use the following folder structure

NicksPythonPackage/
├─ src/
│  ├─ examplepythonpackage/
│  │  ├─ __init__.py
│  │  ├─ some_module.py
│  │  ├─ some_sub_package/
│ │ │ ├─ __init__.py
@thehappycheese
thehappycheese / print_columns.py
Last active January 17, 2023 09:05
Print a list in nice columns
# inspired by https://stackoverflow.com/questions/1524126/how-to-print-a-list-more-nicely
# needs refinement before I post as answer though. I'll update this at some point
from typing import Iterable, Literal
def print_columns(data:Iterable, columns:int=3, sep:str=" ", alignment:Literal[">","<","^"]=">"):
"""Prints a list of objects in columns.
`data` should be an iterable object, such as a list. Each element of data will be converted to a string using the built in `str()`
`sep` is a string used to separate the columns. defaults to `' '`
@thehappycheese
thehappycheese / load_from_zip_of_csv.py
Created June 3, 2022 03:11
Load CSV files from Zip and Stack
import pandas
from zipfile import ZipFile
zip_file_path = "some_zip.zip"
# some_zip.zip/
# ├─ part1.csv
# ├─ part2.csv
# ├─ part3.csv
zip_file = ZipFile(zip_file_path)
extracted_data = pd.concat([
pandas.read_csv(
@thehappycheese
thehappycheese / overlap.py
Last active July 28, 2022 03:14
Interval overlap (signed distance) for numpy lists of intervals
import numpy as np
from numpy import typing as npt
def overlap(a:npt.NDArray, b:npt.NDArray, x:npt.NDArray, y:npt.NDArray):
"""Compute the signed distance between lists of intervals"""
overlap_min = np.maximum(a, x.reshape(-1,1))
overlap_max = np.minimum(b, y.reshape(-1,1))
signed_overlap_len = overlap_max - overlap_min
return signed_overlap_len
@thehappycheese
thehappycheese / compile.py
Created July 28, 2022 03:04
Compile a python function from string, and return the function
# Take a snippet of python code that defines at least one function at the top-level, then returns the last defined function.
import ast
def compile_function(source:str):
# parse first, so we can automatically find the funciton name later
parsed = ast.parse(source)
# compile in specified `scope` dictionary
exec(compile(source, "", "exec"), scope:={})
# return the last function definition
for item in reversed(parsed.body):
if isinstance(item, ast.FunctionDef):
@thehappycheese
thehappycheese / using_fsspec.py
Created November 2, 2022 06:11
How to use fsspec with Azure Blob Storage Account
# Must use the async io variant of Azure Credentials
from azure.identity.aio import DefaultAzureCredential
# fsspec directly, use this library which implements it and gives better type hints and autocompletion
import adlfs
cloud_filesystem = adlfs.AzureBlobFileSystem(
account_name="<STORAGE_ACCOUNT_NAME>",
credential=DefaultAzureCredential()
)
@thehappycheese
thehappycheese / use pandas to read azure storage.py
Created November 2, 2022 06:13
Use pandas to read azure cloud storage
import pandas as pd
from azure.identity.aio import DefaultAzureCredential
CONTAINER = "..."
STORAGE_ACCOUNT_NAME = "..."
pd.read_parquet(
path = f"abfss://{CONTAINER}@{STORAGE_ACCOUNT_NAME}.dfs.core.windows.net/some/path/example.parquet",
storage_options = {"credential":DefaultAzureCredential()}
)