Skip to content

Instantly share code, notes, and snippets.

View wassname's full-sized avatar
🙃

Michael J Clark wassname

🙃
View GitHub Profile
@wassname
wassname / split_by_unique_col.py
Last active May 9, 2021 04:01
split_by_unique_col
from sklearn.model_selection import train_test_split
import pandas as pd
def shuffle_df(df, random_seed=42):
return df.sample(frac=1, random_state=random_seed, replace=False)
def split_by_unique_col(df, col='patient_id', stratify_cols=[], random_seed=42):
"""
Make a dataframe of unique ids, with our stratification data
@wassname
wassname / dicom_over_http.py
Last active April 10, 2021 06:24
how to read only metadata from a dicom url
"""
how to read only metadata from a dicom url to save bandwith
- note the server must support HTTP Range, e.g. s3 buckets or azure blobs.
- note that if you don't mind reading the whole thing, it's easier to just read the whole thing, then pass it into pydicom as io.BytesIO
url: https://gist.github.com/wassname/70106b2d66a7c6e83e4b0300c9d1d4d3
"""
@wassname
wassname / rebalance_df.py
Last active October 16, 2022 02:06
split stratify pandas by unique
"""
If you want to split and sample at the same time use something else.
but in timeseries sometimes you want to split by time, then resample to get balanced weights
@url:https://gist.github.com/wassname/f34321d4797a356a82802bdfb935e6cd/edit
@author:wassname
@lic: meh
"""
@wassname
wassname / rich_tqdm.py
Created January 22, 2021 13:01
rich tqdm
from rich.progress import (
ProgressColumn,
BarColumn,
DownloadColumn,
TextColumn,
TransferSpeedColumn,
TimeRemainingColumn,
Progress,
TaskID,
TimeElapsedColumn,
@wassname
wassname / leftKNN.py
Last active January 15, 2021 07:50
Causal k nearest neighbors (KNN) that only looks back
# %%
import numpy as np
from functools import partial
from pykdtree.kdtree import KDTree
class LeftKDTree(KDTree):
"""
KNN that only looks left.
@wassname
wassname / stablenormal.py
Created November 1, 2020 03:41
pytorch stable normal using log_scale
import math
import torch
from torch.distributions import Normal
from torch.distributions.utils import broadcast_all, _standard_normal
from torch.distributions.kl import register_kl
class StableNormal(Normal):
"""Modified version that uses log_scale for stability of grad."""
def __init__(self, loc, log_scale):
@wassname
wassname / exercise.md
Created September 28, 2020 07:40
exercise.md

Exercise

Description:

  1. first
  2. second
@wassname
wassname / aws_keepassxc.md
Created September 1, 2020 06:04
How to use keepassxc and browser integration with aws console sign's

For your IAM user you get a csv of credentials like this

User name,Password,Access key ID,Secret access key,Console login link
USERNAME,PASSWORD,ACCESS_KEY,SECRET_KEY,https://0123456.signin.aws.amazon.com/console

If your region is sydney (ap-southeast-2) in keepass you enter:

Title: USERNAME/COMPANY

@wassname
wassname / azureml_py36_pytorch_conda.yaml
Created July 3, 2020 02:22
azure data science vm packages azureml_py36_pytorch
name: azureml_py36_pytorch
channels:
- pytorch
- conda-forge
- anaconda
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- _py-xgboost-mutex=2.0=cpu_0
- _pytorch_select=0.2=gpu_0
@wassname
wassname / times.py
Created April 26, 2020 04:17
date utils
def row2date(row, tz="Australia/Perth"):
"""Parse time columns."""
return pd.Timestamp(
year=int(row.Year),
month=int(row.Month),
day=int(row.Day),
hour=int(row.Hour),
minute=int(row.Minute),
second=int(row.Seconds),