Skip to content

Instantly share code, notes, and snippets.

View jszym's full-sized avatar

Joseph Szymborski jszym

View GitHub Profile
@jszym
jszym / validate_url.js
Created January 13, 2019 03:41
Function to validate URLs
/**
VALIDATE URL
------------------------------------------------------
Requires punycode.js found at https://mths.be/punycode
to handle UTF-8. Has a very high true-positive rate,
and low false-positive rate on this test-suite
https://mathiasbynens.be/demo/url-regex
**/
function validate_url(link){

Keybase proof

I hereby claim:

  • I am jszym on github.
  • I am jszym (https://keybase.io/jszym) on keybase.
  • I have a public key whose fingerprint is 9961 76AC EF9F 41DA EF59 8151 AAFD DADA 459E F326

To claim this, I am signing this object:

@jszym
jszym / clean_trackers_url.py
Created September 2, 2019 01:20
A quick script to get rid of Google (UTM) tracking, as well as the tracking query strings on NYTimes URLs.
from urllib.parse import parse_qs, urlparse, urlencode, urlunparse
import copy
def clean_trackers_url(url):
url_obj = urlparse(url)
raw_query = parse_qs(url_obj.query)
clean_query = copy.deepcopy(raw_query)
# add query keys to ban (exact matches)
@jszym
jszym / split.py
Created June 4, 2020 17:38
Given a class=folder structure, compute splits with sklearn
# a library for discovering paths
from glob import glob
from sklearn.model_selection import train_test_split
# you may need to look up the documentation for glob
# "*" is a stand=in for any string
# this assumes that the subfolders are in the same folder as the script
# if the subfolders were in a folder "data", the argument to glob would be
# "./data/*.png"
paths = glob("./*/*.png")
@jszym
jszym / dictlogger.py
Created February 13, 2023 00:46
Dictionary Logger for PyTorch Lightning
from pytorch_lightning.utilities import rank_zero_only
from pytorch_lightning.loggers import Logger
from pytorch_lightning.loggers.logger import rank_zero_experiment
from collections import defaultdict
class DictLogger(Logger):
def __init__(self):
super().__init__()
@jszym
jszym / quickdraw.py
Created February 20, 2023 16:34
PyTorch QuickDraw dataset
# adapted from https://github.com/nateraw/quickdraw-pytorch/blob/main/quickdraw.ipynb
from typing import List, Optional
import urllib.request
from tqdm.auto import tqdm
from pathlib import Path
import requests
import torch
import math
import numpy as np
@jszym
jszym / namespaced_token.py
Created June 28, 2023 17:57
Namespaced, random token.
from base64 import urlsafe_b64encode
from blake3 import blake3
import krock32
from time import time
import secrets
import math
def generate_token(namespace: str) -> str:
"""