Skip to content

Instantly share code, notes, and snippets.

View colllin's full-sized avatar

Collin Kindrom colllin

View GitHub Profile
@colllin
colllin / log_profile.py
Last active June 4, 2023 11:01
Utility for logging system profile to tensorboardx during pytorch training.
import torch
import psutil
import numpy as np
def log_profile(summaryWriter, step, scope='profile', cpu=True, mem=True, gpu=torch.cuda.is_available(), disk=['read_time', 'write_time'], network=False):
if cpu:
cpu_usage = np.array(psutil.cpu_percent(percpu=True))
summaryWriter.add_scalars(f'{scope}/cpu/percent', {
'min': cpu_usage.min(),
'avg': cpu_usage.mean(),
@colllin
colllin / Install NVIDIA Driver and CUDA.md
Last active November 2, 2019 18:02 — forked from wangruohui/Install NVIDIA Driver and CUDA.md
Install NVIDIA Driver and CUDA on Ubuntu / CentOS / Fedora Linux OS
@colllin
colllin / Readme.md
Last active July 9, 2023 11:40
Example startup script / boot script "user data" for running machine learning experiments on EC2 Spot Instances with git & dvc

Prerequisites

  • Write your training script so that it can be killed, and then automatically resumes from the beginning of the current epoch when restarted. (See train-example.py for an example training loop incorporating these recommendations.)
    • Save checkpoints at every epoch... (See utils.py for save_training_state helper function.)
      • model(s)
      • optimizer(s)
      • any hyperparameter schedules — I usually write the epoch number to a JSON file and compute the hyperparameter schedules as a function of the epoch number.
    • At the beginning of training, check for any saved training checkpoints and load all relevent info (models, optimizers, hyperparameter schedules). (See utils.py for load_training_state helper function.)
    • Consider using smaller epochs by limiting the number of batches pulled from your (shuffled) dataloader during each epoch.
  • This will cause your trai
@colllin
colllin / README.md
Created December 31, 2018 20:21
Deep Learning Base AMI setup script

Development Setup

$ sudo add-apt-repository ppa:jonathonf/python-3.6
$ sudo apt update
$ sudo apt install python3.6 python3.6-dev
$ wget https://bootstrap.pypa.io/get-pip.py
$ python3.6 get-pip.py
$ rm get-pip.py
$ sudo pip3.6 install pipenv
@colllin
colllin / ec2__setting-up-raid__useful-commands.sh
Last active October 29, 2018 18:59
Setting up raid on EC2 instance
# xvdg1 and xvdh1 are 2 attached volumes; md0 will be the (virtual) raid volume
mdadm -C /dev/md0 -l raid0 -c 64 -n 2 /dev/xvdg1 /dev/xvdh1
mdadm -E /dev/xvd[g-h]1
mdadm --detail /dev/md0
mkfs.ext4 /dev/md0
df -h
mount /dev/md0 /mnt/
@colllin
colllin / Readme.md
Last active May 28, 2019 22:36
Install NVIDIA drivers & CUDA
  1. Install NVIDIA drivers

    1. Find NVIDIA driver download link for your system at http://www.nvidia.com/Download/index.aspx?lang=en-us

    2. wget -P ~/Downloads/ http://us.download.nvidia.com/tesla/390.46/NVIDIA-Linux-x86_64-390.46.run

    3. sudo rm /etc/X11/xorg.conf # It's ok if this doesn't exist

    4. NVIDIA will clash with the nouveau driver so deactivate it:

      $ sudo vim /etc/modprobe.d/blacklist-nouveau.conf
      
@colllin
colllin / adamw.py
Last active January 1, 2020 15:26
PyTorch AdamW optimizer
# Based on https://github.com/pytorch/pytorch/pull/3740
import torch
import math
class AdamW(torch.optim.Optimizer):
"""Implements AdamW algorithm.
It has been proposed in `Fixing Weight Decay Regularization in Adam`_.
Arguments:
@colllin
colllin / install.md
Last active August 29, 2019 19:17
Setup EC2 for RL OpenAI Gym
@colllin
colllin / globalmaptiles.py
Last active November 29, 2017 19:51 — forked from tucotuco/globalmaptiles.py
Classes to calculate Tile coordinates
# From https://gist.github.com/colllin/c02319fe3202470cc4d0a0b73cdbd1a6
#!/usr/bin/env python
###############################################################################
# $Id$
#
# Project: GDAL2Tiles, Google Summer of Code 2007 & 2008
# Global Map Tiles Classes
# Purpose: Convert a raster into TMS tiles, create KML SuperOverlay EPSG:4326,
# generate a simple HTML viewers based on Google Maps and OpenLayers
@colllin
colllin / capsule_networks.py
Created November 21, 2017 23:37 — forked from kendricktan/capsule_networks.py
Clean Code for Capsule Networks
"""
Dynamic Routing Between Capsules
https://arxiv.org/abs/1710.09829
"""
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import torchvision.transforms as transforms