Skip to content

Instantly share code, notes, and snippets.

View drjasonharrison's full-sized avatar

Jason Harrison drjasonharrison

View GitHub Profile
@roylee0704
roylee0704 / dockergrep.sh
Created December 9, 2016 08:24
how to grep docker log
docker logs nginx 2>&1 | grep "127."
# ref: http://stackoverflow.com/questions/34724980/finding-a-string-in-docker-logs-of-container
@teasherm
teasherm / s3_multipart_upload.py
Last active February 4, 2024 04:44
boto3 S3 Multipart Upload
import argparse
import os
import boto3
class S3MultipartUpload(object):
# AWS throws EntityTooSmall error for parts smaller than 5 MB
PART_MINIMUM = int(5e6)
@zebulonj
zebulonj / .npmrc
Last active March 20, 2023 04:33
Dockerfile with Private npm Module Dependencies
//registry.npmjs.org/:_authToken=${NPM_TOKEN}
save-exact=true
loglevel=error
@StevenACoffman
StevenACoffman / Docker Best Practices.md
Last active April 29, 2024 08:36
Docker Best Practices

Mistakes to Avoid: Docker Antipatterns

Whichever route you take to implementing containers, you’ll want to steer clear of common pitfalls that can undermine the efficiency of your Docker stack.

Don’t run too many processes inside a single container

The beauty of containers—and an advantage of containers over virtual machines—is that it is easy to make multiple containers interact with one another in order to compose a complete application. There is no need to run a full application inside a single container. Instead, break your application down as much as possible into discrete services, and distribute services across multiple containers. This maximizes flexibility and reliability.

Don’t install operating systems inside Docker containers

It is possible to install a complete Linux operating system inside a container. In most cases, however, this is not necessary. If your goal is to host just a single application or part of an application in the container, you need to install only the essential

@ipan
ipan / diff-jq.md
Created January 16, 2018 04:47
compare two JSONs with jq #json #jq
@shtratos
shtratos / fetch-dev-secrets-from-vault.sh
Last active May 13, 2024 14:27
Bash script to fetch and store secrets from Azure KeyVault
#!/usr/bin/env bash
#
# Fetch secrets for local development from Azure KeyVault
# and print them to stdout as a bunch of env var exports.
# These secrets should be added to your local .env file
# to enable running integration tests locally.
#
KEY_VAULT=$1
function fetch_secret_from_keyvault() {
@kobybum
kobybum / unused_modules.sh
Last active April 5, 2024 00:44
Finding Unused Python Files
#!/bin/bash
# This is part of a Medium article about finding unused files:
# https://medium.com/@kobybum/finding-dead-python-files-with-snakefood-6c75a3e82294
# Generate a list of included dependencies
sfood -i example-project > /tmp/out.deps
# Get dependant filepath from each dependency, sort and get unique
cat /tmp/out.deps | \
grep -v test | \
@lisawolderiksen
lisawolderiksen / git-commit-template.md
Last active May 28, 2024 21:32
Use a Git commit message template to write better commit messages

Using Git Commit Message Templates to Write Better Commit Messages

The always enthusiastic and knowledgeable mr. @jasaltvik shared with our team an article on writing (good) Git commit messages: How to Write a Git Commit Message. This excellent article explains why good Git commit messages are important, and explains what constitutes a good commit message. I wholeheartedly agree with what @cbeams writes in his article. (Have you read it yet? If not, go read it now. I'll wait.) It's sensible stuff. So I decided to start following the

@Guitaricet
Guitaricet / reproducibility.md
Last active March 24, 2024 11:11
Notes on reproducibility in PyTorch

Reproducibility

ML experiments may be very hard to reproduce. You have a lot of hyperparameters, different dataset splits, different ways to preprocess your data, bugs, etc. Ideally, you should log data split (already preprocessed), all hyperparameters (including learning rate scheduling), the initial state of your model and optimizer, random seeds used for initialization, dataset shuffling and all of your code. Your GPU is also should be in deterministic mode (which is not the default mode). For every single model run. This is a very hard task. Different random seed can significantly change your metrics and even GPU-induced randomness can be important. We're not solving all of these problems, but we need to address at least what we can handle.

For every result you report in the paper you need (at least) to:

  1. Track your model and optimizer hyperparameters (including learning rate schedule)
  2. Save final model parameters
  3. Report all of the parameters in the pap
@premek
premek / mv.sh
Last active March 5, 2024 17:43
Rename files in linux / bash using mv command without typing the full name two times
# Put this function to your .bashrc file.
# Usage: mv oldfilename
# If you call mv without the second parameter it will prompt you to edit the filename on command line.
# Original mv is called when it's called with more than one argument.
# It's useful when you want to change just a few letters in a long name.
#
# Also see:
# - imv from renameutils
# - Ctrl-W Ctrl-Y Ctrl-Y (cut last word, paste, paste)