Skip to content

Instantly share code, notes, and snippets.

View guysmoilov's full-sized avatar
🐶

Guy Smoilovsky guysmoilov

🐶
View GitHub Profile
@hamelsmu
hamelsmu / mlops.md
Last active July 30, 2020 13:35
The Git + Data tool I always wanted

install MLOps

brew install mlops

clone repo as usual

If metadata data.yaml is detected in the root of the repo, the plugin mlops will automatically ask user if they would like to download the data.

> git clone https://github.com/david/ml
@guysmoilov
guysmoilov / Git pre-commit hook for large files.md
Last active February 14, 2024 23:46
Git pre-commit hook for large files

Git pre-commit hook for large files

This hook warns you before you accidentally commit large files to git. It's very hard to reverse such an accidental commit, so it's better to prevent it in advance.

Since you will likely want this script to run in all your git repos, a script is attached to add this hook to all git repos you create / clone in the future.

Of course, you can just download it directly to the hooks in an existing git repo.

If you find this script useful, you might enjoy our more heavy-duty project FastDS, which aims to make it easier to work with versioning in data science projects.

@guysmoilov
guysmoilov / .gitconfig
Last active September 13, 2022 10:48
My git aliases. Work in progress.
[alias]
br = branch
sh = show
rb = rebase
rbi = rebase -i
st = status
ci = commit
cim = commit -m
cia = commit -a -m
co = checkout
@benmccallum
benmccallum / _Instructions.md
Last active August 26, 2023 14:36
git pre-commit hook preventing large files

Usage

You can use in two ways.

  1. Directly as the pre-commit hook in your .git/hooks folder.

  2. With Husky by updating your package.json with:

"husky": {
@guysmoilov
guysmoilov / gdrive_download.sh
Created February 28, 2019 12:56
Bash script to download large zip files from google drive while confirming the virus scan warning
#!/bin/sh
# Usage: gdrive_download 123-abc ./output.zip
function gdrive_download () {
CONFIRM=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate "https://drive.google.com/uc?export=download&id=$1" -O- | sed -En 's/.*confirm=([0-9A-Za-z_]+).*/\1/p')
wget --load-cookies /tmp/cookies.txt "https://drive.google.com/uc?export=download&confirm=$CONFIRM&id=$1" -O $2
rm -f /tmp/cookies.txt
}
@npearce
npearce / install-docker.md
Last active April 19, 2024 12:35
Amazon Linux 2 - install docker & docker-compose using 'sudo amazon-linux-extras' command

UPDATE (March 2020, thanks @ic): I don't know the exact AMI version but yum install docker now works on the latest Amazon Linux 2. The instructions below may still be relevant depending on the vintage AMI you are using.

Amazon changed the install in Linux 2. One no-longer using 'yum' See: https://aws.amazon.com/amazon-linux-2/release-notes/

Docker CE Install

sudo amazon-linux-extras install docker
sudo service docker start
@mkocabas
mkocabas / nms_pytorch.py
Created June 1, 2018 04:56
Pytorch NMS implementation
import torch
# Original author: Francisco Massa:
# https://github.com/fmassa/object-detection.torch
# Ported to PyTorch by Max deGroot (02/01/2017)
def nms(boxes, scores, overlap=0.5, top_k=200):
"""Apply non-maximum suppression at test time to avoid detecting too many
overlapping bounding boxes for a given object.
Args:
boxes: (tensor) The location preds for the img, Shape: [num_priors,4].
@debasishg
debasishg / gist:8172796
Last active March 15, 2024 15:05
A collection of links for streaming algorithms and data structures

General Background and Overview

  1. Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
  2. Models and Issues in Data Stream Systems
  3. Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
  4. Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
  5. [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t
@piscisaureus
piscisaureus / pr.md
Created August 13, 2012 16:12
Checkout github pull requests locally

Locate the section for your github remote in the .git/config file. It looks like this:

[remote "origin"]
	fetch = +refs/heads/*:refs/remotes/origin/*
	url = git@github.com:joyent/node.git

Now add the line fetch = +refs/pull/*/head:refs/remotes/origin/pr/* to this section. Obviously, change the github url to match your project's URL. It ends up looking like this:

@masak
masak / explanation.md
Last active April 11, 2024 02:50
How is git commit sha1 formed

Ok, I geeked out, and this is probably more information than you need. But it completely answers the question. Sorry. ☺

Locally, I'm at this commit:

$ git show
commit d6cd1e2bd19e03a81132a23b2025920577f84e37
Author: jnthn <jnthn@jnthn.net>
Date:   Sun Apr 15 16:35:03 2012 +0200

When I added FIRST/NEXT/LAST, it was idiomatic but not quite so fast. This makes it faster. Another little bit of masak++'s program.