Skip to content

Instantly share code, notes, and snippets.

View rom1504's full-sized avatar

Romain Beaumont rom1504

View GitHub Profile
@rom1504
rom1504 / slurm_stats.py
Last active August 7, 2023 02:05
slurm_users.py
import json
import pandas as pd
import subprocess
import sys
def get_msg(
backticks=True # whether to add backticks for Discord formatting or not
):
"gets a list of cluster usage from squeue and creates a text message from it"
a = json.loads(subprocess.check_output(['squeue','--json']).decode("utf8"))
@rom1504
rom1504 / auto_eval_openclip.md
Last active July 30, 2022 22:54
auto eval openclip

Change paths then run keep_evaling.sh

Will keep evaluating openclip during training

to send to wandb, can also run while [ 1 ]; do python3 eval_to_wandb.py; sleep 300; done

@rom1504
rom1504 / debug_init_process_group.md
Last active July 25, 2023 12:22
debug torch.distributed.init_process_group on slurm

create an env:

python3.8 -m venv .env
source .env/bin/activate
pip install -U pip
pip3 install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu113

fix the paths in simple.sh

@rom1504
rom1504 / open_clip_slurm.md
Last active August 7, 2023 02:01
open clip at slurm

Install

git clone https://github.com/mlfoundations/open_clip.git
cd open_clip
python3.8 -m venv .env
source .env/bin/activate
pip install -U pip
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
pip install -e .
@rom1504
rom1504 / Webdataset_ondisk.py
Created July 8, 2022 16:01
Using webdataset and on disk tags
class WDSDataset(data.IterableDataset):
def __init__(self, min_size, transform=None, target_transform=None):
self.min_size = min_size
self.transform = transform if transform is not None else nn.Identity()
self.target_transform = target_transform if target_transform is not None else nn.Identity()
self.kv = OnDiskKV(file='/home/ubuntu/laion5B-watermark-safety-ordered', key_format='q', value_format='ee')
self.kv_aesthetic = OnDiskKV(file='/home/ubuntu/laion5B-aesthetic-tags-kv', key_format='q', value_format='e')
self.pwatermark_threshold = 0.8
self.punsafe_threshold = 0.5
self.aesthetic_threshold = 5.
@rom1504
rom1504 / distributed_dalle2_laion.md
Last active April 7, 2024 13:16
distributed dalle2 laion
@rom1504
rom1504 / display_aesthetics.py
Created June 9, 2022 01:37
Display aesthetic
import pandas as pd
df = pd.read_parquet("aethetic_multi/0000.parquet")
buckets = [(i, i+1) for i in range(10)]
html= "<h1>Aesthetic subsets in Laion2B-multi</h1>"
for [a,b] in buckets:
total_part = df[(df["prediction"] >= a) & (df["prediction"] <= b)]
count_part = len(total_part) / len(df) * 100
@rom1504
rom1504 / bench.txt
Created May 28, 2022 22:14
prompts
A red colored car.
A black colored car.
A pink colored car.
A black colored dog.
A red colored dog.
A blue colored dog.
A green colored banana.
A red colored banana.
A black colored banana.
A white colored sandwich.
@rom1504
rom1504 / dalle_mega_prompts.json
Last active May 14, 2022 15:04
dalle_mega_prompts.json
[
"t-shirt, size M",
"flower dress, size M",
"a t-shirt of an avocado",
"a rainbow hat",
"white snow covered mountain under blue sky during daytime",
"aerial view of the beach during daytime",
"aerial view of the beach at night",
"double rainbow over a lake",
"a beautiful sunset at a beach with a shell on the shore",
@rom1504
rom1504 / monitor_efa_aws.py
Last active July 25, 2022 17:59
monitor_efa_aws.py
from glob import glob
import time
import datetime
def get_read_bytes():
return sum([int(open(f"{p}/ports/1/hw_counters/rdma_read_bytes", "r").read().strip()) for p in glob("/sys/class/infiniband/*")])
from os.path import expanduser
home = expanduser("~")