Skip to content

Instantly share code, notes, and snippets.

View mehdidc's full-sized avatar

Mehdi Cherti mehdidc

View GitHub Profile
@mehdidc
mehdidc / pytorch_performance_profiling.md
Created February 16, 2023 08:44 — forked from mingfeima/pytorch_performance_profiling.md
How to do performance profiling on PyTorch

(Internal Tranining Material)

Usually the first step in performance optimization is to do profiling, e.g. to identify performance hotspots of a workload. This gist tells basic knowledge of performance profiling on PyTorch, you will get:

  • How to find the bottleneck operator?
  • How to trace source file of a particular operator?
  • How do I indentify threading issues? (oversubscription)
  • How do I tell a specific operator is running efficiently or not?

This tutorial takes one of my recent projects - pssp-transformer as an example to guide you through path of PyTorch CPU peformance optimization. Focus will be on Part 1 & Part 2.

model_fullname,model_fullname_pretty,model_arch,samples_seen,gmacs_per_sample,gmacs_total,upstream_dataset,downstream_dataset,acc1,acc5,mean_per_class_recall,image_retrieval_recall@5,text_retrieval_recall@5
ViT-g-14 /fsx/rom1504/open_clip/good_models/g_90.pt,g/14 2B,ViT-g-14,12208147020,290.74,3549396664594.8003,LAION-2B,vtab+,0.5654112282297443,0.8329414582676622,0.56279878057792,,
ViT-g-14 /fsx/rom1504/open_clip/good_models/g_90.pt,g/14 2B,ViT-g-14,12208147020,290.74,3549396664594.8003,LAION-2B,vtab/caltech101,0.8522353714661407,0.963346482577252,0.944284654839904,,
ViT-g-14 /fsx/rom1504/open_clip/good_models/g_90.pt,g/14 2B,ViT-g-14,12208147020,290.74,3549396664594.8003,LAION-2B,imagenet1k,0.76664,0.9485,0.76656,,
ViT-g-14 /fsx/rom1504/open_clip/good_models/g_90.pt,g/14 2B,ViT-g-14,12208147020,290.74,3549396664594.8003,LAION-2B,vtab/cifar100,0.8391,0.9729,0.8388,,
ViT-g-14 /fsx/rom1504/open_clip/good_models/g_90.pt,g/14 2B,ViT-g-14,12208147020,290.74,3549396664594.8003,LAION-2B,imagenetv2,0.6961,0.9086,0.6
We can make this file beautiful and searchable if this error is corrected: It looks like row 6 should actually have 13 columns, instead of 11. in line 5.
model_fullname,model_fullname_pretty,model_arch,samples_seen,gmacs_per_sample,gmacs_total,upstream_dataset,downstream_dataset,acc1,acc5,mean_per_class_recall,image_retrieval_recall@5,text_retrieval_recall@5
ViT-g-14 /fsx/rom1504/open_clip/good_models/g_90.pt,g/14 2B,ViT-g-14,12208147020,290.74,3549396664594.8003,LAION-2B,vtab+,0.5654112282297443,0.8329414582676622,0.56279878057792,,
ViT-g-14 /fsx/rom1504/open_clip/good_models/g_90.pt,g/14 2B,ViT-g-14,12208147020,290.74,3549396664594.8003,LAION-2B,vtab/caltech101,0.8522353714661407,0.963346482577252,0.944284654839904,,
ViT-g-14 /fsx/rom1504/open_clip/good_models/g_90.pt,g/14 2B,ViT-g-14,12208147020,290.74,3549396664594.8003,LAION-2B,imagenet1k,0.76664,0.9485,0.76656,,
ViT-g-14 /fsx/rom1504/open_clip/good_models/g_90.pt,g/14 2B,ViT-g-14,12208147020,290.74,3549396664594.8003,LAION-2B,vtab/cifar100,0.8391,0.9729,0.8388,,
ViT-g-14 /fsx/rom1504/open_clip/good_models/g_90.pt,g/14 2B,ViT-g-14,12208147020,290.74,3549396664594.8003,LAION-2B,imagenetv2,0.6961,0.9086,0.6
@mehdidc
mehdidc / example.sbatch
Created September 27, 2022 10:17
Content of the files
#!/bin/bash -x
#SBATCH --account=cstdl
#SBATCH --nodes=8
#SBATCH --gres=gpu:4
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=12
#SBATCH --wait-all-nodes=1
#SBATCH --time=00:30:00
#SBATCH --partition=batch
#SBATCH --job-name=open_clip
This file has been truncated, but you can view the full file.
{"info": {"description": "COCO 2014 Dataset", "url": "http://cocodataset.org", "version": "1.0", "year": 2014, "contributor": "COCO Consortium", "date_created": "2017/09/01"}, "images": [{"license": 3, "file_name": "COCO_val2014_000000391895.jpg", "coco_url": "http://images.cocodataset.org/val2014/COCO_val2014_000000391895.jpg", "height": 360, "width": 640, "date_captured": "2013-11-14 11:18:45", "flickr_url": "http://farm9.staticflickr.com/8186/8119368305_4e622c8349_z.jpg", "id": 391895}, {"license": 4, "file_name": "COCO_val2014_000000060623.jpg", "coco_url": "http://images.cocodataset.org/val2014/COCO_val2014_000000060623.jpg", "height": 427, "width": 640, "date_captured": "2013-11-14 17:24:15", "flickr_url": "http://farm7.staticflickr.com/6080/6113512699_37b4c98473_z.jpg", "id": 60623}, {"license": 3, "file_name": "COCO_val2014_000000483108.jpg", "coco_url": "http://images.cocodataset.org/val2014/COCO_val2014_000000483108.jpg", "height": 640, "width": 428, "date_captured": "2013-11-14 18:27:53", "flickr_u
import matplotlib as mpl
mpl.use('Agg')
import argparse
import pandas as pd
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
def plot_scaling_and_efficiency(df):
"""
@mehdidc
mehdidc / config.yaml
Created September 24, 2021 05:32
Config example with diversity
lr: 0.001
epochs: 10
noise_dim: 128
dim: 128
depth: 8
vq_image_size: 16
dropout: 0
cutn: 16
batch_size: 2
repeat: 8
This file has been truncated, but you can view the full file.
https://i.pinimg.com/736x/66/01/6c/66016c3ba27c0e04f39e2bd81a934e3e--anita-ekberg-bob-hope.jpg
http://www.standard.net/image/2015/02/04/800x_a16-9_b0_q81_p1/winter-fly-fishing.jpg
http://indianapolis-photos.funcityfinder.com/files/2009/12/Clearwater-Crossing-Shopping-Center-sign-Indianapolis-Indiana.jpg
http://www.abc.net.au/news/image/9066492-3x2-700x467.jpg
https://www.featurepics.com/StockImage/20090316/carrying-globe-stock-image-1115085.jpg
http://i.dailymail.co.uk/i/pix/2014/11/05/1415187324676_wps_31_Home_is_a_little_Deer_Ivy.jpg
http://www.waste360.com/sites/waste360.com/files/styles/article_featured_standard/public/Trista%2002%20007_0.jpg?itok=F1eJZsX3
https://media.gettyimages.com/photos/young-woman-seated-on-the-beach-picture-id97545987?s=612x612
https://worldjourneysdiscover.files.wordpress.com/2014/07/kyoto-07.jpg?w=860&h=645
http://piquemagazine.uk/wp-content/uploads/2017/10/LPO-24-Feb-Albrecht-Menzel-%C2%AE-Anne-Hornemann-300dpi.jpg
import os
from imageio import imread
import pandas as pd
import lmdb
from caffe2.proto import caffe2_pb2
# Folder should contain a set of images
image_folder = "flickr30k_images"
# CSV should contain image filenames with corresponding captions
dataframe_path = "flickr30k_images/results.csv"
#!/usr/bin/env bash
set -e
cd
case "$OSTYPE" in
darwin*) DOWNLOAD=https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh ;;
linux*) DOWNLOAD=https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh ;;
*) echo "unknown: $OSTYPE" ;;
esac