Mehdi Cherti mehdidc

## imagenet1000_clsidx_to_labels.txt
{0: 'tench, Tinca tinca',
 1: 'goldfish, Carassius auratus',
 2: 'great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias',
 3: 'tiger shark, Galeocerdo cuvieri',
 4: 'hammerhead, hammerhead shark',
 5: 'electric ray, crampfish, numbfish, torpedo',
 6: 'stingray',
 7: 'cock',
 8: 'hen',
 9: 'ostrich, Struthio camelus',

## tmux-cheatsheet.markdown

      
              1 file
            
          
              3253 forks
            
          
              181 comments
            
          
              14302 stars
            
          
                MohamedAlaa
                / tmux-cheatsheet.markdown
            
            
              Last active
              April 24, 2024 12:19
            
              
                tmux shortcuts & cheatsheet
              
          
    tmux shortcuts & cheatsheet

start new:
tmux

start new with session name:
tmux new -s myname


## tmux_cheatsheet.markdown

      
              1 file
            
          
              1039 forks
            
          
              64 comments
            
          
              4764 stars
            
          
                henrik
                / tmux_cheatsheet.markdown
            
            
              Created
              March 3, 2012 19:47
            
              
                tmux cheatsheet
              
          
    tmux cheatsheet

As configured in my dotfiles.
start new:
tmux

start new with session name:

  
## latency.markdown

      
              2 files
            
          
              742 forks
            
          
              49 comments
            
          
              4381 stars
            
          
                hellerbarde
                / latency.markdown
            
            
              Created
              May 31, 2012 13:16
                — forked from jboner/latency.txt
            
              
                Latency numbers every programmer should know
              
          
    Latency numbers every programmer should know

L1 cache reference ......................... 0.5 ns
Branch mispredict ............................ 5 ns
L2 cache reference ........................... 7 ns
Mutex lock/unlock ........................... 25 ns
Main memory reference ...................... 100 ns             
Compress 1K bytes with Zippy ............. 3,000 ns  =   3 µs
Send 2K bytes over 1 Gbps network ....... 20,000 ns  =  20 µs
SSD random read ........................ 150,000 ns  = 150 µs

Read 1 MB sequentially from memory ..... 250,000 ns = 250 µs

  
## postgres_queries_and_commands.sql
-- show running queries (pre 9.2)
SELECT procpid, age(clock_timestamp(), query_start), usename, current_query
FROM pg_stat_activity
WHERE current_query != '<IDLE>' AND current_query NOT ILIKE '%pg_stat_activity%'
ORDER BY query_start desc;

-- show running queries (9.2)
SELECT pid, age(clock_timestamp(), query_start), usename, query
FROM pg_stat_activity
WHERE query != '<IDLE>' AND query NOT ILIKE '%pg_stat_activity%'

## min-char-rnn.py
"""
Minimal character-level Vanilla RNN model. Written by Andrej Karpathy (@karpathy)
BSD License
"""
import numpy as np

# data I/O
data = open('input.txt', 'r').read() # should be simple plain text file
chars = list(set(data))
data_size, vocab_size = len(data), len(chars)

## fairscale_demo.py
import torch
import torch.distributed as dist
import torch.nn as nn
import torch.multiprocessing as mp

from torch.nn.parallel import DistributedDataParallel as DDP
from fairscale.nn.data_parallel import ShardedDataParallel as ShardedDDP
from fairscale.optim.oss import OSS
from fairscale.nn.data_parallel import FullyShardedDataParallel as FSDP
import os

## ddp_notes.md

      
              1 file
            
          
              25 forks
            
          
              18 comments
            
          
              175 stars
            
          
                TengdaHan
                / ddp_notes.md
            
            
              Last active
              April 22, 2024 00:19
            
              
                Multi-node-training on slurm with PyTorch
              
          
    Multi-node-training on slurm with PyTorch

What's this?


A simple note for how to start multi-node-training on slurm scheduler with PyTorch.
Useful especially when scheduler is too busy that you cannot get multiple GPUs allocated,
or you need more than 4 GPUs for a single job.
Requirement: Have to use PyTorch DistributedDataParallel(DDP) for this purpose.
Warning: might need to re-factor your own code.
Warning: might be secretly condemned by your colleagues because using too many GPUs.


## pytorch_performance_profiling.md

      
              2 files
            
          
              10 forks
            
          
              3 comments
            
          
              44 stars
            
          
                mingfeima
                / pytorch_performance_profiling.md
            
            
              Last active
              April 21, 2024 16:44
            
              
                How to do performance profiling on PyTorch
              
          
    (Internal Tranining Material)
Usually the first step in performance optimization is to do profiling, e.g. to identify performance hotspots of a workload.
This gist tells basic knowledge of performance profiling on PyTorch, you will get:

How to find the bottleneck operator?
How to trace source file of a particular operator?
How do I indentify threading issues? (oversubscription)
How do I tell a specific operator is running efficiently or not?

This tutorial takes one of my recent projects - pssp-transformer as an example to guide you through path of PyTorch CPU peformance optimization. Focus will be on Part 1 & Part 2.

  
## vpn.md

      
              1 file
            
          
              191 forks
            
          
              629 comments
            
          
              2387 stars
            
          
                joepie91
                / vpn.md
            
            
              Last active
              April 20, 2024 21:15
            
              
                Don't use VPN services.
              
          
    Don't use VPN services.

No, seriously, don't. You're probably reading this because you've asked what VPN service to use, and this is the answer.
Note: The content in this post does not apply to using VPN for their intended purpose; that is, as a virtual private (internal) network. It only applies to using it as a glorified proxy, which is what every third-party "VPN provider" does.

A Russian translation of this article can be found here, contributed by Timur Demin.
A Turkish translation can be found here, contributed by agyild.
There's also this article about VPN services, which is honestly better written (and has more cat pictures!) than my article.
	{0: 'tench, Tinca tinca',
	1: 'goldfish, Carassius auratus',
	2: 'great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias',
	3: 'tiger shark, Galeocerdo cuvieri',
	4: 'hammerhead, hammerhead shark',
	5: 'electric ray, crampfish, numbfish, torpedo',
	6: 'stingray',
	7: 'cock',
	8: 'hen',
	9: 'ostrich, Struthio camelus',
	-- show running queries (pre 9.2)
	SELECT procpid, age(clock_timestamp(), query_start), usename, current_query
	FROM pg_stat_activity
	WHERE current_query != '<IDLE>' AND current_query NOT ILIKE '%pg_stat_activity%'
	ORDER BY query_start desc;

	-- show running queries (9.2)
	SELECT pid, age(clock_timestamp(), query_start), usename, query
	FROM pg_stat_activity
	WHERE query != '<IDLE>' AND query NOT ILIKE '%pg_stat_activity%'
	"""
	Minimal character-level Vanilla RNN model. Written by Andrej Karpathy (@karpathy)
	BSD License
	"""
	import numpy as np

	# data I/O
	data = open('input.txt', 'r').read() # should be simple plain text file
	chars = list(set(data))
	data_size, vocab_size = len(data), len(chars)
	import torch
	import torch.distributed as dist
	import torch.nn as nn
	import torch.multiprocessing as mp

	from torch.nn.parallel import DistributedDataParallel as DDP
	from fairscale.nn.data_parallel import ShardedDataParallel as ShardedDDP
	from fairscale.optim.oss import OSS
	from fairscale.nn.data_parallel import FullyShardedDataParallel as FSDP
	import os