Skip to content

Instantly share code, notes, and snippets.

View fxmarty's full-sized avatar

fxmarty

View GitHub Profile
import torch
import torch.nn.functional as F
def to_float8(x, dtype=torch.float8_e4m3fn):
finfo = torch.finfo(dtype)
# Calculate the scale as dtype max divided by absmax
scale = finfo.max / x.abs().max().clamp(min=1e-12)
# scale and clamp the tensor to bring it to
# the representative range of float8 data type
# (as default cast is unsaturated)
@mingfeima
mingfeima / part_1_memory_format_and_channels_last_optimization.md
Last active May 6, 2024 05:46
PyTorch CPU Performance Optimization Tutorial - Section I
@mcarilli
mcarilli / nsight.sh
Last active May 8, 2024 18:11
Favorite nsight systems profiling commands for Pytorch scripts
# This isn't supposed to run as a bash script, i named it with ".sh" for syntax highlighting.
# https://developer.nvidia.com/nsight-systems
# https://docs.nvidia.com/nsight-systems/profiling/index.html
# My preferred nsys (command line executable used to create profiles) commands
#
# In your script, write
# torch.cuda.nvtx.range_push("region name")
# ...
@gonzaloplaza
gonzaloplaza / aws_ec2_ubuntu_userdata_docker.sh
Last active April 18, 2024 05:29
Script to auto install Docker (last version) into AWS EC2/Ubuntu instance at launch time: User Data
#!/bin/bash
# Install docker
apt-get update
apt-get install -y cloud-utils apt-transport-https ca-certificates curl software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
apt-get update
@barosl
barosl / add.c
Created July 26, 2015 07:26
Function overloading in C
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int addi(int a, int b) {
return a + b;
}
char *adds(char *a, char *b) {
char *res = malloc(strlen(a) + strlen(b) + 1);