Skip to content

Instantly share code, notes, and snippets.

@AnasKAN
Last active June 24, 2025 17:49
Show Gist options
  • Save AnasKAN/b1d7245b2ebfe072490b8fbfe6e87a74 to your computer and use it in GitHub Desktop.
Save AnasKAN/b1d7245b2ebfe072490b8fbfe6e87a74 to your computer and use it in GitHub Desktop.
Materials for AI production acceleration

Open Neural Network Exchange (ONNX)

Cuda acceleration Background

onnx.ai is an open format built to represent machine learning models. ONNX defines a common set of operators - the building blocks of machine learning and deep learning models - and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers.

Intro to onnx

Docs (very useful)

CUDA Kernels Low-level ML materials

Cuda acceleration Background

this section list resources to learn about pytorch/tensorflow/JAX gpu accelerations which can speed up fine tuning up to 33-50%!!! instead of you waiting for 12 hours to finish training you can trim it to 6 hours, cutting costs and saving time. You'll learn Triton, JAX, Flash attention and so much more.

GPUMODE channel have so many lectures teaching you all you need literally from the ground up. and their discord server is this. GPUMODE discord

Unsloth is one of their kinds. Have you heard of bug bounty hunters in cybersecurity field? those guys are LLMs bug hunters they work non-stop to accelerate LLMs inference and fine-tuning, lately they branched off to multimodal models including TTS models and VLMs!

Docs

Valuable lecture

To accelerate diffusion models finetuning:

dreambooth

Efficient Finetuning

ML in production

Lightining # its pretty good to pair it with hydra btw

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment