Skip to content

Instantly share code, notes, and snippets.

View csullivan's full-sized avatar

Chris Sullivan csullivan

  • OctoML
  • Portland
View GitHub Profile
# IDA (disassembler) and Hex-Rays (decompiler) plugin for Apple AMX
#
# WIP research. (This was edited to add more info after someone posted it to
# Hacker News. Click "Revisions" to see full changes.)
#
# Copyright (c) 2020 dougallj
# Based on Python port of VMX intrinsics plugin:
# Copyright (c) 2019 w4kfu - Synacktiv
@TomFaulkner
TomFaulkner / ubuntu18.04-vfio.md
Last active February 3, 2024 15:28
VFIO Setup on Ubuntu 18.04
@nadavrot
nadavrot / Matrix.md
Last active May 5, 2024 08:37
Efficient matrix multiplication

High-Performance Matrix Multiplication

This is a short post that explains how to write a high-performance matrix multiplication program on modern processors. In this tutorial I will use a single core of the Skylake-client CPU with AVX2, but the principles in this post also apply to other processors with different instruction sets (such as AVX512).

Intro

Matrix multiplication is a mathematical operation that defines the product of