Skip to content

Instantly share code, notes, and snippets.

@bjacob
bjacob / a.diff
Last active March 18, 2024 18:37
diff --git a/home/benoit/iree-build/llvm-project/tools/mlir/include/mlir/IR/BuiltinDialectBytecode.cpp.inc b/tmp/BuiltinDialectBytecode.cpp.inc
index 26072dbe9c8f..015598623db4 100644
--- a/home/benoit/iree-build/llvm-project/tools/mlir/include/mlir/IR/BuiltinDialectBytecode.cpp.inc
+++ b/tmp/BuiltinDialectBytecode.cpp.inc
@@ -355,6 +355,27 @@ static void write(UnknownLoc attribute, DialectBytecodeWriter &writer) {
writer.writeVarInt(/* UnknownLoc */ 15);
}
+namespace {
+struct Logger {
builtin.module @calls attributes {
} {
func.func private @matmul_test.generate_random_matrix(%device: !hal.device, %dim0: i64, %dim1: i64, %element_type: i32, %seed: i32) -> !hal.buffer_view
func.func private @matmul_test.check_matmul_results(%device: !hal.device, %m: i64, %k: i64, %n: i64, %transpose_rhs: i32, %lhs: !hal.buffer_view, %rhs: !hal.buffer_view, %acc: !hal.buffer_view, %actual_result: !hal.buffer_view)
func.func private @module.matmul_accumulate_DYNxDYNxi8_times_DYNxDYNxi8_into_DYNxDYNxi32(%lhs: !hal.buffer_view, %rhs: !hal.buffer_view, %acc: !hal.buffer_view) -> !hal.buffer_view
func.func private @module.matmul_accumulate_1x1xi8_times_1x1xi8_into_1x1xi32(%lhs: !hal.buffer_view, %rhs: !hal.buffer_view, %acc: !hal.buffer_view) -> !hal.buffer_view
func.func private @module.matmul_DYNxDYNxi8_times_DYNxDYNxi8_into_DYNxDYNxi32(%lhs: !hal.buffer_view, %rhs: !hal.buffer_view) -> !hal.buffer_view
@bjacob
bjacob / README.md
Last active March 18, 2024 13:48
Download, compile and run OPT-1.3b on CPU with IREE

Trying to change ukernels calling convention back to "default", instead of ParameterStruct.

Problem: they are returning void, which leads to an assertion failure in ConvertToLLVM due to a discrepancy in how void llvm.call is represented, either returning nothing or returning one value of type !llvm.void

Getting this:

iree-compile: /home/benoit/iree/third_party/llvm-project/mlir/lib/IR/PatternMatch.cpp:153: virtual void mlir::RewriterBase::replaceOp(Operation *, ValueRange): Assertion `op->getNumResults() == newValues.size() && "incorrect # of replacement values"' failed.

Thread 7 "llvm-worker-5" received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffcf7fe6c0 (LWP 259929)]
@bjacob
bjacob / README.md
Last active February 16, 2024 20:30
Attempt at ukernel fallback to codegen

Attempt at ukernel fallback to codegen

This is to document a short-lived attempt at solving #15784 by implementing the idea laid out in the original issue description. This changes the mmt4 ukernel to return a second return value which is a status code, and changes the mmt4d-to-ukernel lowering to create a scf.if based on that status code:

%62:2 = iree_codegen.ukernel.generic "iree_uk_mmt4d" ins(%59, %60 : tensor<1x?x16x1xf32>, tensor<1x?x16x1xf32>) outs(%61 : tensor<1x1x16x16xf32>) (%c1, %c1, %dim, %c16_i32, %c16_i32, %c1_i32, %c1281_i32 : index, index, index, i32, i32, i32, i32) fn_def_attrs {hal.import.bitcode = true, hal.import.cconv = 1 : i32, hal.import.fields = ["processor_data"]} strided_outer_dims(1) -> tensor<1x1x16x16xf32>, i32
%63 = arith.cmpi eq, %62#1, %c0_i32 : i32
%64 = scf.if %63 -> (tensor<1x1x16x16xf32>) {
  scf.yield %62#0 : tensor<1x1x16x16xf32>
} else {
@bjacob
bjacob / README.md
Created January 29, 2024 22:04
Putting the "LLVM loop unrolling for ukernel bitcode" idea to rest

Putting the "LLVM loop unrolling for ukernel bitcode" idea to rest

Problem statement

Microkernels have some variants for various M0 tile sizes, such as M0={1,2,4,8,16}, and sometimes a few other similar parameters.

We need to generate microkernel code for each such variant, with some fully-unrollable for loops properly unrolled each time.

Currently this is done in microkernel source code at the price of some boilerplate in the source, and inflated bitcode to embed into iree-compile. For instance, here is how we generate 5 tile-functions differing only in the M0-value: https://github.com/openxla/iree/blob/1c83020136b9d3d56da692036e5bbcb2b4586ebf/runtime/src/iree/builtins/ukernel/arch/x86_64/mmt4d_x86_64_avx512_base.c#L12-L61

@bjacob
bjacob / README.md
Last active June 5, 2024 16:35
IREE / MLIR / Linalg tutorial

IREE/MLIR/Linalg tutorial

Introduction

This tutorial is simultaneously about IREE, MLIR, and specifically the MLIR Linalg dialect.

What is MLIR?

MLIR is a programming language, but MLIR in itself is almost just an empty shell. What it really provides is a framework allowing to define MLIR dialects which are where the features come from.

@bjacob
bjacob / README.md
Last active January 23, 2024 15:55
%%{ init: {"theme": "neutral" } }%%
graph TD;
matmulontensors-- CPUMaterializeEncoding -->mmt4dontensors;
mmt4dontensors-- CPULowerToUKernels -->ukernelontensors;
ukernelontensors-- IREEComprehensiveBufferize -->ukernelonmemref;
ukernelonmemref-- LowerUKernelOpsToCalls -->ukernelcall;
ukernelcall-- ConvertToLLVM -->codegenll;
codegenll-->bitcodelinking;
genericsource-- clang -emit-llvm --> genericbitcode -- llvm-link --> ukernelbitcode;
@bjacob
bjacob / README.md
Last active January 22, 2024 20:11
%%{
  init: {
    'theme': 'base',
    'themeVariables': {
      'primaryColor': '#BB2528',
      'primaryTextColor': '#fff',
      'primaryBorderColor': '#7C0000',
      'lineColor': '#F8B229',
      'secondaryColor': '#006100',
@bjacob
bjacob / README.md
Last active February 29, 2024 17:18
Exploring IREE CPU microkernels on a simple matmul example

Exploring IREE CPU microkernels on a simple matmul example

Basic setup, command lines

Source file: matmul.mlir:

func.func @matmul_dynamic(%lhs: tensor<?x?xf32>, %rhs: tensor<?x?xf32>, %acc: tensor<?x?xf32>) -> tensor<?x?xf32> {
  %result = linalg.matmul ins(%lhs, %rhs: tensor<?x?xf32>, tensor<?x?xf32>) outs(%acc: tensor<?x?xf32>) -> tensor<?x?xf32>
  return %result: tensor<?x?xf32>