Analysis of parallel matrix-matrix multiplication DAG

Target summary

size(A) = (256, 256)

work: 1.6 ms (single-thread run-time T₁)
span: 109 μs (theoretical fastest run-time Tₒₒ)
View llvm.patch
diff --git a/llvm/lib/CodeGen/DwarfEHPrepare.cpp b/llvm/lib/CodeGen/DwarfEHPrepare.cpp
index 5ca1e91cc5f4..fde7b942665d 100644
--- a/llvm/lib/CodeGen/DwarfEHPrepare.cpp
+++ b/llvm/lib/CodeGen/DwarfEHPrepare.cpp
@@ -1,350 +1,355 @@
//===- DwarfEHPrepare - Prepare exception handling for code generation ----===//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
View 2021-09-29-201752.txt
julia> Threads.nthreads()
julia> suite = SyncBarriersBenchmarks.BenchUniformLoops.setup();
julia> results = run(suite["dissemination"]["spin"])
2-element BenchmarkTools.BenchmarkGroup:
A script to locate libpython associated with the given Python executable.
# This file is machine-generated - editing it directly is not advised
deps = ["LinearAlgebra"]
git-tree-sha1 = "485ee0867925449198280d4af84bdb46a2a404d0"
uuid = "621f4979-c628-5d54-868e-fcf4e3e8185c"
version = "1.0.1"
deps = ["LinearAlgebra"]