Skip to content

Instantly share code, notes, and snippets.

@minstrel271
minstrel271 / Matrix.md
Created May 22, 2018 23:57 — forked from nadavrot/Matrix.md
Efficient matrix multiplication

High-Performance Matrix Multiplication

This is a short post that explains how to write a high-performance matrix multiplication program on modern processors. In this tutorial I will use a single core of the Skylake-client CPU with AVX2, but the principles in this post also apply to other processors with different instruction sets (such as AVX512).

Intro

Matrix multiplication is a mathematical operation that defines the product of

static INPUT: &str = "179 2358 5197 867 163 4418 3135 5049 187 166 4682 5080 5541 172 4294 1397
2637 136 3222 591 2593 1982 4506 195 4396 3741 2373 157 4533 3864 4159 142
1049 1163 1128 193 1008 142 169 168 165 310 1054 104 1100 761 406 173
200 53 222 227 218 51 188 45 98 194 189 42 50 105 46 176
299 2521 216 2080 2068 2681 2376 220 1339 244 605 1598 2161 822 387 268
1043 1409 637 1560 970 69 832 87 78 1391 1558 75 1643 655 1398 1193
90 649 858 2496 1555 2618 2302 119 2675 131 1816 2356 2480 603 65 128
2461 5099 168 4468 5371 2076 223 1178 194 5639 890 5575 1258 5591 6125 226
204 205 2797 2452 2568 2777 1542 1586 241 836 3202 2495 197 2960 240 2880
560 96 336 627 546 241 191 94 368 528 298 78 76 123 240 563
fn func(seq: &[u8]) -> u64 {
let n = seq.len();
let mut sum = 0u64;
for i in 0..n {
let j = (i + 1) % n;
if seq[i] == seq[j] {
sum += seq[i] as u64
}
}
fn func(seq: &[u8]) -> u64 {
let n = seq.len();
let mut sum = 0u64;
for i in 0..n {
let j = (i + 1) % n;
if seq[i] == seq[j] {
sum += seq[i] as u64
}
}
from bisect import bisect_left
from collections import Mapping
from functools import total_ordering
@total_ordering
class frozendict(Mapping):
"""
Sorted immutable mapping with hash and total ordering.
from string import ascii_letters, digits
def checker(c, good=set('-_.() '+ascii_letters+digits)):
return c in good
def valid_filename(filename):
return ''.join(filter(checker, filename)).strip()
import pickle
from itertools import count
from collections import Mapping
class PickleRecorder:
def __init__(self, path):
self.path = path
from itertools import product
def named_product(**kwargs):
"""
Cartesian product of input iterables with keys.
Due to how python handle kwargs, the order of the keys is not conserved.
>>> dict(x=1, y='a') in list(named_product(x=[1, 2], y=['a', 'b']))
True
"""
from collections import Iterable, Mapping
from operator import methodcaller
def flatten(it, map_iter='values'):
"""
Iterator that walk through nested iterable.
If the initial item is not iterable, only itself will be yielded.
Args:
import os
def ensure_dir(path):
d = os.path.dirname(path)
if not os.path.exists(d):
os.makedirs(d)