Skip to content

Instantly share code, notes, and snippets.

@nschloe
Created January 25, 2022 15:28
Show Gist options
  • Save nschloe/b4432cd5c828a465b101a0edbfa812b5 to your computer and use it in GitHub Desktop.
Save nschloe/b4432cd5c828a465b101a0edbfa812b5 to your computer and use it in GitHub Desktop.
perfplot: multi dotproduct comparison
import numpy as np
import perfplot
def setup(n):
x = np.random.rand(n, k)
y = np.random.rand(n, k)
xt = np.ascontiguousarray(x.T)
yt = np.ascontiguousarray(y.T)
return x, y, xt, yt
def dot_diag(x, y, *_):
return np.dot(x.T, y).diagonal()
def einsum(x, y, *_):
return np.einsum("ij,ij->j", x, y)
def dot_for(x, y, *_):
return np.array([np.dot(xx, yy) for xx, yy in zip(x.T, y.T)])
def dot_for_T(_, __, xt, yt):
return np.array([np.dot(xx, yy) for xx, yy in zip(xt, yt)])
def sum_axis(x, y, *_):
return np.sum(x * y, axis=0)
def sum_axis_T(_, __, xt, yt):
return np.sum(xt * yt, axis=1)
k = 20
b = perfplot.bench(
setup=setup,
kernels=[dot_diag, einsum, dot_for, dot_for_T, sum_axis, sum_axis_T],
n_range=[2 ** k for k in range(22)],
)
b.save("out.png")
b.show()
@nschloe
Copy link
Author

nschloe commented Jan 25, 2022

Simply iterating over the arrays and forming the individual dot products is fastest if the arrays are contiguous, e.g., if the dot products are formed in the rows of a contiguous numpy array. From about 10 rows on, einsum becomes competetive, too.

out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment