Skip to content

Instantly share code, notes, and snippets.

@arghdos arghdos/test.py Secret

Created Jun 2, 2017
Embed
What would you like to do?
broken leafs
import loopy as lp
import pyopencl as cl
import numpy as np
from loopy.kernel.data import temp_var_scope as scopes
ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)
limit_max = 10
k_lim = 1000
l_lim = 100
ref_knl = lp.make_kernel([
'{{[k]: 0 <= k < {}}}'.format(k_lim),
'{{[l]: 0 <= l < {}}}'.format(l_lim),
'{[i]: start <= i < end}',
'{[j]: start <= j < end}'],
"""
for k
for l
<> start = limits[k, l]
<> end = limits[k + 1, l]
out[k, l] = 1
for j
for i
out[k, l] = out[k, l] * a[i, j]
end
end
end
end
""",
[lp.GlobalArg('a', shape=(limit_max, limit_max), dtype=np.float64),
lp.TemporaryVariable('limits', shape=(k_lim * 2, l_lim), dtype=np.int64,
initializer=np.random.randint(0, high=limit_max, size=(k_lim * 2, l_lim)),
scope=scopes.PRIVATE, read_only=True),
lp.GlobalArg('out', shape=(k_lim, l_lim))]
)
print(lp.generate_code(ref_knl)[0])
@arghdos

This comment has been minimized.

Copy link
Owner Author

arghdos commented Jun 2, 2017

Generates the error:

loopy.diagnostic.LoopyError: sanity check failed--implemented and desired domain for instruction 'insn_2' do not match

implemented: [end, start] -> { [k, l, j, i] : 0 <= k <= 999 and 0 <= l <= 99 and start <= j < end and start <= i < end }

desired:[end, start] -> { [k, l, j, i] : 0 <= k <= 999 and 0 <= l <= 99 and start <= j < end }

sample point in desired but not implemented: start=1, end=2, i=0, k=0, j=1, l=0
gist of constraints in implemented but not desired: [end, start] -> { [k, l, j, i] : start <= i < end }
@arghdos

This comment has been minimized.

Copy link
Owner Author

arghdos commented Jun 2, 2017

Changing the domain order to

[
    '{[i]: start <= i < end}',
    '{[j]: start <= j < end}',
    '{{[k]: 0 <= k < {}}}'.format(k_lim),
    '{{[l]: 0 <= l < {}}}'.format(l_lim)]

generates the expected code:

for (int l = 0; l <= 99; ++l)
  for (int k = 0; k <= 999; ++k)
  {
    out[100 * k + l] = 1.0;
    end = limits[100 * (1 + k) + l];
    start = limits[100 * k + l];
    for (int j = start; j <= -1 + end; ++j)
      for (int i = start; i <= -1 + end; ++i)
        out[100 * k + l] = out[100 * k + l] * a[10 * i + j];
  }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.