merrymercy/feature.md

## feature.md

      
    Raw
  

              feature.md
            
          
    Features

Loop-based Feature

For an IterVar (or an axis), it has three kinds of features

axis attribute
arithmetic feature
touch feature

Axis Attribute

Axis attribute describes the basic attributes of an axis.
It is a tuple(length, nest_level, topdown, bottomup, annotation_type)

length is int
nest_level is int
topdown is int. topdown(x) = np.product([a.length for a is x's outer loop ])
bottomup is int. bottomup(x) = np.product([a.length for a is x's inner loop ])
annotation_type is enum {BLOCK_X, BLOCK_Y, BLOCK_Z, THREAD_X, THREAD_Y, THREAD_Z, PARALLEL, UNROLL, VECTORIZE, SERIAL}
in one-hot encoding

Arithmetic Feature

Arithmetic feature counts the number of float add/sub, mul, div/mod under the loop scope.
It is a tuple(n_add, n_mul, n_dic)
Touch Feature

Memory touch feature records the memory access pattern of all accessed buffers under the loop scope.
We define
touch_pattern = tuple(stride, mod, count, reuse, T_count, T_reuse)
touch_feature = dict(buf -> touch_pattern)


stride, mod is extracted from the index pattern buf[(stride * var) % mod + other]
count is the number of touched element, reuse is the reuse ratio, reuse = bottom_up / count
T_count is valid for parallel axis. It is the number of touched elements by all threads in a round.
It is equal to count if we reorder the thread axes into innermost.
T_reuse is the reuse ratio for T_count.

API


feature.get_itervar_feature(s, args)

The return value is a list, the format is
(
(
  ('_itervar_',  var),
  ('_attr_',     length, nest_level, topdown, bottomup, one_hot_annotation),
  ('_arith_',    add_ct, mul_ct, div_ct),
  ('data_vec_0', stride, mod, count, reuse, thread_count, thread_reuse),
  ('conv_0',     stride, mod, count, reuse, thread_count, thread_reuse),
),
(
  ('_itervar_',    var2),
  ('_attr_',       length, nest_level, one_hot_annotation),
  ('_arith_',      add_ct, mul_ct, div_ct),
  ('kernel_vec_0', stride, mod, count, reuse, thread_count, thread_reuse),
  ('conv_1',       stride, mod, count, reuse, thread_count, thread_reuse),
)
...
)

For the GEMM examples (see code), IR is
produce C {
  for (x, 0, 128) {
    for (y, 0, 128) {
      # C_0
      C[((x*128) + y)] = 0.000000f
      for (k, 0, 128) {
        # C_1               C_2                 A_0              B_0
        C[((x*128) + y)] = (C[((x*128) + y)] + (A[((x*128) + k)]*B[(y + (k*128))]))
      }
    }
  }
}

print feature
fea = feature.get_itervar_feature(s, [A, B, C])

for row in fea:
    print("\n---------------------------")
    for pair in row:
        print("%-10s %s" % (pair[0], pair[1:]))
The output is

---------------------------
_itervar_  [y]
_attr_     [128, 1, 128, 2097152, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
_arith_    [0, 0, 0]
A_0        [128, -1, 16384, 128, 0, 0]
B_0        [0, -1, 16384, 128, 0, 0]
C_0        [128, -1, 16384, 1, 0, 0]
C_1        [128, -1, 16384, 128, 0, 0]
C_2        [128, -1, 16384, 128, 0, 0]

---------------------------
_itervar_  [x]
_attr_     [128, 2, 16384, 16384, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
_arith_    [0, 0, 0]
A_0        [0, -1, 128, 128, 0, 0]
B_0        [1, -1, 16384, 1, 0, 0]
C_0        [1, -1, 128, 1, 0, 0]
C_1        [1, -1, 128, 128, 0, 0]
C_2        [1, -1, 128, 128, 0, 0]

---------------------------
_itervar_  [k]
_attr_     [128, 3, 2097152, 128, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
_arith_    [1, 1, 0]
A_0        [1, -1, 128, 1, 0, 0]
B_0        [128, -1, 128, 1, 0, 0]
C_1        [0, -1, 1, 128, 0, 0]
C_2        [0, -1, 1, 128, 0, 0]


feature.flatten_itervar_feature(s, args)
feature.get_flatten_name(s, args)

we can flatten features and get their names by

fea = feature.get_itervar_feature(s, [A, B, C])

flatten = feature.flatten_itervar_feature(fea)   # Array of float
names = feature.get_flatten_name(fea)            # Array of str