njsmith/miniNEP1-where.rst

## miniNEP1-where.rst

      
    Raw
  

              miniNEP1-where.rst
            
          
    A mini-NEP for the where= argument to ufuncs

To try and make more progress on the whole missing values/masked arrays/... debate, it seems useful to have a more technical discussion of the pieces which we can agree on. This is the first, which attempts to nail down the details of the new where= argument to ufuncs.
Rationale

It is often useful to apply operations to a subset of your data, and numpy provides a rich interface for accomplishing this, by combining indexing operations with ufunc operations, e.g.:
a[10, mymask] += b
np.sum(a[which_indices], axis=0)
But any kind of complex indexing necessarily requires making a temporary copy of (parts of) the underlying array, which can be quite expensive, and this copying could be avoided by teaching the ufunc loop to `index as it goes'.
There are strong arguments against doing this. There are tons of cases like this where one can save some memory by avoiding temporaries, and we can't build them all into the core -- especially since we also have more general solutions like numexpr or writing optimized routines C/Fortran/Cython. Furthermore, this case is a clear violation of orthogonality -- we already have indexing and ufuncs as separate things, so adding a second, somewhat crippled implementation of indexing to ufuncs themselves is a bit ugly. (It would be better if we could make sure that anything that could be passed to ndarray.__getitem__ could also be passed to ufuncs with the same semantics, but this would require substantial refactoring and seems unlikely to be implemented any time soon.)
However,
API

A new optional, keyword argument named where= will be added to all ufuncs.
Error checking

If given, this argument must be a boolean array. If f is a ufunc, then given a function call like:
f(a, b, where=mymask)
the following occurs. First, mymask is coerced to an array if necessary, but no type conversion is performed. (I.e., we do np.asarray(mymask).) Next, we check whether mymask is a boolean array. If it is not, then we raise an exception. (In the future it would be nice to support other forms of indexing as well, such as lists of slices or arrays of integer indices. In order to preserve this option, we do not want to coerce integers into booleans.) Next, a and b are broadcast against each other, just as now; this determines the shape of the output array. Then mymask is broadcast to match this output array shape. (The shape of the output array cannot be changed by this process -- for example, having a.shape == (10, 1, 1), b.shape = (1, 10, 1), mymask.shape == (1, 1, 10) will raise an error rather than returning a new array with shape (10, 10, 10).)
Semantics: ufunc __call__

When simply calling a ufunc with an output argument, e.g.:
f(a, b, out=c, where=mymask)
then the result is equivalent to:
c[mymask] = f(a[mymask], b[mymask])
On the other hand, if no output argument is given:
f(a, b, where=mymask)
then an output array is instantiated as if by calling np.empty(shape, dtype=dtype), and then treated as above:
c = np.empty(shape_for(a, b), dtype=dtype_for(f, a, b))
f(a, b, out=c, where=mymask)
return c
Note that this means that the output will, in general, contain uninitialized values.
Semantics: ufunc .reduce

Take an expression like:
f.reduce(a, axis=0, where=mymask)
This performs the given reduction operation along each column of a, but simply skips any elements where the corresponding entry in mymask is false. (For ufuncs which have an identity, this is equivalent to treating the given elements as if they were the identity.) For example, if a is a 2-dimensional array and skipping over the details of broadcasting, dtype selection, etc., the above operation produces the same result as:
out = np.empty(a.shape[1])
for i in xrange(a.shape[1]):
  out[i] = f.reduce(a[mymask[:, i], i])
return out
Semantics: ufunc .accumulate

Accumulation is similar to reduction, except that .accumulate saves the intermediate values generated during the reduction loop. Therefore we use the same semantics as for .reduce above. If a is 2-d etc., then this expression:
f.accumulate(a, axis=0, where=mymask)
is equivalent to:
out = np.empty(a.shape)
for i in xrange(a.shape[1]):
  out[mymask[:, i], i] = f.accumulate(a[mymask[:, i], i])
return out
Notice that once again, elements of out which correspond to False entries in the mask are left unitialized.
Semantics: ufunc .reduceat

I've never used .reduceat, and 30 seconds staring at the documentation was not sufficient to wrap my head around it. It's not obvious to me that a where= argument even make sense for this? If not we could just say that it's not supported.
Semantics: ufunc .outer

This is more complicated -- .outer takes two arrays, which do not necessarily have the same shape. Therefore we need to handle multiple masks, one for each of the arrays. The syntax is:

f.outer(a, b, where=(mymask_for_a, mymask_for_b))

That is, the where= argument is an arbitrary Python sequence, which must contain exactly one mask for each array argument. (For the current implementation, I believe that there are always exactly 2 arrays given, but this way we're prepared in case someone extends the implementation later.) If you don't actually want to mask out anything from one of the arrays, just use a sequence like (True, mymask_for_b); the True will be broadcast to a full array of Trues.
This produces an output array with shape a.shape + b.shape, in which those pairs of elements in a for which the corresponding entry in mymask_for_a is True, and in b for which both corresponding entry in mymask_for_b is True, are combined by f and placed into the appropriate spot. All other entries are left uninitialized.
Unresolved questions

Does it make sense to support where= in .reduceat operations?
How does this interact with iterator support (e.g., the new nditer)? A section should be added, but I'm not the one to write it.