I've tried to make SequentialDataset
support Cython fused types, but it seems really expensive.
You can find the modified code in this branch.
tl;dr - seq_dataset.pyx
is heavily bound with sag_fast.pyx
, sgd_fast.pyx
.
After I modified seq_dataset.pyx
, this line in sag_fast.pyx
requires to change as well since this pointer is passed into SequentialDataset
's function. However, my past experience is that one can only declare local floating
variable when at least one of the function's argument variable also belongs to floating
type. Nonetheless, that's not the case here, unless we make this function's arguments
np.ndarray[double, ndim=2, mode='c'] weights_array
np.ndarray[double, ndim=1, mode='c'] intercept_array
become floating
types as well, i.e., changed double
to floating
.
However, doing so will require a comprehensive changes in three Cython files I mentioned above.
Further, it can become more complicated since ArrayDataset
inherits from SequentialDataset
and we know that inheritance doen't work well when combing with fused types.
Also it's is a little bit weird to degrade the precision of weights
and y
just because we need to use fused types in this function.
Considering above reasons, I guess I'll have a look at Neighbors Trees first.
In which branch?