Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?

I've tried to make SequentialDataset support Cython fused types, but it seems really expensive. You can find the modified code in this branch.

tl;dr - seq_dataset.pyx is heavily bound with sag_fast.pyx, sgd_fast.pyx.

After I modified seq_dataset.pyx, this line in sag_fast.pyx requires to change as well since this pointer is passed into SequentialDataset's function. However, my past experience is that one can only declare local floating variable when at least one of the function's argument variable also belongs to floating type. Nonetheless, that's not the case here, unless we make this function's arguments

np.ndarray[double, ndim=2, mode='c'] weights_array
np.ndarray[double, ndim=1, mode='c'] intercept_array

become floating types as well, i.e., changed double to floating.

However, doing so will require a comprehensive changes in three Cython files I mentioned above.

Further, it can become more complicated since ArrayDataset inherits from SequentialDataset and we know that inheritance doen't work well when combing with fused types.

Also it's is a little bit weird to degrade the precision of weights and y just because we need to use fused types in this function.

Considering above reasons, I guess I'll have a look at Neighbors Trees first.

In which branch?

Owner

Sorry for the late late reply, this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment