Skip to content

Instantly share code, notes, and snippets.

@FedericoV
Created May 28, 2015 08:06
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save FedericoV/0e7d6d8c8794a99a7a42 to your computer and use it in GitHub Desktop.
Save FedericoV/0e7d6d8c8794a99a7a42 to your computer and use it in GitHub Desktop.
Cosine Similarity that handles NaN with Numba
import numba
@numba.jit(target='cpu', nopython=True)
def fast_cosine(u, v):
m = u.shape[0]
udotv = 0
u_norm = 0
v_norm = 0
for i in range(m):
if (np.isnan(u[i])) or (np.isnan(v[i])):
continue
udotv += u[i] * v[i]
u_norm += u[i] * u[i]
v_norm += v[i] * v[i]
u_norm = np.sqrt(u_norm)
v_norm = np.sqrt(v_norm)
if (u_norm == 0) or (v_norm == 0):
ratio = 1.0
else:
ratio = udotv / (u_norm * v_norm)
return ratio
@LucaCappelletti94
Copy link

Have you considered adding this to Sklearn?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment