Skip to content

Instantly share code, notes, and snippets.

@jbochi
Last active October 18, 2023 07:52
Show Gist options
  • Save jbochi/2e8ddcc5939e70e5368326aa034a144e to your computer and use it in GitHub Desktop.
Save jbochi/2e8ddcc5939e70e5368326aa034a144e to your computer and use it in GitHub Desktop.
Recommending GitHub repositories with Google Big Query and implicit library: https://medium.com/@jbochi/recommending-github-repositories-with-google-bigquery-and-the-implicit-library-e6cce666c77
Display the source blob
Display the rendered blob
Raw
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@seb799
Copy link

seb799 commented Jan 17, 2019

@joddm @antonioalegria
From my understanding, p in LeavePOutByGroup() should be <= to the (minimum number of items per user)/2.
For exemple, if your dataset has a user with activity for only 4 items, p should be <= 2.

Either you rebuild your dataset to include only users with activity for more products, or you filter out users with less than p*2 products from the test sets.

Hope that makes sense

It resolved the index out of bound error on my end.

See also

@DaStapo
Copy link

DaStapo commented Aug 23, 2020

If my dataset is mostly just 2 items per users, I assume LeavePOutByGroup is not the way to go? Because if I understand correctly, this would mean that each split would have mostly 1 item per users and therefore the model has nothing to learn.

@kylemcmearty
Copy link

@jbochi what is the license on this gist?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment