Skip to content

Instantly share code, notes, and snippets.

@vanatteveldt
Last active October 5, 2022 15:49
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save vanatteveldt/a8ad49918036937ea60166b923e4c041 to your computer and use it in GitHub Desktop.
Save vanatteveldt/a8ad49918036937ea60166b923e4c041 to your computer and use it in GitHub Desktop.
#' Run an SVD for collaborative filtering and process the results to be more tidyverse-friendly
#' @param ratingsmatrix A item-user review matrix
#' @param ndimensions the number of dimensions to use, defaults to 10
#' @return a list with the original u, d, and v matrices from the svd function and
#' item_values - a long-format tibble with the values per item per dimension
#' user_values - a long-format tibble with the values per user per dimension
#' predictions - a long-format tibble with the predictions per user per item
#' @note (c) 2022 Wouter van Atteveldt, license: CC-0
run_svd = function(ratingsmatrix, ndimensions=10) {
# Run SVD and set rownames on the resulting matrices
udv = svd(ratingsmatrix, ndimensions, ndimensions)
rownames(udv$u) = rownames(ratingsmatrix)
rownames(udv$v) = colnames(ratingsmatrix)
# Convert the u and v matrices to long-format tibbles
udv$item_values = as_tibble(udv$v, rownames='item') |>
pivot_longer(-item, names_to="dimension", values_to="item_value")
udv$user_values = as_tibble(udv$u, rownames='user') |>
pivot_longer(-user, names_to="dimension", values_to="user_value")
# Compute the predictions by re-multiplying the result of the decomposition
# and convert to long-format tibble
udv$d = udv$d[1:ndimensions]
p = (udv$u %*% diag(udv$d) %*% t(udv$v))
udv$predictions = as_tibble(p, rownames='user') |>
pivot_longer(-user, names_to="item", values_to="prediction")
return(udv)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment