Author: Srikanth KS
You might need minor modifications
For ubuntu 16.04
:
sudo apt-get install build-essential cmake
sudo apt-get install libgtk-3-dev
# dataset: https://zenodo.org/records/2594012 | |
df = arrow::read_parquet("personal/Avazu/test.parquet") |> | |
tibble::as_tibble() | |
dim(df) # 4,218,938 X 24 | |
res = | |
bench::mark( | |
# using `duckplyr` | |
duckplyr = { |
------------------------------------------------------------------- | |
Instructions to make a R script/program executable on *nix machines | |
------------------------------------------------------------------- | |
Author: Srikanth KS | |
Date : 12th February 2018 | |
- We assume that R, Rscript and required R packages are installed. | |
- 'Rscript' is a program to execute scripts/programs written in R language. | |
- Add this shebang(without quotes) as the first line of the script: '#!/usr/bin/env Rscript'. |
# Read 20newsgroups data as a datatable (dataframe) | |
# Author: Srikanth KS | |
# license: GPL-3 | |
# | |
# download data from here: | |
# https://archive.ics.uci.edu/ml/machine-learning-databases/20newsgroups-mld/20_newsgroups.tar.gz | |
# extract it and provide its location to `baseDir` on line 9 | |
baseDir = "Downloads/20_newsgroups" | |
newsGroupNames = list.files(baseDir, full.names = TRUE) |
#' @title cutq | |
#' @description Discretize a numeric vector along quantiles | |
#' @param vec numeric/integer vector | |
#' @param n number of buckets (atleast two) | |
#' @param ... extra named arguments passed to `cut` | |
#' @return A factor | |
#' @details By passing extra arguments to `cut`, output can be styled | |
cutq = function(vec, n = 10, ...){ | |
stopifnot(inherits(vec, "numeric") || inherits(vec, "integer")) |
############################################################################### | |
# | |
# cor2 | |
# -- Compute correlations of columns of a dataframe of mixed types | |
# | |
############################################################################### | |
# | |
# author : Srikanth KS (talegari) | |
# license : GNU AGPLv3 (http://choosealicense.com/licenses/agpl-3.0/) | |
# |
#' @title require2 | |
#' | |
#' @author Srikanth KS (talegari), gmail at sri dot teach GNU AGPLv3 | |
#' (http://choosealicense.com/licenses/agpl-3.0/) | |
#' | |
#' @param pkgname a string (character vector of length 1) | |
#' @param similar an positive integer indicating the number of similar package | |
#' names to be suggested, if the match is not found | |
#' | |
#' @description The function attaches and loads a R library, if present and |
sudo add-apt-repository ppa:edd/misc
sudo apt-get update
sudo apt-get install libmlpack-dev libboost-all-dev libboost-program-options-dev libboost-serialization-dev libarmadillo-dev r-cran-rcpp r-cran-rcpparmadillo
in R
: devtools::install_github("eddelbuettel/rcppmlpack2")
A list of R libraries for Recommender systems. Most of the libraries are good for quick prototyping.
Maintainer: Srikanth KS(talegari) Email: gmail me at sri dot teach (do write to me about packages ommited)
Package | Dev page | Description |
---|---|---|
recommenderlab | github | Provides a research infrastructure to test and develop recommender algorithms including UBCF, IBCF, FunkSVD and association rule-based algorithms |
rrecsys | github | Implementations of several popular recommendation systems like Global/Item/User-Average baselines, Item-Based KNN, FunkSVD, BPR and weighted ALS for rapid prototyping |
# how much memory did an execution use in MBs | |
mem_usage <- function(...){ | |
exprs <- as.list(match.call(expand.dots = FALSE)$...) | |
invisible(gc(reset = TRUE)) | |
start_mem <- sum(gc()[,2]) | |
lapply(exprs, eval, parent.frame()) | |
max_mem <- sum(gc()[,6]) |