Skip to content

Instantly share code, notes, and snippets.

@martinctc
Last active November 1, 2021 23:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save martinctc/54b7591f65c84442747fe1ead7724c16 to your computer and use it in GitHub Desktop.
Save martinctc/54b7591f65c84442747fe1ead7724c16 to your computer and use it in GitHub Desktop.
[Rank a data frame with a grouping variable using entirely base R] #R
#' @title
#' Rank a data frame by grouping variable using base R
#'
#' @description
#' This function ranks a specified column in a data frame by group using entirely base R functions.
#' The underlying function is `rank()`, where additional arguments can be passed with `...`.
#' The grouping variable is specified as a string using the argument `group_var`, and the variable to rank is
#' specified using the argument `rank_var`. The operation is analogous to using `group_by()` followed by
#' `mutate()` in {dplyr}.
#' See example below using the base dataset `iris`.
#'
#' @param data data frame to be passed through
#' @param group_var string containing the name of grouping column
#' @param rank_var string containing the name of the column to rank
#' @param ... additional arguments to pass to `rank()`
#'
#' @examples
#' rank_by_group(data = iris, group_var = "Species", rank_var = "Sepal.Length")
rank_by_group <- function(data, group_var, rank_var, ...){
# Vector of grouping variable (unique)
group_chr <- unique(data[[group_var]])
# List of data frames
l_df <- split(x = data, f = group_chr)
# Ranked list of data frames
l_df_ranked <-
lapply(l_df,
function(df_x) {
df_x[[rank_var]] <- rank(df_x[[rank_var]], ...)
df_x
})
# Row bind list of data frames
Reduce(rbind, l_df_ranked)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment