Skip to content

Instantly share code, notes, and snippets.

@jeroenjanssens
Created April 12, 2017 10:12
Show Gist options
  • Save jeroenjanssens/1c628c7e07429e5f0f2245e8598ec8e9 to your computer and use it in GitHub Desktop.
Save jeroenjanssens/1c628c7e07429e5f0f2245e8598ec8e9 to your computer and use it in GitHub Desktop.
R function to keep rows belonging to top n groups of a certain column
library(tidyverse)
keep_top_n <- function(df, col, n = 10) {
semi_join(df, head(count_(df, col, sort = TRUE), n))
}
data(mpg)
# All car models
mpg %>% nrow()
# Just car models of top three manufacturers
mpg %>% keep_top_n(~manufacturer, 3) %>% nrow()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment