Skip to content

Instantly share code, notes, and snippets.

@thisisnic
Last active November 23, 2018 15:06
Show Gist options
  • Save thisisnic/2733829485ead4cc8732558d813a33ec to your computer and use it in GitHub Desktop.
Save thisisnic/2733829485ead4cc8732558d813a33ec to your computer and use it in GitHub Desktop.
There are a few different ranking functions in dplyr. Here are 2 examples: use ntile() to rank values into n buckets, or percent_rank() to rank values on a scale between 0 and 1.

Code:

library(dplyr)
mtcars %>%
  mutate(efficiency_group = ntile(mpg, 5)) %>%
  head()

Output:

  mpg cyl disp  hp drat    wt  qsec vs am gear carb efficiency_group
1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4                3
2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4                3
3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1                4
4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1                4
5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2                3
6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1                3

Code:

mtcars %>%
  mutate(efficiency = percent_rank(mpg)) %>%
  head()

Output:

   mpg cyl disp  hp drat    wt  qsec vs am gear carb efficiency
1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4  0.5806452
2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4  0.5806452
3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1  0.7419355
4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1  0.6451613
5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2  0.4516129
6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1  0.4193548
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment