Skip to content

Instantly share code, notes, and snippets.

@thisisnic
thisisnic / dplyr::lag.md
Last active November 23, 2018 15:05
A lesser-known dplyr function: lag(). Use lag() to get the previous value in a vector. The equivalent for getting the next value is lead(). I've not used them often, but when I have they've saved me from writing horrible workarounds!

Code:

library(dplyr)
BOD

Output:

  Time demand
1    1    8.3
2 2 10.3
@thisisnic
thisisnic / dplyr::ntile and dplyr::percent_rank.md
Last active November 23, 2018 15:06
There are a few different ranking functions in dplyr. Here are 2 examples: use ntile() to rank values into n buckets, or percent_rank() to rank values on a scale between 0 and 1.

Code:

library(dplyr)
mtcars %>%
  mutate(efficiency_group = ntile(mpg, 5)) %>%
  head()

Output:

@thisisnic
thisisnic / dplyr::filter_all, dplyr::filter_if, dplyr::all_vars, dplyr::any_vars.md
Last active November 23, 2018 15:07
If you want to apply the same filter to multiple columns, you can use filter_all() and filter_if() with all_vars() & any_vars().

Code:

library(dplyr)
filter_if(iris, is.numeric, all_vars(.>2.4))

Output:

  Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
1          6.3         3.3          6.0         2.5 virginica
2 7.2 3.6 6.1 2.5 virginica
@thisisnic
thisisnic / lubridate::%m+%.md
Last active November 23, 2018 15:18
A favourite overlooked tidyverse function of mine is %m+% from #lubridate which allows you to do maths with dates, whilst accounting for the fact that time periods are not uniform (not all years have 365 days, months differ in length etc).

Adding a month on with a '+' is fairly straightforward.

Code:

library(lubridate)
ymd("2018-01-01") + months(1)

Output:

[1] "2018-02-01"
@thisisnic
thisisnic / lubridate::rollback.md
Created November 23, 2018 15:22
Another lubridate function that has saved me from writing some really janky code is rollback() which allows you to roll a date back to the last day of the previous month or first day of the month!

Code:

library(lubridate)
my_date <- ymd("2018-11-23")
rollback(my_date)

Output:

[1] "2018-10-31"
@thisisnic
thisisnic / tidyr::separate_rows.md
Last active November 27, 2018 08:17
If you have repeated observations in one column, you could transform these into separate rows by using tidyr::separate(), tidyr::gather(), and dplyr::select(). However, in this specific case, tidyr::separate_rows() is a simpler solution!

Code:

library(tidyr)
library(dplyr)
test_scores <- data_frame(student = c("Amy", "Belle", "Candice"),
                          score = c("75-81-86", "77-70-82", "90-91-91"))
test_scores

Output:

@thisisnic
thisisnic / tibble::glimpse.md
Last active November 29, 2018 08:54
Use tibble::glimpse() instead of str() to get a quick view of data frames that can handle list columns.

Code:

str(starwars)

Output:

# Lots of output up here
...
  ..$ : chr 
  ..$ : chr 
@thisisnic
thisisnic / tidyr::unnest.md
Created November 29, 2018 09:01
When working with list-columns you can use the parameters to tidyr::unnest() to specifty whether to keep or drop other list-columns.

Unnest - unpacking list columns

library(dplyr)
library(tidyr)

# Use this to show list columns
glimpse(starwars)

# Unnest 'films' column, drop other list column
by_film &lt;- unnest(starwars, films) 
@thisisnic
thisisnic / tidyr::uncount.md
Last active December 7, 2018 18:41
tidyr::uncount() might come in handy if you want to transform a summary table to individual rows

Code:

library(tidyr)
library(tibble)
df <- data_frame(animal = c("cat", "dog"), toy = c("ball", "stick"), total = c(5, 6))
df

Output:

# A tibble: 2 x 3
@thisisnic
thisisnic / dplyr::left_join, dplyr::inner_join, dplyr::anti_join, dplyr::semi_join, dplyr::full_join.md
Last active December 7, 2018 18:41
It took me a long time to wrap my head around the different types of joins when I first started learning them, so here's a few examples with some excellent mini datasets from dplyr designed specifically for this purpose!

Datasets

Code:

library(dplyr)
band_members

Output: