Skip to content

Instantly share code, notes, and snippets.

@yabyzq
Created October 24, 2016 13:15
Show Gist options
  • Save yabyzq/55201e67b7808b4e18f10b5b242e389c to your computer and use it in GitHub Desktop.
Save yabyzq/55201e67b7808b4e18f10b5b242e389c to your computer and use it in GitHub Desktop.
Basic dplyr
library(dplyr)
library(nycflights13)
#look at the data
tbl_df(flights)
#filter
filter(flights, month == 1, day ==1,!is.na(month))
#arrange
arrange(flights, desc(year))
#select
select(flights, year:day, starts_with("day"))
#distinct
distinct(select(flights, origin, dest))
#mutate/#transform
mutate(flights, gain = arr_delay - dep_delay, speed = distance/air_time * 60)
#summarise
summarise(flights, delay = mean(dep_delay, na.rm = TRUE),
count = n(), count_distinct=n_disatinct())
#group_by
group_by(flights, tailnum)
#%>%
flights %>% filter(month ==1, day ==1) %>% select(year, day)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment