group_by illustrative examples
# group_by() ------------------------------------------------------------------------- | |
# Generate a per-carrier summary of hflights with the following variables: n_flights, | |
# the number of flights flown by the carrier; n_canc, the number of cancelled flights; | |
# p_canc, the percentage of cancelled flights; avg_delay, the average arrival delay of | |
# flights whose delay does not equal NA. Next, order the carriers in the summary from | |
# low to high by their average arrival delay. Use percentage of flights cancelled to | |
# break any ties. Which airline scores best based on these statistics? | |
hflights %>% | |
group_by(UniqueCarrier) %>% | |
summarise(n_flights = n(), n_canc = sum(Cancelled), p_canc = 100*n_canc/n_flights, | |
avg_delay = mean(ArrDelay, na.rm = TRUE)) %>% arrange(avg_delay) | |
# Generate a per-day-of-week summary of hflights with the variable avg_taxi, | |
# the average total taxiing time. Pipe this summary into an arrange() call such | |
# that the day with the highest avg_taxi comes first. | |
hflights %>% | |
group_by(DayOfWeek) %>% | |
summarize(avg_taxi = mean(TaxiIn + TaxiOut, na.rm = TRUE)) %>% | |
arrange(desc(avg_taxi)) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment