Skip to content

Instantly share code, notes, and snippets.

@sebastiansauer
Created November 27, 2015 12:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sebastiansauer/57cb2be7e6e0105ea49f to your computer and use it in GitHub Desktop.
Save sebastiansauer/57cb2be7e6e0105ea49f to your computer and use it in GitHub Desktop.
Automated reporting: onepager Latex+R+knitr
\documentclass{article}
\usepackage[margin=2cm,noheadfoot]{geometry}
%\usepackage{fancyhdr}
%\pagestyle{fancyplain}
\usepackage{eso-pic}
%\rhead{\includegraphics[height=5cm]{logo_dd}} % right logo
%\lhead{\includegraphics[height=3cm]{logo_dd}} % right logo
%\renewcommand{\headrulewidth}{0pt} % remove rule below header
\date{}
\pagenumbering{gobble}
\renewcommand\floatpagefraction{.9}
\renewcommand\dblfloatpagefraction{.9} % for two column documents
\renewcommand\topfraction{.9}
\renewcommand\dbltopfraction{.9} % for two column documents
\renewcommand\bottomfraction{.9}
\renewcommand\textfraction{.1}
\setcounter{totalnumber}{50}
\setcounter{topnumber}{50}
\setcounter{bottomnumber}{50}
<<init, echo = FALSE, include = FALSE>>=
library(nycflights13)
data(flights)
# data(planes)
library(dplyr)
library(ggplot2)
library(tidyr)
library(lubridate)
library(gridExtra)
library(xtable)
library(knitr)
# choose quarter
active_quarter <- 4
# add date, quarter (1-4), and (sorted) month abbreviations to dataframe
flights <- flights %>%
mutate(date = ymd(paste(year, month, day, sep="-")),
week = week(paste(date)),
quarter = cut(month, breaks = 4, labels = c(1:4)),
month_abb = factor(month.abb[flights$month], month.abb, ordered = TRUE)) %>%
filter(quarter == active_quarter) %>%
filter(week != 53) %>%
na.omit() # delete rows with NAs
flights_active_quarter <- nrow(flights %>% filter(quarter == active_quarter))
flights_jfk <- nrow(flights %>% filter(origin == "JFK"))
flights_lga <- nrow(flights %>% filter(origin == "LGA"))
flights_ewr <- nrow(flights %>% filter(origin == "EWR"))
@
\newcommand\AtPagemyUpperLeft[1]{\AtPageLowerLeft{%
\put(\LenToUnit{0.8\paperwidth},\LenToUnit{0.8\paperheight}){#1}}}
\AddToShipoutPictureFG{
\AtPagemyUpperLeft{{\includegraphics[width=5cm,keepaspectratio]{logo_dd}}}
}%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{document}
\title{\vspace{-2cm}Analysis of flights from NYC in 2013;\\
QUARTER: \Sexpr{active_quarter}}
\maketitle
\section*{Background and Objective}
This reports analyses the number of flights with origin one of the
airports of NYC.
\section*{Main figures}
In the present quarter (\textbf{Quarter: \Sexpr{active_quarter}}) there were a total of
\textbf{\Sexpr{flights_active_quarter}} flights. From \textbf{JFK} started
\textbf{\Sexpr{flights_jfk}} flights. From \textbf{LGA} started
\textbf{\Sexpr{flights_lga}} flights. From \textbf{EWR} started
\textbf{\Sexpr{flights_ewr}} flights.
Non eram nescius,
Brute, cum, quae summis ingeniis exquisitaque doctrina philosophi
Graeco sermone tractavissent, ea Latinis litteris mandaremus,
fore ut hic noster labor in varias reprehensiones incurreret.
<<figs_define, echo = FALSE>>=
# how many flights per origin airport and per month?
f_1 <- flights %>%
filter(quarter == active_quarter) %>%
select(origin, month_abb) %>%
group_by(month_abb, origin) %>%
summarise(flight_count = n())
p_f_5 <- f_1 %>%
ggplot(aes(x = origin, y = flight_count)) + geom_bar(stat = "identity") +
facet_wrap(~ month_abb) +
# ggtitle("Number of flights per month from each NYC origin airport") +
theme(axis.title.y = element_text(angle = 0)) +
coord_flip() +
scale_y_continuous(breaks = c(10000))
by_week <- flights %>%
filter(quarter == active_quarter) %>%
group_by(week=week(date), origin) %>%
summarize(n=n(), delay=mean(arr_delay), n_dests=n_distinct(dest))
p_f_7 <- by_week %>%
ggplot(aes(week, n, colour = origin)) + geom_line() +
# ggtitle(paste("Number of flights per week of the year per origin airport; quarter", active_quarter)) +
theme(axis.title.y = element_text(angle = 0))
@
\begin{figure}[h!]
\centering
<<figs_print, fig.height=2, out.width='1\\linewidth', echo = FALSE>>=
grid.arrange(p_f_5, p_f_7, ncol = 2)
@
\caption{Number of flights per month (left panel) and week (right panel)
from each NYC origin airport}
\end{figure}
<<compute_top10, echo = FALSE>>=
# top 10 delay flights
top_10_delayed_flights <-
flights %>%
mutate(delay = arr_delay - dep_delay) %>%
select(carrier, tailnum, flight, origin, dest, air_time, date, delay) %>%
arrange(desc(delay)) %>%
top_n(10, delay)
top_delay <- max(top_10_delayed_flights$delay)
@
\section*{Top 10 delayed flights in last quarter}
In this quarter, the \textbf{maximum delay} was \textbf{\Sexpr{top_delay} minutes}.
nam quibusdam, et iis quidem non admodum indoctis, totum hoc displicet philosophari.
quidam autem non tam id reprehendunt, si remissius agatur, sed tantum studium
tamque multam operam ponendam in eo non arbitrantur.
<<print_table_top10, echo = FALSE, results = "asis", warning = FALSE>>=
# kable(top_10_delayed_flights)
options(xtable.comment = FALSE)
suppressMessages(print(xtable(top_10_delayed_flights), type="latex"))
@
\end{document}
@sebastiansauer
Copy link
Author

Use this code to create a automated data-based report. Latex is used for page layout; R for analysis. Knitr brings the folks together...
nyc_flights_automated_reporting_q2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment