Skip to content

Instantly share code, notes, and snippets.

Last active Nov 19, 2020
What would you like to do?
How-To: Leverage R Markdown output files to create simple ETL processes on RStudio Connect (These are two separate data assets)
data <- read.csv('')
ui <- fluidPage(
titlePanel("Basic Data Filter Application"),
h4("This application presents data generated by a scheduled R Markdown process: ", tags$a(href="", "See it here!")),
p("Use this framework to build out your own R Markdown-based ETL jobs hosted on RStudio Connect."),
"Attribute A:",
c("All","Positive Values","Negative Values"))
"Attribute B:",
c("All","Positive Values","Negative Values"))
"Attribute C:",
c("All","Positive Values","Negative Values"))
server <- function(input, output) {
# Filter data based on selections
output$table <- renderDataTable(DT::datatable({
if (input$a != "All") {
if (input$a == "Positive Values") {
data <- data[data$a >= 0,]
} else data <- data[data$a < 0,]
if (input$b != "All") {
if (input$b == "Positive Values") {
data <- data[data$b >= 0,]
} else data <- data[data$b < 0,]
if (input$c != "All") {
if (input$c == "Positive Values") {
data <- data[data$c >= 0,]
} else data <- data[data$c < 0,]
# Run the application
shinyApp(ui = ui, server = server)
title: "Output File Framework for R Markdown ETL on RStudio Connect"
output: html_document
- "data.csv"
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
A common use case for [Scheduling on RStudio Connect](, is to use that feature as part of an R-based process to automate scheduled data updates. This example report outputs a CSV file that can be used/consumed by other assets hosted on RStudio Connect.
## R Markdown Output Metadata and Output Files
- [User Guide Reference](
The purpose of this R Markdown document is to make an output data file (updated on a schedule) available over HTTP on my RStudio Connect server.
### Extract/Transform Data
df <- data.frame(a=rnorm(50), b=rnorm(50), c=rnorm(50), d=rnorm(50), e=rnorm(50))
Every time this report is executed, it creates a new random data frame. _Creating dummy data is not representative of a typical ETL process._ You'll likely want to replace this section with code that pulls data from a database or API.
- Best practices for working with databases can be found at [](
- The `httr` package is a [good place to start]( when working with REST APIs and the http protocol
### Show a nice table preview (optional)
```{r message=FALSE, warning=FALSE}
df %>%
sample_n(6) %>%
gt() %>%
title = "Current Data Sample"
### Write data (CSV file) Important!
write.csv(df, "data.csv", row.names=FALSE)
This is the step that creates the data.csv output file. There are two ways to specify output files:
- List file names in the R Markdown YAML header under `rmd_output_metadata` and `rsc_output_files` _(done above)_
- List the output files from within the R code chunk
Reference: [How to work with output files](
### Download data
#### Here is the data generated from this report: [data.csv](data.csv)

This comment has been minimized.

Copy link
Owner Author

@kellobri kellobri commented Apr 22, 2019


This comment has been minimized.

Copy link

@cderv cderv commented Apr 25, 2019

Really nice example ! thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment