Skip to content

Instantly share code, notes, and snippets.

@davebraze
Last active June 30, 2019 17:47
Show Gist options
  • Save davebraze/d15f9b5adeab4c1373e80f068dea12a4 to your computer and use it in GitHub Desktop.
Save davebraze/d15f9b5adeab4c1373e80f068dea12a4 to your computer and use it in GitHub Desktop.
Access yaml variables within rmarkdown file

Given a yaml block like this:

---
title: My Data Summary (DRAFT)
subtitle: Fall 2019 Data
author: David Braze
institute: Haskins Laboratories
date: "`r format(Sys.time(), '%B %d, %Y')`"

fontsize: 11pt
geometry: margin=1in
documentclass: article
toc: false

output:
  html_document:
    highlight: tango
    pandoc_args:
    - --output
    - !expr paste0("data-summary-", format(Sys.time(), "%Y%m%d"), ".html")

  pdf_document:
    highlight: tango
    latex_engine: xelatex
    number_sections: false
    pandoc_args:
    - --output
    - !expr paste0("data-summary-", format(Sys.time(), "%Y%m%d"), ".pdf")

  word_document:
    highlight: tango
    pandoc_args:
    - --reference-doc
    - resources/FDB-styles.docx
    - --output
    - !expr paste0("data-summary-", format(Sys.time(), "%Y%m%d"), ".docx")

---

We can access the entire set of yaml definitions from within R, as a list, like so:

rmarkdown::metadata

This yields a nested list that can be parsed for the desired components

## $title
## [1] "My Data Summary (DRAFT)"
## 
## $subtitle
## [1] "Fall 2019 Data"
## 
## $author
## [1] "David Braze"
## 
## $institute
## [1] "Haskins Laboratories"
## 
## $date
## [1] "`r format(Sys.time(), '%B %d, %Y')`"
## 
## $fontsize
## [1] "11pt"
## 
## $geometry
## [1] "margin=1in"
## 
## $documentclass
## [1] "article"
## 
## $toc
## [1] FALSE
## 
## $output
## $output$html_document
## $output$html_document$highlight
## [1] "tango"
## 
## $output$html_document$pandoc_args
## [1] "--output"                   "data-summary-20190630.html"
## 
## 
## $output$pdf_document
## $output$pdf_document$highlight
## [1] "tango"
## 
## $output$pdf_document$latex_engine
## [1] "xelatex"
## 
## $output$pdf_document$number_sections
## [1] FALSE
## 
## $output$pdf_document$pandoc_args
## [1] "--output"                  "data-summary-20190630.pdf"
## 
## 
## $output$word_document
## $output$word_document$highlight
## [1] "tango"
## 
## $output$word_document$pandoc_args
## [1] "--reference-doc"            "resources/FDB-styles.docx" 
## [3] "--output"                   "data-summary-20190630.docx"

So, the title can be accessed as a string as follows (returns "My Data Summary (DRAFT)"):

rmarkdown::metadata$title

For single element, non-nested parts of the yaml definitions, this works in either an R code block or from inline R code. But for return values that are nested lists, it will only work in a code block, not inline code.

One of the reasons I'm interested in access yaml definitions from R code is so that I can tailor the formatting of, for example, tables or graphics to the particular output format (e.g., pdf vs html vs docx). A call to rmarkdown::render(), with the rmarkdown file in question as input argument, will make use of the first format specification in the output list. The name of that specification can be accessed as shown below, which for the yaml block above returns "html_document":

names(rmarkdown::metadata$output)[1]

In order to access the content of the first list in "output" you would call:

rmarkdown::metadata[1]

which returns:

## $html_document
## $html_document$highlight
## [1] "tango"
## 
## $html_document$pandoc_args
## [1] "--output"                   "data-summary-20190630.html"

One thing to notice is that some of the R code in the yaml block is parsed, and some isn't. I don't fully understand the difference, but observe that for R code embedded as `r X`, X is not parsed, but for R code embedded as !expre X, then X is parsed and replaced with its return value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment