-
-
Save alexpghayes/6309533d19e76ce4e4771f9f10376484 to your computer and use it in GitHub Desktop.
library(reticulate) | |
library(here) | |
library(glue) | |
library(fs) | |
library(stringr) | |
#' Turn a Jupyter notebook into a blogdown post | |
#' | |
#' This function makes several key assumptions: | |
#' | |
#' - The jupyter notebook you want to turn in a blog post | |
#' lives in `content/jupyter`. | |
#' | |
#' - The name of the `.ipynb` file is `{slug}.ipynb`, using | |
#' `glue` notation. | |
#' | |
#' - `conda` is on your path, and your `base` conda environment | |
#' has jupyter installed. | |
#' | |
#' First executes the notebook (this can take a moment!) from | |
#' scratch, then turns the rendered notebook into an `.md` file | |
#' in `content/blog`. Automatically adds YAML frontmatter | |
#' needed by blogdown, and moves supporting files (such as images) | |
#' to `static/jupyter_support`, fixing paths along the way. | |
#' | |
#' The idea is that you only ever modify the `.ipynb` file for your | |
#' post, and then you call this function to turn that `.ipynb` into | |
#' a `.md` post file. | |
#' | |
#' This function extends ideas from Timothy Lin at | |
#' https://www.timlrx.com/2018/03/25/uploading-jupyter-notebook-files-to-blogdown/ | |
#' | |
rmarkdownify <- function(slug, title) { | |
original_wd <- getwd() | |
setwd(here("content/jupyter")) | |
on.exit(setwd(original_wd)) | |
nb_path <- glue("{slug}.ipynb") | |
render_md <- glue( | |
"conda activate base", | |
"jupyter nbconvert --to markdown --execute {nb_path}", | |
.sep = " && " | |
) | |
shell(render_md) | |
support_path <- paste0(slug, "_files") | |
blogdown_support_dir <- here(glue("static/jupyter_support/{support_path}")) | |
dir_copy(support_path, blogdown_support_dir, overwrite = TRUE) | |
new_post_path <- here(glue("content/blog/{slug}.md")) | |
frontmatter <- glue( | |
"---", | |
"title: '{title}'", | |
"author: 'Alex Hayes'", | |
"date: '{Sys.Date()}'", | |
"slug: {slug}", | |
"---\n\n", | |
.sep = "\n" | |
) | |
post_md_path <- glue("{slug}.md") | |
post_md_text <- readLines(post_md_path) | |
new_support_path <- glue("/jupyter_support/{support_path}") | |
new_post_text <- c( | |
frontmatter, | |
str_replace_all(post_md_text, support_path, new_support_path) | |
) | |
writeLines(new_post_text, new_post_path) | |
file_delete(post_md_path) | |
dir_delete(support_path) | |
} | |
rmarkdownify( | |
slug = "many-models-workflows-in-python-part-i", | |
title = "many models workflows in python: part i" | |
) | |
Thanks, this is much nicer on the whole! I ended up running into rstudio/rstudio#4182 after I switched to this solution. I'm going to end up using my own variant of convert_ipynb
so that I can have more control over the frontmatter. In particular, I'm going to emulate @machow's strategy of a raw cell with YAML frontmatter at the top of the .ipynb
. Once I get that working I can share it here or potentially make a PR to rmarkdown
to increase flexibility around frontmatter.
Okay @yihui I'm struggling to make rmarkdown:::convert_ipynb()
work for me, mostly because I don't want to keep the intermediate .Rmd
-- I exclusively want to work in the original .ipynb
with no double record keeping. So I tried including the following in my build.R
monkey_patch <- paste0(
"```{python, echo = FALSE}\n",
"import os\n",
"os.environ['QT_QPA_PLATFORM_PLUGIN_PATH'] = ",
"'C:/Users/alex/Anaconda3/Library/plugins/platforms'\n",
"```\n"
)
# based on https://github.com/rstudio/rmarkdown/blob/master/R/jupyter.R#L41
render_ipynb <- function(
input,
output = xfun::with_ext(input, 'Rmd'),
output_dir = NULL) {
json <- jsonlite::fromJSON(input, simplifyDataFrame = FALSE)
lang <- json$metadata$kernelspec$language
# assumes first cell is a raw cell with yaml frontmatter
frontmatter_cell <- json$cells[[1]]
frontmatter_source <- paste0(frontmatter_cell$source, collapse = "")
res <- c(frontmatter_source, "", monkey_patch, "")
for (cell in json$cells[-1]) {
if (length(src <- unlist(cell$source)) == 0) {
next
}
src <- gsub("\n$", "", src)
src <- switch(
cell$cell_type,
code = rmarkdown:::cell_chunk(
src,
lang, cell$metadata
),
src
)
res <- c(res, src, "")
}
xfun::write_utf8(res, output)
rmarkdown::render(output, output_dir = output_dir)
# strip out the intermediate Rmd so it doesn't get
# built during the standard site build process
file.remove(output)
}
post_directory <- here("content/blog")
draft_directory <- here("content/drafts")
ipynb_posts <- list.files(
post_directory,
pattern = "ipynb",
full.names = TRUE
)
ipynb_drafts <- list.files(
draft_directory,
pattern = "ipynb",
full.names = TRUE
)
for (post in ipynb_posts) {
render_ipynb(post, output_dir = post_directory)
}
This produces a .html
file but that file doesn't seem to be playing well with blogdown
or registering with Hugo. My YAML frontmatter in the .ipynb
is:
---
title: "many models workflows in python: part i"
author: "Alex Hayes"
date: "2020-08-22"
output: html_document
slug: "many-models-workflows-in-python-part-i"
---
I'm assuming I can make this work if I somehow call the right renderer that will talk to blogdown
?
I don't see problems with this YAML frontmatter, and don't know why it doesn't work with Hugo.
I added rmarkdown:::convert_ipynb()
to rmarkdown more than a year ago, but didn't export it because I don't use Jupyter myself, and don't really know if this simple converter could be useful. I won't be surprised if you need to tweak it (please feel free to). I was just looking for a tester of this function, and wasn't really sure if it could be helpful. Anyway, thanks for testing!
Have you experimented with
rmarkdown:::convert_ipynb()
? You may not need to have Conda or Jupyter installed.