Skip to content

Instantly share code, notes, and snippets.

@alexpghayes
Created August 10, 2020 01:38
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save alexpghayes/6309533d19e76ce4e4771f9f10376484 to your computer and use it in GitHub Desktop.
Save alexpghayes/6309533d19e76ce4e4771f9f10376484 to your computer and use it in GitHub Desktop.
library(reticulate)
library(here)
library(glue)
library(fs)
library(stringr)
#' Turn a Jupyter notebook into a blogdown post
#'
#' This function makes several key assumptions:
#'
#' - The jupyter notebook you want to turn in a blog post
#' lives in `content/jupyter`.
#'
#' - The name of the `.ipynb` file is `{slug}.ipynb`, using
#' `glue` notation.
#'
#' - `conda` is on your path, and your `base` conda environment
#' has jupyter installed.
#'
#' First executes the notebook (this can take a moment!) from
#' scratch, then turns the rendered notebook into an `.md` file
#' in `content/blog`. Automatically adds YAML frontmatter
#' needed by blogdown, and moves supporting files (such as images)
#' to `static/jupyter_support`, fixing paths along the way.
#'
#' The idea is that you only ever modify the `.ipynb` file for your
#' post, and then you call this function to turn that `.ipynb` into
#' a `.md` post file.
#'
#' This function extends ideas from Timothy Lin at
#' https://www.timlrx.com/2018/03/25/uploading-jupyter-notebook-files-to-blogdown/
#'
rmarkdownify <- function(slug, title) {
original_wd <- getwd()
setwd(here("content/jupyter"))
on.exit(setwd(original_wd))
nb_path <- glue("{slug}.ipynb")
render_md <- glue(
"conda activate base",
"jupyter nbconvert --to markdown --execute {nb_path}",
.sep = " && "
)
shell(render_md)
support_path <- paste0(slug, "_files")
blogdown_support_dir <- here(glue("static/jupyter_support/{support_path}"))
dir_copy(support_path, blogdown_support_dir, overwrite = TRUE)
new_post_path <- here(glue("content/blog/{slug}.md"))
frontmatter <- glue(
"---",
"title: '{title}'",
"author: 'Alex Hayes'",
"date: '{Sys.Date()}'",
"slug: {slug}",
"---\n\n",
.sep = "\n"
)
post_md_path <- glue("{slug}.md")
post_md_text <- readLines(post_md_path)
new_support_path <- glue("/jupyter_support/{support_path}")
new_post_text <- c(
frontmatter,
str_replace_all(post_md_text, support_path, new_support_path)
)
writeLines(new_post_text, new_post_path)
file_delete(post_md_path)
dir_delete(support_path)
}
rmarkdownify(
slug = "many-models-workflows-in-python-part-i",
title = "many models workflows in python: part i"
)
@yihui
Copy link

yihui commented Aug 12, 2020

Have you experimented with rmarkdown:::convert_ipynb()? You may not need to have Conda or Jupyter installed.

@alexpghayes
Copy link
Author

Thanks, this is much nicer on the whole! I ended up running into rstudio/rstudio#4182 after I switched to this solution. I'm going to end up using my own variant of convert_ipynb so that I can have more control over the frontmatter. In particular, I'm going to emulate @machow's strategy of a raw cell with YAML frontmatter at the top of the .ipynb. Once I get that working I can share it here or potentially make a PR to rmarkdown to increase flexibility around frontmatter.

@alexpghayes
Copy link
Author

Okay @yihui I'm struggling to make rmarkdown:::convert_ipynb() work for me, mostly because I don't want to keep the intermediate .Rmd -- I exclusively want to work in the original .ipynb with no double record keeping. So I tried including the following in my build.R

monkey_patch <- paste0(
  "```{python, echo = FALSE}\n",
  "import os\n",
  "os.environ['QT_QPA_PLATFORM_PLUGIN_PATH'] = ",
  "'C:/Users/alex/Anaconda3/Library/plugins/platforms'\n",
  "```\n"
)

# based on https://github.com/rstudio/rmarkdown/blob/master/R/jupyter.R#L41
render_ipynb <- function(
  input,
  output = xfun::with_ext(input, 'Rmd'),
  output_dir = NULL) {
  json <- jsonlite::fromJSON(input, simplifyDataFrame = FALSE)
  lang <- json$metadata$kernelspec$language

  # assumes first cell is a raw cell with yaml frontmatter
  frontmatter_cell <- json$cells[[1]]
  frontmatter_source <- paste0(frontmatter_cell$source, collapse = "")

  res <- c(frontmatter_source, "", monkey_patch, "")

  for (cell in json$cells[-1]) {
    if (length(src <- unlist(cell$source)) == 0) {
      next
    }
    src <- gsub("\n$", "", src)

    src <- switch(
      cell$cell_type,
      code = rmarkdown:::cell_chunk(
        src,
        lang, cell$metadata
      ),
      src
    )

    res <- c(res, src, "")
  }
  xfun::write_utf8(res, output)
  rmarkdown::render(output, output_dir = output_dir)
  # strip out the intermediate Rmd so it doesn't get
  # built during the standard site build process
  file.remove(output)
}

post_directory <- here("content/blog")
draft_directory <- here("content/drafts")

ipynb_posts <- list.files(
  post_directory,
  pattern = "ipynb",
  full.names = TRUE
)

ipynb_drafts <- list.files(
  draft_directory,
  pattern = "ipynb",
  full.names = TRUE
)

for (post in ipynb_posts) {
  render_ipynb(post, output_dir = post_directory)
}

This produces a .html file but that file doesn't seem to be playing well with blogdown or registering with Hugo. My YAML frontmatter in the .ipynb is:

---
title: "many models workflows in python: part i"
author: "Alex Hayes"
date: "2020-08-22"
output: html_document
slug: "many-models-workflows-in-python-part-i"
---

I'm assuming I can make this work if I somehow call the right renderer that will talk to blogdown?

@yihui
Copy link

yihui commented Aug 24, 2020

I don't see problems with this YAML frontmatter, and don't know why it doesn't work with Hugo.

I added rmarkdown:::convert_ipynb() to rmarkdown more than a year ago, but didn't export it because I don't use Jupyter myself, and don't really know if this simple converter could be useful. I won't be surprised if you need to tweak it (please feel free to). I was just looking for a tester of this function, and wasn't really sure if it could be helpful. Anyway, thanks for testing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment