Skip to content

Instantly share code, notes, and snippets.

@simonpcouch
Created May 9, 2024 15:16
Show Gist options
  • Save simonpcouch/510c32e7dcf75b5669e6c0cd207456d4 to your computer and use it in GitHub Desktop.
Save simonpcouch/510c32e7dcf75b5669e6c0cd207456d4 to your computer and use it in GitHub Desktop.

Supporting older R versions in broom

In preparing the most recent release of the broom package, I’ve run into some headaches related to the large number of Suggests that make it very difficult to support older versions of R. I’m considering revising the package’s approach so managing dependencies such that broom can “support” a package’s output with including it in Suggests.

Setup

library(broom)
library(desc)
library(dplyr)

In the current structure of the package, broom Suggests a given “model-supplying” package (say, pkg), and then defines three methods for each model object supported from the package (say, mod): tidy.mod(), glance.mod(), and augment.mod(). Some documentation may refer to that Suggested package, and some of those three methods may dispatch to methods from the model-supplying package like summary(), predict(), etc. We also load pkg in tests.

broom supports many packages in this way via Suggests:

desc_get_deps(system.file("DESCRIPTION", package = "broom")) %>% count(type)
      type  n
1  Depends  1
2  Imports 10
3 Suggests 78

Many of those Suggests supply more than one supported model object, and broom supports model objects from stats, too, so there are many more methods than suggested packages:

length(methods("tidy"))
[1] 133

When I started developing on the package in 2020, we switched to an approach where we urged model-supplying packages to implement their own tidiers and newly exported some tools to make it easy to do so, so that number of Suggests hasn’t increased since then.

The number of Suggests actually falls over time when model-supplying packages are archived from CRAN or otherwise become unsupportable. Here’s a NEWS entry on the current dev version of the package giving some examples of how that might happen:

Soft-deprecated tidiers for margins, lsmeans, and emmeans. Each package has been removed from Suggests and is no longer tested—their tidiers will raise a deprecation warning but return the same results as before.

  • margins was archived from CRAN. In the case that the package is back on CRAN before the next package release, broom will once again Suggest and test support for the package (#1200).
  • lsmeans and emmeans have a dependency requiring R 4.3.0 or higher. To maintain compatibility with at least 4 previous minor versions of R, broom won’t test support for these packages until the release of R 4.7.x (or until lsmeans and emmeans are compatible with the R version 4 minor releases previous, #1193).

This approach of 1) remove from Suggests, 2) no longer test, and 3) deprecate, possibly temporarily, is labor-intensive on my end, and also seems like it may lead to most tidiers being deprecated in the upcoming release due to dependency issues with the Matrix package (GH issue, Slack thread with Gabor).

Roll-your-own dependency manager

I’m considering reworking broom’s approach to managing dependencies with the goals of:

  • Continuing to support older versions of R,

  • Continuing to test tidiers against CRAN model-supplying package versions,

  • No longer needing to (possibly temporarily) deprecate tidiers when the model-supplying package introduces some dependency issue with older versions of R, and

  • Reducing the maintenance burden of the package.

The new approach could look like:

  • Suggest no model-supplying packages.

  • In, e.g. tidy.mod(), use some helper that checks that pkg is installed (and ensure that its methods are available if so).

  • Remove all .rd cross-refs to each package, e.g. transitioning #' [pkg::mod()] to #' `pkg::mod()`.

  • Use setup-r-dependenciesextra-packages to install model-supplying packages to use in CI testing.

Pros:

  • broom becomes much more dependency-light—users need only install the model-supplying package they use to use its tidiers.

Cons:

  • broom no longer gets reverse dependency checks. :/

  • The usual mental model of “how to install all needed packages” that users and developers have is disrupted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment