In preparing the most recent release of the broom package, I’ve run into
some headaches related to the large number of Suggests
that make it
very difficult to support older versions of R. I’m considering revising
the package’s approach so managing dependencies such that broom can
“support” a package’s output with including it in Suggests
.
library(broom)
library(desc)
library(dplyr)
In the current structure of the package, broom Suggests
a given
“model-supplying” package (say, pkg), and then defines three methods for
each model object supported from the package (say, mod
): tidy.mod()
,
glance.mod()
, and augment.mod()
. Some documentation may refer to
that Suggest
ed package, and some of those three methods may dispatch
to methods from the model-supplying package like summary()
,
predict()
, etc. We also load pkg in tests.
broom supports many packages in this way via Suggests
:
desc_get_deps(system.file("DESCRIPTION", package = "broom")) %>% count(type)
type n
1 Depends 1
2 Imports 10
3 Suggests 78
Many of those Suggests
supply more than one supported model object,
and broom supports model objects from stats
, too, so there are many
more methods than suggested packages:
length(methods("tidy"))
[1] 133
When I started developing on the package in 2020, we switched to an
approach where we urged model-supplying packages to implement their own
tidiers and newly exported some tools to make it easy to do so, so that
number of Suggests
hasn’t increased since then.
The number of Suggests
actually falls over time when model-supplying
packages are archived from CRAN or otherwise become unsupportable.
Here’s a NEWS entry on the current dev version of the package giving
some examples of how that might happen:
Soft-deprecated tidiers for margins, lsmeans, and emmeans. Each package has been removed from Suggests and is no longer tested—their tidiers will raise a deprecation warning but return the same results as before.
- margins was archived from CRAN. In the case that the package is back on CRAN before the next package release, broom will once again Suggest and test support for the package (#1200).
- lsmeans and emmeans have a dependency requiring R 4.3.0 or higher. To maintain compatibility with at least 4 previous minor versions of R, broom won’t test support for these packages until the release of R 4.7.x (or until lsmeans and emmeans are compatible with the R version 4 minor releases previous, #1193).
This approach of 1) remove from Suggests
, 2) no longer test, and 3)
deprecate, possibly temporarily, is labor-intensive on my end, and also
seems like it may lead to most tidiers being deprecated in the
upcoming release due to dependency issues with the Matrix package (GH
issue, Slack thread
with
Gabor).
I’m considering reworking broom’s approach to managing dependencies with the goals of:
-
Continuing to support older versions of R,
-
Continuing to test tidiers against CRAN model-supplying package versions,
-
No longer needing to (possibly temporarily) deprecate tidiers when the model-supplying package introduces some dependency issue with older versions of R, and
-
Reducing the maintenance burden of the package.
The new approach could look like:
-
Suggest
no model-supplying packages. -
In, e.g.
tidy.mod()
, use some helper that checks that pkg is installed (and ensure that its methods are available if so). -
Remove all
.rd
cross-refs to each package, e.g. transitioning#' [pkg::mod()]
to#' `pkg::mod()`
. -
Use
setup-r-dependencies
’extra-packages
to install model-supplying packages to use in CI testing.
Pros:
- broom becomes much more dependency-light—users need only install the model-supplying package they use to use its tidiers.
Cons:
-
broom no longer gets reverse dependency checks. :/
-
The usual mental model of “how to install all needed packages” that users and developers have is disrupted.