Some gems require external dependencies, such as gems containing C extensions
that compile against external C libraries (ex: pg) or gems that wrap around
external command-line utilities (ex: graphviz). Presently there is no
mechanism to automatically install these external dependencies. Instead, it is
the user's responsibility to figure out the correct package name and install the
external dependency via their system's package manager (ex: apt or brew)
and attempt installing the gem again. This often results in users becoming
frustrated when a C extension does not successfully build and then having to
search google and StackOverflow for which package needs to be installed.
Very rarely do gem authors list external dependencies via the gemspec
requirements attribute, and if so, it usually lists the canonical project
name of the external dependency (ex: sqlite3) and not the package name
(ex: libsqlite3-dev).
To workaround the problem of external dependencies some popular gems (ex:
nokogiri) have begun vendoring their external dependencies directly into the
gem and building both the C library and any C extensions during installation.
There are two downsides to this approach:
- Increased compilation time.
- Security Advisories: each time a new security advisory is published for the vendored library, the gem maintainer has to update the vendored copy of the C library, publish an updated version of the gem, and then publish their own security advisory warning users that the previous versions of their gem contains a vulnerable copy of the vendored library.
It should be possible to list the external dependencies required by a gem and have rubygems automatically install them from the system's package manager during the installation process.
This external_dependencies metadata could be embedded in the gemspec's
metadata attribute. In order to support naming differences between different
package managers, the package names for the external dependencies would need to
be listed for each popular package manager.
gemspec.metadata['external_dependencies'] = {
'apt' => %w[libsqlite3-dev],
'dnf' => %w[sqlite-devel],
'brew' => %w[sqlite],
# ...
}If only one external dependency needs to be specified, then the values of the
external_dependencies Hash could also be single package name Strings which
would later be automatically coerced into an Array:
gemspec.metadata['external_dependencies'] = {
'apt' => 'libsqlite3-dev',
'dnf' => 'sqlite-devel',
'brew' => 'sqlite',
# ...
}However, this might be a bit confusing?
If all of the package names are the same for each package manager,
then external_dependencies could be specified as an Array of Strings:
gemspec.metadata['external_dependencies'] = %w[nmap]Multiple package managers may be installed on the same system. In order to determine the primary system package manager, the system's package manager can be selected based on the OS/distro/flavor.
macOS is a unique edge-case since it does not have a default package manager. So there should be a prioritized list of macOS package managers to check for in order of popularity:
brewports
If gems can specify packages names that will be installed via an
apt-get install or brew install command, special concern should be made to
prevent arbitrary command injection. system() with multiple arguments
(ex: system('apt-get','install',...) or Shellwords.shellescape must be
used to prevent command injection.
In order to prevent option injection via the gemspec's package names, an
argument of '--' or '-' can be specified before the package names to
prematurely terminate option parsing and prevent the package names, so that
they are not accidentally parsed as options.
Some users may wish to customize which package manager is used on their system, if they have multiple package managers installed alongside each other. It may be necessary to add a configuration option or environment variable to control which package manager is used by default. Although, this feature request seems to be very rare.
If we allow annotating the external dependencies and automatically
installing them along with the gem, the gemspec's requirements attribute
might no longer be necessary and could be deprecated in the future?
@flavorjones I should have probably put this into a Google Doc to allow for better commenting...
Naming
OK I can see how
external_dependencymight be confusing, considering that other gem dependencies are technically "external". However, "external" here is trying to imply that the dependency is outside of the RubyGems ecosystem. Some other possible names:system_dependenciespackage_depenenciespackagessystem_packagesmetadata
Good catch. A quick workaround would be to just add the package lists as space separated Strings to
metadata:Alternatively, we could call the
metadatakeysapt_pkgsor something similar?I would argue against extending
gemspec.requirementsas it's already an arbitrary String, so there's no real way to validate it which also makes parsing and extracting the package metadata from it error prone. Also embedding JSON into a Ruby String, which then later gets converted into YAML, seems kind of hacky.Ideally, we should be able to just add new attributes to
Gem::Specificiation, but that wasn't thought of when the.gempackage format would originally designed. I think allowing nested Hashes withinmetadatais the second best thing, assuming RubyGems can figure out how to recursively validatemetadataand catch infinite referential Hashes.Determining Already Installed Packages
Good news. Most package managers already ignore previously installed packages. The exceptions are
brewandpacman(Arch). Luckily,ruby-installalready includes some code to filter out previously installed packages when usingbreworpacman.Opt-In
This is a good idea, especially considering it's a new feature, and we probably will need to beta-test it before turning it on for everyone. Although, it should be a one-time opt-in, possibly done via
~/.gemrcor some command that adds a flag file to~/.gem/.Multiple Optional Dependencies
Eh, this gets complex. I believe both Debian (
deb) and RedHat (rpm) packages have generic meta package names which will resolve to which ever package is installed? This allows other packages to depend on eithermariadbormysqlclient, but that forces the user to decide which one to install first. I would error on the side of caution and only support hard-dependencies first, that way no user interaction is required during installation.Toolchains
Gems could define explicit package dependencies on
gccorclang, or simply use the generic package group ofbuild-essentials(apt) orC Development Tools and Libraries(dnf). No need to re-invent package meta-groups as package managers already provide us with them.