Some gems require external dependencies, such as gems containing C extensions
that compile against external C libraries (ex: pg) or gems that wrap around
external command-line utilities (ex: graphviz). Presently there is no
mechanism to automatically install these external dependencies. Instead, it is
the user's responsibility to figure out the correct package name and install the
external dependency via their system's package manager (ex: apt or brew)
and attempt installing the gem again. This often results in users becoming
frustrated when a C extension does not successfully build and then having to
search google and StackOverflow for which package needs to be installed.
Very rarely do gem authors list external dependencies via the gemspec
requirements attribute, and if so, it usually lists the canonical project
name of the external dependency (ex: sqlite3) and not the package name
(ex: libsqlite3-dev).
To workaround the problem of external dependencies some popular gems (ex:
nokogiri) have begun vendoring their external dependencies directly into the
gem and building both the C library and any C extensions during installation.
There are two downsides to this approach:
- Increased compilation time.
- Security Advisories: each time a new security advisory is published for the vendored library, the gem maintainer has to update the vendored copy of the C library, publish an updated version of the gem, and then publish their own security advisory warning users that the previous versions of their gem contains a vulnerable copy of the vendored library.
It should be possible to list the external dependencies required by a gem and have rubygems automatically install them from the system's package manager during the installation process.
This external_dependencies metadata could be embedded in the gemspec's
metadata attribute. In order to support naming differences between different
package managers, the package names for the external dependencies would need to
be listed for each popular package manager.
gemspec.metadata['external_dependencies'] = {
'apt' => %w[libsqlite3-dev],
'dnf' => %w[sqlite-devel],
'brew' => %w[sqlite],
# ...
}If only one external dependency needs to be specified, then the values of the
external_dependencies Hash could also be single package name Strings which
would later be automatically coerced into an Array:
gemspec.metadata['external_dependencies'] = {
'apt' => 'libsqlite3-dev',
'dnf' => 'sqlite-devel',
'brew' => 'sqlite',
# ...
}However, this might be a bit confusing?
If all of the package names are the same for each package manager,
then external_dependencies could be specified as an Array of Strings:
gemspec.metadata['external_dependencies'] = %w[nmap]Multiple package managers may be installed on the same system. In order to determine the primary system package manager, the system's package manager can be selected based on the OS/distro/flavor.
macOS is a unique edge-case since it does not have a default package manager. So there should be a prioritized list of macOS package managers to check for in order of popularity:
brewports
If gems can specify packages names that will be installed via an
apt-get install or brew install command, special concern should be made to
prevent arbitrary command injection. system() with multiple arguments
(ex: system('apt-get','install',...) or Shellwords.shellescape must be
used to prevent command injection.
In order to prevent option injection via the gemspec's package names, an
argument of '--' or '-' can be specified before the package names to
prematurely terminate option parsing and prevent the package names, so that
they are not accidentally parsed as options.
Some users may wish to customize which package manager is used on their system, if they have multiple package managers installed alongside each other. It may be necessary to add a configuration option or environment variable to control which package manager is used by default. Although, this feature request seems to be very rare.
If we allow annotating the external dependencies and automatically
installing them along with the gem, the gemspec's requirements attribute
might no longer be necessary and could be deprecated in the future?
I've been thinking about this quite a bit over the last few days, and I'm less excited by a declarative gemspec approach now having played out a few scenarios in my head.
Circling back on detection
Specifically I want to point out that I didn't communicate clearly in my first comment about "detecting already-installed packages". What I meant by that is manually installed libraries, not previously-installed-by-package-manager packages.
For example, imagine a user who is insanely concerned with performance and so has compiled their own libxml2 with some compiler options like
-marchand-O3; and has set the env varPKG_CONFIG_PATHso that the pkg-config filelibxml-2.0.pccan be found.Or consider a Mac user who's already got a version installed through macports, and won't want to install another via homebrew.
This is part of the reason I suggested the syntax in my previous comment -- wrapping a set of declarations with a pkg-config name. Does that suggestion make more sense now?
I think any solution needs to be able to handle situations like this -- it needs to use
pkg-configand not blindly rely on the package managers as the first and only option. Do you agree?Scenarios
I like to go through concrete scenarios as a thought experiment for new APIs.
Scenario 1: runtime-only depenendency:
ffi-libarchiveThe proposed solution seems perfect for a gem that has no
Gem::Specification#extensionsdefined but has a runtime dependency on a system library.For example, something like
ffi-libarchivewhich uses FFI to bind tolibarchive.and then the gem at runtime calls (file):
Scenario 2: Simple C extension:
rcairoI can imagine a solution like this being valid for straightforward integrations that are insensitive to version and don't have many install-time options.
For example, something like
rcairowhich currently usesnative-package-installer(extconf.rb):(The code above is taken from the extconf.rb, and does both a detection and an installation phase for the
libcairo2library.)Under the proposed solution, the dependencies would be declared as follows in the gemspec:
In the extconf.rb the "native package installation" code can be deleted, however the
pkg-config/detection bit must remain so the extension knows the compiler and linker flags (see https://github.com/rcairo/rcairo/blob/master/ext/cairo/extconf.rb#L42), which means it's not a big advantage to use the declarative gemspec syntax. And there's some risk that the detection done in the extconf doesn't match the detection done by RubyGems -- I worry that we'd be duplicating that logic.I'm not sure I like the separation of the dependency declaration from the code where those dependencies are used. Previously the
extconf.rbwas the canonical place to look, but we'd be placing the names into the gemspec, and the logic around detection is in two places (RubyGems andextconf.rb).This scenario would be supported, but it doesn't feel like a clear win to me.
Scenario 3.a: C extension with "use whichever" optionality
Let's look at the
mysql2gem which is pretty straightforward except for the optionality around the client library (supports mariadb and mysqlclient).I'm not sure how to express this optionality with the proposed solution. The
extconf.rbsomewhat relies on a system only having one or the other installed -- so if it finds either, it uses it. How do we express this detection logic in the gemspec? Which one should be installed if no match is found? Are you suggesting just listing both?My earlier suggestion of listing multiple
pkg-confignames might help, but I still worry about complex detection being done in both RubyGems and in theextconf.rb.You mentioned generic package names, but to me this means we're taking something that can be expressed easily in Ruby in an extconf.rb and moving that responsibility into the distro package manager (assuming it supports the functionality and a suitable virtual package) which feels less obvious and more brittle.
Scenario 3.b: C extension with "choose one at install time" optionality
A gem with a different flavor of optionality is
sqlite3-rubywhich can use eitherlibsqliteorlibsqlcipher; and the difference here frommysql2is that bothlibsqliteandlibsqlciphercan be installed on the system at the same time.The extension defaults to look for (and use)
libsqlite, but with a runtime option--with-sqlcipher, the gem installation will instead uselibsqlcipher.I don't think we're able to implement an extension like this with a declarative gemspec syntax. Only at install-time is it possible to know which of the two libraries the user wants to use.
Maybe there's a cheaper alternative
In short, it feels like declaring dependencies in the gemspec isn't the silver bullet I originally imagined it would be. It also feels complex enough that getting it right for the use cases where it's a good solution might take some iteration, and RubyGems feels like a heavyweight solution that will be challenging to iterate within (but maybe I'm wrong). Finally, the detection phase still needs to be duplicated in both RubyGems and in the
extconf.rb.Instead of a declarative syntax in the gemspec, though, what if we provided a great set of tools for C extension authors to use in their
extconf.rbscripts? The combination ofnative-package-installerand thepkg-configgem feels like a good start, though I'd want to add:Why not use these today?
I think we could! -- in fact as part of a recent overhaul of the sqlite3-gem I had an experimental branch that used
native-package-installerand it mostly worked; but we instead decided to go with a precompiled library.One obstacle to using
native-package-installerandpkg-config(the gem) is that they're additional gem dependencies which aren't actually needed at runtime (only at install-time). Further, in the past people have objected to introducing gem dependencies which are LGPL-licensed -- specifically thepkg-configgem andnative-package-installerare both released under LGPL.(To work around that objection, we might propose a change to RubyGems that introduces a new "install-time dependency" in addition to the current "runtime dependency" and "development dependency", which would allow folks to delete install-time dependencies once the gem is installed, or to ignore the licenses of install-time dependencies. Would love to hear @kou's thoughts about that.)
We can start experimenting with this approach today, without having to first get approval from RubyGems maintainers.
If this approach works well, we could later propose moving the code into Ruby itself as companions to
MakeMakefilefor C extensions.