Instantly share code, notes, and snippets.

Embed
What would you like to do?
Michael Snoyman's personal take on PVP version upper bounds

In response to a request on Reddit, I'm writing up my thoughts on PVP upper bounds. I'm putting this in a Gist since I don't want to start yet another debate on the matter, just provide some information to someone who asked for it. Please don't turn this into a flame war :)

For those unaware: the Package Versioning Policy is a set of recommendations covering how to give version numbers to your packages, and how to set upper and lower bounds on your dependencies.

I'll start by saying: I support completely with the PVP's recommendations on how to assign version numbers. While there are plenty of points in this design space, the PVP is an unobjectionable one, and consistency in the community is good. On multiple occasions, I have reached out to package authors to encourage compliance with this. (However, I've always done so privately, as opposed to a statement on Reddit, as I believe that to be a more likely route to successful convincing.)

The issue around upper and lower bounds is where contention lies. Again, where I agree: if you know for certain that your package will not work with transformers 0.4 and earlier, you absolutely should put a transformers >= 0.5 in your .cabal file. Similarly, if you know it won't work with transformers 0.5, you should put a transformers < 0.5 in your .cabal file.

The issue comes down to the unknowns: you haven't tested with older package versions, and you can't test with unreleased package versions. The argument in favor of preemptively putting in bounds is to prevent the cabal dependency solver from choosing a build plan which may fail. I won't elaborate on the topic, since the PVP itself discusses this, and many people who are ardent version bounds supporters will likely say more than I ever could.

Here are my thoughts on the matter:

  • There are much better ways to solve this problem than version ranges. I wrote a blog post about my proposal. Tracking information on which versions of dependencies a package has successfully built with is far more reliable, trivial to automate, and introduces no ambiguity. I honestly have no idea why there would be opposition to making this change.
  • Historically, extra version bounds can cause problems with cabal's dependency solver. This was a bigger issue in the past, and was the original reason I stopped following PVP bounds. Many people dispute this claim, but it's an easily verifiable fact: if you look through the Yesod issue tracker, you'll see countless examples of people reporting that cabal gave up on calculating a build plan, when a valid build plan was available.
    • Since this point seems to cause confusion, I'll elaborate a bit more. There are two reasons why a dependency-solved build will fail: it will either fail to find a build plan, or find an incorrect build plan. Putting in version bounds will prevent the second case. But the first case is in fact exacerbated by version bounds.
  • Using version bounds does in fact avoid many common failure cases, but not all. Changes in the Cabal library for custom build scripts, discontinuities in version ranges (e.g., the foo function was added in version 1.2.3 and backported to 1.1.4) can make it highly unlikely that authors will get things right
  • Since version bounds are manually stated, there's a high degree of error possible with creating them.
  • They take a lot of work to maintain correctly, including a lot of busy-work of manually relaxing them. In my experience as a Stackage curator, the vast majority of restrictive upper bounds do not actually stop a problem of a failing build
  • Curation is a better solution for the majority of use cases. True, people looking to use dependency solving can't use that, but like I said, there are better solutions for that use case
  • It leads to brittle code. I've received reports of production build systems being taken down by lack of correct version bounds on a package. This is an inherent flaw in the build system! Making your production software dependent upon an upstream contributor (1) following a specific policy, and (2) not making a mistake when following that policy, is simply a recipe for disaster. For production build systems, reproducible build plans (via Stackage snapshots or freezing bounds) is the only reliable method I'm aware of
  • Trying to solve a technical solution through social means leads to unhealthy behavior. The fact that there have been flame wars and personal attacks about PVP bounds for year in the Haskell community is terrible. The fact that we consider this behavior acceptable is disgusting. And it all comes about from an attitude where it's OK to blame someone else for breaking your build. As I state in the previous point: if your work is blocked by someone else's behavior, your workflow is broken
  • One caveat: I do not believe version bounds serve any purpose other than documentation when used in a closed-source application. If you control the build system and will be delivering a binary/shipping a service, just freeze your dependencies or use a Stackage snapshot, playing with cabal version bounds is just a waste of time. I think Greg Weber wrote a good post about this once, but I can't find it right now.

There are probably other points I've raised or thought of over the years, but I think this covers the main ones. Again: I'm just sharing my thoughts, and I'm actively avoiding getting into more flame wars. I will repeat my request that led to this though: let's please stop as a community condoning the unhealthy behavior of rehashing arguments in every possible thread.

@johnmeacham

This comment has been minimized.

johnmeacham commented May 30, 2016

I completely agree. Cabal is the only packaging program that literally requires precognitive ability to work properly and is utterly frustrating to deal with. Human curated version numbers are opaque and a waste of developers time. The work scales with the square of potential estimated conflicts rather than the actual conflicts.

Remember when there were alternate implementations of Haskell, namely jhc? Cabal's PVP killed the ability to create alternate Haskell compilers, they killed Haskell as a language independent of GHC, and they conflate mechanism with policy to a huge degree. By tying things to opaque version numbers rather than feature tests, there is no way to tell if an alternate compiler or implementation will work, if they were feature based, it would be trivial. A developer knows whether they used higher order polymorphism, they don't know which version of ghc introduced it, which version of jhc did, or which version of a future compiler that hasn't been written yet does and they certainly don't know which future version will drop it.

When someone specifies ghc > 4.8 or whatever, I don't know which feature that only existed past 4.8 they need, it could be something jhc supports, or not, but a version number is useless. saying '2nd order polymorphism' or 'existential types' is something i can actually use, and developers can actually specify without knowing the whole history of ghc to know when they were implemented. Or writing a test case to test for the feature if it does not have a name yet.

Oddly enough, the way jhc builds Cabal packages is to ignore the version numbers because they are nonsense. It's the way it always did. Whomever wrote the package never tested the lower bounds and just guessed at the upper bound. Jhc can mix-n-match preludes to a degree ghc cannot so can usually find a way to compile anyway.

Sorry, I'm fairly bitter towards Cabal, it's decisions have been non-sensical and completely contrary to well established knowledge about package management at every turn, it enforces a single implementation world without taking advantage of anything that might allow. It's version number annotations say nothing about what actally is required and using them properly assumes a bug free library (who remembers to increment a version number when they introduce a bug?) and a user of the library that perfectly researches its history and accurately guesses its future. sigh. just sort of depresses me is all. Haskell is a great language, yet comes with such a backwards system it makes it hard to take seriously any more.

@massysett

This comment has been minimized.

massysett commented Jul 8, 2016

And it all comes about from an attitude where it's OK to blame someone else for breaking your build. As I state in the previous point: if your work is blocked by someone else's behavior, your workflow is broken

This is the best summary of this whole issue. I'm amazed that people somehow think that I should endure a bunch of busywork so that they can ensure that they can build their software using whatever menagerie of dependency versions they see fit. I am always amazed at the sense of entitlement people get when someone gives them software.

@seagreen

This comment has been minimized.

seagreen commented Dec 3, 2016

I think there's a case for keeping the PVP, only modified.

Preemptive upper bounds should be optional.

Some people may need to explicitly approve later versions of their dependencies, even if they type check. These people should use preemptive upper bounds.

For instance, a library called markdown may have a function htmlFromMarkdown :: Text -> Text. The transformation that function performs may change slightly each release (since Markdown is notorious for having no spec, lots of things are ambiguous, etc).

Most other libraries using that function won't care that it changes slightly between releases, but some might. The ones that do should set an preemptive upper bound on their markdown dependency, and approve each release explicitly by bumping their upper bound and doing a release of their own.

More commonly though, library authors want to approve each future version of their dependencies that type checks & passes tests. If that's the case they shouldn't have to make a release of their library each time one of their dependencies releases a new version that doesn't actually affect them. To handle this situation we should use automation to gather information about what dependency releases actually are non-breaking, and this information should be stored outside of the packages themselves. This would save library authors the tedium of acting like human CI tools.

Basically, I think the current way we handle unknown/unreleased packages is right for some situations, but not all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment