taktoa/haskell-pain-points.md

## haskell-pain-points.md

      
    Raw
  

              haskell-pain-points.md
            
          
    I started writing this polemic to answer your question, but I ended up touching
on most of my gripes with Haskell in general, not just in a corporate context.
GHC

GHC is a modern compiler with an amazing RTS and tons of features, but I have
some issues with it.
Monolithic and Hard to Contribute To

I think GHC is a really messy codebase and would benefit greatly from being
cleaned up, refactored, and split into multiple tools rather than being one
monolithic compiler (in other words, I want GHC to be written more like LLVM).
Ideally, a cleaned up GHC would be written in "modern Haskell", using the full
facilities of Hackage, but I think this might cause some issues when
bootstrapping GHC. I also think it might be worth getting rid of some of the
more antiquated thesisware GHC extensions, depending on how large the reverse
dependency transitive closure of the libraries that use them on Hackage is and
how hard it would be to rewrite those libraries to not use those GHC extensions.
Bootstrapping

I don't think bootstrapping GHC from an environment containing only a C compiler
should require multiple versions of GHC. As a Nix user, my view is that any
policy about how you build your code should be part of your codebase, and that
extends to how you bootstrap a compiler. Versions of GHC are a "meta-level"
notion not contained within the codebase, so the fact that you have to manually
retrieve them and build them in order is problematic in my view. Personally, I
would prefer a situation where we "compile" a reasonably modern GHC into a
single file containing GHCi bytecode, write an interpreter for GHCi bytecode in
portable C, and then check that GHCi bytecode file and interpreter into the GHC
codebase. From an auditability point-of-view, I think that this solution isn't
so great for preventing trusting-trust attacks (since the bytecode file is
basically an opaque blob), but practically everyone already bootstraps GHC
starting from an old binary distribution of GHC, so we're already living in a
world susceptible to those kinds of attacks. The real solution to bootstrapping
without being as susceptible to trusting-trust is to write a tower of compilers
and interpreters for successively more complex languages, with the bottom of the
tower written in auditable C, but that's such a huge effort that AFAIK no one
ever does it.
Template Haskell

The fact that Template Haskell is built into GHC as opposed to being a separate
preprocessing step executed via another binary (distributed with GHC) is a
complete travesty. It really does not make any sense to couple the GHC that is
used to evaluate Template Haskell to the GHC that used to compile the library
or executable; this has caused massive problems with cross-compilation and
with GHCJS. There is a workaround for this issue called iserv, but I don't
think it is philosophically the right way to deal with it. Admittedly, there is
a fairly significant problem with separating TH out this way, which is that it
requires some way of pretty-printing the generated Haskell, but I think that
problem should be dealt with by creating a versioned binary representation for
the GHC Haskell AST, and then modifying the GHC frontend so that it can accept
this binary format rather than a Haskell source file. Then the TH execution
utility could just output this binary format rather than Haskell source,
sidestepping the issue of pretty-printing a TH-modified AST (though admittedly
a separate tool for pretty-printing these binary AST files would also be pretty
useful). Going back to my point about splitting GHC up into smaller tools, this
binary AST would also mean that we could make a ghc-parse executable that
converts a (potentially TH-containing) Haskell source file into a binary AST
file, and this would be extremely useful for editor tooling, though as usual the
devil is in the details of versioning, stability, and ease of parsing this
binary AST format. I would probably advocate shipping a tool along with GHC
that, in addition to (slowly) pretty-printing the binary AST as Haskell source,
can convert the AST to a few non-binary formats like s-expressions and JSON.
All of these tasks would be eminently doable if it weren't for the messiness of
the GHC codebase and the general conservatism of the project, which decreases
the number of people who are willing to contribute to it.
Cross-Compilation

I wish GHC were a native cross-compiler. At the very least, I would like to know
how to "cross-compile" Haskell programs to Windows binaries (I use air quotes
because it's the same architecture, just a different binary format and libc).
I really wish the Nix GHC wrapper machinery supported Windows cross-compilation,
assuming it is possible (if it is not possible, why not? the code for generating
a Windows PE file is in the GHC codebase, there's no good reason to disable it
just because you're not building GHC for Windows, is there?).
Miscellaneous

I wish GHC had a WebAssembly backend (this is different from WebGHC, which
involves translating GHC's LLVM output to WebAssembly using emscripten).
Infrastructure / Nix

I am fairly satisfied with the Nix/Haskell situation, but I have a few gripes.
I think it would be extremely useful if we could build Haskell projects
incrementally using Nix, as /u/dmjio mentioned in his comment, and I in fact
spent the most of summer 2017 working on implementing that using Nix's
import-from-derivation (IFD) feature. However, I concluded that although it is
in principle possible to implement this feature using IFD, it would be very
difficult to do it in a way that isn't brittle, and the least brittle way of
implementing it would require fairly significant reengineering of Cabal to
enable it to generate a static description of its build plan upon configuration.
Ultimately, I've come to the conclusion that we will need Recursive Nix, the
ability to run nix-build inside a nix-build sandbox ala recursive make,
before we can start incrementalizing Nix build processes, including Haskell's.
I wrote a long comment on the issue for Recursive Nix talking
about this experience if you want to read more about it.
As I said in the GHC section, the cross-compilation situation with Nix + Haskell
is pretty non-existent, though that's not to imply that it's particularly
existent with other toolchains. I know that /u/Sonarpulse is working on
fixing the nixpkgs cross-compilation infrastructure in general, and I hope
that the Nix GHC wrapper gets the same treatment at some point.
I think the corporate users of Nix + Haskell should probably band together to
create a single Hydra build farm that is better suited to our needs than the
main NixOS build farm. In particular, I think it would be really nice if we had
a fork of nixpkgs for which we could easily fix upstream Haskell packages
(e.g.: by adding native dependencies or haskell.lib.dontCheck or whatever),
and for which we could be proactive about security issues (since we only need
to care about server use-cases). Then peti (or, if he
doesn't want to, someone else) could upstream our fixes every so often, reducing
the workload on both ends. I also think it'd really be nice to have a GHC 8.2
Haskell package set that is compiled with and without profiling / debugging.
Nix itself has some warts, particularly in the performance of its evaluator,
but that is worth a whole other rant.
Module system

Haskell has no way to make nested modules, no way to do qualified exports, and
it doesn't even have C++ style namespaces.
Any of these features would allow you to, for example, export a symbol
called Text from Data.Text that can be indexed as a module
(e.g.: Text.pack), thus allowing you to simply import Data.Text without
any qualification needed. Moreover, this would allow custom preludes to
subsume all the stuff you normally import; as it stands, you can add as much
stuff to a custom prelude as you want, but it has to be in the same
namespace, or else every time the prelude is used there must be multiple
imports.
I really want a world where you can just import MyPrelude and have BS.*,
LBS.*, Text.*, LText.*, Set.*, Map.*, etc. already in scope.
Plenty of languages already have this feature; I don't understand why Haskell
is so antiquated in this regard.
There are three main workarounds for this issue (i.e.: solutions that don't
involve modifying GHC).
The first is to make all the functions that are likely to clash polymorphic
enough that they can be used in both contexts. This is the approach taken by
things like mono-traversable and most custom preludes. This works for some
functions, but if taken to an extreme I think it reduces type inference, makes
code more complicated, slows down compile times, and slows down code at runtime
(due to dictionary passing).
The second solution is to maintain a strict import discipline. This is what I
currently do, and it results in insanely long import lists that contain a lot of
pairs of lines like
import           Data.Foo (Foo)
import qualified Data.Foo as Foo

This is obnoxious and repetitive, but it's the most sensible solution given the
current library ecosystem.
The third workaround is to adopt an OCaml-style convention where we have one
type or typeclass per module that is always named T or C respectively. This
means that you could just do import qualified Data.Foo as Foo without also
having to import the type named Foo, since it will now be named Foo.T rather
than the obnoxious Foo.Foo. The only Haskell programmer I know of that has
adopted this convention is Henning Thielemann (example: numeric-prelude).
The main problem with this approach, besides some people generally not liking
it, is that Haddock generates really confusing documentation,
since AFAIK Haddock doesn't ever display the qualification of a symbol. If the
Haddock problem were fixed, I'd be willing to adopt this convention, but it
seems like a real uphill battle convincing everyone else to do it too, so I
think this is probably best for company-internal code where this style could
be enforced.
It is also worth mentioning that nested modules / namespaces would be pretty
useful for Haskell's record problem, given that they would allow you to more
easily namespace record accessors. For the most part, though, this would only
fix Haskell records for the consumer of a record; it would be fairly
boilerplatey to write records this way. For example, you would define Data.Foo
like:
module Data.Foo (type Foo.Foo, module Foo) where

module Foo (Foo (..), new) where
  data Foo
    = Foo
      { bar  :: Bar
      , baz  :: Baz
      , quux :: Quux
      }

  new :: Bar -> Baz -> Quux -> Foo
  new = Foo

and then you would import Data.Foo to use the following names: Foo (type),
Foo.new, Foo.bar, Foo.baz, and Foo.quux. This isn't perfect, but it's
better in some ways than the current situation, and I can imagine that the
changes that would need to be made in GHC to support nested modules would be
fairly conducive to adding Agda-style support for records that automatically
generate modules like this.
Records

The current situation with records in Haskell is kind of nightmarish. I really
wish we could just have row polymorphism like PureScript. There's been
quite a bit of research on the
subject, I don't really understand why the GHC team is so conservative
about adding it to the type system, especially given the fact that I'm pretty
sure most implementations of it can be completely eliminated into GHC Core;
the existence of vinyl and union certainly implies this, although the
fact that those necessarily have linear-time accesses in present-day Haskell
might mean that we need to extend Core in some way to make an implementation of
row polymorphism efficient (I honestly don't know).
One other thing is that, in addition to making the record situation easier, row
polymorphism can be extremely useful in other ways. For instance, we could have
instances of Aeson's FromJSON and ToJSON typeclasses for a generic record
type^1 like Rec from vinyl. This is useful because most of the time, when
you wrap an API that uses JSON, you want two different "levels" of types; one
level is a straightforward translation of the JSON format described in the API
documentation (which requires comparatively little effort to completely wrap
the API), and the second level is a more high-level Haskell-appropriate
translation of those types (which is almost never an up-to-date complete
description of the API). Since the FromJSON and ToJSON instances for the
low-level types are so trivial, you really want them to be automatically
generated. Sure, you can do that with GHC.Generics, but then you have to
either use DuplicatedRecordFields or prefix all your fields, and that ends up
being much worse than the situation I'm talking about. I know this because there
is actually already a package that uses vinyl for this workflow, called
composite-aeson, and I've used it to wrap APIs (e.g.: the Bittrex API).
In general, I think a lot of the things we currently use GHC.Generics for are
better served by adding instances for anonymous row-polymorphic records/unions,
since I don't think it's reasonable to have the semantics of your program depend
on the identifiers you chose for your record accessors (in the anonymous record
world, these are type-level strings or empty data types equipped with instances
of an open type family into Symbol, so it is much less surprising that program
behavior will change if the type-level string or open type family instance is
changed).
Type system

Haskell doesn't have quantified class constraints. This means that
some typeclasses, like MonadTrans, cannot be written in a way that restricts
instances to have the desired behavior; the way you would want to write the
MonadTrans class is that any instance t should have the property that for
any monad m, t m should also have a Monad instance, but you can't express
this kind of superclass constraint without quantified class constraints.
Haskell doesn't have dependent types, though they are on the way. I don't think
dependent types are necessarily something you should use all the time, but they
are pretty nice to have when needed.
GUI

The biggest gap in the Haskell library ecosystem is definitely that of a good
GUI library. I'm not talking about FRP or high-level wrappers or whatever, I
think the situation there is mostly decent (reflex is best-in-class IMO for
that, though I think reflex-dom is a bit of a crazy codebase). Instead, I'm
talking about the ability to write cross-platform (by which I mean Windows,
Mac OS, and Linux; I'm not convinced that writing the same GUI for PCs and
mobile devices is satisfactorily possible) GUI applications that look good and
perform well.
There are basically four options in that realm currently, listed from
least-promising (IMO) to most-promising:

FLTK, which has Haskell bindings in the form of /u/deech's fltkhs
package. I haven't gotten a chance to look at the quality of these bindings,
but they aim to be complete, which is admirable. However, I've never gotten
fltkhs to build on NixOS, so I can't use them, and moreover as far as I
can tell FLTK isn't a very good UI toolkit anyway (much like GTK), though it
is at least nominally cross-platform.
The Haskell GTK bindings. These are extremely good bindings, and are
well-maintained and I have actually managed to build them, which is more than
I can say for most of the other options. However, GTK itself is a really bad
UI toolkit, and although it is nominally cross-platform, GTK applications
look pretty much the same wherever you run them (by default, at least), so
they aren't a solution I'd bet a big project on.
Qt, which has two bindings: hsqml and Qtah. I can pretty much
immediately throw out hsqml because Qt Quick is really not very usable for
making complicated GUIs (I have tried; it is far less developed and
well-documented than the rest of the Qt ecosystem). Qtah, on the other hand,
is far more impressive to me. Until today, I hadn't been able to compile it,
but it ticks off most of my boxes, so I am generally impressed. Now someone
just needs to figure out how to cross-compile Haskell programs using Qtah
from Linux to Mac OS and Windows with Nix, and I will be much more satisfied
with the state of the Haskell GUI library ecosystem. After that, if someone
(perhaps me) writes a reflex-qtah, we will live in a world where it is
possible to write a cross-platform, performant Haskell application that can
be built entirely through Nix without any Windows or Mac OS licenses in the
mix. There could even be a repository like reflex-platform that makes this
kind of workflow completely plug-and-play! The only unfortunate thing about
this is that making good-looking GUIs with Qt is generally more difficult
than making good-looking GUIs with HTML + CSS, owing to the man-decades of
effort that have been poured into the latter activity.
Compiling your code to JavaScript using GHCJS and running it in some kind of
browser (e.g.: electron). This is the most promising solution in some ways,
since I have no doubt about my ability to make HTML + CSS look decent (it is
painful, but doable), and it is definitely cross-platform, but there are real
issues with the performance of code generated by GHCJS and the memory usage
of modern browser engines. My hope is that by compiling to WebAssembly using
WebGHC, and by making something like electron that uses Mozilla's
Servo browser engine, we can overcome these issues, but I don't know
how likely that is to happen. FWIW, this is the approach I have taken most
seriously, as I put a bunch of work into GHCJS bindings for the Electron API:
ghcjs-electron (they are still a work-in-progress unfortunately).

Footnotes


If your system has anonymous record types like vinyl's Rec type, just
use those, and if it doesn't then you can make instances of those typeclasses
on a wrapper type defined like
newtype Wrapped t = Wrapped (∀ ρ. t ⋄ ρ)
where t is a type variable representing the required field types and ⋄ is
row combination (with vinyl ⋄ would be a closed type family that does
type-level list concatenation).