DanielG/GSoC19-final.md

## GSoC19-final.md

      
    Raw
  

              GSoC19-final.md
            
          
    Final GSoC19 Report for: A stronger foundation for interactive Haskell tooling

Throughout the summer I've been working on ecosystem improvements in order
to allow Haskell tooling initiatives, mainly Haskell-IDE-Engine, to unfurl
their whole potential.
Here we're going to look at what work was done during the summer and how
that relates to the
original proposal (pdf). The proposal had this
to say:

This proposal will substantially improve the reliability, performance and
maintainability of tooling efforts.
This proposal consists of three main areas:

Improvements in GHC to reduce friction for downstream tooling efforts (Task 1 and 2)
Work on cabal-helper to enable easy new-build support (Task 3)
Integration of the above into Haskell IDE Engine (Task 4)


Significant progress was made with respect to the first two points, however
not enough time was leftover to fully finish cabal-helper and integrate
into HIE during GSoC.
Journey through submitted issues and pull-requests

GHC

The first two PRs nicely correspond to Task 1 from the proposal. I had
originally alotted 2 weeks for this, it ended up taking around 4 weeks due
to code-review and needing to get back into the GHC development process
first. The third PR is a backport of the other two onto the 8.8 release
branch which we just missed. Thankfully that got merged and the API changes
will be part of the 8.8.1 release.

Allow using targetContents for modules needing preprocessing
Allow API clients to use GhcMake.downsweep directly
Backport recent changes relevant for HIE to 8.8 branch

At this point, since my schedule was very tight, I figured I'd have to drop
at least one of the tasks so I skipped right ahead to Task 2.2 as I knew
we could handle multiple components downstream instead.
The original idea of Task 2.2 was to share data (in-process) across
multiple GHC sessions in order to reduce the overall memory usage. Since
Task 2.1 was scrapped this morphed into simply investigating why and where
GHC is using so much memory to begin with.

Use laziness for FastString's z-encoding memoization

On the side I was also thinking about if we could share memory across
multiple processes, hence the first PR above. I later scrapped that idea
because it's just a bit too much work.

rts: Fix retainerProfile early return with TREC_CHUNK
rts: Fix STATIC_INLINE macro
rts: Fix -hT option with profiling rts
rts: Divorce init of Heap profiler from CCS profiler / rts: Rename the nondescript initProfiling2 to refreshProfilingCCSs
Generalise Profiling Heap Traversal code from Retainer profiler
Introduce heap profiling by user specified roots

About 50 commits and about 5000 lines of RTS C code later:

I
confirmed my hypothesis
that GHC could really use a better (well any) cache eviction strategy for
the ExternalPackageState datatype or, well, more sharing.
As a side-effect of the investigation I had also produced a new heap
profiling mode which allows more directed quantitative analysis about the
heap memory usage of Haskell code. Please see the "Introduce heap profiling
by user specified roots" PR description for more details.
This code is still under code-review and I hope to get it merged in time
for GHC 8.10.
Unfortunately this took quite a bit longer than expected, 5 weeks or so,
but once I'd started I saw the potential to not just make it easier to
debug and reduce GHC's memory consumption but literally any Haskell
program's!
I figured while it's not what we originally planned for I should finish
this anyways because there likely isn't a lot of people in the Haskell
ecosystem that know about RTS implementation details and like to touch
gnarly C code.
haskell-ide-engine

After finishing with the GHC work I changed gears and finally started
looking into how we can best integrate cabal v2-build support via
cabal-helper in HIE. This also involved looking thought Matthew's work on
hie-bios.
Just getting HIE setup for development took much longer than
expected. Mainly due to the fact that currently it only works when built
via Stack and that's not my preferred build-tool.
I tried getting the tests working with cabal v2-test but there seem to be
some problems there that need more debugging.

Towards testing with cabal #1346

cabal-helper

Meanwhile I was also working on getting cabal-helper ready for release.
I was able to remove some very old ghc-mod-era hacks in the way
cabal-helper deals with multiple interdependent components in a package:
Remove crusty old helper code (commit).
This is a very important change as it essentially reduces the impedance
mismatch between cabal-helper and Cabal-3.0's new show-build-info command
to zero allowing us to take advantage of that going forward.
Other notable commits:


Flesh out project discovery API | This will hopefully allow replacing ghc-mod's automatic project discovery with something more principled in the future.


Introduce Package abstracton


Add exported interface for running build-tools


All changes:

cabal-helper, 20 commits (diff)

I'm planning on releasing a new version of cabal-helper in the next couple
of weeks, there are still a hand full of fixes left to do but it's
certainly ready for preliminary integration now.
Cabal

On a rainy weekend I implemented an idea I had floating around my head for
a while and also produced some fixes for cabal-install in the process:

[RFC] Support for entering/leaving annotations and concurrent log linearization #6196

The idea here is to make build output fully sequential, and incidentally
reproducible, in the presence of build concurrency but still allow for live
output. Turns out this is also useful for HIE since I realised for proper
UX we might have to parse and present build output from the build-tool as
diagnostics in the future.
This is still not finished but once I get some feedback from the other
cabal devs I'll see to getting it merged since I'm already actively using
it in Emacs myself.

Fix v2-install ProgramDb confusion #6195
Fix cabal-install fighting over index cache with old versions #6164
Solver failure on 3.0 with distsdir from 2.4.1.0 due to outdated "compiler" cache #6163

Future work

In conclusion as always there's still a lot to do, especially on the
Haskell-IDE-Engine front. I will likely keep on working in this space even
after GSoC but getting some hands-on experience with HIE has shown me just
how complex it is.
I think the main focus for the future should be making HIE easier to
contribute to. We really need more people working on it to tame the
inherent complexities of the LSP protocol and downstream tools coming
together.
I experienced some major friction in this area: The test coverage leaves
something to be desired and the granularity of the tests we do have is
simply way too coarse. If any of them fail you really have no idea what's
going on at a high level.
Only supporting one build system for building HIE itself is also not
helping to encourage contributions. My guess would be that there is a 50/50
split of cabal vs. stack in the Haskell world so we're just disgruntling
half the possible contributors right there.
Other than that we should continue the current trend of upstream-first
development rather than hacking around problems downstream -- that never
goes well in the long run.