Skip to content

Instantly share code, notes, and snippets.

@zwilias
Last active September 25, 2018 11:36
Show Gist options
  • Star 8 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save zwilias/7ed394ec0e9c6035e1874d19b721e294 to your computer and use it in GitHub Desktop.
Save zwilias/7ed394ec0e9c6035e1874d19b721e294 to your computer and use it in GitHub Desktop.
Elm compiler performance

Things that influence compile times

Complex/large case .. of expressions

Tracked here: elm/compiler#1362

This makes the "exhaustiveness" checker a little weary. You know, the thing that tells you you forgot to add a branch.

If you're facing compilation performance issues, and notice a large-ish case .. of expression, you can replace the entire expression with a Debug.crash "perf experiment" to check if that is, indeed, having a negative influence on your compilation times.

If you're pattern matching on a tuple, doing a nested case of instead may help.

🐌 Potentially slow:

case (foo, bar) of
  (A, A) -> ..
  (A, B) -> ..
  (B, A) -> ..
  (B, B) -> ..

✈️ Potentially faster:

case foo of
  A ->
    case bar of
      A -> ..
      B -> ..
  B ->
    case bar of
      A -> ..
      B -> ..

Note 1: the exhaustiveness checker has undergone a lot of work in preparation for the 0.19 release, and its impact on compilation times will be drastically less, once 0.19 is released.

Note 2: by using a heavily nested TEA setup, your code is bound to end up having a whole bunch of large-ish case of statements in all of those update functions. While the impact of each of those on its own won't be huge, the combination of them will make things perceivably slower to compile.

Multicore Linux systems and CI (Travis, CircleCI, ...)

The Elm compiler can, in some cases, suffer from heavy parallelization. On CI systems like Travis and Circle CI, it's advisable to use sysconfcpu to limit to one or two cores. A secondary avenue (setting flags to influence the parallel GC settings of the haskell runtime) is also being investigated

Webpack

When using webpack, you'll want to check 2 things:

  • maxInstances flag; which - in the most recent version of elm-webpack-loader - is set to 1 by default. On older versions, you'll want to check if setting this to 1 has influence on your compile times.
  • If you're running on linux, the "multicore linux issue" may also apply. To use sysconfcpu while using elm-webpack-loader you can create a wrapper script and use that as your elm-make path using ?pathToMake=scripts/elm-make-wrapper
#! /bin/bash

sysconfcpus -n 2 elm-make "$@"

Note that you can use a similar trick for elm-test, which also accepts a path to elm-make.

Many imports, interlinked module structure

When module A imports module B, every time module B needs to be recompiled will result in module A also being recompiled. Note that this also counts for transitive dependencies. A -> B -> C (so A depends on B depends on C) means that any change to C will recompile B and A as well).

By having highly interlinked modules, that means loads of files need to be recompiled whenever any file is touched.

As far as compilation speed is concerned, a very flat module-hierarchy is what you should aim for. As an added bonus, this tends to lead to nicer code than highly-nested TEA.

Many small files

In addition to the above, every extra file has some constant overhead on the compiler. If code that lives in three files, could live in a single file without impeding reusability and without making it possible to break invariants your code needs to maintain, this may just be a case of over-eagerly splitting things up.

Evan's talk is a great resource for understanding "the life of a file".

Lacking type signatures

This is unlikely to play a significant role, but adding type signatures everywhere is a good idea in general.

Things that do not impact performance

  • import X exposing (..) vs import X exposing (foo) vs import X
  • probably other things but having a hard time coming up with something
@zwilias
Copy link
Author

zwilias commented Sep 11, 2017

Some notes on how elm-make decides which files to compile, i.e. the "planning" phase:

The module structure in Elm can be expressed as a directed, acyclic graph. For example, consider 3 files:

module Main exposing (..)

import Model exposing (..)
import View exposing (..)
[...]
module Model exposing (Model, init)

[...]
module View exposing (view)

import Model exposing (Model)

These dependencies can be expressed as a graph like so:
image

The arrows express a dependency: Main imports View, so Main depends on the API of View. Changes to the API of View may impact Main.

Now, in this setup; any change to View will result in both View and Main being recompiled. Imagine I changed the signature for the view function, now compilation of Main should fail. Similarly, changing Model requires recompiling both View and Main.

Conversely, changing Main does not necessitate recompiling View or Model.


The first step of the planning phase is to create this graph. For each entrypoint, its dependencies are added to the graph, and this happens recursively until there are no more dependencies to add. Since cyclic imports in Elm aren't allowed, this results in a DAG.

Next step is checking if a module needs to be recompiled. A module will be added to the list of modules that need to be compiled when:

  • it hasn't been compiled before (no corresponding interface file)
  • it was changed since it was last compiled (last-modified time of source file > last-modified time of interface file)
  • it depends on a module whose interface-file was created after the interface file of the module itself (*)
  • it depends on a module that needs to be recompiled according to the above rules

During this processing, the information is transformed into a list where each item represents a module and its "blockers" - modules imported by this module, that need to be compiled.

Compilation is then a simple process of looping through this list, searching for a module where blockers is empty, and compiling that. After successfully compiling a module, it is removed from all blockers, and the process starts all over.


(*): Since, during normal operations, a module can only be compiled after all of its dependencies have been compiled, a module whose dependencies were compiled after the module itself means something fishy happened and this module is potentially stale. Realistically, this may happen in applications with multiple entrypoints where elm-make is invoked several times with different entrypoints

@julianjelfs
Copy link

julianjelfs commented Sep 21, 2017

Hi @zwillias I have a couple of questions on this:

Firstly, how do we determine whether module A depends on module B? Is it just done based on import statements or is it done on actual usage? If it is just import statements then I guess it becomes very important to remove unused imports.

Secondly, in the above you state that "Changes to the API of View may impact Main". This is fine, and this clearly requires Main to be recompiled. But is that really what happens? Isn't it the case that Main will be recompiled when View is changed in any way? It seems that a major avenue for improvement would be to distinguish between changes to View that did or did not affect its external API. If I only change the Views internal code but do not change any of the exposed types, then surely only View itself needs to be compiled.

Am I making sense?

@zwilias
Copy link
Author

zwilias commented Sep 28, 2017

@julianjelfs sorry I didn't respond before, didn't notice there was a comment here.

Your understanding is correct on both counts. a imports b == a depends on b; regardless of usage.

As for your second point: that would be an improvement, yes. The way things currently happen, this would be a fairly impactful change. Evan is aware of this possibility, and it is on his personal list of possible improvements to investigate after 0.19 is released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment