Skip to content

Instantly share code, notes, and snippets.

@tomekowal
Last active April 19, 2018 06:56
Show Gist options
  • Save tomekowal/623efe683b0e66556d04dd079f5f749e to your computer and use it in GitHub Desktop.
Save tomekowal/623efe683b0e66556d04dd079f5f749e to your computer and use it in GitHub Desktop.

Exploring Elixir compiler

Elixir language is a love child of Ruby and Erlang giving you the best of both worlds: easy concurrency and programmer friendliness with advanced metaprogramming capabilities. Running tests concurrently enhances development cycle time. Nothing comes without drawbacks though. Compilation time is usually quite big. It doesn't matter that much during development because you compile only changed files. However, building from scratch takes time which might hurt CI time.

What makes compiling larger Elixir project slow? Is it inherent to the problem or is there a room for improvement?

Measuring compilation speed

I'm going to perform measurements on my work laptop: 13 inch MacBook Pro early 2015 3,1 GHz Intel Core i7 with 16GB of RAM Let's measure how much time, CPU and memory does Elixir compilation take. I'm using latest master commit 5947ed751eb0276cb7fda666c319dba451a5e9e7

time make compile
...
make compile  116,55s user 10,70s system 153% cpu 1:22,98 total

I was using htop in the other window and here is what I learned.

  • apart from one brief moment, the compilation didn't use all CPUs
  • the starting memory before compilation was over 5GB and it never went over 7GB

So the compilation is neither memory nor CPI bound. Does it automatically mean that it is IO bound. Let's retry the experiment on RAM disk.

time make compile
...
make compile  112,84s user 9,92s system 154% cpu 1:19,36 total

The results look very similar and those couple of seconds might be just a statistical error. If compilation is neither CPU, nor memory, nor IO bound, maybe it will be possible to parallelize it more?

Understanding Elixir parallel compiler

José wrote a great article on inner workings of parallel compiler. https://elixir-lang.org/blog/2012/04/24/a-peek-inside-elixir-s-parallel-compiler/ It explains the main idea behind while leaving out all the optimizations.

The main idea behind it is like this:

  • there is one main process called "parent" or "elixir_compiler"
  • for each file spawn new compiling process
  • when it finishes: report to parent
  • when it needs other module - report it to parent and wait for the module to be compiled

Very often performance problems in Erlang systems are caused by wrong communication patterns: lock contention or one process being a bottleneck. Let's dive into the source some more.

How to run compile task from iex

Firstly, I wanted to run the compilation task from IEx, so that I am able to apply any profiling code. I started with Observer because it allows to check if any process has a growing message queue which means it doesn't keep up and becomes a bottleneck

#!/usr/bin/env iex
Mix.start #start mix
:observer.start #start observer before compilation
{time, retval} = :timer.tc(fn() ->
  Mix.CLI.main ["compile"]
end)
IO.inspect time/1_000_000 #convert from microseconds to seconds
IO.inspect retval

I dropped that to my work project to make sure that most of the compilation uses Elixir. Elixir source itself is first bootstrapped with Erlang.

Using observer I checked the process queues lengths and none of them jumped higher than two. This rules out my theory with bottleneck.

Htop crashes terminal.app and activity monitor

I wanted to check real time CPU usage while building, but htop doesn't work, so I needed to go with Activity monitor.

Increasing number of schedulers doesn't have an impact

Lock counting doesn't work with kerl build

Another common problem is lock contention. Erlang VM uses locks internally to protect resources and it is possible that those make the VM slower. http://erlang.org/doc/apps/tools/lcnt_chapter.html

Is Lexical Tracker guilty? https://github.com/elixir-lang/elixir/blob/master/lib/elixir/lib/kernel/lexical_tracker.ex

  %                                               CNT       ACC       OWN
  [{ "<0.874.0>",                                1050,undefined,   20.921},    %%
   { spawned_by, "<0.535.0>"},
   { spawned_as, {proc_lib,init_p,
                           ["<0.535.0>",[],gen,init_it,
                            [gen_server,"<0.535.0>","<0.535.0>",
                             'Elixir.Kernel.LexicalTracker',nil,[]]]}},
   { initial_calls, [{proc_lib,init_p,5},{erlang,put,2}]}].
  
  {[{undefined,                                     1,20977.726,    0.037}],
   { {proc_lib,init_p,5},                           1,20977.726,    0.037},     %
   [{{proc_lib,init_p_do_apply,3},                  1,20848.597,    0.081},
  
  
   :lcnt.conflicts
             lock   id  #tries  #collisions  collisions [%]  time [us]  duration [%]
            -----  --- ------- ------------ --------------- ---------- -------------
        proc_link 1399  755506        59627          7.8923    1537139        7.6958
        run_queue    6 3439147        49701          1.4452     948310        4.7478
         atom_tab    1 2803735         1881          0.0671      67882        0.3399

 :lcnt.conflicts
                   lock  id  #tries  #collisions  collisions [%]  time [us]  duration [%]
                  ----- --- ------- ------------ --------------- ---------- -------------
            code_server   6 1393762        73801          5.2951    1155458        5.7849
              run_queue   6 3439147        49701          1.4452     948310        4.7478
               atom_tab   1 2803735         1881          0.0671      67882        0.3399
          process_table   1   89258         5607          6.2818      32691        0.1637

:lcnt.inspect(:proc_link)
      lock                    id  #tries  #collisions  collisions [%]  time [us]  duration [%] histogram [log2(us)]
     -----                   --- ------- ------------ --------------- ---------- ------------- ---------------------
 proc_link           code_server  149268        59609         39.9342    1140048        5.7078 |X        ...............      |
 proc_link <nonode@nohost.487.0>     774            0          0.0000      13260        0.0664 |         X  ..    .    .      |
 proc_link <nonode@nohost.332.0>    1227            0          0.0000      11648        0.0583 |X        .  .. xx Xx.x        |
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment