Skip to content

Instantly share code, notes, and snippets.

@countvajhula
Last active September 30, 2022 18:59
Show Gist options
  • Save countvajhula/5b40ecc9ce53755df90f4046a310a01f to your computer and use it in GitHub Desktop.
Save countvajhula/5b40ecc9ce53755df90f4046a310a01f to your computer and use it in GitHub Desktop.
Understanding Racket module load time contributions

The Setup

This uses the codeloader tool to generate lots of dummy code. The same aggregate generated code is either contained in a single module or distributed across N=500 modules, and there are two client modules, main-combined.rkt and main-separate.rkt that require either the former or the (myriad) latter. The tests below are run on these "main" modules.

$ ./filecreator 500

Results

Require Latencies

All dummy modules required normally

$ raco require-latency -f main-combined.rkt 
12 ms
$ raco require-latency -f main-separate.rkt 
351 ms

339ms difference, about 29X

All dummy modules required for-syntax

$ raco require-latency -f main-combined.rkt
2 ms
$ raco require-latency -f main-separate.rkt
174 ms

  • 172ms difference, about 87X
  • in the combined case, this is 1/6th, and in the separate case it is about half of the value when the modules are required normally.

The remainder of these tests do not use for-syntax, i.e. the modules are required normally.

Hyperfine

$ hyperfine "racket main-combined.rkt"
Benchmark 1: racket main-combined.rkt
  Time (mean ± σ):     239.3 ms ±   3.6 ms    [User: 151.4 ms, System: 83.4 ms]
  Range (min … max):   233.4 ms … 246.0 ms    12 runs
	 
$ hyperfine "racket main-separate.rkt"
Benchmark 1: racket main-separate.rkt
  Time (mean ± σ):     622.7 ms ±  40.2 ms    [User: 438.8 ms, System: 161.9 ms]
  Range (min … max):   578.8 ms … 713.0 ms    10 runs

  • 383ms difference
  • this is 44ms higher than the require latency difference of 339ms

PLT_LINKLET_TIMES

$ racket main-combined.rkt 
;; read-linklet             9 [2] ms ; 76 times
;; run                     14 [5] ms
;;   instantiate           14 [5] ms ; 313 times
;; read                    15 [0] ms
;;   read-bundle            0 [0] ms
;;   faslin-literals        3 [0] ms ; 313 times
;;   faslin-code           12 [0] ms ; 313 times
;; boot                   125 [0] ms
;; total                  163 [7] ms
;;
;; faslin-code              1 MB

$ racket main-separate.rkt 
;; run                      7 [ 0] ms
;;   instantiate            7 [ 0] ms ; 1810 times
;; read-linklet            28 [ 7] ms ; 575 times
;; boot                   123 [ 0] ms
;; read                   145 [72] ms
;;   read-bundle            0 [ 0] ms
;;   faslin-literals       20 [ 0] ms ; 1810 times
;;   faslin-code          125 [72] ms ; 1810 times
;; total                  303 [79] ms
;;
;; faslin-code              4 MB

  • 140 ms total difference
  • 130 ms difference in read time
  • eval time is apparently higher for the combined module, but is small
  • The difference comes mainly from faslin-code.

PLT_EXPANDER_TIMES

$ racket main-combined.rkt 
REGION               MSECS     MEMK   COUNT
  eval               59.75   24,425       0
    resolver         20.41    4,805     968
    instantiate      18.38   10,179      64
    module           16.81    7,445      76
    requires          4.14    1,995    1293

  read               14.71    4,888     141

  expand              0.00        0       0
    require           0.00        0       1

$ racket main-separate.rkt 
REGION               MSECS      MEMK   COUNT
  eval              272.09   124,993       0
    resolver        138.89    21,507    4461
    instantiate      68.39    79,909     563
    module           54.42    18,412     575
    requires         10.38     5,163    2790

  read               38.06    16,079     640

  expand              0.00         0       0
    require           0.00         0       1

  • 212 ms difference in eval time
  • the difference comes mainly from resolver, but the others are clearly higher as well
  • 23 ms difference in read time

Compile times

$ time raco make main-combined.rkt

real    0m0.657s
user    0m0.491s
sys     0m0.161s

$ time raco make main-separate.rkt 

real    0m18.463s
user    0m12.626s
sys     0m5.270s

main-separate.rkt takes significantly longer to compile than main-combined.rkt.

See Also

Qi Compiler Sync Sept 23 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment