Skip to content

Instantly share code, notes, and snippets.

@chasinglogic
Last active August 20, 2019 19:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save chasinglogic/d43a1e12d43a7d8ca052fe2aa76f16a1 to your computer and use it in GitHub Desktop.
Save chasinglogic/d43a1e12d43a7d8ca052fe2aa76f16a1 to your computer and use it in GitHub Desktop.
Ninja perf results, interesting findings but causes some correctness problems in the current state (conftests are treated as build nodes)

Table of Contents

  1. Ninja SCons Performance Notes and Results
    1. Base SCons Profile Results
    2. Caching SCons.Node.get_contents_file
    3. Caching SCons.Node.FS.File.get_csig
    4. Caching SCons.Node.FS.File.get_csig using standard memoization
    5. Initializing NodeInfo at SCons.Node.FS.Base construction time
    6. Initializing NodeInfo at SCons.Node.Node construction time

Ninja SCons Performance Notes and Results

I seeded the local file system cache by performing the build 3 times on the system before collecting results of the first profiling. I ran each profiled command three times and I averaged the time command output to determine the values shown here. The cProfile results are not as easily combinable so I used the profile output from the last run.

Unless otherwise specified, I tested each patch in isolation. Meaning that the patch shown is the changes tested and the subsequent patches do not build on one another.

The command line used to generate these results is:

time python -m cProfile -o profile.out ./buildscripts/scons.py --ninja --install-mode=hygienic ninja-install-servers

I chose the ninja-install-servers due to it's, relative to other targets, faster build.ninja generation time. It's worth noting that I perceived more noticeable gains with some changes on larger targets. For example, I was in one result able to get ninja-install-servers down to 1 minute and 50 seconds from the initial value of 3 minutes and 45 seconds. But ninja-install-meta which had an original build time of greater than 5 minutes was reduced to 2 minutes and 30 seconds.

The time results include running all the configure tests as well. Since this is a no_exec build the SConsign is never written so we never benefit from the caching of configure tests or SConsign signatures. If we can find a way to cache these even on a no_exec build then we will certainly see a performance gain.

Base SCons Profile Results

A run with no performance modifications takes about 206 seconds. The results of the time command are:

real	3m43.572s
user	3m45.924s
sys	0m5.683s

SCons is spending most of it's time in acquiring thread locks and reading files.

This indicates that SCons is doing File IO as part of walking the node graph. The actual File IO that the ninja builder does takes about 0.106 seconds and so is not the reason for the nearly 30 seconds spent in BufferedReader. The total ninja_builder execution time is about 14 seconds. This means the majority of the build time is spent reaching the ninja builder, so we should look for optimizations in SCons itself.

Welcome to the profile statistics browser.
profile.out.no_get_contents_file_cache% sort tottime
profile.out.no_get_contents_file_cache% stats 20
Tue Aug 13 09:53:10 2019    profile.out.no_get_contents_file_cache

         151797044 function calls (148685579 primitive calls) in 206.826 seconds

   Ordered by: internal time
   List reduced from 3232 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     7050   31.537    0.004   31.537    0.004 {method 'acquire' of '_thread.lock' objects}
    11076   22.198    0.002   22.198    0.002 {method 'read' of '_io.BufferedReader' objects}
      261    7.345    0.028    7.345    0.028 {built-in method posix.waitpid}
  1513241    7.151    0.000   24.345    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:402(find_include)
 23648061    5.707    0.000    7.880    0.000 {built-in method builtins.isinstance}
    85728    4.495    0.000    4.495    0.000 {built-in method posix.stat}
   324804    3.988    0.000   37.705    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:360(scan)
8645/5696    3.908    0.000   93.694    0.016 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/__init__.py:966(get_implicit_deps)
  5259563    3.613    0.000    4.663    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Util.py:1572(to_bytes)
1030213/1030065    3.521    0.000   78.283    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py:2929(get_found_includes)
  1893588    3.237    0.000    5.303    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/__init__.py:313(decide_source)
  1513847    2.346    0.000   11.797    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py:3655(find_file)
975079/428072    2.313    0.000    7.143    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py:3617(filedir_lookup)
  4539735    2.162    0.000    3.876    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Util.py:1584(to_str)
    13508    2.100    0.000  114.612    0.008 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Taskmaster.py:770(_find_next_ready_node)
329800/329600    2.058    0.000   73.510    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:198(__call__)
  1054039    2.057    0.000    3.760    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Util.py:105(splitext)
   186311    2.007    0.000    2.007    0.000 {method 'findall' of 're.Pattern' objects}
  1358487    1.973    0.000    9.045    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:248(select)
    14164    1.948    0.000    1.956    0.000 {built-in method io.open}


profile.out.no_get_contents_file_cache% 

Caching SCons.Node.get_contents_file

I believed that the place the expensive file reads were occurring in the get\contents\file function. The File node uses this when calculating content signatures for the SConsign.

I added a simple attribute based cached for this function but it didn't appear to have a major affect.

Welcome to the profile statistics browser.
profile.out% sort tottime
profile.out% stats 20
Tue Aug 13 10:06:32 2019    profile.out

         151779546 function calls (148666573 primitive calls) in 189.705 seconds

   Ordered by: internal time
   List reduced from 3230 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    11039   22.565    0.002   22.565    0.002 {method 'read' of '_io.BufferedReader' objects}
     7006   14.765    0.002   14.765    0.002 {method 'acquire' of '_thread.lock' objects}
      261    7.800    0.030    7.800    0.030 {built-in method posix.waitpid}
  1511427    6.651    0.000   24.048    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:402(find_include)
 23636776    5.694    0.000    7.898    0.000 {built-in method builtins.isinstance}
    85633    4.694    0.000    4.694    0.000 {built-in method posix.stat}
   324598    4.042    0.000   37.016    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:360(scan)
8402/5453    3.877    0.000   93.773    0.017 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/__init__.py:972(get_implicit_deps)
1032545/1032397    3.636    0.000   78.396    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py:2929(get_found_includes)
  5259568    3.581    0.000    4.622    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Util.py:1572(to_bytes)
  1893583    3.230    0.000    5.179    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/__init__.py:319(decide_source)
  1512033    2.480    0.000   11.896    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py:3655(find_file)
  4534293    2.230    0.000    3.952    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Util.py:1584(to_str)
329592/329392    2.221    0.000   73.559    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:198(__call__)
975079/428072    2.122    0.000    6.902    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py:3617(filedir_lookup)
  1360613    2.076    0.000    9.200    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:248(select)
  1055885    2.045    0.000    3.757    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Util.py:105(splitext)
    13505    1.992    0.000  113.978    0.008 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Taskmaster.py:770(_find_next_ready_node)
   186212    1.948    0.000    1.948    0.000 {method 'findall' of 're.Pattern' objects}
     7165    1.885    0.000    6.446    0.001 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Action.py:537(<listcomp>)


profile.out% 

Diff applied for adding the cache:

diff --git a/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/__init__.py b/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/__init__.py
index ece0ab7312..6f78baf1b6 100644
--- a/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/__init__.py
+++ b/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/__init__.py
@@ -220,10 +220,16 @@ def get_contents_dir(node):
 def get_contents_file(node):
     if not node.rexists():
         return b''
+
+    cached = getattr(node.attributes, "contents", None)
+    if cached is not None:
+        return cached
+
     fname = node.rfile().get_abspath()
     try:
         with open(fname, "rb") as fp:
             contents = fp.read()
+            setattr(node.attributes, "contents", contents)
     except EnvironmentError as e:
         if not e.filename:
             e.filename = fname

Caching SCons.Node.FS.File.get_csig

I moved on the caller of get_contents_hash and attempted added caching to the get_csig method of the File node. This cut our two most expensive calls in half and shaved an entire minute off of the build as seen in the time results:

real	1m58.251s
user	1m52.486s
sys	0m6.195s

The python profiler shows us where we saved all that time:

Welcome to the profile statistics browser.
profile.out% sort tottime
profile.out% stats 20
Tue Aug 13 10:21:17 2019    profile.out

         151684002 function calls (148572685 primitive calls) in 124.873 seconds

   Ordered by: internal time
   List reduced from 3229 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    11039   12.962    0.001   12.962    0.001 {method 'read' of '_io.BufferedReader' objects}
     6998   10.911    0.002   10.911    0.002 {method 'acquire' of '_thread.lock' objects}
  1514454    5.131    0.000   16.523    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:402(find_include)
 23641015    3.651    0.000    5.077    0.000 {built-in method builtins.isinstance}
      261    2.976    0.011    2.976    0.011 {built-in method posix.waitpid}
1024335/1024187    2.882    0.000   51.330    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py:2936(get_found_includes)
    85731    2.872    0.000    2.872    0.000 {built-in method posix.stat}
   324987    2.727    0.000   25.387    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:360(scan)
8624/5675    2.640    0.000   62.000    0.011 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/__init__.py:972(get_implicit_deps)
  5259561    2.213    0.000    2.882    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Util.py:1572(to_bytes)
  1893509    2.168    0.000    3.492    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/__init__.py:319(decide_source)
  1515060    1.773    0.000    7.833    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py:3662(find_file)
    13506    1.385    0.000   75.851    0.006 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Taskmaster.py:770(_find_next_ready_node)
  1048122    1.379    0.000    2.555    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Util.py:105(splitext)
975079/428072    1.375    0.000    4.534    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py:3624(filedir_lookup)
  4543374    1.374    0.000    2.467    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Util.py:1584(to_str)
329979/329779    1.349    0.000   47.644    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:198(__call__)
  1352789    1.342    0.000    6.208    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:248(select)
    14127    1.340    0.000    1.344    0.000 {built-in method io.open}
   186154    1.240    0.000    1.240    0.000 {method 'findall' of 're.Pattern' objects}


profile.out% 

Patch to apply the get_csig attribute based caching:

diff --git a/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py b/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py
index 352aeaebcc..0dc842c7c6 100644
--- a/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py
+++ b/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py
@@ -2732,6 +2732,11 @@ class File(Base):
         """
         if not self.rexists():
             return SCons.Util.MD5signature('')
+
+        cs = getattr(self.attributes, "csig_cached", None)
+        if cs is not None:
+            return cs
+
         fname = self.rfile().get_abspath()
         try:
             cs = SCons.Util.MD5filesignature(fname,
@@ -2740,6 +2745,8 @@ class File(Base):
             if not e.filename:
                 e.filename = fname
             raise
+
+        setattr(self.attributes, "csig_cached", cs)
         return cs

     @SCons.Memoize.CountMethodCall

Caching SCons.Node.FS.File.get_csig using standard memoization

Using an attributes based cache is fine for a simple test. However, the File node already supports memoization in the standard SCons way. The standard memoization uses a plain old dictionary which is generally faster than attribute access as there is less overhead.

Changing the implementation to use self._memo yields the following results.

real	1m56.616s
user	1m51.211s
sys	0m5.791s

As expected the profiling results are only about 10 seconds faster.

Welcome to the profile statistics browser.
profile.out% sort tottime
profile.out% stats 20
Tue Aug 13 11:06:14 2019    profile.out

         151819942 function calls (148704010 primitive calls) in 114.820 seconds

   Ordered by: internal time
   List reduced from 3217 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    11076   12.017    0.001   12.017    0.001 {method 'read' of '_io.BufferedReader' objects}
     7034   10.115    0.001   10.115    0.001 {method 'acquire' of '_thread.lock' objects}
  1512321    4.538    0.000   15.630    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:402(find_include)
 23645198    3.475    0.000    4.773    0.000 {built-in method builtins.isinstance}
      261    3.038    0.012    3.038    0.012 {built-in method posix.waitpid}
   324709    2.489    0.000   23.796    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:360(scan)
1032322/1032174    2.431    0.000   46.699    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py:2936(get_found_includes)
8622/5673    2.412    0.000   56.303    0.010 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/__init__.py:966(get_implicit_deps)
    85706    2.361    0.000    2.361    0.000 {built-in method posix.stat}
  5259566    2.139    0.000    2.767    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Util.py:1572(to_bytes)
  1893646    2.047    0.000    3.317    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/__init__.py:313(decide_source)
  1512927    1.513    0.000    7.704    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py:3662(find_file)
975079/428072    1.353    0.000    4.445    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py:3624(filedir_lookup)
  4536975    1.347    0.000    2.411    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Util.py:1584(to_str)
    13507    1.283    0.000   69.219    0.005 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Taskmaster.py:770(_find_next_ready_node)
  1056103    1.275    0.000    2.346    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Util.py:105(splitext)
329708/329508    1.273    0.000   43.518    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:198(__call__)
  1360500    1.231    0.000    5.651    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:248(select)
   186246    1.200    0.000    1.200    0.000 {method 'findall' of 're.Pattern' objects}
    14164    1.179    0.000    1.183    0.000 {built-in method io.open}


profile.out% 

The patch for this version is:

diff --git a/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py b/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py
index 352aeaebcc..ef12756877 100644
--- a/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py
+++ b/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py
@@ -2726,12 +2726,18 @@ class File(Base):
                 return contents.decode('utf-8', error='backslashreplace')


+    @SCons.Memoize.CountMethodCall
     def get_content_hash(self):
         """
         Compute and return the MD5 hash for this file.
         """
         if not self.rexists():
             return SCons.Util.MD5signature('')
+
+        cs = self._memo.get("get_content_hash", None)
+        if cs is not None:
+            return cs
+
         fname = self.rfile().get_abspath()
         try:
             cs = SCons.Util.MD5filesignature(fname,
@@ -2740,6 +2746,8 @@ class File(Base):
             if not e.filename:
                 e.filename = fname
             raise
+
+        self._memo["get_content_hash"] = cs
         return cs

     @SCons.Memoize.CountMethodCall

Initializing NodeInfo at SCons.Node.FS.Base construction time

I discovered after speaking to Bill Deegan that SCons already attempts to cache the csig on the NodeInfo for =File=s which is subsequently stored in the SConsign file. This information is not actually loaded however until the node is executed by a Taskmaster job. These jobs are running in threads and so accessing the NodeInfo when a node can be in multiple threads creates a lot of lock contention around the SConsign and other shared resources.

Additionally each thread is loading / creating the NodeInfo per-thread. Causing further lock contention and wasting CPU cycles on duplicate work. By front loading this work we can get the same benefits as Memoizing but with better performance characteristics as the control flow around the signature subsystem is exception based and, so is optimized for the success case. Since we optimized lower in this stack we were paying the penalty for catching exceptions every time.

The time output shows:

real	1m50.664s
user	1m45.792s
sys	0m5.304s

Our python profiling shows that we're now spending under 10 seconds locking threads down from 30 seconds before this change and _hashlib.Hash no longer registers in the top 20 tottime:

   Welcome to the profile statistics browser.
profile.out% sort tottime
profile.out% stats 20
Tue Aug 13 13:19:19 2019    profile.out

         151697302 function calls (148583891 primitive calls) in 108.919 seconds

   Ordered by: internal time
   List reduced from 3229 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    11075   11.606    0.001   11.606    0.001 {method 'read' of '_io.BufferedReader' objects}
     7042    9.494    0.001    9.494    0.001 {method 'acquire' of '_thread.lock' objects}
  1510526    3.939    0.000   14.355    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:402(find_include)
 23634287    3.324    0.000    4.582    0.000 {built-in method builtins.isinstance}
      261    2.847    0.011    2.847    0.011 {built-in method posix.waitpid}
    85728    2.451    0.000    2.451    0.000 {built-in method posix.stat}
   324397    2.394    0.000   22.288    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:360(scan)
1027213/1027065    2.354    0.000   44.552    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py:2939(get_found_includes)
8637/5688    2.294    0.000   53.662    0.009 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/__init__.py:966(get_implicit_deps)
  5259569    2.055    0.000    2.673    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Util.py:1572(to_bytes)
  1893645    1.970    0.000    3.126    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/__init__.py:313(decide_source)
  1511132    1.498    0.000    7.235    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py:3665(find_file)
  4531590    1.263    0.000    2.280    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Util.py:1584(to_str)
329399/329199    1.246    0.000   41.479    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:198(__call__)
975079/428072    1.245    0.000    4.160    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py:3627(filedir_lookup)
    28371    1.200    0.000    3.923    0.000 {built-in method builtins.eval}
    14165    1.188    0.000    1.192    0.000 {built-in method io.open}
  1051020    1.187    0.000    2.225    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Util.py:105(splitext)
    13506    1.163    0.000   65.297    0.005 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Taskmaster.py:770(_find_next_ready_node)
  1355083    1.156    0.000    5.361    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:248(select)


profile.out% 

The patch used to make this change is:

diff --git a/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py b/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py
index 352aeaebcc..e02a1c4e38 100644
--- a/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py
+++ b/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py
@@ -2643,6 +2643,16 @@ class File(Base):
         Base.__init__(self, name, directory, fs)
         self._morph()

+        try:
+            ssign = self.dir.sconsign()
+            if ssign is None:
+                raise KeyError
+            stored_info = ssign.get_entry(self.name)
+            self.ninfo = stored_info.ninfo
+            self.binfo = stored_info.binfo
+        except (KeyError, EnvironmentError):
+            self.ninfo = self.new_ninfo()
+
     def Entry(self, name):
         """Create an entry node named 'name' relative to
         the directory of this file."""

Initializing NodeInfo at SCons.Node.Node construction time

While the performance gains from the above patch had significant performance wins it also produced some unintended side effects. The SConsign module was getting imported much earlier in the build by things like conftest*.c files and the SConscript File nodes themselves. This led to two SConsign files existing one at the root of the project using the default name and one at the location indicated by SConsignFile. We also don't really need to load the SConsign info we just need the NodeInfo object to be constructed for File nodes.

With these changes in mind I made the following patch that has both correct behavior with no side effects and all of the performance gains of the above change.

   commit 526a6f9755644fc04f7662a3d63bb35299ee8c80
Author: Mathew Robinson <chasinglogic@gmail.com>
Date:   Mon Aug 19 13:38:37 2019 -0400

    SERVER-42264 Improve Ninja generation time

diff --git a/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/__init__.py b/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/__init__.py
index ece0ab7312..2eb9c048ef 100644
--- a/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/__init__.py
+++ b/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/__init__.py
@@ -630,6 +630,7 @@ class Node(object, with_metaclass(NoSlotsPyPy)):
         self._func_rexists = 1
         self._func_get_contents = 0
         self._func_target_from_source = 0
+        self.ninfo = None

         self.clear_memoized_values()

@@ -1133,11 +1134,10 @@ class Node(object, with_metaclass(NoSlotsPyPy)):
         return ninfo

     def get_ninfo(self):
-        try:
-            return self.ninfo
-        except AttributeError:
-            self.ninfo = self.new_ninfo()
+        if self.ninfo is not None:
             return self.ninfo
+        self.ninfo = self.new_ninfo()
+        return self.ninfo

     def new_binfo(self):
         binfo = self.BuildInfo()

The performance results show we are still in getting the threading benefits from before:

real	1m50.637s
user	1m49.697s
sys	0m1.805s

Pstats shows we continued to save the time in the same functions:

Welcome to the profile statistics browser.
profile.out% sort tottime
profile.out% stats 30
Tue Aug 20 15:16:52 2019    profile.out

         172581567 function calls (169405361 primitive calls) in 108.667 seconds

   Ordered by: internal time
   List reduced from 3285 to 30 due to restriction <30>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1695973    4.875    0.000   16.725    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:402(find_include)
    74097    4.488    0.000    4.488    0.000 {built-in method posix.stat}
23514516/23514515    3.974    0.000    8.221    0.000 {built-in method builtins.isinstance}
    14111    2.917    0.000    2.920    0.000 {built-in method io.open}
   354255    2.890    0.000   28.592    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:360(scan)
     6976    2.829    0.000    2.829    0.000 {method 'acquire' of '_thread.lock' objects}
25877/23115    2.707    0.000   52.468    0.002 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/__init__.py:967(get_implicit_deps)
  3136050    2.509    0.000    4.247    0.000 /home/chasinglogic/.virtualenvs/mongo/lib/python3.6/abc.py:180(__instancecheck__)
1115601/1115597    2.389    0.000   41.684    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py:2929(get_found_includes)
    17863    2.267    0.000    2.267    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Executor.py:408(<listcomp>)
  5451572    2.241    0.000    2.831    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Util.py:1572(to_bytes)
  2154224    1.895    0.000    3.050    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/__init__.py:313(decide_source)
  1696319    1.790    0.000    7.765    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py:3655(find_file)
  6205537    1.729    0.000    1.729    0.000 /home/chasinglogic/.virtualenvs/mongo/lib/python3.6/_weakrefset.py:70(__contains__)
  5087931    1.528    0.000    2.902    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Util.py:1584(to_str)
354824/354791    1.505    0.000   38.583    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:198(__call__)
  1174296    1.470    0.000    2.625    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Util.py:105(splitext)
   170583    1.432    0.000    1.432    0.000 {method 'findall' of '_sre.SRE_Pattern' objects}
1021451/449741    1.399    0.000    4.577    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Node/FS.py:3617(filedir_lookup)
  1472859    1.340    0.000    6.611    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Scanner/__init__.py:248(select)
  7034627    1.289    0.000    3.027    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Util.py:406(is_String)
  1092108    1.176    0.000    2.036    0.000 /home/chasinglogic/.virtualenvs/mongo/lib/python3.6/posixpath.py:104(split)
     6879    1.092    0.000    3.881    0.001 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Action.py:537(<listcomp>)
233613/29767    1.089    0.000    5.572    0.000 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Subst.py:649(expand)
    13522    1.048    0.000   66.036    0.005 /home/chasinglogic/Work/mongo/src/third_party/scons-3.0.4/scons-local-3.0.4/SCons/Taskmaster.py:770(_find_next_ready_node)
487929/116692    0.944    0.000    8.525    0.000 {method 'sub' of '_sre.SRE_Pattern' objects}
  2466128    0.938    0.000    3.358    0.000 /home/chasinglogic/.virtualenvs/mongo/lib/python3.6/_collections_abc.py:879(__iter__)
    26434    0.929    0.000    4.245    0.000 {built-in method builtins.eval}
    11017    0.894    0.000    0.894    0.000 {method 'read' of '_io.BufferedReader' objects}
67502/67455    0.850    0.000    1.270    0.000 {built-in method builtins.__build_class__}

Now that the ninfo attribute is not being dynamically created in each thread it is shared reducing time spent locking and removing redundant MD5 hashing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment