Skip to content

Instantly share code, notes, and snippets.

@andrewb
Created March 25, 2019 12:05
Show Gist options
  • Save andrewb/499807675b0d957dd5ae1db61ef96c20 to your computer and use it in GitHub Desktop.
Save andrewb/499807675b0d957dd5ae1db61ef96c20 to your computer and use it in GitHub Desktop.
Quick guide to debugging Node memory leaks

Quick guide to debugging Node.js memory leaks

The basics

Run the app locally using the --inspect switch. For example node --inspect index.js. Note: if you are debugging a complex Next.js app (or similar) you need to use a production build, otherwise you’ll get stuck on "Building dominator tree…" (see this issue for some info). To inspect Oxygen, run yarn build and NODE_ENV=production node --inspect server.js

Open DevTools for Node. You can do this by visiting chrome://inspect/#devices and looking for your app under "Remote Target".

Screen Shot 2019-03-21 at 6 58 12 AM

Or, you can Open DevTools in the browser and click for the Node icon.

Screen_Shot_2019-03-21_at_6_58_43_AM

Select the "Memory" tab and take a "Heap snapshot".

Screen Shot 2019-03-21 at 6 59 20 AM

The snapshot will be saved as "Snapshot 1" and shows the amount of allocated memory below its name.

Screen Shot 2019-03-21 at 7 01 24 AM

You can select the snapshot to see detailed info about memory allocation.

Perform an action that you think might be causing a leak, e.g. reload a page.

Take another snapshot (see the top left record icon above the snapshot list). Compare its size to the size of the previous snapshot to see if it's larger.

Screen_Shot_2019-03-21_at_7_01_24_AM

Select the second snapshot and select "Comparison" to compare it with "Snapshot 1".

Screen Shot 2019-03-21 at 7 09 05 AM

You can sort by "# Delta" and "Size Delta" to see changes in memory allocation. This might not reveal much if the leak is small. Also don’t be concerned if some additional memory is retained – it might not have been garbage collected yet, etc.

We can get a better sense of what’s happening if we do some load testing using a tool like Apache Benchmark. Note: before you load test make sure you will not CRUSH PRODUCTION SERVICES. If testing an app like Oxygen you can configure your local oxygen-gql instance to use mocks instead of hitting Flow.

tl;dr ab -A admin:thegoodplace -k -n 400 http://localhost:9000/ would make 400 requests to http://localhost:9000/ and include an auth header. See docs for details.

Go back to DevTools and take another snapshot and compare it with the previous snapshot. If there is a leak you’ll see a significant increase in "# Delta" and "Size Delta".

Finding the leak

Now that we have data we can find the leak.

The following uses Leaky as an example. This is a very simple app with a big flaw.

Here are the guts:

const myLog = [];

const server = http.createServer((req, res) => {
  myLog.push({ url: req.url, date: new Date() });
  res.end("Hello!");
});

Two snapshots have been taken using the method described above. The first was taken after initializing the app, and the second was taken after reloading once.

Screen Shot 2019-03-21 at 7 12 55 AM

We could use these snapshots to find the leak, however it might be difficult to find the source since the changes in allocation are small.

Run ab -k -n 400 http://localhost:9000/ to load test the app. Note, 400 is arbitrary and should be modified based on your application.

Now it's a lot easier to see what's happening when comparing snapshots. We have 400 new date objects which is concerning since in the example above we made 400 requests. We also have 401 new number and object entries.

Screen Shot 2019-03-21 at 7 23 29 AM

Note, you can also use "Summary" and "Objects allocated between Snapshot 2 and Snapshot 3" instead of "Comparison".

Expand "Object" in the "Constructor" column to take a close look at the entries. Some of them might look familiar.

Screen Shot 2019-03-21 at 7 25 26 AM

Sort by "Alloc. Size" or "Size Delta" and inspect the array constructor. Looking at the first entry we can see it contains the same objects we saw earlier.

Screen Shot 2019-03-25 at 6 45 58 AM

Select the array to find additional info, such as its variable name.

Screen Shot 2019-03-25 at 6 45 40 AM

This reveals that pushing to myLog is causing the leak – each client request pushes to a shared array that will keep growing until memory limit is reached. If this were a real application the solution would be to write to a db or file instead of keeping the log in memory. Note, you can also select the object instead (like earlier) to see that it is an entry in myLog.

To confirm this is causing the leak, comment out the suspect code and re-run the test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment