Skip to content

Instantly share code, notes, and snippets.

@brianc
Last active December 22, 2023 00:47
Show Gist options
  • Star 39 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save brianc/f906bacc17409203aee0 to your computer and use it in GitHub Desktop.
Save brianc/f906bacc17409203aee0 to your computer and use it in GitHub Desktop.
Some thoughts on node-postgres in web applications

Some thoughts on using node-postgres in a web application

This is the approach I've been using for the past year or so. I'm sure I'll change and it will change as I grow & am exposed to more ideas, but it's worked alright for me so far.

Pooling:

I would definitely use a single pool of clients throughout the application. node-postgres ships with a pool implementation that has always met my needs, but it's also fine to just use the require('pg').Client prototype and implement your own pool if you know what you're doing & have some custom requirements on the pool.

The drawback to using a pool for each piece of middleware or using multiple pools in your application is you need to control how many open clients you have connected to the backend, it's more code to maintain, and likely wont improve performance over using a single pool. If you find requests often waiting on available clients from the pool you can increase the size of the built in pool with pg.defaults.poolSize = 100 or something. I've set the default at 20 which is a sane default I think. If you have long running queries during web requests you probably have bigger problems than increasing your pool size is going to solve. 😉

Each client connection takes a small but non-trivial amount of ram on the PostgreSQL server backend. Once we opened 1,500 clients to our server & the server went unresponsive - all the resources were consumed servicing clients & queries wouldn't run. I've heard a good soft limit to stick to is 1,000.

So if you have N servers running M processes with the cluster module make sure (pg.defaults.poolSize * N * M) < 1000. here is a more in-depth guide about connection count

Code Organization:

I usually make a single file in the application with a 'query' method exported. This method uses the pool, checks out a client, runs the query, returns the client to the pool, and then calls the callback with the results. I do this because it's much easier to audit SQL or put logging in one file than 3 months into a project going "hmm...can we log every sql statement?" but you've already sprinkled pg.connect calls throughout 50 route handlers and 20 model files across the application. Also, it's easy to forget to release your client when you're finished with it. I've helped on apps which had a client leak, and just like a memory leak, the app stops responding some time in the future as the pool gets exhausted. Hard to track down if you're checking out clients throughout a 100+ file code base.

usually the file looks something like this:

//./lib/query.js

//I have omitted logging, but my function is usually sprinkled with logs
module.exports = function(queryText, queryValues, cb) {
   pg.connect(function(err, client, release) {
     //connection failure
     //we don't need to release anything
     //because we were never handed a client in this case
     if(err) return cb(err);
     
     client.query(queryText, queryValues, function(err, result) {
       //always release the client
       release();
       
       if(err) return cb(err);
       
       //i like to return the rows directly since 99% of the time
       //I don't care about the other properties on the result object
       return cb(null, result.rows, result);
     });
   });
};

This pattern for me is so common I created pg-query to basically abstract it into a module I drop in on most projects. I still generally create a ./lib/query file that exports the pg-query module anyway in case I want to replace the module with a bespoke implementation.

I also like to keep longer, non-dynamic queries in separate .sql files because it's hard to do multi-line strings in JavaScript & I like to separate them out to keep the code a bit more readable. I know in web-apps a lot of times you need to dynamically build queries and for that I still build them dynamically, but I try for more static queries, and when I have them, I like to use querybox. I usually have a folder at the application root level with sql files in it, and an index.js file that looks like this:

// ./sql/index.js

//our custom 'query' helper
var query = require('../lib/query');
module.exports = require('querybox')(__dirname, query);

So we end up with a tree structure that looks like this...

| app.js
| routes/
|-- data.js
|-- photos.js
| lib/
|-- query.js
| sql/
|-- index.js
|-- some-big-query.sql
|-- get-user-by-email.sql

So a very simplified version of a route handler with no input validation and no logging would look something like this:

// routes/data.js
var sql = require('../sql');
var query = require('../lib/query');
var router = new require('express').Router();

module.exports = router;

router.get('/user/:email', function(req, res, next) {
  var email = req.params.email;
  sql.run('get-user-by-email', [email], function(err, rows) {
    if(err) return next(err);
    if(!rows.length) return next(404);
    
    return res.json(rows[0]);
  });
});

router.post('/user', function(req, res, next) {
  var queryText = 'INSERT INTO "user"(email) VALUES ($1')';
  var queryValues = [req.body.email]
  query(queryText, queryValues, function(err, rows, res) {
    if(err) return next(err);
    
    res.status(204).end();
  });
});

Transactions:

Funnily enough the amount of times I actually use multi-query transactions is pretty low, and so I haven't had the need to look at modules that help out with this or work on any of my own. In the handful of places I've done a transaction I've just rolled it out by hand...it's quite a bit of code and you absolutely cannot use the aforementioned ./lib/query helper because you must use the same client for all steps of the transaction. This is one of the places where I still reference require('pg').connect directly. This is (unfortunately) usually how it looks:

var transferFunds = function(fromId, toId, amount, cb) {

  pg.connect(function(err, client, release) {
    if(err) return cb(err);
    
    var curryError = function(dispose, next) {
      return function(err, res) {
        if(err) {
          //if we pass 'truthy' to release it will
          //close & throw away the client & replace it with a new one
          //in the pool - i do this for 'critical' query errors
          release(dispose);
          return cb(err);
        }
        return next(res.rows, res);
      }
    }
    
    client.query('BEGIN', curryError(true, function() {
    
      var queryText = 'SELECT sum(money) as "availableFunds" FROM account WHERE id = $1'
      client.query(queryText, [fromId], curryError(function(rows) {
        if(rows[0].availableFunds < amount) {
          release();
          return cb(new Error('not enough money to transfer'));
        }
        
        var updateFunds = 'INSERT INTO account(money) VALUES ($2) WHERE id = $1';
        
        //debit & credit accounts
        client.query(updateFunds, [fromId, -amount], curryError(true, function() {
        
          client.query(updateFunds, [toId, amount], curryError(true, function() {
            
            //finally commit our transaction
            client.query('COMMIT', curryError(true, function() {
              cb(null);
            }));
          }));
        }));
      }));
    }));
  }); 
  
}

That's actually pretty horrible looking & there's probably a better way - I just haven't needed to find one yet!

Testing

I use mocha for testing usually, and I have a pretty simple way to integration test routes, library functions, or whatever that hit the database. This assumes everything is using the query method in ./lib/query we talked about..

//some-test.js

//set the pool size to ONE in tests
//this means we will always get the same client...
//this lets us use a BEGIN/ROLLBACK in our tests so the database
//is never left in a bad state

require('pg').defaults.poolSize = 1;
var query = require('../lib/query');
var ok = require('okay');
var assert = require('assert');

describe('test with database access', function() {
  beforeEach(function(done) {
    query('BEGIN', done);
  });
  
  afterEach(function(done) {
    query('ROLLBACK', done);
  });
  
  it('does something awesome but does not disturb the database', function(done) {
    query('INSERT INTO "user"(email) VALUES($1)', ['brian@test.com'], ok(done, function(rows) {
      query('SELECT COUNT(*) as user_count FROM "user"', ok(done, function(rows) {
        assert.equal(rows[0].user_count, 1);
        done();
      }));
    }));
  });
  
  it('has clean database state', function(done) {
    query('SELECT COUNT(*) as user_count FROM "user"', ok(done, function(rows) {
      assert.equal(rows[0].user_count, 1);
      done();
    }));
  });
});

I banged this out quickly so it has some holes in it for sure, but I hope this helps...

@mshanu
Copy link

mshanu commented Nov 18, 2014

Its a good start, and I am also trying to get out a web app with node and pg. More curious to know about the sql folder? Do you have this code checked in some where?

@abenhamdine
Copy link

Thx ! It helped me a lot.
Actually, I didn't even think about .sql files, which is such a simple solution than having multilines in javascript...

+1 also for the BEGIN/ROLLBACK in tests. Didnt' think about it too...
Usefull for integration tests.

@kevinburkeshyp
Copy link

Think there's a slight error here - might be reading this wrong, in one instance, you call curryError without a boolean for the first argument

@DJviolin
Copy link

DJviolin commented Sep 26, 2016

I followed this guide. My implementation is ok? It's returning what it should but I'm not sure I following the best practices:

// ./sql/config/pgConfig
const pg = require('pg');

const pgConfig = {
  user: process.env.PGUSER,
  database: process.env.PGDATABASE,
  password: process.env.PGPASSWORD,
  port: process.env.PGPORT || 5432,
  max: 10,
  idleTimeoutMillis: 30000,
};
const pool = new pg.Pool(pgConfig);

module.exports = pool;

// ./sql/config/pgQuery
const pool = require('./pgConfig');

module.exports = (text, values, cb) => {
   pool.connect((err, client, done) => {
     if (err) return cb(err);
     client.query(text, values, (err, result) => {
       done();
       if (err) return cb(err);
       return cb(null, result.rows, result);
       ////cb(err, result);
     });
   });
};

// ./routes/pg
const express = require('express');
const router = express.Router();
const query = require('../sql/config/pgQuery');

router.post('/', (req, res, next) => {
  const text = `
    SELECT * FROM users;
  `;
  const values = [];
  query(text, values, (err, rows) => {
    if (err) return next(err);
    console.log('RETURN QUERY: ', rows);
    res.status(204).end();
  });
  res.redirect('back');
});

@styxali
Copy link

styxali commented Nov 19, 2017

Well it's okey, but you didn't mind to release the connection, follow the doc and look at these lines for reference:

...

if(err) {
//if we pass 'truthy' to release it will
//close & throw away the client & replace it with a new one
//in the pool - i do this for 'critical' query errors
release(dispose);
return cb(err);
}
...

@Unionindesign
Copy link

Thank you for this! Any updates for 2019? I'm trying to structure the backend of a React application, and haven't been sure (until know) where to keep my sql queries - very helpful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment