Skip to content

Instantly share code, notes, and snippets.

@pramsey
Last active April 13, 2018 12:50
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pramsey/2990afc4d17f247db5e491c342194eea to your computer and use it in GitHub Desktop.
Save pramsey/2990afc4d17f247db5e491c342194eea to your computer and use it in GitHub Desktop.
SQL API COPY

So, it looks like COPY support will have to be divided into two parts, /copyfrom and /copyto, which is fine. They are both interesting, in that since the idea is to support scaling, the implementations absolutely must stream, rather than taking files or holding data in memory.

The parts needed are

The parts do in fact all fit together and work, as can been seen in this small proof-of-concept:

In the proof-of-concept everything of interest happens in the custom storage engine. Database connect, stream handling, etc. In the real implementation in the SQL API, we need at least a pg client, and the COPY sql we want to run. Getting these into the custom storage handler seems quite hard.

A basic implementation within the SQL API is on this branch:

The /copyfrom end point is set up by the copy_controller.js implementation.

The copy controller adds in two extra parts to the standard middleware chain used by the query_controller.js. A standard npm body parser, and the multer in single-file mode, with a custom storage handler, utils/multer-pg-copy.js.

The custom storage handler doesn't yet try to do a COPY, as that requires getting our hands on a pg client, which is tricky since the handler is only fed the "req" object, while the "authParams" needed to connect to the database live on "res.locals". Putting a stupid middleware ahead of the multer that copies the "authParams" onto the request object works to pass the database connection information through, although it seems hacky.

The blocking problem is this: the 'sql' parameter, that I want to access in the custom storage handler, because it will be required once we start doing a real COPY into the database is not available on the "req.body.sql" object at that stage. It is available after the handler is finished, in later stages, but not while the custom multer storage engine is running. By the time the terminal "handleCopy" function in "copy_controller.js" is called, the value is there, but it is not there for the multer storage engine, even though the body parser precedes the multer in the middleware chain.

Note that the proof-of-concept does not display this problem, even though the set-up seems identical: end point bound to POST, with a middleware chain that first runs body parser and then runs the multer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment