Following on Chris Whong's excellent writeup of how to make calls directly to NYC's geosupport client, here's a basic way to call it as a child process in Node.js.
- Follow Chris's instructions on installing the desktop edition on Linux.
- Instead of his
test.c
, use thisgeocode.c
, which is modified to treat the first command line argument as the contents of Work Area 1, and the (optional) second argument as the contents of Work Area 2. It prints the two resulting work areas to stdout, separated by a newline. - Compile
geocode.c
with the same instructions. - Use something like this
geocode.js
to call it repeatedly as a child process, constructing the proper work area string as needed by following the user guide.
This geocode.js
constructs the working area for geocoding an address into a lat/lng, census tract, and census block based on address components (Geosupport function 1). But by changing the input string you send (line 20), and by how you parse the output (lines 27 - 46), you could do any other function.
Spawning a new child process over and over introduces a lot of unnecessary overhead, but in a Docker container on my machine this still does ~250 addresses per second. There at least two alternatives that would be much faster:
- Use node-ffi to link the library directly. I'll give this a shot at some point.
- Do the work natively in C instead. I bet this would be one or two orders of magnitude faster. But manipulating strings in C makes me want to tear my hair out.
Notes
- This script also reprojects the results, which are in NY Long Island State Plane, into latitude/longitude
- If the first attempt returns support code EE, which suggests a corrected street name (e.g. if you supplied
BRADWAY
it might suggest you tryBROADWAY
instead), this script tries again with the suggested name.