Skip to content

Instantly share code, notes, and snippets.

@chriswhong
Created October 2, 2016 14:39
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save chriswhong/2e5f0f41fc5d366ec902613251445b30 to your computer and use it in GitHub Desktop.
Save chriswhong/2e5f0f41fc5d366ec902613251445b30 to your computer and use it in GitHub Desktop.

#Using NYC geospport linux shared library from Ubuntu 16.04

I have been trying to understand more about geosupport, specifically geosupport desktop edition for linux which contains a linux .so shared library. I would like to eventually write node.js bindings for it so that I can write geocoding scripts that don't require a ton of network traffic.

I am a C noob and this was my first time messing with C and gcc on linux. I was able to write and compile a simple C program that calls the Geosupport shared library with hard-coded arguments.

##What is geosupport?

"Geosupport is a data processing system originally designed to run on IBM mainframes to support geographic processing needs common to New York City agencies." Basically, it's an NYC-specific geocoder released by the NYC department of city planning. It does many things, but at its simplest it can take human-readable address fields and return a point coordinate.

Geosupport desktop edition includes the entire geosupport package as a library, so there is no network traffic happening to look up addresses.

NYC's IT department DoITT wrote and open-sourced a java-based web API written on top of the GeoSupport Shared Libraries in order to bring this functionality to the web. I have been making use of this API for bulk geocoding from node.js scripts, but it ends up being inefficient to manage thousands of requests.

##API Overview

Geosupport's API has a single function called geo(wa1, wa2). wa1 and wa2 (work areas one and two) are chars whose length depends on which function you are calling. The simplest function is function 1, in which work area 1 has a length of 1200 characters. The 1200 characters accounts for both input and output data, and you must provide required values at their appropriate locations in the string.

Refer to the Geosupport User Guide for details about building work area strings. The chart below from p570 of the user guide pdf outlines the first 142 characters of the work area 1 string.

geosupport_system_user_guide_16c

##Environment

I used a docker image for Ubuntu 16.04 running on a Macbook Air. Once I had the container running, I updated apt-get and installed curl, unzip, and gcc.

To download geosupport desktop edition for linux, curl -o geosupport.zip http://www1.nyc.gov/assets/planning/download/zip/data-maps/open-data/gdelx_16c.zip It will unzip into /version-16c_16.3

Set environment variables LD_LIBRARY_PATH and GEOFILES with the full paths to /version-16c_16.3/lib and /version-16c_16.3/fls respectively.

Per some reading on the github repo for GeoClient, I modfied geo.h to get rid of the logic for windows, etc. It (declares?) the geo function:

//---------------------------------------------------------------------------

#ifndef NYCgeoH
#define NYCgeoH
//---------------------------------------------------------------------------

extern void geo(char *ptr_wa1, char *ptr_wa2);
#endif 


I wrote a simple C script to call the geo function and spit out the results into the console. I have set the geosupport function 1, street number 120, borough code 1 for Manhattan, street name BROADWAY, and the Work Area Format Indicator C in the appropriate positions in the work area 1 string


#include "./include/foruser/geo.h"
#include <stdio.h>


int main(void)
{
 

        char wa1[1200]="1 120                                                   1          BROADWAY                                                                                                                                         C ";
        char wa2[1200]="";

        char *wa1ptr;
        char *wa2ptr;

        wa1ptr=wa1;
        wa2ptr=wa2;

        geo(wa1ptr, wa2ptr);
    
        printf("%s",wa1ptr);    
        printf("%s",wa2ptr); 
        return 0;
}



Compile gcc test.c -L ./lib -lgeo -lm -o test Run ./test

Output:

1 120                                                   1          BROADWAY                                                                                                                                         C                                                                                                                                                   MANHATTAN120             000120000AA11361001010BROADWAY                                                                                                                                                                                                                                                                                                             00                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  011136102000120000AA  000104000AA000120000AA011128870                        1132390                        1075501070 R0012409811760197397       I 11011027109265         157700101          10010101E004 02303MS UX1  1   7     7  1001    7  3002 MN25P           5                      113610010112589 011136102000120000AA  000104000AA000120000AA011128870                        1132390                        1075501070 R0012409811760197397       I 11011027109265         157700101          10010101E004 02303MS UX1  1   7     7  1001    7  3002 MN25P           5                      113610010112589 

This looks like a big mess, but it is working as expected! The geo() function updates wa1 and wa2. wa1 remains unchanged, but wa2, which was originally "" now contains the response from geoclient. You can see the string 10271 which is the zip code mixed in with the results. You will need to parse this out into its individual fields, and I am pretty sure there is already a C header for that in the package, but I don't know enough to make use of it at this point.

##TODO Now that I have a basic understanding of what to pass into the geo() function, I can work on building a node.js addon that makes use of geosupport. When I have a repo up for it, I'll update this gist. Any help is appreciated!

@veltman
Copy link

veltman commented Oct 7, 2016

This is awesome! One note so far: I believe GEOFILES has to include a trailing slash to work properly (but LD_LIBRARY_PATH doesn't).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment