Skip to content

Instantly share code, notes, and snippets.

@jsanz
Last active April 24, 2019 18:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jsanz/c4253fafd118a7f91a20632a198008cc to your computer and use it in GitHub Desktop.
Save jsanz/c4253fafd118a7f91a20632a198008cc to your computer and use it in GitHub Desktop.
CARTO new COPY endpoint & ogr2ogr

These are the steps to test the new CARTO COPY endpoint (more details here and here.

  1. Downloaded a Brooklyn PLUTO shapefile from here as a data testing example.
  2. Generated a 1 row dataset to upload to CARTO the definition of the shapefile
  3. Uploaded it to CARTO using normal Import API (with a client I have for convenience)
  4. Removed the row
  5. Converted the original shapefile into a PGDump file
  6. Generated the COPY command using the third line of bkmappluto.txt
  7. Removed first 4 and last 2 lines of the resulting $file.
  8. Encoding the COPY command to be used on a URL, I run curl command to upload the dataset

This means that with my 50Mbps broadband connection I uploaded a dataset of 276.715 polygons (200MB) in less than 6 minutes to my CARTO cloud account.

The carto_dataset and carto_sql commands are part of carto_cli, a set of utilities I maintain here. At some point I should publish that for pip as I've been using them for months without big issues.

Commands

[2] $ ogr2ogr -f "GPKG" -dialect sqlite -sql "select * from BKMapPLUTO limit 1" BKMapPLUTO.gpkg BKMapPLUTO.shp 
[3] $ carto_dataset upload bkmappluto.gpkg
[4] $ carto_sql run "delete from bkmappluto" 
[5] $ ogr2ogr -f PGDUMP -lco CREATE_TABLE=NO -lco SRID=4326 -lco GEOMETRY_NAME=the_geom \
-nlt POLYGON --config PG_USE_COPY YES  -t_srs "EPSG:4326" \
bkmappluto.txt BKMapPLUTO.shp
[7a] $ sed -i '1,4d;$d' $file
[7b] $ tail -n 2 "$file" | wc -c | xargs -I {} truncate "$file" -s -{}

-- 6
COPY "bkmappluto" ("the_geom", "borough", "block", "lot", "cd", "ct2010", "cb2010", "schooldist", 
"council", "zipcode", "firecomp", "policeprct", "healthcent", "healtharea", "sanitboro", "sanitdistr", 
"sanitsub", "address", "zonedist1", "zonedist2", "zonedist3", "zonedist4", "overlay1", "overlay2", 
"spdist1", "spdist2", "spdist3", "ltdheight", "splitzone", "bldgclass", "landuse", "easements", "ownertype",
 "ownername", "lotarea", "bldgarea", "comarea", "resarea", "officearea", "retailarea", "garagearea", 
"strgearea", "factryarea", "otherarea", "areasource", "numbldgs", "numfloors", "unitsres", "unitstotal", 
"lotfront", "lotdepth", "bldgfront", "bldgdepth", "ext", "proxcode", "irrlotcode", "lottype", "bsmtcode", 
"assessland", "assesstot", "exemptland", "exempttot", "yearbuilt", "yearalter1", "yearalter2", "histdist", 
"landmark", "builtfar", "residfar", "commfar", "facilfar", "borocode", "bbl", "condono", "tract2010", "xcoord",
 "ycoord", "zonemap", "zmcode", "sanborn", "taxmap", "edesignum", "appbbl", "appdate", "plutomapid", 
"firm07_fla", "pfirm15_fl", "version", "mappluto_f", "shape_area", "shape_len") FROM STDIN
[8]$ time curl -X POST \                                                                                                                            
    -H "Transfer-Encoding: chunked" \
    -H "Content-Type: application/octet-stream" \
    --data-binary @bkmappluto.txt \
    "https://jsanz.carto.com/api/v2/sql/copyfrom?api_key=MY_API_KEY&q=COPY%20%22bkmappluto%22%20(%22the_geom%22%2C%20%22borough%22%2C%20%22block%22%2C%20%22lot%22%2C%20%22cd%22%2C%20%22ct2010%22%2C%20%22cb2010%22%2C%20%22schooldist%22%2C%20%22council%22%2C%20%22zipcode%22%2C%20%22firecomp%22%2C%20%22policeprct%22%2C%20%22healthcent%22%2C%20%22healtharea%22%2C%20%22sanitboro%22%2C%20%22sanitdistr%22%2C%20%22sanitsub%22%2C%20%22address%22%2C%20%22zonedist1%22%2C%20%22zonedist2%22%2C%20%22zonedist3%22%2C%20%22zonedist4%22%2C%20%22overlay1%22%2C%20%22overlay2%22%2C%20%22spdist1%22%2C%20%22spdist2%22%2C%20%22spdist3%22%2C%20%22ltdheight%22%2C%20%22splitzone%22%2C%20%22bldgclass%22%2C%20%22landuse%22%2C%20%22easements%22%2C%20%22ownertype%22%2C%20%22ownername%22%2C%20%22lotarea%22%2C%20%22bldgarea%22%2C%20%22comarea%22%2C%20%22resarea%22%2C%20%22officearea%22%2C%20%22retailarea%22%2C%20%22garagearea%22%2C%20%22strgearea%22%2C%20%22factryarea%22%2C%20%22otherarea%22%2C%20%22areasource%22%2C%20%22numbldgs%22%2C%20%22numfloors%22%2C%20%22unitsres%22%2C%20%22unitstotal%22%2C%20%22lotfront%22%2C%20%22lotdepth%22%2C%20%22bldgfront%22%2C%20%22bldgdepth%22%2C%20%22ext%22%2C%20%22proxcode%22%2C%20%22irrlotcode%22%2C%20%22lottype%22%2C%20%22bsmtcode%22%2C%20%22assessland%22%2C%20%22assesstot%22%2C%20%22exemptland%22%2C%20%22exempttot%22%2C%20%22yearbuilt%22%2C%20%22yearalter1%22%2C%20%22yearalter2%22%2C%20%22histdist%22%2C%20%22landmark%22%2C%20%22builtfar%22%2C%20%22residfar%22%2C%20%22commfar%22%2C%20%22facilfar%22%2C%20%22borocode%22%2C%20%22bbl%22%2C%20%22condono%22%2C%20%22tract2010%22%2C%20%22xcoord%22%2C%20%22ycoord%22%2C%20%22zonemap%22%2C%20%22zmcode%22%2C%20%22sanborn%22%2C%20%22taxmap%22%2C%20%22edesignum%22%2C%20%22appbbl%22%2C%20%22appdate%22%2C%20%22plutomapid%22%2C%20%22firm07_fla%22%2C%20%22pfirm15_fl%22%2C%20%22version%22%2C%20%22mappluto_f%22%2C%20%22shape_area%22%2C%20%22shape_len%22)%20FROM%20STDIN"

{"time":351.587,"total_rows":276715}curl -X POST -H "Transfer-Encoding: chunked" -H  --data-binary @bkmappluto.tx  0,92s user 0,76s system 0% cpu 5:52,53 total
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment