Skip to content

Instantly share code, notes, and snippets.

@jsanz
Last active April 4, 2024 17:27
Show Gist options
  • Save jsanz/58f21c866dd1f740922e77557700aa67 to your computer and use it in GitHub Desktop.
Save jsanz/58f21c866dd1f740922e77557700aa67 to your computer and use it in GitHub Desktop.
[Elasticsearch & GDAL] Using an API Key to upload a dataset

Loading geospatial data into Elasticsearch

Resources:

Note

Asumes a recent Elastic stack (Elasticsearch at least) running on Docker on the es_esnet network with the elasticsearch hostname and exposing Elasticsearch on the 9203port

  • Create an API key using Kibana or the Rest API
  • Create a headers.txt file with the API Key value
echo "Authorization: apikey YjN6ZHFZNEJRRHh0T2hiY1hnS2E6RWp5OE1faWJRNHlMSDlQZUFmWGVQUQ==" > headers.txt
  • Delete the ne_pop_places index if it exists
curl --user elastic:changeme -XDELETE http://localhost:9203/ne_pop_places
  • Create the mappings file (but it also creates the index)
docker run --rm -u $(id -u ${USER}):$(id -g ${USER}) \
	-v ${PWD}:/tmp/ogr2ogr --network es_esnet \
	ghcr.io/osgeo/gdal:alpine-small-latest ogr2ogr \
	--config GDAL_HTTP_HEADER_FILE /tmp/ogr2ogr/headers.txt \
	-lco INDEX_NAME=ne_pop_places \
	-lco NOT_ANALYZED_FIELDS={ALL} \
	-lco WRITE_MAPPING=/tmp/ogr2ogr/ne_pop_places.json \
	-f Elasticsearch \
	http://elasticsearch:9200 \
	/tmp/ogr2ogr/ne_10m_populated_places.shp
  • Tweak mapping if needed ...
  • Delete the index created by the step generating the mappings file
curl --user elastic:changeme -XDELETE http://localhost:9203/ne_pop_places
  • Transfer the geospatial data content
docker run --rm -u $(id -u ${USER}):$(id -g ${USER}) \
	-v ${PWD}:/tmp/ogr2ogr --network es_esnet \
	ghcr.io/osgeo/gdal:alpine-small-latest ogr2ogr \
	--config GDAL_HTTP_HEADER_FILE /tmp/ogr2ogr/headers.txt \
	-lco INDEX_NAME=ne_pop_places \
	-lco NOT_ANALYZED_FIELDS={ALL} \
	-lco MAPPING=/tmp/ogr2ogr/ne_pop_places.json \
	-f Elasticsearch \
	http://elasticsearch:9200 \
	/tmp/ogr2ogr/ne_10m_populated_places.shp
  • Check the data created
docker run --rm -u $(id -u ${USER}):$(id -g ${USER}) \
	-v ${PWD}:/tmp/ogr2ogr --network es_esnet \
	ghcr.io/osgeo/gdal:alpine-small-latest ogrinfo \
	--config GDAL_HTTP_HEADER_FILE /tmp/ogr2ogr/headers.txt \
	-summary \
	ES:http://elasticsearch:9200 \
	ne_pop_places

If you need to debug things, add a few config parameters, for example for a generic call to get the summary of datasets in the cluster

docker run --rm -u $(id -u ${USER}):$(id -g ${USER}) \
	-v ${PWD}:/tmp/ogr2ogr --network es_esnet \
	ghcr.io/osgeo/gdal:alpine-small-latest ogrinfo \
	--config GDAL_HTTP_HEADER_FILE /tmp/ogr2ogr/headers.txt \
  	--config CPL_CURL_VERBOSE YES \
  	--config CPL_DEBUG ON \
  	--config DEBUG ON \
	ES:http://elasticsearch:9200
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment