Last active
November 1, 2020 12:41
-
-
Save lossyrob/59f8116b07d37f7f45c5 to your computer and use it in GitHub Desktop.
Ingest GeoTIFF into HDFS using GeoTrellis spark (0.10 Snapshot)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### INGEST GEOTIFFS INTO HDFS ### | |
# geotrellis-spark JAR. Shouldn't have to change this one if running in the root folder (remember to run ./sbt "project spark" assembly) | |
JAR=spark/target/scala-2.10/geotrellis-spark-assembly-0.10.0-SNAPSHOT.jar | |
# Amount of memory for the driver | |
DRIVER_MEMORY=3G | |
# Amount of memory per executor. If in local mode, change the DRIVER_MEMORY instead. | |
EXECUTOR_MEMORY=512G | |
# MASTER | |
# For local ingest, options are "local" or "local[K]", where K is the number of executors, e.g. "local[8]" | |
# Otherwise specify the spark master, such as spark://207.184.161.138:7077 or mesos://192.168.1.2:5050 | |
MASTER=local[8] | |
# Directory with the input tiled GeoTIFF's | |
INPUT=file:/Users/rob/data/nlcd/clipped_tiles | |
# Catalog directory on HDFS | |
CATALOG=hdfs://localhost/catalog | |
# Name of the layer. This will be used in conjunction with the zoom level to reference the layer (see LayerId) | |
LAYER_NAME=nlcd | |
# This defines the destination spatial reference system we want to use | |
# (in this case, Web Mercator) | |
CRS=EPSG:3857 | |
# true means we want to pyramid the raster up to larger zoom levels, | |
# so if our input rasters are at a resolution that maps to zoom level 11, pyramiding will also save | |
# off levels 10, 9, ..., 1. | |
PYRAMID=true | |
# true will delete the HDFS data for the layer if it already exists. | |
CLOBBER=true | |
# We need to remove some bad signatures from the assembled JAR. We're working on excluding these | |
# files as part of the build step, this is a workaround. | |
zip -d $JAR META-INF/ECLIPSEF.RSA | |
zip -d $JAR META-INF/ECLIPSEF.SF | |
# Run the spark submit job | |
spark-submit \ | |
--class geotrellis.spark.ingest.HadoopIngestCommand \ | |
--master $MASTER \ | |
--driver-memory $DRIVER_MEMORY \ | |
--executor-memory $EXECUTOR_MEMORY \ | |
$JAR \ | |
--crs $CRS \ | |
--pyramid $PYRAMID \ | |
--clobber $CLOBBER \ | |
--input $INPUT \ | |
--catalog $CATALOG \ | |
--layerName $LAYER_NAME |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment