Skip to content

Instantly share code, notes, and snippets.

@palmerj
Last active August 16, 2023 17:33
Show Gist options
  • Star 8 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save palmerj/ac1e19eb81c986d9634e3a3de7cdfc3d to your computer and use it in GitHub Desktop.
Save palmerj/ac1e19eb81c986d9634e3a3de7cdfc3d to your computer and use it in GitHub Desktop.
Creating BigTiff COGS for raster RGB photos from a tile mosaic directory using GDAL

Creating a Cloud Optimised Geotiffs (COGs) for raster photo imagery

This process outlines the process for creating Cloud Optimised Geotiffs suitable for hosting in services such as AWS S3. COGs enables more efficient workflows use cases such as fast access from Functions as a Services (E.g AWS Lambda), or comsumption into client desktop GIS systems (e.g QGIS). For more details on COGs please see https://www.cogeo.org/in-depth.html

1. Create a mosaic

First create the virtual mosaic from the directory of tiles, ensuring that a alpha band is created in the VRT to set transparency where there is no source raster.

gdalbuildvrt -addalpha mosaic.vrt *.tif
gdal_translate -b 1 -b 2 -b 3 -mask 4 mosaic.vrt rgbmask.vrt

2. Create a BigTiff

Create a BigTiff in a lossless compression to avoid quality loss. Use all available CPU cores (DEFLATE compression method can use multi-threading). The GeoTiff has an internal 1-bit mask band to provide transparency for parts of the mosaic raster extent that contain no source data

gdal_translate \
  -b 1 -b 2 -b 3 -mask 4 \
  -of GTiff \
  -co BIGTIFF=YES \
  -co TILED=YES \
  -co COMPRESS=DEFLATE \
  -co PREDICTOR=2 \
  -co NUM_THREADS=ALL_CPUS \
  --config GDAL_CACHEMAX 4096 \
  -co ALPHA=YES \
  --config GDAL_TIFF_INTERNAL_MASK YES \
  mosaic.vrt output.tif

3. Create Overviews

Create overviews for the mosaic.

Note: For the gdaladdo there is known issue that generating multiple overviews in the same TIFF file is slow and causes tiff directory thrashing. The libtiff library has to go back-and-forth between multiple TIFF internal images, and load/unload the TIFF indexes each time. For a huge file, this involes a lot of I/O. The workaround, which is especially fine for the COG case, is to generate each overview level in its own file by cascading calls to gdaladdo. See https://trac.osgeo.org/gdal/ticket/5067#comment:2 for more info


OVERVIEW=output.tif
for VARIABLE in 2 4 8 16 32 64 128 256 512
do
  gdaladdo \
    --config GDAL_CACHEMAX 4096 \
    --config COMPRESS_OVERVIEW DEFLATE \
    -ro \
    -r average \
    $OVERVIEW 2
  OVERVIEW = ${OVERVIEW}.ovr
done

4. Create Cloud Optimised Geotiff (COGS)

Create COGs, applying final JPEG compression, and copying and compressing the previously generated overview's IFD (Image File Directory) index in the header of the file to be efficiently fetchable via cloud web APIs. The GeoTiff is creates internal tiles of 256x256 for the main resolution and 128x128 tiles for overviews

NOTES:

  • When compressing with JPEG multi-threading can not be used.
  • Increasing the block size can reduce the size of the IFD. But larger blocks can cause more bytes to be pulled for random access if the compression rate is not high. Going from teh default of 256 to 512 will reduce the index by a factor of 4. The size of the TIFF index arrays, for each pyramid level, is : 2 * ceil(xsize / blockxsize) * ceil(ysize / blockysize) * 8 bytes Because we use an internal mask, this value has to be multiplied by 2.
gdal_translate \
  -of GTiff \
  -co BIGTIFF=YES \
  -co TILED=YES \
  -co BLOCKXSIZE=256 \
  -co BLOCKYSIZE=256 \
  -co COMPRESS=JPEG \
  -co JPEG_QUALITY=85 \
  -co PHOTOMETRIC=YCBCR \
  -co COPY_SRC_OVERVIEWS=YES \
  -co ALPHA=YES \
  --config GDAL_TIFF_INTERNAL_MASK YES \
  --config GDAL_TIFF_OVR_BLOCKSIZE 128 \
  --config GDAL_CACHEMAX 4096 \
  output.tif output_cogs.tif

5. Validate the COGs Geotiff

Check there are no errors or warnings from the following script

python validate_cloud_optimized_geotiff.py output_cogs.tif
@mahendraprateik
Copy link

I am running this command:
gdal_translate input.vrt output.tif -co TILED=YES -co COPY_SRC_OVERVIEWS=YES -co COMPRESS=DEFLATE -co BIGTIFF=YES -co NUM_THREADS=ALL_CPUS

And I get a warning "Warning 6: Driver GTiff does not support NUM_THREADS creation option"

I am not able to fix why I am not able to distribute it. Are you working with a special version/driver?

@palmerj
Copy link
Author

palmerj commented Apr 4, 2020

I am not able to fix why I am not able to distribute it. Are you working with a special version/driver?

It's supported form GDAL 2.1: https://gdal.org/drivers/raster/gtiff.html#open-options

But I would recommend GDAL 2.3+ for this: https://gist.github.com/palmerj/2e4a46fbcf0c97212e6e77fced22e885

In GDAL 3.1 (which will be released in May) you can do all of this much easier with the COG driver: https://gdal.org/drivers/raster/cog.html

@ftrastour
Copy link

Hi,
I'm wondering about this command line :

gdal_translate -b 1 -b 2 -b 3 -mask 4 mosaic.vrt rgbmask.vrt

as the resulting VRT is not used after.
Could you clarify please ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment