Skip to content

Instantly share code, notes, and snippets.

@AsgerPetersen
Last active October 19, 2020 13:15
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save AsgerPetersen/6f9c8120b85e462ccbc26191a2117b3a to your computer and use it in GitHub Desktop.
Save AsgerPetersen/6f9c8120b85e462ccbc26191a2117b3a to your computer and use it in GitHub Desktop.
Test rasterstats with or without boundless reading

Install modified rasterstats from https://github.com/AsgerPetersen/python-rasterstats/tree/boundless to use this script as is.

Or just modify the bool in this line https://github.com/perrygeo/python-rasterstats/blob/master/src/rasterstats/io.py#L319

Script output:

time python test.py
Using boundless: True
python test.py  72.11s user 9.82s system 99% cpu 1:21.96 total
time python test.py
Using boundless: False
python test.py  1.51s user 0.24s system 90% cpu 1.926 total

I think the issue is emphasized for tiff files with a lot of tileoffsets. Probably because it takes releatively long time to open the file. This particular file is special in that it has a pretty large header but a small size on disk.

from rasterstats import gen_zonal_stats
from shapely.geometry import Point
raster_file = "./gdal_everywhere2.tif"
def geoms_generator():
pix_size = 38
radius_pixels = 10
num_geometries = 1000
for _ in range(num_geometries):
yield Point(0, 0).buffer(radius_pixels * pix_size)
boundless = True
print(f"Using boundless: {boundless}")
list(gen_zonal_stats(geoms_generator(), raster_file, boundless=boundless, nodata=-1))
This file has been truncated, but you can view the full file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment