Skip to content

Instantly share code, notes, and snippets.

View evan-blaisdell's full-sized avatar

Evan Blaisdell evan-blaisdell

View GitHub Profile
@jakebrinkmann
jakebrinkmann / download_m2m.py
Last active April 12, 2024 12:42
Python module for straight-forward EarthExplorer Machine-to-Machine
"""
Data download script for EarthExplorer Machine-to-Machine
download_m2m('/path/to/downloads', username='user1234', dataset='ARD_TILE',
products='TOA,BT,SR,PQA', threads=40,
fields={'Region': 'CU', 'Spacecraft': 'LANDSAT_8'})
More M2M documentation: https://earthexplorer.usgs.gov/inventory/documentation
Author: Jake Brinkmann <jacob.brinkmann.ctr@usgs.gov>

Maxar Open Data GeoParquet STAC Catalog

GeoParquet is an experimental standard for storing geospatial data in the Parquet format. Because Parquet's columnar architecture allow for efficient reads over HTTP it's considered to be an initial attempt at a "Cloud-optimized" vector format.

While it lacks the ability to do optimized spatial reads, since it can filter on a small fraction of a features fields it fits will with storing and querying STAC Items. A STAC Item inherits from GeoJSON so has a spatial component, but also can store a large number of metadata field, many of which may be redundant and rarely useful to query. GeoParquet lets us query for features with simple filter requirements like "all images with a low cloud cover percentage".

The first attempt at converting the full Open Data Catalog to GeoParquet is at:

s3://maxar-opendata/events/maxar-opendata.parquet