Skip to content

Instantly share code, notes, and snippets.

@jsignell
Created May 22, 2023 20:33
Show Gist options
  • Save jsignell/01abc2aa8d97e273838ed6dfea7c9229 to your computer and use it in GitHub Desktop.
Save jsignell/01abc2aa8d97e273838ed6dfea7c9229 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "7ceca385-2fbb-4c7f-ab8e-1dbcbb25f7ed",
"metadata": {},
"source": [
"# New STAC collection\n",
"\n",
"Starting point for data providers who want to add a new dataset to the STAC API. \n",
"\n",
"Additional resources: https://github.com/NASA-IMPACT/delta-backend/issues/29/"
]
},
{
"cell_type": "markdown",
"id": "24a07118-cea2-41e3-9f85-d64b7ba720ee",
"metadata": {},
"source": [
"## Run this notebook\n",
"\n",
"This notebook is designed to run on a VEDA JupyterHub instance. Either https://nasa-veda.2i2c.cloud or https://daskhub.veda.smce.nasa.gov/\n",
"\n",
"We'll start by installing then importing some packages."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "0f8e7c3a-1ec3-4e82-8c9c-790be48da090",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: pystac in /srv/conda/envs/notebook/lib/python3.10/site-packages (1.7.3)\n",
"Requirement already satisfied: python-dateutil>=2.7.0 in /srv/conda/envs/notebook/lib/python3.10/site-packages (from pystac) (2.8.2)\n",
"Requirement already satisfied: six>=1.5 in /srv/conda/envs/notebook/lib/python3.10/site-packages (from python-dateutil>=2.7.0->pystac) (1.16.0)\n"
]
}
],
"source": [
"!pip install -U pystac"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "e16eb677-1ac6-4860-9230-d2a403f0d8a2",
"metadata": {},
"outputs": [],
"source": [
"from datetime import datetime, timezone\n",
"import pystac"
]
},
{
"cell_type": "markdown",
"id": "ff855143-dfef-4318-aeb2-e2108bceef90",
"metadata": {
"tags": []
},
"source": [
"## Create `pystac.Collection`\n",
"\n",
"In this section we will be creating a `pystac.Collection` object. This is the part of that notebook that you should update.\n",
"\n",
"### Declare constants\n",
"\n",
"Start by declaring some string and boolean fields."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "dcf5702f-ed7c-4844-9fcd-a2e693f5ec05",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"COLLECTION_ID = \"no2-monthly-diff\"\n",
"TITLE = \"NO₂ (Diff)\"\n",
"DESCRIPTION = (\n",
" \"This layer shows changes in nitrogen dioxide (NO₂) levels. Redder colors \"\n",
" \"indicate increases in NO₂. Bluer colors indicate lower levels of NO₂. \"\n",
" \"Missing pixels indicate areas of no data most likely associated with \"\n",
" \"cloud cover or snow.\"\n",
")\n",
"DASHBOARD__IS_PERIODIC = True\n",
"DASHBOARD__TIME_DENSITY = \"month\"\n",
"LICENSE = \"CC0-1.0\""
]
},
{
"cell_type": "markdown",
"id": "56c0d7e1-80c3-4f84-a8a0-b6a20e0ee80e",
"metadata": {
"tags": []
},
"source": [
"### Extents\n",
"\n",
"The extents indicate the start (and potentially end) times of the data as well as the footprint of the data."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "99d23b77-e337-4fb6-8a41-da51b96e139c",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Time must be in UTC\n",
"demo_time = datetime.now(tz=timezone.utc)\n",
"\n",
"extent = pystac.Extent(\n",
" pystac.SpatialExtent([[-180.0, -90.0, 180.0, 90.0]]),\n",
" pystac.TemporalExtent([[demo_time, None]]),\n",
")"
]
},
{
"cell_type": "markdown",
"id": "9c0b2c12-a44e-4fea-97de-c068e621d22c",
"metadata": {},
"source": [
"### Providers\n",
"\n",
"We know that the data host, processor, and producter is \"VEDA\", but you can include other providers that fill other roles in the data creation pipeline."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "6956b82f-75a4-46c8-9006-85eed5123ebe",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"providers = [\n",
" pystac.Provider(\n",
" name=\"VEDA\",\n",
" roles=[pystac.ProviderRole.PRODUCER, pystac.ProviderRole.PROCESSOR, pystac.ProviderRole.HOST],\n",
" url=\"https://github.com/nasa-impact/veda-data-pipelines\",\n",
" )\n",
"]"
]
},
{
"cell_type": "markdown",
"id": "6f0b36ad-f5a0-4201-a64e-20dd63d6f565",
"metadata": {},
"source": [
"### Put it together\n",
"\n",
"Now take your constants and the extents and providers and create a `pystac.Collection`"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "f762b2c8-997a-40b0-8e30-526155048946",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"collection = pystac.Collection(\n",
" id=COLLECTION_ID,\n",
" title=TITLE,\n",
" description=DESCRIPTION,\n",
" extra_fields={\n",
" \"dashboard:is_periodic\": DASHBOARD__IS_PERIODIC,\n",
" \"dashboard:time_density\": DASHBOARD__TIME_DENSITY,\n",
" },\n",
" license=LICENSE,\n",
" extent=extent,\n",
" providers=providers,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "daf17ade-d2f4-42d4-ba6a-916fcc41a435",
"metadata": {},
"source": [
"### Try it out!\n",
"\n",
"Now that you have a collection you can try it out and make sure that it looks how you expect and that it passes validation checks."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "5d53bbe2-cbfd-4993-b98f-17d8a63e0e13",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"['https://schemas.stacspec.org/v1.0.0/collection-spec/json-schema/collection.json']"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"collection.validate()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "880af165-0b1a-4f20-835e-f8eed28f100a",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"{'type': 'Collection',\n",
" 'id': 'no2-monthly-diff',\n",
" 'stac_version': '1.0.0',\n",
" 'description': 'This layer shows changes in nitrogen dioxide (NO₂) levels. Redder colors indicate increases in NO₂. Bluer colors indicate lower levels of NO₂. Missing pixels indicate areas of no data most likely associated with cloud cover or snow.',\n",
" 'links': [],\n",
" 'dashboard:is_periodic': True,\n",
" 'dashboard:time_density': 'month',\n",
" 'title': 'NO₂ (Diff)',\n",
" 'extent': {'spatial': {'bbox': [[-180.0, -90.0, 180.0, 90.0]]},\n",
" 'temporal': {'interval': [['2023-05-22T20:23:49.291888Z', None]]}},\n",
" 'license': 'CC0-1.0',\n",
" 'providers': [{'name': 'VEDA',\n",
" 'roles': [<ProviderRole.PRODUCER: 'producer'>,\n",
" <ProviderRole.PROCESSOR: 'processor'>,\n",
" <ProviderRole.HOST: 'host'>],\n",
" 'url': 'https://github.com/nasa-impact/veda-data-pipelines'}]}"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"collection.to_dict()"
]
},
{
"cell_type": "markdown",
"id": "7bfa02fd-e6d8-4b03-94b3-6109a95214f5",
"metadata": {},
"source": [
"## Save it"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "adc3e38b-cc1a-4646-bdee-6b9ce6b08ac2",
"metadata": {},
"outputs": [],
"source": [
"collection.save_object(include_self_link=False, dest_href=f\"{COLLECTION_ID}/collection.json\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment