I hereby claim:
- I am youssef-harby on github.
- I am yharby (https://keybase.io/yharby) on keybase.
- I have a public key ASBmkyd87Ju7hwN_Uej1lBn8r2SbYZsl12bXghbTDcxPNwo
To claim this, I am signing this object:
I hereby claim:
To claim this, I am signing this object:
| import csv | |
| import re | |
| def extract_lat_long(csv_file, url_column): | |
| pattern = r".+!3d(-?\d+\.\d+)!4d(-?\d+\.\d+).+" | |
| data = [] | |
| # Read the CSV file and extract latitude and longitude values | |
| with open(csv_file, 'r') as f: |
| import duckdb | |
| con = duckdb.connect('./duckdb.duckdb') | |
| data = """ | |
| SET memory_limit = '32GB'; | |
| SET threads TO 16; | |
| SET enable_progress_bar = true; | |
| SET enable_progress_bar_print = true; | |
| INSTALL httpfs; | |
| INSTALL spatial; |
| # dataset ref : https://www.kaggle.com/datasets/max-mind/world-cities-database/code | |
| # pip install duckdb | |
| import duckdb | |
| con = duckdb.connect() | |
| data = """ | |
| -- Environment setup | |
| SET enable_progress_bar = true; |
| import httpx | |
| import datetime | |
| import hashlib | |
| import hmac | |
| # AWS credentials | |
| access_key = "secret" | |
| secret_key = "secret" | |
| region = "eu-central-1" # e.g. 'us-west-1' | |
| bucket = "bucket-name" |
| import pyarrow as pa | |
| import pyarrow.parquet as pq | |
| from pathlib import Path | |
| import json | |
| import pandas as pd | |
| def process_parquet_file(parquet_path): | |
| # Read the Parquet file into a PyArrow Table | |
| table = pq.read_table(parquet_path) |
Overture Maps Data Downloader with Optional GeoJSON Clipping
This Python script downloads geospatial data from Overture Maps, based on user-defined themes and data types (e.g., buildings, transportation). The script can download specific data within a given bounding box (bbox) and export it in GeoParquet or GeoPackage format.
Features:
Allows downloading multiple themes/types (e.g., buildings, transportation) using the overturemaps Python library.
Bounding box (bbox) filtering to limit data to a specific geographic extent.
Optionally clips geometries to the exact boundaries of a GeoJSON file, if provided.
Supports output in GeoParquet or GeoPackage formats.
| __doc__ = """ | |
| # Overview | |
| This example shows how to open High Resolution Rapid Refresh (HRRR) meteorology | |
| "surfaces." These surfaces are slices in vertical space to create single level | |
| maps of variables like temperature, wind speed components, pressure, etc. Each | |
| surface is opened using the Zarr archive described in Amazon's opendata registry | |
| described at https://registry.opendata.aws/noaa-hrrr-pds/. | |
| # Examples |