Skip to content

Instantly share code, notes, and snippets.

@Youssef-Harby
Created December 24, 2023 08:41
Show Gist options
  • Save Youssef-Harby/727d07c27f8e090408756188ff56f5b1 to your computer and use it in GitHub Desktop.
Save Youssef-Harby/727d07c27f8e090408756188ff56f5b1 to your computer and use it in GitHub Desktop.
Converting CSV to Any Geospatial Format Using DuckDB and DuckDB Spatial Extension.
# dataset ref : https://www.kaggle.com/datasets/max-mind/world-cities-database/code
# pip install duckdb
import duckdb
con = duckdb.connect()
data = """
-- Environment setup
SET enable_progress_bar = true;
SET enable_progress_bar_print = true;
INSTALL httpfs; -- Only if HTTP file system is needed
INSTALL spatial; -- Required for spatial functions
LOAD httpfs; -- Only if HTTP file system is needed
LOAD spatial; -- Required for spatial functions
-- Creating a table schema matching the CSV structure
CREATE TABLE worldcities (
Country VARCHAR,
City VARCHAR,
AccentCity VARCHAR,
Region VARCHAR,
Population DOUBLE,
Latitude DOUBLE,
Longitude DOUBLE
);
-- Loading data from CSV into the table
COPY worldcities FROM 'https://link.storjshare.io/raw/jwh4kesndug5ooocnxqr3hytw4wa/truemaps-public%2Fpublic-datasets%2Fmaxmind%2Fworldcitiespop.csv.gz' (DELIMITER ',', HEADER, IGNORE_ERRORS);
-- Creating and exporting GeoJSON (or other formats as specified by gdal_driver)
COPY (
SELECT
*, -- Selects all fields from the table
ST_GeomFromText('POINT(' || Longitude || ' ' || Latitude || ')') AS geom
FROM worldcities
) TO 'worldcitiespop.geojson' WITH (FORMAT GDAL, DRIVER 'GeoJSON', SRS 'EPSG:4326');
"""
results = con.execute(data).fetchall()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment