Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
How to turn a list of JSON objects into a Datasette

How to turn a list of JSON objects into a Datasette

This repository has a dataset of 184.879 crimes committed in Buenos Aires:

Download the raw data like this:

wget ''

Now use Pandas to load that into a dataframe:

>>> import pandas as pd
>>> df = pd.read_json(open('delitos.json'))
>>> df
                   barrio  cantidad_vehiculos  cantidad_victimas     comuna  \
0                FLORESTA                   2                  0  Comuna 10   
1                 PALERMO                   1                  0  Comuna 14   
2            VILLA CRESPO                   3                  0  Comuna 15   
3       PARQUE AVELLANEDA                   1                  0   Comuna 9   
4        VILLA GRAL MITRE                   1                  0  Comuna 11   
5                FLORESTA                   2                  0  Comuna 10   
6        PARQUE CHACABUCO                   1                  0   Comuna 7   
7                RECOLETA                   4                  0   Comuna 2   
8                  FLORES                   2                  0   Comuna 7   
9           NUEVA POMPEYA                   2                  0   Comuna 4   

Pandas can save that file to a SQLite database like so:

>>> import sqlite3
>>> conn = sqlite3.connect('delitos.db')
>>> df.to_sql('delitos', conn)

Now you can deploy that SQLite file as a web-based API using datasette publish. You'll need to install Zeit Now and Datasette first.

Include source information with command-line options:

datasette publish now delitos.db --source_url="" --source="ramadis/delitos-caba"
> Deploying /private/var/folders/jj/fngnv0810tn2lt_kd3911pdc0000gp/T/tmpcagpkk5f/datasette under simonw
> Ready! (copied to clipboard) [6s]
> Synced 2 files (399B) [0ms] 
> ...

A few seconds later:

Let's give it a nice URL:

now alias

Now you can visit it at

Here's an example SQL query showing the most common barrios:

Copy link

Slowly commented Feb 17, 2020

Yes, you can use pandas read_json - IF it is heeeasssyy and sssiiiiiimple. But if you've got a very deeply nested json datasource that isn't a file and has lists mixed in with dictionaries - it won't work. For example, a graphql query result from a Tableau metadata API call.

Copy link

simonw commented Sep 19, 2022

Since writing this my sqlite-utils CLI tool has grown the ability to do this as a one-liner:

curl -L '' | \
 sqlite-utils insert delitos.db delitos - --pk id

You can see the schema like this:

sqlite-utils schema delitos.db
CREATE TABLE [delitos] (
   [comuna] TEXT,
   [barrio] TEXT,
   [latitud] FLOAT,
   [longitud] FLOAT,
   [fecha] TEXT,
   [hora] TEXT,
   [uso_arma] TEXT,
   [uso_moto] TEXT,
   [lugar] TEXT,
   [origen_dato] TEXT,
   [tipo_delito] TEXT,
   [cantidad_vehiculos] INTEGER,
   [cantidad_victimas] INTEGER

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment