Skip to content

Instantly share code, notes, and snippets.

@lukassup
Last active October 5, 2016 20:36
Show Gist options
  • Save lukassup/43306d0accc2a1eebd4a8dcfaab7a1e2 to your computer and use it in GitHub Desktop.
Save lukassup/43306d0accc2a1eebd4a8dcfaab7a1e2 to your computer and use it in GitHub Desktop.

SQLite3 CSV import tutorial

Start up the sqlite3 shell. I am going to use fps.db as my database file:

$ sqlite3 fps.db
SQLite version 3.14.2 2016-09-12 18:50:49
Enter ".help" for usage hints.
sqlite> 

Enable csv mode:

sqlite> .mode csv

Fetch the dataset. Skip this if you have already downloaded it beforehand:

sqlite> .shell curl -sSO https://raw.githubusercontent.com/washingtonpost/data-police-shootings/master/fatal-police-shootings-data.csv

Import the dataset to a table called fps:

sqlite> .import fatal-police-shootings-data.csv fps

Set a more user-friendly table display mode (similar to the one in MariaDB shell) and set null value display to NULL:

sqlite> .mode column
sqlite> .header on
sqlite> .null NULL

Verify that the data was imported:

sqlite> SELECT * FROM fps LIMIT 10;
id          name        date        manner_of_death  armed       age         gender      race        city        state       signs_of_mental_illness  threat_level  flee         body_camera
----------  ----------  ----------  ---------------  ----------  ----------  ----------  ----------  ----------  ----------  -----------------------  ------------  -----------  -----------
3           Tim Elliot  2015-01-02  shot             gun         53          M           A           Shelton     WA          True                     attack        Not fleeing  False      
4           Lewis Lee   2015-01-02  shot             gun         47          M           W           Aloha       OR          False                    attack        Not fleeing  False      
5           John Paul   2015-01-03  shot and Tasere  unarmed     23          M           H           Wichita     KS          False                    other         Not fleeing  False      
8           Matthew Ho  2015-01-04  shot             toy weapon  32          M           W           San Franci  CA          True                     attack        Not fleeing  False      
9           Michael Ro  2015-01-04  shot             nail gun    39          M           H           Evans       CO          False                    attack        Not fleeing  False      
11          Kenneth Jo  2015-01-04  shot             gun         18          M           W           Guthrie     OK          False                    attack        Not fleeing  False      
13          Kenneth Ar  2015-01-05  shot             gun         22          M           H           Chandler    AZ          False                    attack        Car          False      
15          Brock Nich  2015-01-06  shot             gun         35          M           W           Assaria     KS          False                    attack        Not fleeing  False      
16          Autumn Ste  2015-01-06  shot             unarmed     34          F           W           Burlington  IA          False                    other         Not fleeing  True       
17          Leslie Sap  2015-01-06  shot             toy weapon  47          M           B           Knoxville   PA          False                    attack        Not fleeing  False      
changes:   1   total_changes: 1723

Let's calculate some interesting stats from the dataset:

sqlite> SELECT armed, COUNT(*) FROM fps GROUP BY armed ORDER BY COUNT(*) DESC LIMIT 10;
armed       COUNT(*)  
----------  ----------
gun         970       
knife       251       
unarmed     136       
vehicle     98        
undetermin  75        
toy weapon  63        
unknown we  14        
machete     12        
box cutter  7         
sword       7         
changes:   1   total_changes: 1723
sqlite> SELECT race, COUNT(*) FROM fps GROUP BY race ORDER BY COUNT(*) DESC LIMIT 10;
race        COUNT(*)  
----------  ----------
W           826       
B           436       
H           286       
            110       
A           23        
O           22        
N           20        
changes:   1   total_changes: 1723

NOTE

There is one gotcha with the CSV imports in SQLite -- the default schema sets TEXT for all columns so there are no real NULL values imported, even where is says NULL in the CSV field.

sqlite> SELECT DISTINCT flee, TYPEOF(flee) FROM fps;
flee         TYPEOF(flee)
-----------  ------------
Not fleeing  text        
Car          text        
Foot         text        
Other        text        
             text        
changes:   0   total_changes: 0
sqlite> .schema fps
CREATE TABLE fps(
  "id" TEXT,
  "name" TEXT,
  "date" TEXT,
  "manner_of_death" TEXT,
  "armed" TEXT,
  "age" TEXT,
  "gender" TEXT,
  "race" TEXT,
  "city" TEXT,
  "state" TEXT,
  "signs_of_mental_illness" TEXT,
  "threat_level" TEXT,
  "flee" TEXT,
  "body_camera" TEXT
);

To work around this issue you have to define a schema manually and/or replace empty column fields with NULL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment