Skip to content

Instantly share code, notes, and snippets.

@ian29
Created January 19, 2012 04:25
Show Gist options
  • Save ian29/1637850 to your computer and use it in GitHub Desktop.
Save ian29/1637850 to your computer and use it in GitHub Desktop.

Converting to SQLite

Importing the spatially-explicit tables (TMsch.csv and TMlea.csv, in the form of simple lat/long coordinates) into QGIS via the'Add Delimited Text Layer' plugin, allows for a first look at the data. Selecting 'Save layer as vector file..' from the Layers menu. In the dialog box, select the following parameters:

Format: SQLite
Encoding: UTF-8
CRS: Google Mercator EPSG:900913

Import the non-spatial .csv tables into SQLite database files using a database inspector like Base or from the command line using SQL for SQLite. Be sure to import all six tables into one database.

Cleaning the Data

Non-spatial tables

Once the tables have been imported, you will need to process and clean the datable to make it both useable and consumeable. First you will need to execute a query to join each year-specific table into one table.

See the query I used to aggregate school districts into one layer here.

Using combination of sub-select and left joins accomplishes this without losing records (schools in this case) that don't exist in all years. Note that we are selecting only 4th, 8th and 12th grades. Additionally, we are changing raw percentages into more legible numbers, and casting them as strings in order to concatenate a percentage sign. Lastly an alias is used to distinguish that this is the processed column.

A side-effect of the above query is that all NULL values are shown as '0.0%'. To correct this run the these updates, iterated for every affected column.

###Spatial tables

In order to display similar formatting in Tmsch.sqlite, use the following set of SQL statements:

ALTER TABLE tmsch ADD COLUMN pct_gradrate07;  
ALTER TABLE tmsch ADD COLUMN pct_gradrate08;  
ALTER TABLE tmsch ADD COLUMN pct_gradrate09;  

UPDATE tmsch SET pct_gradrate07 = cast(round((all_gradrate07 * 100), 1) as text);  
UPDATE tmsch SET pct_gradrate08 = cast(round((all_gradrate08 * 100), 1) as text);  
UPDATE tmsch SET pct_gradrate09 = cast(round((all_gradrate09 * 100), 1) as text);  

UPDATE tmsch SET pct_gradrate07 = 'n/a' where pct_gradrate07 = '0.0%';   
UPDATE tmsch SET pct_gradrate08 = 'n/a' where pct_gradrate08 = '0.0%';  
UPDATE tmsch SET pct_gradrate09 = 'n/a' where pct_gradrate09 = '0.0%';   

Making an dynamic, interactive map.

Conditional grade levels

In order to ensure that only elementary schools on the map display only 4th grade reading and math scores and not 8th or 12th scores (which presumably be null), we need to create columns that flag if either reading or math exist for each grade level for the 09-10 school year. In ed_data.sqlite:

ALTER TABLE clean_sch ADD COLUMN e04;  
ALTER TABLE clean_sch ADD COLUMN e08;  
ALTER TABLE clean_sch ADD COLUMN e12;  
UPDATE clean_sch SET e04 = (all_math04_0910||all_read04_0910);  
UPDATE clean_sch SET e04 = (all_math04_0910||all_read04_0910);  
UPDATE clean_sch SET e04 = (all_math04_0910||all_read04_0910);   

This should create a flag where either reading or math scores exist that will instruct the tooltip to display information for that grade level.

###Classes based on

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment