biggie
refine slides
carto slides
big is by Tom MacWright
greetings
tonight’s civic hacking topic: parsing messy data with openrefine
(and mapping it!)
I’m Elliott
OpenRefine fka Google Refine history chatter
still branded google refine
but you can download here
So, govt data is usually exported out of a database or old mainframe advance threw slides as talking
It could be ALL CAPS, have mashed up columns
Have weird formatting (like dates) and location data wonky
Refine can fix all things easily
What is it?
Runs as server
Input in browser
Processing is local
Fast free
Open URL to arrests data
download CSV (CSV is good)
open Refine
create project
go over import screen
- ignore rows
- headings
talk about interface
show how to easily
- change case
- create facets
do example with blank district
numeric facet on age
Now, there are some programming things with refine but you can easily google + copy/paste to get most things done
combine date time to iso 8601 which will be used for mapping
check for blanks because we're doing a temporal map
convert date to ISO
Combine time and date to new column
create new column based on new column for refine date
Show timeline facet (no time zone designator support)
Capitalize CDS
Split charge column
Remove whitespace
Now show cluster and edit!
split incidence
split location column
- del address by splitting on
(
- split by comma
value.replace(")","")
Export to CSV
go to cartodb.com
new table jazz
convert fields to number / date / etc
cool thing are viz wizards