The NYPD just released their 2014 Stop and Frisk data, and in CSV format too! Here's some Bash commands to get the data and scope it out (see the previous guide here on how to graph it with R)
Using bash:
baseurl="http://www.nyc.gov/html/nypd/downloads/zip/analysis_and_planning/YYYY_sqf_csv.zip"
for year in $(seq 2003 2014); do
url=$(echo $baseurl | sed "s/YYYY/$year/")
echo $url
curl -sO $url
done
unzip -o "*.zip"
wc -l *.csv
2014 comes in at 45,787 reports, which is nearly 5 times fewer than in 2013 (and more than 10 times fewer than 2012):
160852 2003.csv
313524 2004.csv
398192 2005.csv
506492 2006.csv
472097 2007.csv
540303 2008.csv
581169 2009.csv
601286 2010.csv
685725 2011.csv
532912 2012.csv
191852 2013.csv
45788 2014.csv
5030192 total
Let's see if they made the headers all uniform, which would make this an easy import into a database:
for year in $(seq 2003 2014); do
fname="$year.csv"
head -n 1 $fname | tr ',' $'\n' | tr '[:upper:]' '[:lower:]'
done | sort | uniq -c | sort -rn
BLAAARGH!!!!
12 zip
12 year
12 ycoord
12 xcoord
12 weight
12 typeofid
12 trhsloc
12 timestop
12 sumoffen
12 sumissue
12 state
12 sex
12 ser_num
12 sector
12 searched
12 sb_outln
12 sb_other
12 sb_hdobj
12 sb_admis
12 riflshot
12 rf_verbl
12 rf_vcrim
12 rf_vcact
12 rf_rfcmp
12 rf_othsw
12 rf_knowl
12 rf_furt
12 rf_bulg
12 rf_attir
12 revcmd
12 repcmd
12 recstat
12 radio
12 race
12 post
12 pistol
12 pf_wall
12 pf_ptwep
12 pf_pepsp
12 pf_other
12 pf_hcuff
12 pf_hands
12 pf_grnd
12 pf_drwep
12 pf_baton
12 perstop
12 perobs
12 pct
12 othrweap
12 othpers
12 othfeatr
12 offverb
12 offunif
12 offshld
12 officrid
12 machgun
12 knifcuti
12 inout
12 ht_inch
12 ht_feet
12 haircolr
12 frisked
12 eyecolor
12 explnstp
12 dob
12 datestop
12 cs_vcrim
12 cs_other
12 cs_objcs
12 cs_lkout
12 cs_furtv
12 cs_drgtr
12 cs_descr
12 cs_cloth
12 cs_casng
12 cs_bulge
12 crimsusp
12 contrabn
12 compyear
12 comppct
12 city
12 build
12 beat
12 asltweap
12 arstoffn
12 arstmade
12 aptnum
12 age
12 adtlrept
12 addrtyp
12 ac_time
12 ac_stsnd
12 ac_rept
12 ac_proxm
12 ac_other
12 ac_inves
12 ac_incid
12 ac_evasv
12 ac_cgdir
12 ac_assoc
11 stname
11 stinter
11 rescode
11 premtype
11 premname
11 linecm
11 dettypcm
11 detailcm
11 crossst
11 addrpct
11 addrnum
4 forceuse
1 wepfound
1 strname
1 strintr
1 rescod
1 prenam
1 premtyp
1 dettyp_c
1 details_
1 detail1_
1 crossst
1 adrpct
1 adrnum
I'll deal with this tomorrow. It'll probably use the same transformations as the ones I did on the previous files