JP Florent gave me an XLS file containing the statistic of the waste collected from 2010, 2011 and 2012. The initial file is not really machine friendly as the spreadsheet is used for human reading. This is another important point with open data. Usually a lot of claimed to be "open data" is in an format that you cannot easily parse and need a lot of conversion to be understandable by a machine.
First you need to clean-up the file to export it in a proper CSV format. The data contains the unit (To) in each cells. So you don't really need it as you need a numerical value as input and not a string. Here is an approach to automatically convert the exported "messy CSV" for each active sheet (one per year) with ";" as separator:
sed -e "s/To//g" dechet2012.csv | sed -e "s/\,//g" | sed -e "s/;/,/g" | sed -e "s/\"//g" >clean-dechet2012.csv
The conversion process is the following. First we must get rid of the units (To) in all the cells then we must remove the comma used for the representation of large units then we replace the default separator ";" with a "," and finally we remove the string escaping as we treat all the values as numerical except the first column as this is the name of the commune. We add a first line with the label of the column.
Final data looks like this:
Commune,Verres Commune,Papier-carton (porte), Dechet commune melange, Fraction seche (porte),Matiere organique (porte),encombrants (porte),baches agricoles,Parcs conteneurs ARLON,706.84 ,805.82 ,0.00 ,3307.66 ,1207.04 ,379.88 ,21.76 ,10954.69 ATTERT,74.95 ,41.64 ,0.00 ,455.86 ,149.48 ,5.86 ,21.30 ,3098.84 AUBANGE,328.70 ,121.54 ,168.68 ,1814.24 ,1329.66 ,7.36 ,12.66 ,6303.18 BASTOGNE,406.06 ,178.52 ,5.24 ,2369.96 ,592.50 ,101.70 ,104.56 ,8330.01 BERTOGNE,37.37 ,39.30 ,0.00 ,220.72 ,108.66 ,3.92 ,49.44 ,1271.37 BERTRIX,153.79 ,80.36 ,83.82 ,999.60 ,629.12 ,11.78 ,40.96 ,3573.76 BOUILLON,215.52 ,138.24 ,172.26 ,742.04 ,256.00 ,86.70 ,23.26 ,2459.41 CHINY,93.37 ,72.78 ,0.00 ,632.16 ,142.04 ,62.10 ,16.98 ,2545.57
With the provincial waste data, you can show the current repartition of waster per commune. You can also compare by commune but you will need the number of citizen per commune in order to have comparable dataset.
For comparing the repartition of waste collection, I wanted to show all communes into something comparable. I based my output on the following example from D3 http://bl.ocks.org/mbostock/3888852.
- http://transparency.ge/en/ten-open-data-guidelines - Ten Open Data Guidelines - Transparency International Georgia