Skip to content

Instantly share code, notes, and snippets.

@jrmoran
Created September 20, 2011 04:48
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jrmoran/1228355 to your computer and use it in GitHub Desktop.
Save jrmoran/1228355 to your computer and use it in GitHub Desktop.
;; Jaime Moran 2011
(ns parser
(:use [clojure.contrib.duck-streams :only (read-lines)]
clojure.contrib.json))
;; This processes a CSV file, with all energy consumption data from
;; 1960-2009 accross all US states
;;
;; MSN,StateCode,Year,Data
;; "ABICB","AK","1960",0
;; "ABICB","AK","1961",0
;; "ABICB","AK","1962",0
;; ....
;;
;; And produces the following JSON structure per code
;;
;; "ABIB" : { 1960 : { AK: 0
;; AL: 0
;; AR: 0 },
;; 1961 : { ... }
;; }
(defn process-file [file-name codes initial-year]
(defn process-line [line]
(let [[code state year datum] (.split (.replace line "\"" "") ",")
year (Integer/parseInt year)]
(if (and (contains? codes code) (>= year initial-year))
(zipmap [:code :year :state :datum]
[ (codes code) year state (/ (Float/parseFloat datum) 1000)]))))
(defn add-to-data [data dm]
(assoc-in data [(:code dm) (:year dm) (:state dm)] (:datum dm)))
(reduce add-to-data {}
(remove nil? (map process-line (rest (read-lines file-name))))))
(def codes {"TETCB" "Total"
"FFTCB" "Fossil Fuels"
"CLTCB" "Coal"
"NGTCB" "Natural Gas"
"PMTCB" "Petroleum"
"NUETB" "Nuclear Energy"
"RETCB" "Renewable Energy"
"ELISB" "Flow"
"ELIMB" "Imports"
"TERCB" "Residential"
"TECCB" "Commercial"
"TEICB" "Industrial"
"TEACB" "Transportation"})
(if (.exists (java.io.File. "Complete_SEDS.csv"))
(clojure.contrib.duck-streams/spit
"data.json"
(json-str (process-file "Complete_SEDS.csv" codes 2000)))
(println (str "Make sure to place the file Complete_SEDS.csv "
"in the same directory as this program")))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment