Skip to content

Instantly share code, notes, and snippets.

@brweber2
Forked from djKianoosh/gist:2648751
Created May 10, 2012 00:09
Show Gist options
  • Save brweber2/2649948 to your computer and use it in GitHub Desktop.
Save brweber2/2649948 to your computer and use it in GitHub Desktop.
Some Clojure functions to help read IIS log files into maps
(defn comment? [s]
(.startsWith s "#"))
(defn not-comment? [s]
(not (comment? s))) ; you could also do (-> s comment? not), just showing alternatives, but what you had originally is just fine.
(defn remove-comments [file-contents]
(filter not-comment? file-contents))
(defn nil-if-hyphen [s]
(if (not= s "-") s)) ; there is also if-not but I prefer the way you wrote it
(defn str->int
"Returns an int if the string parses as an int, otherwise returns input unaltered"
[str]
(if (re-matches (re-pattern "\\d+") str)
(read-string str) ; even with the regex match above, be careful, if a non-integer gets here you've got exploit heaven... :)
str))
;; #Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status time-taken
(defn is-format-line? [s]
(.startsWith s "#Fields:"))
(defn find-first-format-line [lines]
(first (filter is-format-line? lines))) ; the take 1 should be unnecessary because filter returns a lazy seq and you only need to get the first element once
(defn read-format-into-keywords [s]
(map keyword (filter not-comment? (.split s " "))))
(defn read-format-from-file [f]
(let [file-contents (line-seq (clojure.java.io/reader f))]
(read-format-into-keywords (find-first-format-line file-contents) )))
(defn zipmap-line-data
"Returns a map with the keywords mapped to data from a log line."
[form line]
(let [line-data (map str->int (.split line " "))]
(zipmap form line-data))) ; I'd suggest calling form something more like column-names
(defn read-data-from-file [file]
(let [form (read-format-from-file file)
file-without-comments (remove-comments (line-seq (clojure.java.io/reader file)))]
(map (partial zipmap-line-data form) file-without-comments))) ; partials work nicely when your arguments go on the end. Not sure if this is more or less readable...
; you are reading the file twice, once for the format line... if it is always early in the file that's fine...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment