Create a gist now

Instantly share code, notes, and snippets.

Rough clojure regex for parsing cachefly logs
(def delimited-str "\"([^\"]*)\"")
(def not-whitespace-group "(\\S+)")
(def not-whitespace "\\S+")
(def digits "([0-9]+)")
(def bracketed-group "\\[([^\\[]+)\\]")
(def rx-pieces [
not-whitespace-group ;; host %h
not-whitespace ;; indent %l (unused)
not-whitespace-group ;; user %u
bracketed-group ;; time %t
delimited-str ;; request "%r"
digits ;; status %>s
not-whitespace-group ;; size %b (careful, can be '-')
delimited-str ;; referer "%{Referer}i"
delimited-str ;; user agent "%{User-agent}i"
delimited-str ;; cachefly uid
delimited-str ;; cachefly POP
delimited-str ;; cachefly time in seconds
delimited-str ;; cachefly request country for source IP
delimited-str ;; cachefly isp/provider for source IP
delimited-str ;; conn type
(def line-parser-rx
(apply str (clojure.string/join \space rx-pieces))))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment