public
Last active

Rough clojure regex for parsing cachefly logs

  • Download Gist
gistfile1.clj
Clojure
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
(def delimited-str "\"([^\"]*)\"")
(def not-whitespace-group "(\\S+)")
(def not-whitespace "\\S+")
(def digits "([0-9]+)")
(def bracketed-group "\\[([^\\[]+)\\]")
 
(def rx-pieces [
not-whitespace-group ;; host %h
not-whitespace ;; indent %l (unused)
not-whitespace-group ;; user %u
bracketed-group ;; time %t
delimited-str ;; request "%r"
digits ;; status %>s
not-whitespace-group ;; size %b (careful, can be '-')
delimited-str ;; referer "%{Referer}i"
delimited-str ;; user agent "%{User-agent}i"
delimited-str ;; cachefly uid
delimited-str ;; cachefly POP
delimited-str ;; cachefly time in seconds
delimited-str ;; cachefly request country for source IP
delimited-str ;; cachefly isp/provider for source IP
delimited-str ;; conn type
])
 
 
(def line-parser-rx
(re-pattern
(apply str (clojure.string/join \space rx-pieces))))

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.