Skip to content

Instantly share code, notes, and snippets.

@biggert
Created September 5, 2013 17:49
Show Gist options
  • Save biggert/6453648 to your computer and use it in GitHub Desktop.
Save biggert/6453648 to your computer and use it in GitHub Desktop.
Get a clojure reader using the encoding by the BOM
(ns util
(:require [clojure.java.io :refer (file)]
[clojure.string :as str]
[clojure.java.io :as io])
(:import org.apache.commons.io.input.BOMInputStream
org.apache.commons.io.ByteOrderMark))
(def bom-array
(into-array [ByteOrderMark/UTF_16LE
ByteOrderMark/UTF_16BE
ByteOrderMark/UTF_8
ByteOrderMark/UTF_32BE
ByteOrderMark/UTF_32LE]))
(defn bom-reader
"Returns a BOM contextual reader with the proper encoding set (= BOM)"
[file]
(let [bom (-> file
io/input-stream
(BOMInputStream. true bom-array))
encoding (.getBOM bom)
encoding (if (nil? encoding) "UTF-8" (.getCharsetName encoding))]
(io/reader bom :encoding encoding)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment