Skip to content

Instantly share code, notes, and snippets.

@amalloy
Forked from mattdeboard/extract.clj
Last active December 18, 2015 18:49
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save amalloy/5828691 to your computer and use it in GitHub Desktop.
Save amalloy/5828691 to your computer and use it in GitHub Desktop.
;; This buffer is for notes you don't want to save, and for Lisp evaluation.
;; If you want to create a file, visit that file with C-x C-f,
;; then enter the text in that file's own buffer.
(defn extract-render-listener
"A RenderListener implementation that extracts images from a PDF and
writes them to disk."
[path]
(reify RenderListener
(renderImage [_ render-info]
(let [image (.getImage render-info)
ref-number (-> render-info (.getRef) (.getNumber))
file-type (.getFileType image)
file-name (str path ref-umber "." file-type)]
(with-open [os (clojure.java.io/output-stream file-name)]
(.write os (.getImageAsBytes image))
(.flush os))))
(beginTextBlock [_] nil)
(endTextBlock [_] nil)
(renderText [_ render-info] nil)))
(defn matrix-render-listener
"A RenderListener implementation that updates shared mutable state with
nested vectors of image coordinate data."
[page]
(reify RenderListener
(renderImage [_ render-info]
(let [matrix (.getImageCTM render-info)
[x y w h :as coords] (for [i [6 7 0 4]] (.get matrix i))
pagekey (keyword (str page))]
(dosync (alter coords update-in [pagekey] conj coords))))
(beginTextBlock [_] nil)
(endTextBlock [_] nil)
(renderText [_ render-info] nil)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment