Skip to content

Instantly share code, notes, and snippets.

@coyotespike
Last active February 13, 2016 20:51
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save coyotespike/d01eff49341debaec5a4 to your computer and use it in GitHub Desktop.
Save coyotespike/d01eff49341debaec5a4 to your computer and use it in GitHub Desktop.
(defn login []
(let [
;;;; Use clj-http's cookie-store.
cookie-store (cookies/cookie-store)
;;;; Get the CSRF token and cookies from the login page.
login-page (client/get
"https://www.generic-website.com/login"
{:cookie-store cookie-store})
body (html/html-resource (java.io.StringReader. (:body login-page)))
token-div (html/select body [(html/attr= :name "authenticity_token")])
token (get-in (into {} token-div) [:attrs :value])
;;;; Use that CSRF token, cookies, and credentials to actually log in.
;;;; The actual url to log in is slightly different, for whatever reason.
logmein (client/post "https://www.generic-website.com/sessions"
{:form-params {:email_address "me@clojure.com"
:password "12345"
:authenticity_token token}
:cookie-store cookie-store})]
;;;; (:status logmein) will show 302, so we have logged in.
logmein))
(defn fetch-url [url]
(-> url clj-http.client/get :body java.io.StringReader. html/html-resource))
;;;; Below:
;;;; First log in, then go get the secured page.
;;;; Why? Because clj-http's docs *seem* to say future GET requests will use the
;;;; acquired cookies automatically.
;;;; See https://github.com/dakrone/clj-http#cookies
(defn profile-scraper []
(let [login (login)
dom (fetch-url "http://www.generic-website/secured-page")
address (html/select dom [(html/attr= :itemprop "address") html/text-node])]
address))
;;;; Sadly, this function returns only non-secured information.
;;;; I have tried including {:cookie-store cookie-store} with the GET request as well.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment