Skip to content

Instantly share code, notes, and snippets.

@eldritchideen
Last active April 2, 2024 16:00
Show Gist options
  • Star 11 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save eldritchideen/9495299265a5cd04d450 to your computer and use it in GitHub Desktop.
Save eldritchideen/9495299265a5cd04d450 to your computer and use it in GitHub Desktop.
Web scraping in Clojure with Jsoup
(ns scraping.core
(:gen-class)
(:import (org.jsoup Jsoup)
(org.jsoup.select Elements)
(org.jsoup.nodes Element)))
(def URL "http://www.smh.com.au/business/markets/52-week-highs?page=-1")
(defn get-page []
(.get (Jsoup/connect URL)))
(defn get-elems [page css]
(.select page css))
(defn -main
"Fetch the list of stocks that have made new highs"
[& args]
(let [html (get-page)
elems (get-elems html "#content > section > table > tbody > tr > th > a")]
(println (for [e elems] (.text e)))))
(defproject scraping "0.1.0-SNAPSHOT"
:description "FIXME: write description"
:url "http://example.com/FIXME"
:license {:name "Eclipse Public License"
:url "http://www.eclipse.org/legal/epl-v10.html"}
:dependencies [[org.clojure/clojure "1.5.1"]
[org.jsoup/jsoup "1.7.3"]]
:main ^:skip-aot scraping.core
:target-path "target/%s"
:profiles {:uberjar {:aot :all}})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment