Skip to content

Instantly share code, notes, and snippets.

@mahozad
Last active December 3, 2021 12:46
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mahozad/a55245b84ae86294f5481ae7b9df6839 to your computer and use it in GitHub Desktop.
Save mahozad/a55245b84ae86294f5481ae7b9df6839 to your computer and use it in GitHub Desktop.
A Kotlin script to scrape a phrase from a dynamic page using Selenium and jsoup (mentioned in a Stack Overflow post)
#!/usr/bin/env kotlin
/**
* A Kotlin script for extracting (scraping) a phrase from a dynamic page.
* NOTE: Download and place the executable [Chrome driver](https://chromedriver.storage.googleapis.com/index.html) beside this script.
* See [this stackoverflow post](https://stackoverflow.com/a/69974518/8583692) for more information.
* Alternatively, use the [WebDriverManager](https://github.com/bonigarcia/webdrivermanager) library.
*/
@file:JvmName("Scraper")
@file:CompilerOptions("-jvm-target", "11")
@file:Repository("https://repo.maven.apache.org/maven2")
@file:Repository("https://jcenter.bintray.com")
@file:Repository("https://jitpack.io")
@file:DependsOn("org.jsoup:jsoup:1.14.3")
@file:DependsOn("org.seleniumhq.selenium:selenium-java:4.0.0")
import org.jsoup.Jsoup
import org.openqa.selenium.chrome.ChromeDriver
import java.io.File
System.setProperty("webdriver.chrome.driver", "chromedriver.exe")
val result = File("output.html")
val driver = ChromeDriver() // OR FirefoxDriver(); download its driver and set the system property above
driver.get("https://www.singaporepools.com.sg/en/product/sr/Pages/toto_results.aspx")
result.writeText(driver.pageSource)
driver.close()
// Could also have used Jsoup.parse(driver.pageSource)
// instead of writing to and then reading from a file
val document = Jsoup.parse(result, "UTF-8")
val targetElement = document
.body()
.children()
.select(":containsOwn(Next Jackpot)")
.single()
.parent()!!
val phrase = targetElement.text()
val prize = targetElement.select("span").text().removeSuffix(" est")
println(phrase) // Next Jackpot $8,000,000 est
println(prize) // $8,000,000
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment