Skip to content

Instantly share code, notes, and snippets.

@sagrawal31
Last active May 4, 2022 10:10
Show Gist options
  • Save sagrawal31/55451ee85130f2dcda8e to your computer and use it in GitHub Desktop.
Save sagrawal31/55451ee85130f2dcda8e to your computer and use it in GitHub Desktop.
A simple Groovy script to scrape all URLs from a given string and download the content from those URLs
import java.util.regex.Matcher
import java.util.regex.Pattern
Pattern urlPattern = Pattern.compile("\\b(https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]",Pattern.CASE_INSENSITIVE);
String urlString = """This is a big string with lots of Image URL like: http://i.istockimg.com/file_thumbview_approve/69656987/3/stock-illustration-69656987-vector-of-flat-icon-life-buoy.jpg and
http://i.istockimg.com/file_thumbview_approve/69943823/3/stock-illustration-69943823-beach-ball.jpg few others below
http://i.istockimg.com/file_thumbview_approve/40877104/3/stock-photo-40877104-pollen-floating-on-water.jpg
http://i.istockimg.com/file_thumbview_approve/68944343/3/stock-illustration-68944343-ship-boat-flat-icon-with-long-shadow.jpg
"""
Matcher matcher = urlPattern.matcher(urlString);
while (matcher.find()) {
String address = matcher.group()
println("Got URL: " + address);
new File("./" + address.tokenize("/").last()).withOutputStream { out ->
out << new URL(address).openStream()
}
}
// References:
// 1. http://stackoverflow.com/questions/5713558/detect-and-extract-url-from-a-string
// 2. http://stackoverflow.com/questions/4674995/groovy-download-image-from-url
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment