Skip to content

Instantly share code, notes, and snippets.

@kinduff
Last active April 24, 2021 22:48
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kinduff/c2a92492b0cb0a25d423 to your computer and use it in GitHub Desktop.
Save kinduff/c2a92492b0cb0a25d423 to your computer and use it in GitHub Desktop.
[4chan WebM Downloader] #4chan
#
# 4chan Webm Downloader
# Fetches all webms from a thread
# filters out all the .webm files
# and downloads to a custom path
# using wget
#
# Requires
# nokogiri: gem install nokogiri
#
# Usage
# ruby 4chan_webm_downloader.rb URL PATH
#
require 'rubygems'
require 'nokogiri'
require 'open-uri'
puts "Start"
input_url = ARGV[0]
path = ARGV[1]
website = Nokogiri::HTML(open(input_url))
links = website.css('.fileText a')
i = 0
links.each do |link|
url = link['href']
name = link.content
if File.extname(url) == ".webm"
`wget -nd -O "#{path}/#{name}" http:#{url}`
end
end
puts "Done"
Copy link

ghost commented Feb 28, 2015

I wrote a command line linux script for the same purpose with a similar usage
Maybe it will be useful for you.

curl $1 | sed -e "s/</\n/g" | sed -e "s/a\ href="/https:/g" |grep cdn.webm | grep -v class | grep -v title | sed -e "s/".$//g" | xargs wget -nc

@fire-hawk-86
Copy link

fire-hawk-86 commented Oct 5, 2017

I changed line 29
from
wget -P #{path} http:#{url}
to
wget -nd -O "#{path}/#{name}" http:#{url}

So it keeps the original name of the file.

edit: -nd because possibility of same filename:
https://stackoverflow.com/questions/28133885/wget-keep-the-files-with-same-name

@kinduff
Copy link
Author

kinduff commented Oct 21, 2017

@fire-hawk-86 Updated with your change, thank you.

@kinduff
Copy link
Author

kinduff commented Oct 21, 2017

@kinazarov Love it! Thanks for sharing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment