Skip to content

Instantly share code, notes, and snippets.

@jamiew
Created July 13, 2011 17:46
Show Gist options
  • Star 61 You must be signed in to star a gist
  • Fork 18 You must be signed in to fork a gist
  • Save jamiew/1080846 to your computer and use it in GitHub Desktop.
Save jamiew/1080846 to your computer and use it in GitHub Desktop.
Download all the images from a Tumblr blog
# Usage:
# [sudo] gem install mechanize
# ruby tumblr-photo-ripper.rb
require 'rubygems'
require 'mechanize'
# Your Tumblr subdomain, e.g. "jamiew" for "jamiew.tumblr.com"
site = "doctorwho"
FileUtils.mkdir_p(site)
concurrency = 8
num = 50
start = 0
loop do
puts "start=#{start}"
url = "http://#{site}.tumblr.com/api/read?type=photo&num=#{num}&start=#{start}"
page = Mechanize.new.get(url)
doc = Nokogiri::XML.parse(page.body)
images = (doc/'post photo-url').select{|x| x if x['max-width'].to_i == 1280 }
image_urls = images.map {|x| x.content }
image_urls.each_slice(concurrency).each do |group|
threads = []
group.each do |url|
threads << Thread.new {
puts "Saving photo #{url}"
begin
file = Mechanize.new.get(url)
filename = File.basename(file.uri.to_s.split('?')[0])
file.save_as("#{site}/#{filename}")
rescue Mechanize::ResponseCodeError
puts "Error getting file, #{$!}"
end
}
end
threads.each{|t| t.join }
end
puts "#{images.count} images found (num=#{num})"
if images.count < num
puts "our work here is done"
break
else
start += num
end
end
@Meroje
Copy link

Meroje commented Oct 14, 2011

Awesome, exactly what i needed, thanks

@meneguinha
Copy link

Very nice code, it is exactly what i need!

I will give you just some advises, if u wanna spread this beautiful piece of code.

  1. Make a .exe .

  2. A very simple Graphical Interface

  3. Be happy

@vxbinaca
Copy link

Why have constants when you could append the link at the end of the command, and have it strip off everything but the blog name

@fengs
Copy link

fengs commented Aug 1, 2015

Nice tool. Helped me save 500+ images. Thanks.

@kestel
Copy link

kestel commented May 19, 2016

Great script! Thank you so much!

@AleXoundOS
Copy link

Successfully done for 16290 images (3.8G). Thank you!

@v3rba
Copy link

v3rba commented Apr 13, 2018

Thanks a lot. Downloaded 5800 images.

@lulububalus
Copy link

lulububalus commented Dec 6, 2018

Hi, I'm really sorry, I don't know anything about programming, but I'm desperately looking for some software or way of downloading images from a tumblr. Is there some way I can use this script with my ignorance? Thanks in advanced.

@sverzel
Copy link

sverzel commented Dec 14, 2018

Version in Perl to grab every single image instead, surpassing XML parsing etc.:

#!/usr/bin/perl

use strict;

use LWP::Simple;

my $site = 'foo';

foreach (my $i = 0; $i < 9999; $i += 50) {
    my $url = "http://$site.tumblr.com/api/read?type=photo&num=50&start=$i";
    warn "Retrieving $url\n";

    my $src = get($url);

    foreach my $image( $src =~ m{https://[^\s]+\.jpg}g ) {
        my ($filename) = $image =~ m{/([^/]+\.jpg)};
        print "found $filename\n";
        
        getstore($image, "pictures/$filename");
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment