Skip to content

Instantly share code, notes, and snippets.

@jeffaudi
Last active June 15, 2022 01:25
Show Gist options
  • Star 27 You must be signed in to star a gist
  • Fork 7 You must be signed in to fork a gist
  • Save jeffaudi/89bba20e839d99e4afab to your computer and use it in GitHub Desktop.
Save jeffaudi/89bba20e839d99e4afab to your computer and use it in GitHub Desktop.
This python script downloads all the photos liked of your tumblr account. This is usually more useful than downloading the photos from a specfic blog. Updated to also download videos and store content is folders. Please, note that currently the Tumblr API only returns the first 1000 likes (https://groups.google.com/forum/#!searchin/tumblr-api/li…
import pytumblr
import os
import code
import oauth2 as oauth
from pprint import pprint
import json
import urllib
import codecs
# Number of likes to fetch in one request
limit = 20
# Directory where to save the images
directory = "tumblr-likes"
# List of ident downloaded
downloaded = []
try:
# Authenticate via OAuth
client = pytumblr.TumblrRestClient(
'CONSUMER_KEY',
'CONSUMER_SECRET',
'OAUTH_TOKEN',
'OAUTH_TOKEN_SECRET'
)
# Get the info on the user
info = client.info()
# Get the content
name = info["user"]["name"]
number = int(info["user"]["likes"])
# Currently the Tumblr API returns no more than 1000 likes
pages = min(number // limit, 50)
# Display the number of likes and pages of 20
print "Tumblr user {0} has {1} likes".format(name, number)
print "{0} pages will be fetched".format(pages)
posts = 0
total = 0
for page in xrange(0, pages):
# For testing
#if page == 1:
# break
# Get the likes
offset = page * limit
likes = client.likes(offset=offset, limit=limit)["liked_posts"]
# Parse the likes
for liked in likes:
# Only the photos
if "photos" in liked:
downloaded.append([liked["id"], liked["reblog_key"]])
photos = liked["photos"]
count = 0
# Parse photos
for photo in photos:
# Get the original size
url = photo["original_size"]["url"]
imgname = url.split('/')[-1]
# Store in a directory based on blog name
blog_dir = directory + "/" + liked["blog_name"]
if not os.path.isdir(blog_dir):
os.mkdir(blog_dir)
# Create a unique name
filename = blog_dir + "/" + str(liked["id"]) + "-"
# Add numbers if more than one image
if count > 0:
filename += str(count) + "-"
filename += imgname
# Check if image is already on local disk
if (os.path.isfile(filename)):
print "File already exists : " + imgname
else:
print "Downloading " + imgname + " from " + liked["blog_name"]
urllib.urlretrieve(url, filename)
count += 1
posts += 1
total += count
elif "video_url" in liked:
# Get the video name
url = liked["video_url"]
vidname = url.split('/')[-1]
count = 0
# Create a unique name
filename = directory + "/" + liked["blog_name"] + "-" + str(liked["id"]) + "-" + vidname
# Check if video is already on local disk
if (os.path.isfile(filename)):
print "File already exists : " + vidname
else:
print "Downloading " + vidname + " from " + liked["blog_name"]
urllib.urlretrieve(url, filename)
count += 1
posts += 1
total += count
elif "body" in liked:
# Create a unique name
filename = directory + "/" + liked["blog_name"] + "-" + str(liked["id"]) + ".htm"
with codecs.open(filename, "w", "utf-8") as ds:
ds.write('<!doctype html><html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"><title></title></head><body>')
ds.write(liked["body"])
ds.write('</body></html>')
else:
# If not a photo or a video, dump the JSON
with open(str(liked["id"]) + "-" + str(liked["blog_name"]) + ".json", "w") as f:
json.dump(liked, f)
with open("downloaded.json", "w") as f:
json.dump(downloaded, f)
# Display some stats
print "Total posts parsed : " + str(posts)
print "Total images or videos downloaded : " + str(total)
except:
print "Unexpected error:", sys.exc_info()[0]
raw_input("Press Enter to close window...")
@ScootyBooty
Copy link

How do we use this...?

@KlfJoat
Copy link

KlfJoat commented Dec 12, 2017

Thanks for this!!!

FYI, to get this working I had to...

  1. Add #!/usr/bin/env python to the start.

  2. Run

    `which python` pip install pytumblr oauth2 pprint
  1. Add import sys.

  2. Manually create the directory name. (I don't know enough python to suggest a command to do this)

I would create a PR, but this is just a gist, so I'm commenting.

@jaro-m
Copy link

jaro-m commented Aug 6, 2018

Checking for a directory is in the code:

if not os.path.isdir(blog_dir):
    os.mkdir(blog_dir)

but it only checks for blog_dir.
Try to join paths like this: os.path.join(directory, file_name)

@cloudjumper2000
Copy link

Hello, id really like to see this in action. Im getting the following error:

Tumblr user xxxxx has xxxxx likes
50 pages will be fetched
Unexpected error: <class 'NameError'>

Process finished with exit code 0

Im still new to Python but it appears the code gets to:
for page in xrange(0, pages):
Then moves to the except line.

Any thoughts?
Thx!

@souliaq
Copy link

souliaq commented Aug 29, 2018

Confirmed working today 8/28/2018. Remember to fill this fields:

client = pytumblr.TumblrRestClient(
'CONSUMER_KEY',
'CONSUMER_SECRET',
'OAUTH_TOKEN',
'OAUTH_TOKEN_SECRET'
)

@SlickRickEm
Copy link

Hello, id really like to see this in action. Im getting the following error:

Tumblr user xxxxx has xxxxx likes
50 pages will be fetched
Unexpected error: <class 'NameError'>

Process finished with exit code 0

Im still new to Python but it appears the code gets to:
for page in xrange(0, pages):
Then moves to the except line.

Any thoughts?
Thx!

change 'xrange' to 'range' basically omit the letter x

@StefRe
Copy link

StefRe commented Nov 28, 2018

According to the API docs you should be able to get all likes by using the before or after parameters instead of offset:

When using the offset parameter the maximum limit on the offset is 1000. If you would like to get more results than that use either before or after.

So I guess you could take the time stamp of the last like in the current request and request the next 20 likes before that timestamp.

@mra1984
Copy link

mra1984 commented Dec 7, 2018

, line 40 print "Tumblr user {0} has {1} likes".format(name, number) ^ SyntaxError: invalid syntax

On running I hit this error...

@talkinggoat
Copy link

I have an issue where it worked for a while and downloaded some of my likes, probably about 1/5th. Now it says I have 6679 likes and 333 pages, but the total posts parsed and images downloaded are 0.

@johnwhenry
Copy link

, line 40 print "Tumblr user {0} has {1} likes".format(name, number) ^ SyntaxError: invalid syntax

On running I hit this error...

Me too. Very annoying.

@talkinggoat
Copy link

, line 40 print "Tumblr user {0} has {1} likes".format(name, number) ^ SyntaxError: invalid syntax
On running I hit this error...

Me too. Very annoying.

Comment it out like this:
#print "Tumblr user {0} has {1} likes".format(name, number)

@spoonyfork
Copy link

I forked this script over at https://gist.github.com/spoonyfork/d06e524df5dbe2b547f7d0e95fc9c37d with changes to get it to run on python 3.7. I also included instructions for use.

I also had success using this program to backup even more likes: https://github.com/neuro-sys/tumblr-likes-downloader

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment