Instantly share code, notes, and snippets.

Embed
What would you like to do?
Ffffind (downloads every image from a given FFFFOUND! account)
"""
ash_ffffind.py
v1.1 (September 14, 2015)
by me@aaronhildebrandt.com
Automatically downloads all images from ffffound saved by a specific user.
Will first try to download the image from the original source (to get the highest quality possible).
If that fails, it'll download the cached version from ffffound.
Prerequisities:
Beautiful Soup (http://www.crummy.com/software/BeautifulSoup/)
Usage:
python ffffind.py username
"""
import os, sys, urllib, imghdr
from BeautifulSoup import BeautifulSoup
from urlparse import urlparse
from posixpath import basename, dirname
class URLOpener(urllib.FancyURLopener):
version = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11'
urllib._urlopener = URLOpener()
def main(user):
offset = 0
page = 1
while True:
print "Capturing page "+str(page)+" ..."
print
f = urllib.urlopen("http://ffffound.com/home/"+user+"/found/?offset="+str(offset))
s = f.read()
f.close()
if "<div class=\"description\">" in s:
images = []
offset += 25
count = 0
soup = BeautifulSoup(s)
for i in soup.findAll("div", { "class" : "description" }):
images.append({"url": urlparse("http://" + str(i).split("<br />")[0].replace("<div class=\"description\">", ""))})
for i in soup.findAll("img"):
if str(i).find("_m.") != -1:
images[count]["backup"] = str(i).split("src=\"")[1].split("\"")[0]
count += 1
for i in images:
if os.path.exists(user+"/"+basename(i["url"].path)):
print basename(i["url"].path) + " exists, stopping."
print
sys.exit()
else:
print "Downloading " + basename(i["url"].path),
try:
urllib.urlretrieve(i["url"].geturl(), user+"/"+basename(i["url"].path))
print "... done."
if not imghdr.what(user+"/"+basename(i["url"].path)) in ["gif", "jpeg", "png", None]:
print "... unfortunately, it seems to be a bad image.\nDownloading backup",
try:
urllib.urlretrieve(i["backup"], user+"/"+basename(i["url"].path))
print "... which seems to have worked."
except:
print "... which also failed."
if os.path.getsize(user+"/"+basename(i["url"].path)) < 5000:
raise
except:
print "... failed. Downloading backup",
try:
urllib.urlretrieve(i["backup"], user+"/"+basename(i["url"].path))
print "... which seems to have worked."
except:
print "... which also failed."
print
page += 1
else:
print "Reached the end of the list, stopping."
break
if __name__ == '__main__':
print
print("ffffound image downloader")
print
if len(sys.argv) < 2:
print "Usage:\n\t python ffffind.py username"
print
else:
try:
if not os.path.exists("./"+sys.argv[1]):
os.mkdir(sys.argv[1])
except:
print "Error creating directory."
sys.exit()
user = sys.argv[1]
print "Downloading all pictures from user '"+user+"'"
print
main(user)
@jwmacias

This comment has been minimized.

Show comment
Hide comment
@jwmacias

jwmacias Apr 14, 2017

@ashildebrandt when I run this it stops on page six of my 491 page account. Any reason that you can think of? I am trying to archive my account images since the site is going offline in May. It also seems to be saving quite a few images without the file extensions. Any thoughts?

jwmacias commented Apr 14, 2017

@ashildebrandt when I run this it stops on page six of my 491 page account. Any reason that you can think of? I am trying to archive my account images since the site is going offline in May. It also seems to be saving quite a few images without the file extensions. Any thoughts?

@graphictheory

This comment has been minimized.

Show comment
Hide comment
@graphictheory

graphictheory Apr 15, 2017

Hi Aaron,

I am getting the same problems that jwmacias is experiencing. Is there a way to fix the file extension issue. That would be great

graphictheory commented Apr 15, 2017

Hi Aaron,

I am getting the same problems that jwmacias is experiencing. Is there a way to fix the file extension issue. That would be great

@monsieurm

This comment has been minimized.

Show comment
Hide comment
@monsieurm

monsieurm Apr 17, 2017

@ashildebrandt Hi, I have installed BS4 via pip and I get this error when I run your script

$ python ffffind.py monsieurm
Traceback (most recent call last):
File "ffffind.py", line 22, in
from BeautifulSoup import BeautifulSoup
ImportError: No module named BeautifulSoup

However

$ pip install beautifulsoup4
Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python2.7/site-packages

Any idea ?

monsieurm commented Apr 17, 2017

@ashildebrandt Hi, I have installed BS4 via pip and I get this error when I run your script

$ python ffffind.py monsieurm
Traceback (most recent call last):
File "ffffind.py", line 22, in
from BeautifulSoup import BeautifulSoup
ImportError: No module named BeautifulSoup

However

$ pip install beautifulsoup4
Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python2.7/site-packages

Any idea ?

@monsieurm

This comment has been minimized.

Show comment
Hide comment
@monsieurm

monsieurm Apr 18, 2017

ok just install BeautifulSoup previous version

pip install beautifulsoup not pip install beautifulsoup4

monsieurm commented Apr 18, 2017

ok just install BeautifulSoup previous version

pip install beautifulsoup not pip install beautifulsoup4

@PhilVD

This comment has been minimized.

Show comment
Hide comment
@PhilVD

PhilVD Apr 18, 2017

Hey. I never before worked with Python. But I want to save my ffffounds so bad. So, is there any instruction, what to do with this script. I dowloaded python already, and tried to use the code, but i failed. Please, can you help?

BG - Phil :)

PhilVD commented Apr 18, 2017

Hey. I never before worked with Python. But I want to save my ffffounds so bad. So, is there any instruction, what to do with this script. I dowloaded python already, and tried to use the code, but i failed. Please, can you help?

BG - Phil :)

@philgyford

This comment has been minimized.

Show comment
Hide comment
@philgyford

philgyford Apr 22, 2017

I updated the script to also create HTML pages in which you can view your images, plus fixed a couple of edge cases I came across: https://github.com/philgyford/ffffound-export

philgyford commented Apr 22, 2017

I updated the script to also create HTML pages in which you can view your images, plus fixed a couple of edge cases I came across: https://github.com/philgyford/ffffound-export

@G-9999

This comment has been minimized.

Show comment
Hide comment
@G-9999

G-9999 Jun 15, 2017

Anyone have a torrent of the html page dump? I totally learned about this too late :(

G-9999 commented Jun 15, 2017

Anyone have a torrent of the html page dump? I totally learned about this too late :(

@fish2000

This comment has been minimized.

Show comment
Hide comment
@fish2000

fish2000 May 11, 2018

@triune the entirety of ffffound.com is in fact available, via torrent, from archive.org:

https://archive.org/details/ffffound.com-warc-archive-2017-05-07

Enjoy!

fish2000 commented May 11, 2018

@triune the entirety of ffffound.com is in fact available, via torrent, from archive.org:

https://archive.org/details/ffffound.com-warc-archive-2017-05-07

Enjoy!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment