Skip to content

Instantly share code, notes, and snippets.

@guillefix
Last active February 27, 2020 02:36
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save guillefix/da2d848f8388014dfbc39cdfd5468204 to your computer and use it in GitHub Desktop.
Save guillefix/da2d848f8388014dfbc39cdfd5468204 to your computer and use it in GitHub Desktop.
scrape621
import os
import sys
import pandas as pd
if not os.path.exists("floofs"):
os.mkdir("floofs")
d = pd.read_csv("e621_"+sys.argv[1]+".csv")
d=d[d["rating"]=="s"]
import urllib.request
import time
opener = urllib.request.build_opener()
opener.addheaders = [('User-agent', 'FurGAN/1.0 (by guillefix on e621)')]
urllib.request.install_opener(opener)
t0 = time.time()
for j,(id,sample_url,file_ext) in d[["id","sample_url","file_ext"]].iterrows():
# print(id,sample_url)
print(j)
try:
urllib.request.urlretrieve(sample_url,"floofs/"+str(id)+"."+file_ext)
except:
continue
# d.columns
t1 = time.time()
print(t1-t0)
@guillefix
Copy link
Author

guillefix commented Feb 27, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment