Skip to content

Instantly share code, notes, and snippets.

@jinyu121 jinyu121/README.md
Created Mar 7, 2019

Embed
What would you like to do?
网易博客图片批量下载

网易博客停运,但是不支持打包下载。 于是可以从某个页面上下载回来所有的博客文字的xml,还可以下载出来博客配图的xml。 用这个xml配合上述脚本即可快速将图片下载回来。

import xmltodict
import requests
import os
from tqdm import tqdm
data = open("网易博客图片列表.xml", "r").read()
data = xmltodict.parse(data)['root']['photo']
for url in tqdm(data):
tqdm.write(url)
try:
url = url.replace("?height=96&width=96", "")
filename = os.path.join("images", os.path.split(url)[-1])
r = requests.get(url, stream=True, timeout=60)
r.raise_for_status()
with open(filename, 'wb') as f:
for chunk in r.iter_content(chunk_size=1024):
if chunk: # filter out keep-alive new chunks
f.write(chunk)
f.flush()
except:
tqdm.write("==> Error {}".format(url))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.