Skip to content

Instantly share code, notes, and snippets.

@JichunMa
Last active April 3, 2018 04:43
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save JichunMa/21e8f36d5cd0e11df90afc527e8a669e to your computer and use it in GitHub Desktop.
Save JichunMa/21e8f36d5cd0e11df90afc527e8a669e to your computer and use it in GitHub Desktop.
头条详情页图片url
detail_data = ''
// 详情页网址 https://www.toutiao.com/a6539663078324175367/
//我把头条详情页存到了本地 toutiao_detai.html
with open('toutiao_detai.html', 'r')as f:
detail_data = f.read()
data_list = re.findall("articleInfo: {(.*?)}", detail_data, re.S)
for data in data_list:
url_list = re.findall('(http.*?com)', text, re.S)
for url in url_list:
print(url)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment