Skip to content

Instantly share code, notes, and snippets.

@luzihang123
Created September 9, 2019 10:58
Show Gist options
  • Save luzihang123/5f5f29c3191b50ce5673ae8799745c07 to your computer and use it in GitHub Desktop.
Save luzihang123/5f5f29c3191b50ce5673ae8799745c07 to your computer and use it in GitHub Desktop.
智能解析文章
import requests
import json
headers = {
'Accept': 'application/json',
'Referer': 'https://scrapinghub.com/autoextract',
'Origin': 'https://scrapinghub.com',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36',
'Sec-Fetch-Mode': 'cors',
'Content-Type': 'application/json',
}
url = "http://www.bjd.com.cn/jx/tp/201909/03/t20190903_11104419.html"
data = json.dumps({"url": url, "page_type": "article"})
response = requests.post('https://xod-beta-forwarder.scrapinghub.com/extract', headers=headers, data=data)
from pprint import pprint
pprint(response.text)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment