Skip to content

Instantly share code, notes, and snippets.

@Xifeng2009
Created January 8, 2019 05:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Xifeng2009/705bbd1621b8b25f6171f6416b70950a to your computer and use it in GitHub Desktop.
Save Xifeng2009/705bbd1621b8b25f6171f6416b70950a to your computer and use it in GitHub Desktop.
# 文档
https://beautifulsoup.readthedocs.io/zh_CN/v4.4.0/
import requests
from bs4 import BeautifulSoup
all_url = 'http://www.mzitu.com/all/'
start_html = requests.get(all_url, headers=headers)
Soup = BeautifulSoup(start_html.text, "lxml") # 解析器:html.parser, lxml-xml, xml, html5lib
# 获取a标签的链接
for link in soup.find_all('a'):
print(link.get('href')) #或者 print(link['href'])
# 获取所有文字内容
soup.get_text()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment