Skip to content

Instantly share code, notes, and snippets.

@mrgarita
Last active February 21, 2020 14:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mrgarita/51cad4238b6715a5b01da3e56232dd0c to your computer and use it in GitHub Desktop.
Save mrgarita/51cad4238b6715a5b01da3e56232dd0c to your computer and use it in GitHub Desktop.
Python:BS4で特定のクラス内のあるクラスを取り出す
# -*- coding: utf-8 -*-
'''
beautifulsoup4:特定のクラス内のあるクラスを取り出す
'''
from bs4 import BeautifulSoup
# HTMLデータを取得する
html = """
<ol>
<li class="best1"><td class><p class="waku">グリム童話</p><p class="waku">怖い話が多い</p></td><tr>
<li class="best2"><td class><p class="waku">ギリシャ神話</p><p class="waku">神々の名前が漫画のキャラに使われていたりする</p></td><tr>
</ol>
"""
# HTMLをBeautifulSoupに渡して解析
soup = BeautifulSoup(html, 'html.parser')
# h2タグのwakuクラスの部分のみ取り出したい
str = soup.find("li", class_="best2")
str = str.find("p", class_="waku")
# 画面に表示
print(str.text)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment