Skip to content

Instantly share code, notes, and snippets.

@xihuny
Last active February 8, 2018 04:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save xihuny/3ed256c481e6cbf8562654922d3515b9 to your computer and use it in GitHub Desktop.
Save xihuny/3ed256c481e6cbf8562654922d3515b9 to your computer and use it in GitHub Desktop.
Simply python script to scrape latest stories from sun.mv website.
from bs4 import BeautifulSoup as bs
from requests import get
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}
content = bs(get("https://sun.mv/story", headers=headers).content, "html5lib")
for x in content.find_all("a", class_="feetha-news"):
print x['href']
print x.find("strong", class_="thaana_bold").text
content = bs(get(x['href'], headers=headers).content, "html5lib")
article = content.find("div", class_="article-text")
for x in article.find_all("p"):
print x.text
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment