Skip to content

Instantly share code, notes, and snippets.

@bradtraversy
Created July 29, 2018 12:02
Show Gist options
  • Star 27 You must be signed in to star a gist
  • Fork 10 You must be signed in to fork a gist
  • Save bradtraversy/f2014a236646ff62dccfc9fe5d469ed5 to your computer and use it in GitHub Desktop.
Save bradtraversy/f2014a236646ff62dccfc9fe5d469ed5 to your computer and use it in GitHub Desktop.
Simple scraping of a blog
import requests
from bs4 import BeautifulSoup
from csv import writer
response = requests.get('http://codedemos.com/sampleblog/')
soup = BeautifulSoup(response.text, 'html.parser')
posts = soup.find_all(class_='post-preview')
with open('posts.csv', 'w') as csv_file:
csv_writer = writer(csv_file)
headers = ['Title', 'Link', 'Date']
csv_writer.writerow(headers)
for post in posts:
title = post.find(class_='post-title').get_text().replace('\n', '')
link = post.find('a')['href']
date = post.select('.post-date')[0].get_text()
csv_writer.writerow([title, link, date])
@Manojaditya
Copy link

hi i am trying scrape google playstore with this but i am just getting a a csv file with the headers i just changed the links and tags according to the page

@felix4webscience
Copy link

felix4webscience commented Nov 1, 2018

Hi, the webpage: "http://codedemos.com/sampleblog/" is out of date and for sell. Have you moved the page by chance or are you able to provide another dummy page like example.com?
Thanky in advance.

@fredcodee
Copy link

thanks you just simplified web scraping. you are the best

@DankSteelMemes
Copy link

For anyone requesting sample pages, your best bet is to just put the HTML into the code yourself and try and scrape there. You will have to kind of skip over the requests part but that's fine because it is the easy part. Pretend whatever variable your HTML is under is the returned request. It works out the same since this is just an example for testing.

@nathantbissell
Copy link

nathantbissell commented Jan 8, 2020

For anyone looking to find good test sites, webscraper.io should do the trick

@BekBrace
Copy link

BekBrace commented Mar 1, 2020

Thank you Brad for the ultimate awesomeness

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment