Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Simple scraping of a blog
import requests
from bs4 import BeautifulSoup
from csv import writer
response = requests.get('http://codedemos.com/sampleblog/')
soup = BeautifulSoup(response.text, 'html.parser')
posts = soup.find_all(class_='post-preview')
with open('posts.csv', 'w') as csv_file:
csv_writer = writer(csv_file)
headers = ['Title', 'Link', 'Date']
csv_writer.writerow(headers)
for post in posts:
title = post.find(class_='post-title').get_text().replace('\n', '')
link = post.find('a')['href']
date = post.select('.post-date')[0].get_text()
csv_writer.writerow([title, link, date])
@Manojaditya

This comment has been minimized.

Copy link

@Manojaditya Manojaditya commented Sep 21, 2018

hi i am trying scrape google playstore with this but i am just getting a a csv file with the headers i just changed the links and tags according to the page

@felix4webscience

This comment has been minimized.

Copy link

@felix4webscience felix4webscience commented Nov 1, 2018

Hi, the webpage: "http://codedemos.com/sampleblog/" is out of date and for sell. Have you moved the page by chance or are you able to provide another dummy page like example.com?
Thanky in advance.

@fredcodee

This comment has been minimized.

Copy link

@fredcodee fredcodee commented May 6, 2019

thanks you just simplified web scraping. you are the best

@DankSteelMemes

This comment has been minimized.

Copy link

@DankSteelMemes DankSteelMemes commented Jun 26, 2019

For anyone requesting sample pages, your best bet is to just put the HTML into the code yourself and try and scrape there. You will have to kind of skip over the requests part but that's fine because it is the easy part. Pretend whatever variable your HTML is under is the returned request. It works out the same since this is just an example for testing.

@nathantbissell

This comment has been minimized.

Copy link

@nathantbissell nathantbissell commented Jan 8, 2020

For anyone looking to find good test sites, webscraper.io should do the trick

@BekBrace

This comment has been minimized.

Copy link

@BekBrace BekBrace commented Mar 1, 2020

Thank you Brad for the ultimate awesomeness

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.