Skip to content

Instantly share code, notes, and snippets.

@mikeckennedy
Created May 23, 2023 04:27
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mikeckennedy/7ca030d6ddda1570f56dacf44e845f40 to your computer and use it in GitHub Desktop.
Save mikeckennedy/7ca030d6ddda1570f56dacf44e845f40 to your computer and use it in GitHub Desktop.
Result of asking for scraping titles from talk python episodes

MK: Write a python function to use requests and beautiful soup to scrape the titles of episodes of talk python to me

ChatGPT: Sure, here's a basic example of how you might use requests and BeautifulSoup to scrape the titles of episodes of Talk Python to Me. This code assumes that the titles are stored in HTML elements in a specific way. You may need to adjust the code based on the actual structure of the web page.

import requests
from bs4 import BeautifulSoup

def get_episode_titles():
    url = 'https://talkpython.fm/episodes/all' # replace with the actual URL

    response = requests.get(url)
    response.raise_for_status()  # ensure we notice bad responses

    soup = BeautifulSoup(response.text, 'html.parser')

    # You may need to adjust the selector based on the actual structure of the page.
    # This example assumes that each episode title is in a 'h1' tag with a class 'title'.
    title_tags = soup.select('h1.title') 

    for tag in title_tags:
        print(tag.get_text())

get_episode_titles()

This script will print out the titles of the episodes. If you want to return them as a list, you can replace the print statement with titles.append(tag.get_text()) and initialize titles = [] before the loop, then return titles at the end of the function.

Again, please note that you will need to adjust the CSS selector ('h1.title' in this case) to match the actual structure of the web page you are scraping. You can inspect the web page's source code to figure out how the titles are structured.

Also, be aware that websites' terms of service often have rules about web scraping, so make sure that what you're doing is allowed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment