Skip to content

Instantly share code, notes, and snippets.

@monspo1
monspo1 / proj3_webscrape1.py
Last active August 25, 2019 01:12
Open the connection to indeed.com using BeautifulSoup
# load the library
from bs4 import BeautifulSoup as Soup
import urllib, requests, re, pandas as pd
# indeed.com url
base_url = 'http://www.indeed.com/jobs?q=data+scientist&jt=fulltime&sort='
sort_by = 'date' # sort by data
start_from = '&start=' # start page number
pd.set_option('max_colwidth',500) # to remove column limit (Otherwise, we'll lose some info)