Skip to content

Instantly share code, notes, and snippets.

@shaungt1
Forked from Yankim/top20list
Created May 4, 2021 20:04
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save shaungt1/17fa30fbeebf3c011c1fa562e039fdf2 to your computer and use it in GitHub Desktop.
Save shaungt1/17fa30fbeebf3c011c1fa562e039fdf2 to your computer and use it in GitHub Desktop.
br = webdriver.Firefox() #open firefox
br.get('https://www.allrecipes.com/recipes/'+str(yearurls[i]))
###ID number for year, example 1997 has ID of 14486
html_list = br.find_element_by_id("grid")
urls = html_list.find_elements(By.CLASS_NAME, "favorite")
#All top 20 recipes have hearts associated with them. Inside
#the heart contains the unique ID number for the given recipe
for i, e in enumerate(urls):
id.append(e.get_attribute('data-id'))
urls[i] = 'https://allrecipes.com/recipe/' + str(id[i])
#update list of URLS to the 20 recipe URL for a given year
urls = np.unique(urls)
id = np.unique(id)
#remove any repeats
#go to each individual recipe to scrape
for i, url in enumerate(urls):
br.get(url)
time.sleep(3)
scrape_recipe(br, year, id[i])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment