This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
0 verbose cli C:\Program Files\nodejs\node.exe C:\Program Files\nodejs\node_modules\npm\bin\npm-cli.js | |
1 info using npm@10.2.4 | |
2 info using node@v20.11.0 | |
3 timing npm:load:whichnode Completed in 5ms | |
4 timing config:load:defaults Completed in 6ms | |
5 timing config:load:file:C:\Program Files\nodejs\node_modules\npm\npmrc Completed in 25ms | |
6 timing config:load:builtin Completed in 27ms | |
7 timing config:load:cli Completed in 6ms | |
8 timing config:load:env Completed in 1ms | |
9 timing config:load:file:G:\Other computers\My Laptop\Documents\Work\Coding\st_component-template_2\template\my_component\frontend\.npmrc Completed in 1ms |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def scrape_article_page(govuk_string, article_partial_url): | |
target_url = target_url_stub + article_partial_url | |
r = requests.get(target_url, headers={'User-agent': 'Mozilla/5.0'}) # Gov.uk might require headers on the request (unconfirmed) # noqa: E501 | |
if r.status_code == 200: | |
soup = BeautifulSoup(r.content, features='html.parser') | |
# Grab article type and title from the title section | |
title_section = soup.find('div', 'gem-c-title') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def scrape_news_and_communications_page( | |
govuk_string, page_number, article_date_min, article_date_max | |
): | |
target_url = ( | |
target_url_stub + | |
target_url_newscommssnippet + | |
target_url_pagesnippet + | |
str(page_number) + | |
target_url_peoplesnippet + | |
govuk_string + |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# %% | |
# #!/usr/bin/env python | |
# -*- coding: utf-8 -*- | |
''' | |
Purpose | |
Scrape articles in which ministers are tagged and save them to SQL | |
Inputs | |
- SQL: temp.minister_govukid | |
- Web: gov.uk pages |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
height: 740 | |
scrolling: yes | |
border: no |