Skip to content

Instantly share code, notes, and snippets.

View tableau-scraper.py
import requests
from bs4 import BeautifulSoup
import json
url_stub = "https://results.mo.gov"
workbook_url = f"{url_stub}/t/COVID19/views/VaccinationsDashboard/Vaccinations"
workbook_params = {
':embed': 'y',
@gordonje
gordonje / scrape_in_parallel.py
Last active Feb 19, 2020
A scraping script that runs in multiple, parallel processes
View scrape_in_parallel.py
import requests
from time import sleep
from multiprocessing import Pool
session = None
def set_global_session():
global session
if not session:
session = requests.Session()
@gordonje
gordonje / scrape.py
Last active Feb 19, 2020
A scraping script that runs as a single, synchronous process.
View scrape.py
import requests
from time import sleep
session = requests.Session()
def cache_page(identifier):
sleep(3)
url = f'https://mycourts.in.gov/PORP/Search/Detail?ID={identifier}'
r = session.get(url)
@gordonje
gordonje / warrenmayer-youtube-dl-best-merge.sh
Last active Oct 23, 2019
Command, options and arguments for downloading "best" video and audio file formats from warrenmayer channel on YouTube (and merging if necessary)
View warrenmayer-youtube-dl-best-merge.sh
youtube-dl --write-info-json --all-subs --write-all-thumbnails \
-o '~/Desktop/warrenmayer-best-merge/%(title)s-%(id)s/%(title)s-%(id)s.%(ext)s' \
https://www.youtube.com/user/warrenmayer/
@gordonje
gordonje / warrenmayer-youtube-dl-best-mp4.sh
Last active Oct 23, 2019
Command, options and arguments for downloading "best" compatible file formats from warrenmayer channel on YouTube (but only in mp4 container)
View warrenmayer-youtube-dl-best-mp4.sh
youtube-dl --write-info-json --all-subs --write-all-thumbnails \
-f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]' \
-o '~/Desktop/warrenmayer-best-mp4/%(title)s-%(id)s/%(title)s-%(id)s.%(ext)s' \
https://www.youtube.com/user/warrenmayer/
@gordonje
gordonje / warrenmayer-youtube-dl-best-single.sh
Last active Oct 23, 2019
Command, options and arguments for downloading "best" quality media, served as a single file, from warrenmayer channel on YouTube
View warrenmayer-youtube-dl-best-single.sh
youtube-dl --write-info-json --all-subs --write-all-thumbnails -f best \
-o '~/Desktop/warrenmayer-best-single/%(title)s-%(id)s/%(title)s-%(id)s.%(ext)s' \
https://www.youtube.com/user/warrenmayer/
View link_types_of_duplicate_links.sql
@gordonje
gordonje / create_candidate_committees.sql
Created Sep 22, 2016
Create distinct candidate committees links
View create_candidate_committees.sql
CREATE TABLE calaccess_processed_candidate_committees AS
SELECT
@link_type as link_group_id,
lu."CODE_DESC" AS link_type_description,
cand_filer_id,
committee_filer_id,
MIN(session) AS first_session,
MAX(session) AS last_session,
MIN(effective_date) AS first_effective_date,
MAX(effective_date) AS last_effective_date,
@gordonje
gordonje / filer_pairs_where_other_is_multiple_types.sql
Created Sep 22, 2016
filer pairs where non-candidate is multiple types
View filer_pairs_where_other_is_multiple_types.sql
SELECT
@link_type as link_group,
cand_filer_id,
other_filer_id,
COUNT(DISTINCT other_filer_type) as filer_type_count
FROM (
SELECT
links."FILER_ID_A" AS cand_filer_id,
links."FILER_ID_B" AS other_filer_id,
"LINK_TYPE" as link_type,
@gordonje
gordonje / filer_a_and_b_type_combos.sql
Created Sep 21, 2016
Combos of filer a and filer b types
View filer_a_and_b_type_combos.sql
SELECT
a_types."DESCRIPTION" AS filer_a_type,
b_types."DESCRIPTION" AS filer_b_type,
COUNT(DISTINCT links.id)
FROM "FILER_LINKS_CD" links
JOIN (
SELECT "FILER_ID", "DESCRIPTION"
FROM "FILER_TO_FILER_TYPE_CD" f2ft
JOIN "FILER_TYPES_CD" ft
ON f2ft."FILER_TYPE" = ft."FILER_TYPE"