miraculixx/example.MD

## example.MD

      
    Raw
  

              example.MD
            
          
    How to run this

(output as of September 29, 2023)
$ python scraper.py
Does flying slower actually save fuel?
Is non-consented video recording admissable evidence in a civil trial in Maryland?
Iteration counts of AMG solver changes in parallel
Two switch flyback converter MOSFETs voltage stress
Inconsistency in index contraction
Airline forcibly changed return flight destination city over a month in advance. Are we eligible for compensation?
Blender Accessibility Features
Daisy chaining APs or connect them into the central router?
I (rev)?(pal)? the source code, you (rev)?(pal)? the input!
Does a company have to have your login information to verify your identity?
Is it ok to use std::ignore in order to discard a return value of a function to avoid any related compiler warnings?
verifying the Taylor expansion of ln(1+x) satisfies the properties of logarithm
What do we know about Andy Kaufman's SNL audition?
Science fiction story where a human is searching for immortality and meets an alien that has been searching for thousands of years
"bieten" with the meaning of "to ensure"
Funny Numbers :D
What is meant by software and hardware implementations of cryptograpic schemes? How to do it?
Does interspecies breastfeeding occur in the wild?
Where should I stop in this intersection when turning left?
Why does ranges::for_each return the function?
Do any power loads require both power lines disconnected by the "off" switch?
Probability Puzzle from a Quant Interview
Is there a resource for learning to read mathematical notation/equations/formulae?
My Medieval kingdom has birth control, why is the population so high?


## requirements.txt
beautifulsoup4
lxml
requests
selenium
urllib3

## scraper.py
# answer to https://stackoverflow.com/q/53475578/890242
import requests
from urllib.parse import urljoin
from multiprocessing.pool import ThreadPool, Pool
from bs4 import BeautifulSoup
from selenium import webdriver
import threading

def get_links(link):
  res = requests.get(link)
  soup = BeautifulSoup(res.text,"lxml")
  titles = [str(urljoin(url,items.get("href"))) for items in soup.select(".question-hyperlink")]
  return titles

threadLocal = threading.local()

def get_driver():
  driver = getattr(threadLocal, 'driver', None)
  if driver is None:
    chromeOptions = webdriver.ChromeOptions()
    chromeOptions.add_argument("--headless")
    driver = webdriver.Chrome(options=chromeOptions)
    setattr(threadLocal, 'driver', driver)
  return driver


def get_title(url):
  driver = get_driver()
  driver.get(url)
  sauce = BeautifulSoup(driver.page_source,"lxml")
  item = sauce.select_one("h1 a").text
  print(item)

if __name__ == '__main__':
  url = "https://stackoverflow.com/questions/tagged/web-scraping"
  ThreadPool(5).map(get_title, get_links(url))
	# answer to https://stackoverflow.com/q/53475578/890242
	import requests
	from urllib.parse import urljoin
	from multiprocessing.pool import ThreadPool, Pool
	from bs4 import BeautifulSoup
	from selenium import webdriver
	import threading

	def get_links(link):
	res = requests.get(link)
	soup = BeautifulSoup(res.text,"lxml")
	titles = [str(urljoin(url,items.get("href"))) for items in soup.select(".question-hyperlink")]
	return titles

	threadLocal = threading.local()

	def get_driver():
	driver = getattr(threadLocal, 'driver', None)
	if driver is None:
	chromeOptions = webdriver.ChromeOptions()
	chromeOptions.add_argument("--headless")
	driver = webdriver.Chrome(options=chromeOptions)
	setattr(threadLocal, 'driver', driver)
	return driver


	def get_title(url):
	driver = get_driver()
	driver.get(url)
	sauce = BeautifulSoup(driver.page_source,"lxml")
	item = sauce.select_one("h1 a").text
	print(item)

	if __name__ == '__main__':
	url = "https://stackoverflow.com/questions/tagged/web-scraping"
	ThreadPool(5).map(get_title, get_links(url))