Skip to content

Instantly share code, notes, and snippets.

View cwylie0's full-sized avatar
💭
PHP

cwylie0 cwylie0

💭
PHP
View GitHub Profile
@brianpursley
brianpursley / scrape.py
Last active April 30, 2021 20:57
Python script to extract a price from a product web page
from bs4 import BeautifulSoup
from urllib2 import Request, urlopen
import decimal
def findPrice(url, selector):
userAgent = "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.86 Safari/537.36"
req = Request(url, None, {'User-Agent': userAgent})
html = urlopen(req).read()
soup = BeautifulSoup(html, "lxml")
return decimal.Decimal(soup.select(selector)[0].contents[0].strip().strip("$"))
@azhawkes
azhawkes / spider.sh
Created January 13, 2014 18:00
Really simple wget spider to obtain a list of URLs on a website, by crawling n levels deep from a starting page.
#!/bin/bash
HOME="http://www.yourdomain.com/some/page"
DOMAINS="yourdomain.com"
DEPTH=2
OUTPUT="./urls.csv"
wget -r --spider --delete-after --force-html -D "$DOMAINS" -l $DEPTH "$HOME" 2>&1 \
| grep '^--' | awk '{ print $3 }' | grep -v '\. \(css\|js\|png\|gif\|jpg\)$' | sort | uniq > $OUTPUT

tmux cheatsheet

As configured in my dotfiles.

start new:

tmux

start new with session name: