jadedgnome

## createAccount.sh
#!/bin/bash

clear

trap "" SIGHUP SIGINT SIGTERM SIGTSTP

#get username, check if its taken, and if its proper length
while true
do
	echo -n "Create username: "

## gist:02749f02f9f795a5c80f

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                jadedgnome
                / gist:02749f02f9f795a5c80f
            
            
              Last active
              August 29, 2015 14:17
            
          
    This American Life limits their podcast feed to only the most recently aired episode, but you can download every episode (or a range) using a one-liner like this:
for i in {1..600};do wget http://audio.thisamericanlife.org/jomamashouse/ismymamashouse/$i.mp3 ;done

  
## gist:dada7e90a67cdb4704a4

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                jadedgnome
                / gist:dada7e90a67cdb4704a4
            
            
              Last active
              August 29, 2015 14:15
                — forked from ChickenProp/gist:3037292
            
          
    After installing Arch on my Raspberry Pi, internet worked out of the box: I could plug it into the router, turn it on, ssh in and start downloading things. But the router is in my housemate's bedroom, which isn't ideal. If I want the Pi to be connected to the internet in my room, I need it to be connected to my laptop. (Another option would be a USB wifi dongle, of course.) This is how I did it. Much credit goes to the Ubuntu wiki's Connection sharing page.
I should disclaim that I don't fully understand networking stuff, and some of what I say might be wrong. I also didn't write this as I was going; so while I've consulted my browser and shell histories, it's possible I've forgotten some steps.
My laptop is running Gentoo, and this is where most of the work has to be done. It connects to the internet through wifi, on interface wlan0. The ethernet port is eth0, and eth0 is also the name of the ethernet port on the Pi.
Step zero: plug ev

  
## node-and-npm-in-30-seconds.sh
echo 'export PATH=$HOME/local/bin:$PATH' >> ~/.bashrc
. ~/.bashrc
mkdir ~/local
mkdir ~/node-latest-install
cd ~/node-latest-install
curl http://nodejs.org/dist/node-latest.tar.gz | tar xz --strip-components=1
./configure --prefix=~/local
make install # ok, fine, this step probably takes more than 30 seconds...
curl https://www.npmjs.org/install.sh | sh

## rsync_parallel.sh
#!/bin/bash
set -e

# Usage:
#   rsync_parallel.sh [--parallel=N] [rsync args...]
#
# Options:
#   --parallel=N	Use N parallel processes for transfer. Defaults to 10.
#
# Notes:

## download_site.sh
wget \
     --recursive \
     --no-clobber \
     --page-requisites \
     --html-extension \
     --convert-links \
     --restrict-file-names=windows \
     --domains domain.com \
     --no-parent \
         domain.com

## wget-spider.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                jadedgnome
                / wget-spider.md
            
            
              Last active
              August 29, 2015 14:13
                — forked from fedecarg/wget-spider.md
            
          
    Extract links from a BBC responsive site

DOMAIN="m.bbc.co.uk"
SERVICE="hindi"
HTTP_USER_AGENT="Mozilla/5.0 (iPhone; Mobile; AppleWebKit; Safari)"
EXCLUDE_EXTENSIONS="\.\(txt\|css\|js\|png\|gif\|jpg\)$"
MAX_DEPTH="3"

wget --spider --no-directories --no-parent --force-html --recursive \

--level=$MAX_DEPTH --no-clobber \

  
## wget_spider_https
http://addictivecode.org/FrequentlyAskedQuestions

To spider a site as a logged-in user:

1. post the form data (_every_ input with a name in the form, even if it doesn't have a value) required to log in (--post-data).
2. save the cookies that get generated (--save-cookies), including session cookies (--keep-session-cookies), which are not saved when --save-cookies alone is specified.
2. load the cookies, continue saving the session cookies, and recursively (-r) spider (--spider) the site, ignoring (-R) /logout.

# log in and save the cookies
wget --post-data='username=my_username&password=my_password&next=' --save-cookies=cookies.txt --keep-session-cookies https://foobar.com/login

## gist:5c8592d48108e18d3de0
wget --spider -o wget.log -e robots=off --wait 3 -r -p -S http://

grep -ri 'http://' wget.log | grep -E -v '(files/|\.jpg|\.jpeg|\.gif|\.css|\.js|\.pdf|\.png|\.xls)' | awk '{print $3}'|sort|uniq|sort > site_map.txt

cat $1 |grep -i -E -v '(\.jpg|\.jpeg|\.gif|\.css|\.js|\.pdf|\.png|\.xls|\.ico|\.txt|\.doc|yandexbot|googlebot|YandexDirect|\/upload\/|" 404 |" 301 |" 302 )'|perl -MURI::Escape -lne 'print uri_unescape($_)'|grep yandsearch|awk '{print $1}'|sort|uniq|wc -l

## spider.sh
#!/bin/bash

HOME="http://www.yourdomain.com/some/page"
DOMAINS="yourdomain.com"
DEPTH=2
OUTPUT="./urls.csv"

wget -r --spider --delete-after --force-html -D "$DOMAINS" -l $DEPTH "$HOME" 2>&1 \
    | grep '^--' | awk '{ print $3 }' | grep -v '\. \(css\|js\|png\|gif\|jpg\)$' | sort | uniq > $OUTPUT
	#!/bin/bash

	clear

	trap "" SIGHUP SIGINT SIGTERM SIGTSTP

	#get username, check if its taken, and if its proper length
	while true
	do
	echo -n "Create username: "
	echo 'export PATH=$HOME/local/bin:$PATH' >> ~/.bashrc
	. ~/.bashrc
	mkdir ~/local
	mkdir ~/node-latest-install
	cd ~/node-latest-install
	curl http://nodejs.org/dist/node-latest.tar.gz \| tar xz --strip-components=1
	./configure --prefix=~/local
	make install # ok, fine, this step probably takes more than 30 seconds...
	curl https://www.npmjs.org/install.sh \| sh
	#!/bin/bash
	set -e

	# Usage:
	# rsync_parallel.sh [--parallel=N] [rsync args...]
	#
	# Options:
	# --parallel=N Use N parallel processes for transfer. Defaults to 10.
	#
	# Notes:
	wget \
	--recursive \
	--no-clobber \
	--page-requisites \
	--html-extension \
	--convert-links \
	--restrict-file-names=windows \
	--domains domain.com \
	--no-parent \
	domain.com
	http://addictivecode.org/FrequentlyAskedQuestions

	To spider a site as a logged-in user:

	1. post the form data (_every_ input with a name in the form, even if it doesn't have a value) required to log in (--post-data).
	2. save the cookies that get generated (--save-cookies), including session cookies (--keep-session-cookies), which are not saved when --save-cookies alone is specified.
	2. load the cookies, continue saving the session cookies, and recursively (-r) spider (--spider) the site, ignoring (-R) /logout.

	# log in and save the cookies
	wget --post-data='username=my_username&password=my_password&next=' --save-cookies=cookies.txt --keep-session-cookies https://foobar.com/login
	wget --spider -o wget.log -e robots=off --wait 3 -r -p -S http://

	grep -ri 'http://' wget.log \| grep -E -v '(files/\|\.jpg\|\.jpeg\|\.gif\|\.css\|\.js\|\.pdf\|\.png\|\.xls)' \| awk '{print $3}'\|sort\|uniq\|sort > site_map.txt

	cat $1 \|grep -i -E -v '(\.jpg\|\.jpeg\|\.gif\|\.css\|\.js\|\.pdf\|\.png\|\.xls\|\.ico\|\.txt\|\.doc\|yandexbot\|googlebot\|YandexDirect\|\/upload\/\|" 404 \|" 301 \|" 302 )'\|perl -MURI::Escape -lne 'print uri_unescape($_)'\|grep yandsearch\|awk '{print $1}'\|sort\|uniq\|wc -l
	#!/bin/bash

	HOME="http://www.yourdomain.com/some/page"
	DOMAINS="yourdomain.com"
	DEPTH=2
	OUTPUT="./urls.csv"

	wget -r --spider --delete-after --force-html -D "$DOMAINS" -l $DEPTH "$HOME" 2>&1 \
	\| grep '^--' \| awk '{ print $3 }' \| grep -v '\. \(css\\|js\\|png\\|gif\\|jpg\)$' \| sort \| uniq > $OUTPUT