Skip to content

Instantly share code, notes, and snippets.

@ColeMundus
ColeMundus / spider.sh
Last active November 12, 2017 05:32 — forked from azhawkes/spider.sh
Really simple wget spider to obtain a list of URLs on a website, by crawling n levels deep from a starting page.
#!/bin/bash
HOME="http://listen.tidal.com"
DOMAINS="listen.tidal.com"
OUTPUT="./urls.csv"
wget -r --spider --delete-after --force-html -D "$DOMAINS" "$HOME" 2>&1 \
| grep '^--' | awk '{ print $3 }' | grep -v '\. \(css\|js\|png\|gif\|jpg\)$' | grep '/album/' | sort | uniq > $OUTPUT
@ColeMundus
ColeMundus / TheEyeFAQ.md
Last active November 5, 2017 23:44 — forked from PurpleBooth/README-Template.md
A template to make good README.md

The Eye FAQ

The Eye is a website dedicated towards archiving and serving publicly available information.

We currently host large scale data-sets such as Reddit archives, old console video-games, operating systems and old software installation files.

How do I download these files?

The easiest way is to use wget, you can find a guide for using wget here