Skip to content

Instantly share code, notes, and snippets.

@gabemarshall
Created August 23, 2017 17:53
Show Gist options
  • Save gabemarshall/fa5004b62ae25b5cca2ba96108165a83 to your computer and use it in GitHub Desktop.
Save gabemarshall/fa5004b62ae25b5cca2ba96108165a83 to your computer and use it in GitHub Desktop.
Script to download results from the wayback machine, and do some ghetto parsing
#!/bin/bash
# Requires httpie and jq
#### Settings ####
read -p "What domain would you like to search the wayback machine for? " domain
http --download --output=$domain.json "https://web.archive.org/cdx/search?url=$domain%2F&matchType=prefix&collapse=urlkey&output=json&fl=original%2Cmimetype%2Ctimestamp%2Cendtimestamp%2Cgroupcount%2Cuniqcount&filter=!statuscode%3A%5B45%5D..&_=1498608272486"
cat $domain.json| jq '.[][]'| grep 'http'| grep -v -i -e '.js' -e '.gif' -e '.png' -e '.jpg' -e '.jpeg' -e '.css' | cut -d '"' -f2 > $domain".txt"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment