This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
import sys | |
import json | |
def waybackurls(host, with_subs): | |
if with_subs: | |
url = 'http://web.archive.org/cdx/search/cdx?url=*.%s/*&output=json&fl=original&collapse=urlkey' % host | |
else: | |
url = 'http://web.archive.org/cdx/search/cdx?url=%s/*&output=json&fl=original&collapse=urlkey' % host | |
r = requests.get(url) | |
results = r.json() | |
return results[1:] | |
if __name__ == '__main__': | |
argc = len(sys.argv) | |
if argc < 2: | |
print('Usage:\n\tpython3 waybackurls.py <url> <include_subdomains:optional>') | |
sys.exit() | |
host = sys.argv[1] | |
with_subs = False | |
if argc > 3: | |
with_subs = True | |
urls = waybackurls(host, with_subs) | |
json_urls = json.dumps(urls) | |
if urls: | |
filename = '%s-waybackurls.json' % host | |
with open(filename, 'w') as f: | |
f.write(json_urls) | |
print('[*] Saved results to %s' % filename) | |
else: | |
print('[-] Found nothing') |
how to fix this issue?
kali@kali:~/Desktop/tools/waybackurls$ python3 WAYBACKTEST.py evil.com
Traceback (most recent call last):
File "WAYBACKTEST.py", line 1, in
import requests
ModuleNotFoundError: No module named 'requests'
pip3 install requests
A bash function which uses jq
(not for sub-domain search but works for any URL prefix). It gives the full web archive url which is generally of format https://web.archive.org/web/$TIMESTAMP/$ORIGINAL
:
wb ()
{
if [[ -z $1 ]]; then
echo "Usage: $0 URL";
else
curl "http://web.archive.org/cdx/search/cdx?url=$1/*&output=json&fl=original,timestamp" 2> /dev/null | jq '.[1:][] |"https://web.archive.org/web/" +.[1] + "/" + .[0]' 2> /dev/null;
fi
}
This can be added to the ~/.bashrc or relevant shell profile.
Usage: wb gist.github.com/mhmdiaa
Hi,
Just wanted to tell you that I used your Idea in https://github.com/akamhy/waybackpy. [commit]
Usage :
pip3 install waybackpy
waybackpy --url akamhy.github.io --user_agent "my-user-agent" --known_urls
Output:
http://akamhy.github.io
https://akamhy.github.io/favicon.ico
https://akamhy.github.io/robots.txt
https://akamhy.github.io/waybackpy/
https://akamhy.github.io/waybackpy/assets/css/style.css?v=a418a4e4641a1dbaad8f3bfbf293fad21a75ff11
https://akamhy.github.io/waybackpy/assets/css/style.css?v=f881705d00bf47b5bf0c58808efe29eecba2226c
6 URLs found and saved in ./akamhy.github.io-6-urls.txt
Flags:
- '--alive' will only fetch URLs that are not dead. alive will be slower for websites with too many archived URLs e.g. google
- '--subdomain' will include URLs from subdomains.
See live use @ https://repl.it/@akamhy/Waybackpy-Known-Urls#main.sh
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
bruh how are you