Skip to content

Instantly share code, notes, and snippets.

@KathanP19
Created July 19, 2020 04:23
Show Gist options
  • Save KathanP19/5972c37b1b4228034170fab9c72702b2 to your computer and use it in GitHub Desktop.
Save KathanP19/5972c37b1b4228034170fab9c72702b2 to your computer and use it in GitHub Desktop.
One-liner for crawling domain list of sub-domains << extract urls with parameters << filter the result to only 4 urls for each endpoint >> filter urls to delete unwanted extensions.
cat sub-domains.txt | hakrawler | grep 'http' | cut -d ' ' -f 2 > crawling.txt && gau -subs http://domain.com >> crawling.txt && waybackurls http://domain.com >> crawling.txt && cat crawling.txt | grep "?" | unfurl --unique format %s://%d%p > base.txt ; cat base.txt | parallel -j 4 grep {} -m5 crawling.txt | tee final1.txt; cat final1.txt | egrep -iv ".(jpg|jpeg|gif|css|tif|tiff|png|ttf|woff|woff2|ico|pdf|svg|txt|js)" > final.txt && rm -rf base.txt final1.txt
@secfb
Copy link

secfb commented Jul 26, 2020

I think it is more useful as below.
domain=testfire.net && cat sub-domains.txt | hakrawler | grep 'http' | cut -d ' ' -f 2 > crawling.txt && gau -subs $domain >> crawling.txt && waybackurls $domain >> crawling.txt && cat crawling.txt | grep "?" | unfurl --unique format %s://%d%p > base.txt ; cat base.txt | parallel -j 4 grep {} -m5 crawling.txt | tee final1.txt; cat final1.txt | egrep -iv ".(jpg|jpeg|gif|css|tif|tiff|png|ttf|woff|woff2|ico|pdf|svg|txt|js)" > final.txt && rm -rf base.txt final1.txt

@KathanP19
Copy link
Author

nice

@brutexploiter
Copy link

brutexploiter commented Sep 6, 2022

Getting error like that
unknown shorthand flag: 's' in -subs

edit: --subs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment