Skip to content

Instantly share code, notes, and snippets.

@emaia
Last active December 28, 2019 15:28
Show Gist options
  • Save emaia/ab822217a82bf5b7714922fa9c9a93b5 to your computer and use it in GitHub Desktop.
Save emaia/ab822217a82bf5b7714922fa9c9a93b5 to your computer and use it in GitHub Desktop.
#!/bin/bash
if [ "$1" == "" ]
then
echo "Usage: ./parse-url.sh URL/IP"
else
# Get the html for the URL
wget -O $1.html $1
# check if file exists
if [ -f $1.html ]
then
echo 'Starting scan and filter URLs...'
# Filter html to get a clean list
grep -Eoi '<a [^>]+' $1.html |
grep -Eo 'href="[^\"]+"' |
grep -Eo '(http|https)://[^/"]+' |
sed 's/http:\/\///' |
sed 's/https:\/\///' |
# Remove duplicated and save filtered list in a file
sort -u > tmp-url-list
if [ -f tmp-url-list ]
then
# Delete file
rm -f $1.html
echo 'URLs list saved sucessfully.'
echo 'Starting to get hosts from the list...'
# Execute host command for all itens in the list
for url in $(cat tmp-url-list); do host $url; done |
# Clean output and save results in a file
grep "has address" | sed 's/ has address /;/' > hosts-$1
if [ -f hosts-$1 ]
then
rm -f tmp-url-list
# Show file content in the screen
cat hosts-$1
echo "Job done. Check file hosts-$1"
fi
fi
fi
fi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment