Skip to content

Instantly share code, notes, and snippets.

@asumansenol
Last active April 9, 2022 15:40
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save asumansenol/43b8ed9c97c962bda2cde26b8b24c548 to your computer and use it in GitHub Desktop.
Save asumansenol/43b8ed9c97c962bda2cde26b8b24c548 to your computer and use it in GitHub Desktop.
#!/bin/bash
# Script to check error rate of crawls
# Exit if we hit any errors
set -e
if [ "$#" -lt 1 ]; then
echo "Usage: ./check_crawl.sh <server_location>"
exit 1
fi
key=/home/asuman/.ssh
LOG_FILE=/root/web-inspector/220401_advanced_matching_100K*.log
INDEX_FILE=/root/web-inspector
HIDE_SINGLE_ERRORS=true
if [ $1 == 'nyc' ]
then
host=root@143.198.182.175
else
host=root@178.128.192.249
fi
printf '\nTail of log: \n'
printf '==================== \n'
ssh -o StrictHostKeyChecking=no -i $key $host 'tail -n 30 '$LOG_FILE
printf '\n\nTotal number of visits: \n'
printf '========================= \n'
ssh -o StrictHostKeyChecking=no -i $key $host 'cat '$LOG_FILE' | grep "emailPasswordFields init took" -c'
printf '\n\nTotal number of succesfull visits: \n'
printf '========================= \n'
ssh -o StrictHostKeyChecking=no -i $key $host 'cat '$LOG_FILE' | grep "Processing" | grep "took" | wc -l'
printf '\n\nTotal number of sites where email was filled: \n'
printf '========================= \n'
ssh -o StrictHostKeyChecking=no -i $key $host 'grep -c "Successfully filled the email field" '$LOG_FILE''
printf '\n\nTotal number of sites where password was filled: \n'
printf '========================= \n'
ssh -o StrictHostKeyChecking=no -i $key $host 'grep -c "Successfully filled the password field" '$LOG_FILE''
printf '\n\nTotal number of sites with FB leaks: \n'
printf '========================= \n'
ssh -o StrictHostKeyChecking=no -i $key $host 'find . -name "*.json" | xargs grep "udff" | grep "SubscribedButtonClick" | grep "postData" | wc -l'
printf '\n\nTotal number of sites with TIKTOK leaks: \n'
printf '========================= \n'
ssh -o StrictHostKeyChecking=no -i $key $host 'find . -name "*.json" | xargs grep "auto_email" | grep "EnrichAM" | grep "postData" | wc -l'
printf '\n\nTotal number of timeouts: \n'
printf '========================= \n'
ssh -o StrictHostKeyChecking=no -i $key $host 'grep -c "Crawl failed Operation timed out Error: Operation timed out" '$LOG_FILE''
printf '\n\nTotal number of errors: \n'
printf '========================= \n'
ssh -o StrictHostKeyChecking=no -i $key $host 'grep -c "Crawl failed" '$LOG_FILE''
printf '\nDisk Usage:\n'
printf '=========== \n'
ssh -o StrictHostKeyChecking=no -i $key $host 'df -h | grep -v tmpfs | grep -v snap | grep -v efi | grep -v udev'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment