-
-
Save EdOverflow/8e12e8c26b6bc96168e6b55324b91fa1 to your computer and use it in GitHub Desktop.
#!/bin/bash | |
# Find a public Google group for a particular host. | |
# Some of these groups contain sensitive information. | |
# The tool runs against a list of hosts and returns all public groups. | |
while read domain; do | |
if curl -LIs "https://groups.google.com/a/$domain" | grep "overview" > /dev/null; then | |
echo "[+] https://groups.google.com/a/$domain/forum/#!overview" | |
fi | |
done < $1 |
The results above are with 100 domains that are either private or 404. For those curl makes 2 request (1 redirect), not 3.
If you take 100 public domains, then the difference in run time is higher (3 requests / domain vs. 1 request).
Before:
./googlegroups_old.sh 100public 6,29s user 1,50s system 3% cpu 3:48,46 total
After:
./googlegroups.sh 100public 6,20s user 1,56s system 7% cpu 1:38,67 total
3:48 > 1:38
200 requests and more than 2 minutes saved, only on 100 domains.
Nice work, @milangfx! I wrote this merely as a proof of concept — not focusing on performance. If you really want performance, don't write a while loop in the script itself. Just have the script issue the requests and then run it using GNU parallel.
Thanks for the feedback. I was focusing on efficiency in a single threaded case. Making the same request run in parallel would be faster, but still inefficient.
This script can run a lot faster.
Currently curl makes 3 requests per domain because of 2 redirects.
curl -LIs "https://groups.google.com/a/python.org"
This is not necessary. You can go straight to "/a/$domain/forum/#!overview" in 1 request:
curl -Is "https://groups.google.com/a/python.org/forum/#\!overview"
Without the redirects we don't even need the
-L
option.How do you check if a Google Groups instance is public?
grep for the "Set-Cookie" header, instead of "overview" which appears in the "Location: " header during the redirect.
The
Set-Cookie
header is only present for public domains, for example:github.com - exists, but private >> no output after grep
curl -Is "https://groups.google.com/a/github.com/forum/#\!overview" | grep "Set-Cookie"
github123.com - 404 >> no output after grep
curl -Is "https://groups.google.com/a/github123.com/forum/#\!overview" | grep "Set-Cookie"
python.com - public >> grep finds the Set-Cookie header
curl -Is "https://groups.google.com/a/python.org/forum/#\!overview" | grep "Set-Cookie"
Set-Cookie: NID=195=Xpe8xxxxxxxxxxxxxxxx9xir2CevaRUv-IpIT_4anvlJ-PUOISCeoRMJVvoRL5y1RDq8f_wEVxYhU; expires=Thu, 09-Jul-2020 20:16:05 GMT; path=/; domain=.google.com; HttpOnly
I've also changed the echoed URL to https://groups.google.com/a/$domain/forum/#!forumsearch/ so you go straight to the list of groups and don't have to click "Browse all" on the overview page.
So the final script looks something like this:
Time test on 100 domains:
Before:
After:
2:36 > 1:34, shaved off a minute.