Skip to content

Instantly share code, notes, and snippets.

@Stantheman
Created March 31, 2012 17:08
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Stantheman/2266820 to your computer and use it in GitHub Desktop.
Save Stantheman/2266820 to your computer and use it in GitHub Desktop.
One-Liner to get approximate size of remote Apache directory listing using wget and perl
wget -r -nd -np --spider http://URL_GOES_HERE 2>&1 | perl -ne '$size += $1 if $_ =~ m/^Length: (\d+)/; END{print $size . "\n";}'
@Stantheman
Copy link
Author

This assumes that the remote Apache directory is using the standard index module. Wget issues a HEAD request to every URL found in the listing, and every content-length line is summed with Perl.

I've compared the output of this script with the output of 'du -sb' on the target directory and achieved the same answer with a 2 MB difference. The difference comes from the different sorting-links that a default Apache index offers and could be removed with additional lines. The target remote directory was nearly 3 GB in size and had 29 subdirectories.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment