Skip to content

Instantly share code, notes, and snippets.

@bjornjohansen
bjornjohansen / sitemap-crawler.php
Last active August 14, 2023 18:19
Basic sitemap crawler to warm up a full page cache
#!/usr/bin/php
<?php
/**
* @license http://www.wtfpl.net/txt/copying/ WTFPL
*/
date_default_timezone_set( 'UTC' );
$sitemaps = array(
@suzannealdrich
suzannealdrich / wget.txt
Last active December 11, 2023 15:12
wget spider cache warmer
wget --spider -o wget.log -e robots=off -r -l 5 -p -S --header="X-Bypass-Cache: 1" --limit-rate=124k www.example.com
# Options explained
# --spider: Crawl the site
# -o wget.log: Keep the log
# -e robots=off: Ignore robots.txt
# -r: specify recursive download
# -l 5: Depth to search. I.e 1 means 'crawl the homepages'.  2 means 'crawl the homepage and all pages it links to'...
# -p: get all images, etc. needed to display HTML page
# -S: print server response