Last active
November 4, 2020 04:17
-
-
Save Swimburger/21d69c3ebb29a09178664ea9fdd4f681 to your computer and use it in GitHub Desktop.
PowerShell function to crawl sitemaps, see https://www.swimburger.net/blog/powershell/powershell-snippet-crawling-a-sitemap
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Function CrawlSitemap | |
{ | |
Param( | |
[parameter(Mandatory=$true)] | |
[string] $SiteMapUrl | |
); | |
$SiteMapXml = Invoke-WebRequest -Uri $SiteMapUrl -UseBasicParsing -TimeoutSec 180; | |
$Urls = ([xml]$SiteMapXml).urlset.ChildNodes | |
ForEach ($Url in $Urls){ | |
$Loc = $Url.loc; | |
try{ | |
$result = Invoke-WebRequest -Uri $Loc -UseBasicParsing -TimeoutSec 180; | |
Write-Host $result.StatusCode - $Loc; | |
}catch [System.Net.WebException] { | |
Write-Warning (([int]$_.Exception.Response.StatusCode).ToString() + " - " + $Loc); | |
} | |
} | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment