Skip to content

Instantly share code, notes, and snippets.

@AdamDimech
Last active May 24, 2023 17:50
Show Gist options
  • Save AdamDimech/08ba988211b55c71a480449b3b8ab6cd to your computer and use it in GitHub Desktop.
Save AdamDimech/08ba988211b55c71a480449b3b8ab6cd to your computer and use it in GitHub Desktop.
Download all PDF's from a web page via PowerShell
# More information at https://code.adonline.id.au/download-all-pdfs-from-a-web-page/
function Grab-PDFs {
[Reflection.Assembly]::LoadWithPartialName("System.Windows.Forms") | Out-Null
[System.Windows.Forms.Application]::EnableVisualStyles()
$browse = New-Object System.Windows.Forms.FolderBrowserDialog
$browse.SelectedPath = "C:\"
$browse.ShowNewFolderButton = $false
$browse.Description = "Select a directory"
$loop = $true
while($loop)
{
if ($browse.ShowDialog() -eq "OK")
{
$loop = $false
cd $browse.SelectedPath
#Scrape Web Page for PDFs
$psPage = Invoke-WebRequest "http://www.example.com/path/to/pdfs"
$urls = $psPage.ParsedHtml.getElementsByTagName("A") | ? {$_.href -like "*.pdf"} | Select-Object -ExpandProperty href
$urls | ForEach-Object {Invoke-WebRequest -Uri $_ -OutFile ($_ | Split-Path -Leaf)}
Write-Host "... PDF downloading is complete."
[System.Windows.Forms.MessageBox]::Show("Your PDFs have been downloaded.", "Job Complete")
} else
{
$res = [System.Windows.Forms.MessageBox]::Show("You clicked Cancel. Would you like to try again or exit?", "Select a location", [System.Windows.Forms.MessageBoxButtons]::RetryCancel)
if($res -eq "Cancel")
{
#Ends script
return
}
}
}
$browse.SelectedPath
$browse.Dispose()
} Grab-PDFs
@jkengland
Copy link

does this work recursively for child pages.

@ilanthendral
Copy link

Good one.

@jersam
Copy link

jersam commented Jun 24, 2020

I get the following.

Invoke-WebRequest : The URI prefix is not recognized.
At C:\Users\Administrator\Desktop\PDF_Grabber.ps1:25 char:27
+ ... ach-Object {Invoke-WebRequest -Uri $_ -OutFile ($_ | Split-Path -Leaf ...
+                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotImplemented: (:) [Invoke-WebRequest], NotSupportedException
    + FullyQualifiedErrorId : WebCmdletIEDomNotSupportedException,Microsoft.PowerShell.Commands.InvokeWebRequestCommand

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment