Skip to content

Instantly share code, notes, and snippets.

@mauvilsa
Last active September 10, 2020 14:52
Show Gist options
  • Save mauvilsa/4ac73381089da86f8ce8e3299a7e134f to your computer and use it in GitHub Desktop.
Save mauvilsa/4ac73381089da86f8ce8e3299a7e134f to your computer and use it in GitHub Desktop.
Remove all text from PDF using Apache PDFBox in a Groovy script (pdfbox-RemoveAllText.groovy)
#!/usr/bin/env groovy
// Copyright (c) 2020-present, Mauricio Villegas <mauricio_ville@yahoo.com>
// MIT License <https://badges.mit-license.org/>
@Grab('org.apache.pdfbox:pdfbox-examples:2.0.20')
import org.apache.pdfbox.examples.util.RemoveAllText
RemoveAllText.main(args)
@mauvilsa
Copy link
Author

mauvilsa commented Sep 8, 2020

Usage instructions: Save the gist into a file pdfbox-RemoveAllText.groovy, add the executable bit to it, and store it in a bin directory you have in your path. Then install groovy, for example in Ubuntu sudo apt-get install groovy. Then from anywhere you would be able to run pdfbox-RemoveAllText.groovy <input.pdf> <output.pdf>.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment