Skip to content

Instantly share code, notes, and snippets.

@andyrbell
Last active December 17, 2024 06:59
Show Gist options
  • Save andyrbell/25c8632e15d17c83a54602f6acde2724 to your computer and use it in GitHub Desktop.
Save andyrbell/25c8632e15d17c83a54602f6acde2724 to your computer and use it in GitHub Desktop.
Make a pdf look scanned using ImageMagick
# use ImageMagick convert
# the order is important. the density argument applies to input.pdf and resize and rotate to output.pdf
convert -density 90 input.pdf -rotate 0.5 -attenuate 0.2 +noise Multiplicative -colorspace Gray output.pdf
@m3nu
Copy link

m3nu commented May 8, 2020

Very useful. Added as function:

function pdf.like_scanned () {
        OUT=$(basename "$1" .pdf)
	convert -density 150 "$1" -rotate "$([ $((RANDOM % 2)) -eq 1 ] && echo -)0.$(($RANDOM % 4 + 5))" \
                -attenuate 0.4 +noise Multiplicative -attenuate 0.03 +noise Multiplicative -sharpen 0x1.0 \
                -colorspace Gray "$OUT"_scanned.pdf
}

@baicunko
Copy link

I just developed www.scanyourpdf.com for everyone to use. Code is open source if you'd like to contribute!

@kidsil
Copy link

kidsil commented May 13, 2020

convert -density 130 input.pdf -rotate 0.33 -attenuate 0.15 +noise Multiplicative -colorspace Gray output.pdf

Closest to a modern scanner in my opinion.

@shanecp
Copy link

shanecp commented Jun 7, 2020

Sometimes you'll have to replace a few pages with real scanned pages. Eg: Replace the signature page.

The flow will be:

  • Convert input as a scanned PDF.
  • Split the sections that should be replaced with a real scan.
  • Merge everything back to the output.

You'll need qpdf and img2pdf installed.

convert -density 130 input.pdf -rotate -0.33 -attenuate 0.15 +noise Multiplicative -colorspace Gray output.pdf
qpdf --empty --pages output.pdf 1-5 -- output_1.pdf
img2pdf --pagesize A4 --auto-orient signed.jpg -o output_2.pdf
qpdf --empty --pages output.pdf 7 -- output_3.pdf
qpdf --empty --pages output_*.pdf -- final_scan.pdf

@turkeyphant
Copy link

I get the following:

C:\Program Files\ImageMagick-7.0.10-Q16-HDRI>magick convert -density 150 input.pdf -rotate "$([ $((RANDOM % 2)) -eq 1 ] && echo -)0.$(($RANDOM % 4 + 5))" -attenuate 0.4 +noise Multiplicative -attenuate 0.03 +noise Multiplicative -sharpen 0x1.0 -colorspace Gray output.pdf
convert: FailedToExecuteCommand `"gswin32c.exe" -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 "-sDEVICE=pngalpha" -dTextAlphaBits=4 -dGraphicsAlphaBits=4 "-r150x150"  "-sOutputFile=C:/Users/TURKEY~1/AppData/Local/Temp/magick-9420wtSmlrXSBcfh%d" "-fC:/Users/TURKEY~1/AppData/Local/Temp/magick-9420IJ2RKxHzcTQf" "-fC:/Users/TURKEY~1/AppData/Local/Temp/magick-9420t7VINjcK97Pq"' (The system cannot find the file specified.
) @ error/delegate.c/ExternalDelegateCommand/475.
convert: PDFDelegateFailed `The system cannot find the file specified.
' @ error/pdf.c/ReadPDFImage/662.
convert: invalid argument for option '-rotate': $([ $((RANDOM % 2)) -eq 1 ] && echo -)0.$(($RANDOM % 4 + 5)) @ error/convert.c/ConvertImageCommand/2643.

@muellermartin
Copy link

@turkeyphant: You seem to use Windows. Most commands here use features from UNIX shells like Bash (e.g. command substitution via $(), the $RANDOM variable, arithmetic expressions or conditionals). These features are not available in the default Windows command line, therefore you need to find another way (e.g. remove UNIX shell features from the command or use a UNIX shell like Cygwin, Git shell or WSL under Windows).

@turkeyphant
Copy link

@muellermartin: oops good point. However, I'm still having issues on a OS X machine when running brew install imagemagick. It seems to be either an issue with curl (I don't know how to sub in a different version) or a 301 redirect at kernel.org:

curl: (60) SSL certificate problem: Invalid certificate chain
More details here: http://curl.haxx.se/docs/sslcerts.html

curl performs SSL certificate verification by default, using a "bundle"
 of Certificate Authority (CA) public keys (CA certs). If the default
 bundle file isn't adequate, you can specify an alternate file
 using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
 the bundle, the certificate verification probably failed due to a
 problem with the certificate (it might be expired, or the name might
 not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
 the -k (or --insecure) option.
Error: Failed to download resource "gnu-getopt"
Download failed: https://www.kernel.org/pub/linux/utils/util-linux/v2.35/util-linux-2.35.2.tar.xz

@muellermartin
Copy link

@turkeyphant Nice that you have macOS at hand – these ImageMagick commands should work there :) You seem to have bad luck due to your issue with Homebrew, though. The error seems to indicate that curl (probably used by brew to download the dependencies for ImageMagick) can't validate the SSL certificate (the redirect is likely not an issue). As the certificate for www.kernel.org seems to be valid from my point, there is likey some issue with the certificate bundle used by your curl. This is odd, as the pre-installed version of curl should use the system certificates and therefore it should work. Maybe you have "overwritten" the curl command via Homebrew (which is not recommended). You can check that by using which curl which should output /usr/bin/curl. If the output is something like /usr/local/bin/curl or /usr/local/opt/curl/bin/curl then you might have linked the version from Homebrew (or other tools). With Homebrew you can try brew unlink curl to undo this.

@turkeyphant
Copy link

I haven't messed with curl.

$ which curl
/usr/bin/curl

Any other workaround for this download?

@muellermartin
Copy link

@turkeyphant: Hm, I wonder why Homebrew tries to install gnu-getopt from the sources instead of using a keg file. Maybe you could try to install gnu-getopt explicitly to work around this issue: brew install gnu-getopt

@turkeyphant
Copy link

turkeyphant commented Jun 12, 2020

No dice I'm afraid. Still get Error: Failed to download resource "gnu-getopt" Download failed: https://www.kernel.org/pub/linux/utils/util-linux/v2.35/util-linux-2.35.2.tar.xz

Given I'm able to download the file manually there must be a workaround? Any way to tell brew not to make curl to use -k? Or use wget --no-check-certificate instead?

@turkeyphant
Copy link

Seem to have solved it (slowing making as I write) with Homebrew/legacy-homebrew#6103 (comment) for each and every invalid cert.

Do think there must be a way to update my machine's certs so that curl can work correctly though.

@muellermartin
Copy link

@turkeyphant: Well, if these SSL errors are not only related to curl then something is really off. Sometimes an utterly wrong system time causes such errors (because the certificates seem to be expired/not valid yet) or you're in a shitty corporate network that uses some kind of HTTPS-Interception and thus breaks security or you're the victim of a MITM attack.

@turkeyphant
Copy link

turkeyphant commented Jun 12, 2020

It's seems to be a common macos issue to be honest. System time is correct, there is no vpn or other network issues and I'm fairly certain there's no mitm going on (have tested various Internet connections for example and other macos machines). It's 10.11 and the certificates might just be out of date?

@DavidWuthier
Copy link

DavidWuthier commented Oct 5, 2020

Nice! The other day, I had 19 pages to sign with unique signatures. First, I used xournal on Ubuntu 20.04 with a stylus, and then I ran the following script:

#!/usr/bin/env bash

# Dependencies
sudo apt install pdftk imagemagick -y

# Output folder
mkdir -p output

# Keep pages in the right order
for i in {1..19}; do
  if (( $i < 10 )); then
    j=0$i
  else
    j=$i
  fi

  pdftk input.pdf cat $i output output/$j.pdf
  convert -density 200 -trim -flatten -quality 80 -attenuate 0.15 +noise Multiplicative -rotate 0.01 output/$j.pdf output/$j.jpg
  convert output/$j.jpg output/$j.pdf
  rm output/$j.jpg
done

pdftk output/* cat output result.pdf

The conversion to .jpg prevents the file from bloating.

@michaelrkn
Copy link

The +noise Multiplicative argument created a dappled background behind where I had text but not in other places. Using Gaussian, Laplacian, or Uniform instead of Multiplicative produced better results for me.

@dazhbog
Copy link

dazhbog commented Dec 2, 2020

If you get this error:

convert-im6.q16: not authorized `input.pdf' @ error/constitute.c/ReadImage/412.
convert-im6.q16: no images defined `output.pdf' @ error/convert.c/ConvertImageCommand/3258.

you can run

sudo mv /etc/ImageMagick-6/policy.xml /etc/ImageMagick-6/policy.xml.off

to disable the policy. When done, you can restore the original with

sudo mv /etc/ImageMagick-6/policy.xml.off /etc/ImageMagick-6/policy.xml

Taken from here

@Hoodie2389
Copy link

Guys - completely newbie here
I downloaded Visual studio and Git as per install-windows.txt
then ... how do I run the scanner.sh file?
Do I add this file into folder somewhere...?

tks .....

@Pezmc
Copy link

Pezmc commented Feb 17, 2021

I've improved upon this script slightly (having used it for a while now):

  • by splitting the PDF into separate pages per file
  • applying slightly different rotations to each page
  • recombining the files
  • support for macOS automator quick actions
  • Fixing the noise so it appears across the document

See: https://gist.github.com/Pezmc/38017cb03daccb17d3835280c568dc0f

@vwkd
Copy link

vwkd commented Sep 24, 2021

Thanks @Pezmc. To have the noise only at the edges instead of across the whole document is a feature IMO, and also keeps the file size much smaller. Unfortunately, I couldn't figure out how to get your script to use noise only at the edges.

I ended up modifyng the original script using the higher density to make the output sharper. Got to keep up with the increasing quality of the scanners in the 3 years since then. 😉

convert -density 130 input.pdf -rotate 0.2 -attenuate 0.2 +noise Multiplicative -colorspace Gray output.pdf

@MartinDevillers
Copy link

For those on Windows make sure to install Ghostscript as well or else you'll get errors like

convert: FailedToExecuteCommand `"gswin32c.exe" -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 "-sDEVICE=pngalpha" -dTextAlphaBits=4 -dGraphicsAlphaBits=4 "-r150x150"  "-sOutputFile=C:/Users/TURKEY~1/AppData/Local/Temp/magick-9420wtSmlrXSBcfh%d" "-fC:/Users/TURKEY~1/AppData/Local/Temp/magick-9420IJ2RKxHzcTQf" "-fC:/Users/TURKEY~1/AppData/Local/Temp/magick-9420t7VINjcK97Pq"' (The system cannot find the file specified.
) @ error/delegate.c/ExternalDelegateCommand/475.
convert: PDFDelegateFailed `The system cannot find the file specified.
' @ error/pdf.c/ReadPDFImage/662.

@restyler
Copy link

restyler commented Jan 3, 2022

thank you! I have used some of these commands to build https://oakpdf.com which not only applies scanner effect, but also allows to insert an image of signature or draw a signature.
My observations regarding -density parameter: 200 is good enough in most cases, while 300 gives ultimate quality - but the build time get catastrophically slow..

@EarlGeorge
Copy link

Great

@fewaltix
Copy link

fewaltix commented Mar 16, 2022

Thank you!
I used zenity to add graphical input and output prompts:
convert -density 150 "$(zenity --file-selection --title="Select Input File" --file-filter=*[PpDdFf])" -rotate "$([ $((RANDOM % 2)) -eq 1 ] && echo -)0.$(($RANDOM % 4 + 5))" -attenuate 0.4 +noise Multiplicative -attenuate 0.03 +noise Multiplicative -sharpen 0x1.0 -colorspace Gray "$(zenity --file-selection --save --title="Select Output File" --filename ".pdf")"

Can also be found here as a .desktop file, so the script can be started from the starter on Linux machines:
https://gist.github.com/fewaltix/c1437171d16671741aafe146751dbf9f

@leeeeeeeee2
Copy link

work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment