Skip to content

Instantly share code, notes, and snippets.

@jersobh
Forked from scf37/simulate_scanned_PDF.md
Created October 16, 2020 01:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jersobh/ffc601fb8eac45ff0a2e9af6d4afdf4e to your computer and use it in GitHub Desktop.
Save jersobh/ffc601fb8eac45ff0a2e9af6d4afdf4e to your computer and use it in GitHub Desktop.
Simulate a scanned PDF with ImageMagick

Simulate a scanned PDF with ImageMagick

  1. Download and install Ghostscript and ImageMagick + Convert Module
  2. Execute the following command in the console (change the filename)
$ convert -density 200 INPUT.pdf -rotate 0.3 +noise Multiplicative -format pdf  -quality 85 -compress JPEG -colorspace gray OUTPUT.pdf

Description of the options

Batch-Scripts

Python 3

#!/usr/bin/env python3
import os
import sys
import subprocess

'''
If you want use Drag and Drop, add this to the windows registry (or include into a .reg file and execute the file):

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\Python.File\shellex\DropHandler]
@="{86C86720-42A0-1069-A2E8-08002B30309D}"
'''

# Important for Drag and Drop! Change current work directory to the directory of the script
os.chdir(os.path.abspath( os.path.dirname(os.path.realpath(__file__))))

default_input_dir = os.path.abspath('input')
output_dir = os.path.abspath('output')

def process(input_filename, output_filename):
    if not os.path.isdir(output_dir):
        os.makedirs(output_dir)

    s = 'convert -density 200 "%s" -rotate 0.3 +noise Multiplicative -format pdf -quality 85 -compress JPEG -colorspace gray "%s"' % (input_filename, output_filename)
    err, out = subprocess.Popen(s, stdout=subprocess.PIPE, shell=True).communicate()
    if err:
        sys.stderr.write(err)
        sys.stderr.flush()
    elif out:
        if sys.stdout.isatty():
            print(out)

    sys.stdout.write(output_filename+"\n")
    sys.stdout.flush()

def walkDir(dir):
    for root, dirs, files in os.walk(dir):
        for file in filter(lambda x: x.endswith(".pdf"), files):
            process(os.path.join(root, file), os.path.join(output_dir, file))

def main():
    if sys.stdin.isatty() and len(sys.argv) == 1:
        walkDir(default_input_dir)
    else:
        files = sys.argv[1:]
        if sys.stdin.isatty() is False:
            files += [line.rstrip('\n') for line in sys.stdin.readlines()]

        for x in files:
            x = os.path.abspath(x)
            if os.path.isdir(x):
                walkDir(x)
            elif os.path.isfile(x) and x.endswith(".pdf"):
                process(x, os.path.join(output_dir, os.path.basename(x)))

if __name__ == "__main__":
    try:
        main()
    except KeyboardInterrupt:
        pass
    except Exception as e:
        sys.stderr.write(e)
        sys.stderr.flush()

PowerShell

$scriptPath = split-path -parent $MyInvocation.MyCommand.Definition

$input_directory = $scriptPath + "\Input\"
$output_directory = $scriptPath + "\Output\"

New-Item -ItemType Directory -Force -Path $output_directory

$files = Get-ChildItem $input_directory -Filter *.pdf

ForEach ($file in $files) {
    "Process: " + $file.Name 
    $command = "convert -density 200 '" + $input_directory + $file.Name  + "' -rotate 0.3 +noise Multiplicative -format pdf -quality 85 -compress JPEG -colorspace gray '" + $output_directory + $file.Name + "'"
    Invoke-Expression -command $command
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment