Skip to content

Instantly share code, notes, and snippets.

@lamlion
Created June 11, 2018 08:04
Show Gist options
  • Save lamlion/3be7fb571bf22f0f39cc04aa2b8a85a2 to your computer and use it in GitHub Desktop.
Save lamlion/3be7fb571bf22f0f39cc04aa2b8a85a2 to your computer and use it in GitHub Desktop.
Bulk PYPDFOCR Python script
import glob
import os
import re
pwd = os.getcwd()
searchpdf = pwd+'/*/testfile.pdf'
files = glob.glob(searchpdf)
num = len(glob.glob(searchpdf))
print 'There are %s files to be converted' % num
for x in files:
convert = 'pypdfocr '+"'"+x+"'"
os.system(convert)
print 'Converting file: %s' %x
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment