James Villemarette jvillemare

## convert.py
import os			    # for magick and tesseract commands
import time			  # for epoch time
import calendar 	# for epoch time
from PyPDF2 import PdfFileMerger

dir_files = [f for f in os.listdir(".") if os.path.isfile(os.path.join(".", f))]
epoch_time = int(calendar.timegm(time.gmtime()))
print(dir_files)

for file in dir_files: # look at every file in the current directory

## readme.md

      
        
          
            
              
              1 file
            
          
          
            
              
              0 forks
            
          
          
            
              
              1 comment
            
          
          
            
              
              0 stars
            
          
        
        
          
              
          
          
            
                jvillemare
                / readme.md
            
            
              Last active
              September 26, 2021 18:29
            
              
                OCR images on MacOS with one command and open-source Tesseract
              
          
        
      
        
  
      
    OCR Scan images on MacOS for free, and easy

Scanning images with OCR (Optical Character Recognition) is immensely helpful to find
what you're looking for later solely by using the text in the image when searching.
OCR is big money, so of course, there's no easy way to do it with a nice UI. Many of
these apps cost $10, $20, or more, which is unreasonable.
Tesseract is a free, open-source OCR application that many of the paid apps "borrow",
repackage, and sell at a high mark up. Unfortunately, when I say application, I mean
a command line interface. So, it's not terribly intuitive. But we can simplify it.
	import os # for magick and tesseract commands
	import time # for epoch time
	import calendar # for epoch time
	from PyPDF2 import PdfFileMerger

	dir_files = [f for f in os.listdir(".") if os.path.isfile(os.path.join(".", f))]
	epoch_time = int(calendar.timegm(time.gmtime()))
	print(dir_files)

	for file in dir_files: # look at every file in the current directory