kspeeckaert/projectoxford.py

## readme.md

      
    Raw
  

              readme.md
            
          
    Purpose

Copy an image to the clipboard, then execute an Alfred workflow to trigger OCR on the copied image. Copy the text returned by the webservice to the clipboard.
Notes

Binary clipboard data

Retrieving anything but text from the (OS X) keyboard in Python seems troublesome. The CLI command pbpaste and the Python libraries xerox and pyperclip only support text. Judging from posts on StackOverflow, it should be possible using PyObjC or Tkinter but I couldn't get the former to install and the latter seems a bit much just to get access to the clipboard.
Instead, I used the CLI utility pngpaste, which aims to do what pbpaste does, but for binary data. By using - as parameter instead of a filename, I can retrieve the binary data from stdout into Python.
Image dimensions

The API's requirements state that the image must be at least 40x40 pixels. If not, the server returns an HTTP 500 error. Therefore, we need to use PIL to check the image dimensions and change them to meet the requirements if the dimesions are smaller than 40px.

  
## projectoxford.py
import requests
import subprocess
import sys
from PIL import Image
from io import BytesIO

api_url='https://api.projectoxford.ai/vision/v1/ocr'

header = {'Ocp-Apim-Subscription-Key': '',
          'Content-Type': 'application/octet-stream'}

params = {'language': 'unk'}

try:
    # Retrieve the binary image data from the clipboard
    p = subprocess.run('./pngpaste -',
                       shell=True,
                       check=True,
                       stdout=subprocess.PIPE,
                       stderr=subprocess.PIPE)
    img_data = p.stdout

    img = Image.open(BytesIO(img_data))
    # Ensure the image is at least 40x40
    if min(img.size) < 40:
        img = img.crop((0, 0, max(img.size[0], 40), max(img.size[1], 40)))

    bin_img = BytesIO()
    img.save(bin_img, format='PNG')
    img.close()

    img_data = bin_img.getvalue()
    bin_img.close()

    r = requests.post(api_url,
                      params=params,
                      headers=header,
                      data=img_data)

    r.raise_for_status()

    data = r.json()

    text = ''
    for item in r.json()['regions']:
        for line in item['lines']:
            for word in line['words']:
                text += ' ' + word['text']
            text += '\n'
    print(text)

except subprocess.CalledProcessError as e:
    print('Could not get image from clipboard: {}'.format(e))

except requests.HTTPError as e:
    print('HTTP error occurred: {}'.format(e))

except Exception as e:
    print('Error occurred: {}'.format(e))
	import requests
	import subprocess
	import sys
	from PIL import Image
	from io import BytesIO

	api_url='https://api.projectoxford.ai/vision/v1/ocr'

	header = {'Ocp-Apim-Subscription-Key': '',
	'Content-Type': 'application/octet-stream'}

	params = {'language': 'unk'}

	try:
	# Retrieve the binary image data from the clipboard
	p = subprocess.run('./pngpaste -',
	shell=True,
	check=True,
	stdout=subprocess.PIPE,
	stderr=subprocess.PIPE)
	img_data = p.stdout

	img = Image.open(BytesIO(img_data))
	# Ensure the image is at least 40x40
	if min(img.size) < 40:
	img = img.crop((0, 0, max(img.size[0], 40), max(img.size[1], 40)))

	bin_img = BytesIO()
	img.save(bin_img, format='PNG')
	img.close()

	img_data = bin_img.getvalue()
	bin_img.close()

	r = requests.post(api_url,
	params=params,
	headers=header,
	data=img_data)

	r.raise_for_status()

	data = r.json()

	text = ''
	for item in r.json()['regions']:
	for line in item['lines']:
	for word in line['words']:
	text += ' ' + word['text']
	text += '\n'
	print(text)

	except subprocess.CalledProcessError as e:
	print('Could not get image from clipboard: {}'.format(e))

	except requests.HTTPError as e:
	print('HTTP error occurred: {}'.format(e))

	except Exception as e:
	print('Error occurred: {}'.format(e))