Skip to content

Instantly share code, notes, and snippets.

@kortina
Created June 8, 2013 23:21
Show Gist options
  • Save kortina/5736961 to your computer and use it in GitHub Desktop.
Save kortina/5736961 to your computer and use it in GitHub Desktop.
Hacked together a little tool to search images from your camera uploads for a string of text.

Today, I was looking for some screenshots I wanted to use for a presentation, and rather than looking through all 1478 of my uploaded camera photos ( ls -1 ~/Dropbox/Camera\ Uploads/ | wc -l ), I decided to write a quick bash script to use the tesseract OCR tool to help me out.

I wanted to use https://github.com/jbochi/python-tesseract so first I installed the dependencies.

sudo pip install PIL
brew install tesseract
cd ~/Dropbox/git/
git clone git@github.com:jbochi/python-tesseract.git
~/Dropbox/git/python-tesseract/tesseract.py ~/Dropbox/Camera\ Uploads/2013-06-06\ 21.55.03.png 
chmod 700 ~/Desktop/find-images-with-text.sh 

A quick test to make sure this is working:

Teseract Example

Then I wrote this script:

/Users/kortina/Desktop/find-images-with-text.sh

#!/bin/bash

query=$1
directory_to_search=$2

cd "$2"
for f in *; do
    txt=`~/Dropbox/git/python-tesseract/tesseract.py "$f" 2>/dev/null`
    echo $txt | grep -i -q "$1" && echo -e "$f\n$txt"
done

Next, I made the script excecutable:

chmod 700 /Users/kortina/Desktop/find-images-with-text.sh

And ran it:

~/Desktop/find-images-with-text.sh ride ~/Dropbox/Camera\ Uploads

Pretty sweet that these tools existed and I could do all of this in like 15 minutes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment