Skip to content

Instantly share code, notes, and snippets.

@mnofresno
Last active July 1, 2024 11:56
Show Gist options
  • Save mnofresno/25d0cc6a45aa8644596705c78382304e to your computer and use it in GitHub Desktop.
Save mnofresno/25d0cc6a45aa8644596705c78382304e to your computer and use it in GitHub Desktop.
This script allows to do on Linux desktop PC some simple ocr tasks that we currently can do on Android using Google lens

Google Lens For Ubuntu

For those who use Google Lens on Android as an OCR to capture text from images, screenshots, etc.. here is a bash script that does the same on Ubuntu.

Sometimes we need to copy a telephone number in an APP where there isn't a copy to clipboard feature enabled or maybe to copy texts from an picture.

This can be done with a OCR, this script does that allowing to drag&drop select a region of the screen and copy to clipboard the text parsed from the image.

#!/bin/bash 
# Dependencies: tesseract-ocr imagemagick scrot xsel

tesseract_lang=eng
# quick language menu, add more if you need other languages.

SCR_IMG=`mktemp`
trap "rm $SCR_IMG*" EXIT

scrot -s $SCR_IMG.png -q 100
# increase image quality with option -q from default 75 to 100

mogrify -modulate 100,0 -resize 400% $SCR_IMG.png
#should increase detection rate

tesseract $SCR_IMG.png $SCR_IMG &> /dev/null
cat $SCR_IMG.txt | xsel -bi

exit

Usage

If you save the script in /usr/bin/lens, the usage will be like this:

  1. ALT+F2, type 'lens' and hit ENTER
  2. Select the region you want to capture with the mouse
  3. Paste the clipboard-copied text with the OCR'ed data
@pritomshad
Copy link

man you are a lifesaver

@Devbrat-Dev
Copy link

I updated the script to support Sway, KDE, and GNOME by dynamically selecting the appropriate screenshot tool based on the current Window Manager or Desktop Environment in use. As of 2024-07-01, KDE does not support wlr-screencopy-unstable-v1.

#!/bin/bash

# Dependencies: tesseract-ocr, imagemagick, (grim, slurp[Sway]), spectacle[KDE], gnome-screenshot[GNOME], wl-clipboard, libnotify-bin

# Add more if you need other languages. Example: eng+ita
tesseract_lang=eng

SCR_IMG=$(mktemp)

notify() {
    notify-send "OCR Script" "$1"
}

take_screenshot() {
    case "$XDG_CURRENT_DESKTOP" in
        KDE)
            spectacle --region --background --nonotify --output "$SCR_IMG"
            ;;
        GNOME)
            gnome-screenshot -a -f "$SCR_IMG"
            ;;
        *)
            grim -g "$(slurp)" "$SCR_IMG"
            ;;
    esac
}

trap 'rm -f "$SCR_IMG"*' EXIT

take_screenshot

notify "Processing image for text extraction..."

# Enhance the image for better OCR
mogrify -modulate 100,0 -resize 400% "$SCR_IMG"

# Performs the OCR on the screenshot
tesseract "$SCR_IMG" "$SCR_IMG" -l "$tesseract_lang" &> /dev/null

# Copy OCR result to clipboard
wl-copy < "${SCR_IMG}.txt"

notify "Text successfully copied to clipboard."

@mnofresno
Copy link
Author

Great @Devbrat-Dev ! thank you for the addition! 👏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment