Skip to content

Instantly share code, notes, and snippets.

@scrapehero
Last active September 26, 2024 12:37
Show Gist options
  • Save scrapehero/b85a280dc0d993f665c91e0332cf618f to your computer and use it in GitHub Desktop.
Save scrapehero/b85a280dc0d993f665c91e0332cf618f to your computer and use it in GitHub Desktop.
import pytesseract
import sys
import argparse
try:
import Image
except ImportError:
from PIL import Image
from subprocess import check_output
def resolve(path):
print("Resampling the Image")
check_output(['convert', path, '-resample', '600', path])
return pytesseract.image_to_string(Image.open(path))
if __name__=="__main__":
argparser = argparse.ArgumentParser()
argparser.add_argument('path',help = 'Captcha file path')
args = argparser.parse_args()
path = args.path
print('Resolving Captcha')
captcha_text = resolve(path)
print('Extracted Text',captcha_text)
@ayinot
Copy link

ayinot commented Sep 14, 2017

#Hi
There is two paths mentioned in this code
if name=="main":
argparser = argparse.ArgumentParser()
argparser.add_argument('path',help = 'Captcha file path')
args = argparser.parse_args()
path = args.path
print('Resolving Captcha')
captcha_text = resolve(path)

When I gave the path of my image . It is throwing this error.

Traceback (most recent call last):

File "", line 1, in
args = argparser.parse_args()

File "C:\ProgramData\Anaconda3\lib\argparse.py", line 1730, in parse_args
args, argv = self.parse_known_args(args, namespace)

File "C:\ProgramData\Anaconda3\lib\argparse.py", line 1762, in parse_known_args
namespace, args = self._parse_known_args(args, namespace)

File "C:\ProgramData\Anaconda3\lib\argparse.py", line 1997, in _parse_known_args
', '.join(required_actions))

File "C:\ProgramData\Anaconda3\lib\argparse.py", line 2389, in error
self.exit(2, _('%(prog)s: error: %(message)s\n') % args)

File "C:\ProgramData\Anaconda3\lib\argparse.py", line 2376, in exit
_sys.exit(status)

SystemExit: 2

@neha1909
Copy link

Hi,
I tried the above code and I am getting an error as : Invalid Parameter - -resample.
Can anyone help me resolve this error?

@BAFurtado
Copy link

Hi,
Same here:
subprocess.CalledProcessError: Command '['convert', 'captcha1.png', '-resample', '600', 'captcha1.png']' returned non-zero exit status 4.

@BAFurtado
Copy link

BAFurtado commented Nov 20, 2017

I guess it is because 'convert' is not a directly available program in Windows. It probably Works in Linux or something. My guess only.
Indeed, you have to install ImageMagick and redirect the link in Windows.
The captcha though worked very poorly, recognizing nearly nothing...

@sauldom102
Copy link

@neha1909 did you solve the error?

@abhaygc
Copy link

abhaygc commented Jun 16, 2018

The code didn't worked for me.
I got following error.

Invalid Parameter - -resample
Traceback (most recent call last):
File "ca.py", line 22, in <module>
    captcha_text = resolve(path)
  File "ca.py", line 13, in resolve
    check_output(['convert', path, '-resample', '600', path])
  File "C:\Users\Gupta Niwas\AppData\Local\Programs\Python\Python36-32\lib\subprocess.py", line 336, in check_output
    **kwargs).stdout
  File "C:\Users\Gupta Niwas\AppData\Local\Programs\Python\Python36-32\lib\subprocess.py", line 418, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['convert', 'cap.jpg', '-resample', '600', 'cap.jpg']' returned non-zero exit status 4.

Also can anyone explain to me where is ImageMagick used here???

@TheRealPolatic
Copy link

Hello,

I am experiencing the same error
Invalid Parameter - -resample

I do have ImageMagick installed and added to PATH. When I manually enter convert cap.jpg -resample 600 cap_resampled.jpg it does the job perfectly

@triptigupta346
Copy link

Can you please suggest how you installed ImageMagick and manully convert image?

@sluge
Copy link

sluge commented Feb 4, 2019

I've installed but it doesn't work for captcha :(

@oliwin
Copy link

oliwin commented Mar 2, 2019

It does not work, the same problem

@DaZhu
Copy link

DaZhu commented Apr 24, 2019

magick convert "1.png" -resample 600 "2.png" on windows?

@ScDor
Copy link

ScDor commented Aug 24, 2019

FIX: change the path at check_output so the first argument is the full path to magick.exe.
for example,
check_output([r"C:\Program Files\ImageMagick-7.0.8-Q16\magick.exe", path, '-resample', '600', path])

@chan18
Copy link

chan18 commented Sep 21, 2019

Resolving Captcha
Resampling the Image
Traceback (most recent call last):
File "CaptchaSolver.py", line 22, in
captcha_text = resolve(path)
File "CaptchaSolver.py", line 13, in resolve
check_output(["C:\Program Files\ImageMagick-7.0.8-Q16", path, '-resample', '600', path])
File "C:\Python27\lib\subprocess.py", line 216, in check_output
process = Popen(stdout=PIPE, *popenargs, **kwargs)
File "C:\Python27\lib\subprocess.py", line 394, in init
errread, errwrite)
File "C:\Python27\lib\subprocess.py", line 644, in _execute_child
startupinfo)
WindowsError: [Error 5] Access is denied

@shekaryenagandula
Copy link

Hi,
I got below error.
Please help.

Resolving Captcha
Resampling the Image
Traceback (most recent call last):
File "C:\Users\yenag\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pytesseract\pytesseract.py", line 226, in run_tesseract
proc = subprocess.Popen(cmd_args, **subprocess_args())
File "C:\Users\yenag\AppData\Local\Programs\Python\Python38-32\lib\subprocess.py", line 854, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Users\yenag\AppData\Local\Programs\Python\Python38-32\lib\subprocess.py", line 1307, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "OCR_image_processing.py", line 22, in
captcha_text = resolve(path)
File "OCR_image_processing.py", line 14, in resolve
return pytesseract.image_to_string(Image.open(path))
File "C:\Users\yenag\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pytesseract\pytesseract.py", line 344, in image_to_string
return {
File "C:\Users\yenag\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pytesseract\pytesseract.py", line 347, in
Output.STRING: lambda: run_and_get_output(*args),
File "C:\Users\yenag\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pytesseract\pytesseract.py", line 258, in run_and_get_output
run_tesseract(**kwargs)
File "C:\Users\yenag\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pytesseract\pytesseract.py", line 230, in run_tesseract
raise TesseractNotFoundError()
pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your path

@vamsi6696
Copy link

sudo apt-get install tesseract-ocr

@Shonty10
Copy link

Code works fine but doesn't extract the captcha.
Capture

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment