Skip to content

Instantly share code, notes, and snippets.

@christianroman
Created May 30, 2013 16:02
Show Gist options
  • Save christianroman/5679049 to your computer and use it in GitHub Desktop.
Save christianroman/5679049 to your computer and use it in GitHub Desktop.
Bypass Captcha using 10 lines of code with Python, OpenCV & Tesseract OCR engine
import cv2.cv as cv
import tesseract
gray = cv.LoadImage('captcha.jpeg', cv.CV_LOAD_IMAGE_GRAYSCALE)
cv.Threshold(gray, gray, 231, 255, cv.CV_THRESH_BINARY)
api = tesseract.TessBaseAPI()
api.Init(".","eng",tesseract.OEM_DEFAULT)
api.SetVariable("tessedit_char_whitelist", "0123456789abcdefghijklmnopqrstuvwxyz")
api.SetPageSegMode(tesseract.PSM_SINGLE_WORD)
tesseract.SetCvImage(gray,api)
print api.GetUTF8Text()
@ytrezq
Copy link

ytrezq commented Feb 1, 2024

@NotTrueFalse my intent wasn t generating the dataset myself.

@NotTrueFalse
Copy link

You want me to do all the step to create the model? At this point I could just create a website and make a cheap api for people

@ytrezq
Copy link

ytrezq commented Feb 1, 2024

@NotTrueFalse I lack the knowledge to train an ai and I m bad at advanced databases scenarios.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment