Skip to content

Instantly share code, notes, and snippets.

@zbroyar
Last active December 29, 2015 04:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save zbroyar/7614494 to your computer and use it in GitHub Desktop.
Save zbroyar/7614494 to your computer and use it in GitHub Desktop.
Simple OCaml wrapper for tesseract-ocr
#include <string>
#include <iostream>
#include <tesseract/baseapi.h>
#include <leptonica/allheaders.h>
extern "C" {
#include <caml/mlvalues.h>
#include <caml/memory.h>
#include <caml/alloc.h>
#include <caml/fail.h>
#include <caml/callback.h>
CAMLprim value ocr(value fn)
{
CAMLparam1(fn);
CAMLlocal1(v_res);
/* Розпаковка вхідних параметрів */
char *outText;
std::string text = String_val(fn);
tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
if (api->Init(NULL, "ukr")) failwith("Can't initialize tesseract");
Pix *image = pixRead(text.c_str());
api->SetImage(image);
text = outText = api->GetUTF8Text();
api->End();
delete [] outText;
pixDestroy(&image);
text = std::string("OCR result: ") + text;
v_res = caml_alloc_string(text.length());
memcpy(String_val(v_res), text.c_str(), text.length());
CAMLreturn(v_res);
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment