Skip to content

Instantly share code, notes, and snippets.

@NRBPerdijk
Created April 10, 2020 13:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save NRBPerdijk/848526c10239f30129e20e8ea9ff6960 to your computer and use it in GitHub Desktop.
Save NRBPerdijk/848526c10239f30129e20e8ea9ff6960 to your computer and use it in GitHub Desktop.
package tika.example
import java.io.ByteArrayOutputStream
import java.nio.charset.Charset
import org.apache.tika.metadata.Metadata
import org.apache.tika.sax.BodyContentHandler
import scala.util.{Failure, Success, Using}
object TikaOCRApplication extends App {
val input = getClass.getResourceAsStream("/ExampleOCR.jpg")
val outputStream = new ByteArrayOutputStream()
val attemptedOCR = Using(input) { inputStream =>
TikaOCRParser.parse(inputStream, new BodyContentHandler(outputStream), new Metadata())
}.map { _ =>
new String(outputStream.toByteArray, Charset.defaultCharset())
}
attemptedOCR match {
case Success(value) => println(s"OCR result was: $value")
case Failure(exception) => println(s"OCR has failed, exception message was: ${exception.getMessage}")
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment