Skip to content

Instantly share code, notes, and snippets.

@peaeater
Last active June 7, 2023 18:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save peaeater/c7ecd4ef6a0996353ee2 to your computer and use it in GitHub Desktop.
Save peaeater/c7ecd4ef6a0996353ee2 to your computer and use it in GitHub Desktop.
OCRs image file to plain text with tesseract.
# ocr tif/png to txt
# requires tesseract
Param(
[string]$ext = "tif",
[string]$indir = ".",
[string]$outdir = $indir,
[string]$tesseract = "C:\utils\tesseract\tesseract.exe"
)
if (!(test-path $outdir)) {
mkdir $outdir
}
$files = get-childitem -path $indir -filter *.$ext
foreach ($file in $files) {
$o = "$outdir\{0}" -f $file.BaseName
$arguments = "`"$file`" `"$o`""
start-process $tesseract $arguments -wait -NoNewWindow
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment