Skip to content

Instantly share code, notes, and snippets.

View nobucshirai's full-sized avatar

Nobu C. Shirai nobucshirai

View GitHub Profile
@doraTeX
doraTeX / pdf2text.sh
Created July 5, 2023 08:44
A shell script to extract text from PDF on macOS
#!/bin/bash
SCRIPTNAME=$(basename "$0")
function realpath () {
f=$@;
if [ -d "$f" ]; then
base="";
dir="$f";
else
@woxtu
woxtu / ocr.js
Last active May 3, 2024 04:45 — forked from doraTeX/ocr.sh
A JavaScript (JXA) to perform OCR on images/PDFs using macOS built-in OCR engine
#!/usr/bin/osascript -l JavaScript
ObjC.import("stdlib");
ObjC.import("AppKit");
ObjC.import("PDFKit");
ObjC.import("Vision");
const scriptName = $.NSProcessInfo.processInfo.arguments.objectAtIndex(3).lastPathComponent.js;
console.error = (obj) => {
@doraTeX
doraTeX / ocr.m
Last active November 22, 2023 02:31
Original Swift / Objective-C / AppleScriptObjC codes from which ocr.sh (https://gist.github.com/doraTeX/da9a1a26dffbf3fe5d6ec12a9c79267c) is converted
#import <Quartz/Quartz.h>
#import <Vision/Vision.h>
int main(int argc, const char * argv[]) {
@autoreleasepool {
NSString *target = @"test.pdf";
CGFloat dpi = 200;
PDFDocument *doc = [[PDFDocument alloc] initWithURL:[NSURL fileURLWithPath:target]];
NSUInteger pageCount = [doc pageCount];
@doraTeX
doraTeX / ocr.sh
Last active April 19, 2024 13:04
A shell script to perform OCR on images/PDFs using macOS built-in OCR engine
#!/bin/bash
SCRIPTNAME=$(basename "$0")
function realpath () {
f=$@;
if [ -d "$f" ]; then
base="";
dir="$f";
else
base="/$(basename "$f")";
@mkakh
mkakh / ipm.py
Last active November 11, 2018 02:16
#!/usr/bin/env python
import sys
import json
from optparse import OptionParser
# parseArgs :: (IO (), Dict, [String])
def parseArgs():
parser = OptionParser(
usage="Usage: %prog input_file.ipynb -o output_file.ipynb")
parser.add_option(
@mkakh
mkakh / backer.sh
Created October 6, 2018 08:41 — forked from nobucshirai/backer.sh
A simple backup script.
#!/bin/bash
# AUTHOR
# 2013/10/08 Nobu C. Shirai
# 2018/10/06 Akira Hasegawa
backer(){
check_options $@
Ind=1
for TargetFile in ${ArgFiles[@]}
do