Skip to content

Instantly share code, notes, and snippets.

@thomasgoodfellow
thomasgoodfellow / ItemsFromInvoice.md
Last active April 1, 2024 22:42
Extracting item list from invoice PDFs

Use case: collating previous orders

Composing a new order for DM, need to know total quantities from previous orders, subtract stock on hand to determine what to order so we don't run out in the next months.

DM issue invoices as PDF with items in a "soft grid" - two rows of text describing the item, price per unit, number of units, total price. Used Tabula (https://tabula.technology/) to extract these items, saving as a TSV file. Imported into Excel using CodePage 65001: Unicode UTF-8, resulted in two rows of description for each item. Merged these to new cells (=VERKETTEN(A1;" ";A2)), copied & pasted those values along with quantities to a new table, and sorted that so that duplicate and similar items were grouped (item descriptions proved a little volatile)

@thomasgoodfellow
thomasgoodfellow / HandlingFamilyPhotos.md
Last active October 9, 2025 16:53
Handling Family Photos

Organisation

Stored in directories by photo date (format YYYY-MM-DD) under B:\pix\Pix_mk4

Access

Still using Picasa

Transfer from cameras

@thomasgoodfellow
thomasgoodfellow / PagesToText.md
Last active January 31, 2024 22:55
Converting book pages to text file

Record page images

Scroll through the book on screen, saving images, e.g. via ShareX, to an empty directory (by default prefixed with window name, here "Firefox"). For best results have monitor settings in Portrait orientation, have book pages individually rather than open-2-page-spread, and zoom until nearly cropping.

Prepare the images for OCR

For good OCR crop out non-text contents (page numbers, frame, etc) and reduce to black & white images, numbered sequentially based on the date/time order. Create a subdirectory "crops". Edit the crop values in cropem.cmd. Run cropem.cmd with the images as the current directory

OCR with Tesseract

Run with "crops" as the current directory:

@thomasgoodfellow
thomasgoodfellow / Stream2Episodes.md
Last active January 28, 2024 22:48
Converting stream of a series into episode files

Record a stream as MP4

Play in a maximised browser window, record with ShareX (screen recorder settings: show cursor = OFF, FPS = 30, screen capture options video = screen-capture-recorder, audio = virtual-audio-capturer). Start recording from a batch file:

d:\bin\nircmd.exe win activate ititle "distinctive text from browser tab title"
d:\bin\nircmd.exe win max ititle "distinctive text from browser tab title"
"d:\Program Files\ShareX\ShareX.exe" -StartScreenRecorder
timeout /nobreak /t 3
d:\bin\nircmd.exe sendkeypress F11