Skip to content

Instantly share code, notes, and snippets.

@peaeater
peaeater / tif2jpg.ps1
Last active Jun 22, 2022
Produces a JPG per TIF, requires imagemagick => http://www.imagemagick.org/
View tif2jpg.ps1
<#
1. Leaf
Given a text file of TIF/TIFF filenames, create JPGs from TIFs
and create a mirror directory structure for JPG file outputs.
* Handles filenames with entry separators
* Ignores TIF older than its JPG mirror unless -force param is used
* Requires imagemagick
.\tif2jpg.ps1 -in c:\dev\abc\extract\extracted\tifs\abc-tifs-1.txt `
@peaeater
peaeater / date-fix.ps1
Last active Aug 29, 2015
Transforms a variety of kooky date strings into y-m-d. This example assumes Record Created in an Inmagic Tagged Format dump.
View date-fix.ps1
<#
Fix ambiguous and non-strict dates in dbtext dump data.
#>
param(
[string]$in = ".",
[string]$out = $in
)
@peaeater
peaeater / jp22jpg.ps1
Created Mar 19, 2014
Produces a JPG per JP2, given an input directory. Output size defaults to 1000px width, and output name mirrors source JP2s. Requires imagemagick.
View jp22jpg.ps1
# convert .jp2 to .jpg
# requires imagemagick
Param(
[int]$size = 1000,
[string]$indir = ".",
[string]$outdir = ".\jpg"
)
if (!(test-path $outdir)) {
@peaeater
peaeater / meta2manifest.ps1
Created Mar 19, 2014
Converts Internet Archive XML metadata about a digitized publication to an XML manifest prepped for Solr ingest. Elements are mapped to Andi fields, and sometimes need transformation (e.g. dates to decades).
View meta2manifest.ps1
# convert IA metadata XML to Solr-ready manifest XML
<#
metadata.imagecount - 2 => pagecount
metadata.identifier => WebSafe($1) => id
metadata.title => title, freetext
metadata.date => toDecade($1) => date, date_free, freetext
metadata.creator => name, name_free, freetext
metadata.publisher => name, name_free, freetext
metadata.year => date_free, freetext
@peaeater
peaeater / raw-ia.ps1
Last active Aug 29, 2015
Processes Internet Archive packages, producing 1 txt, djvu xml, jpg per page of a digitized publication, plus an XML manifest. The output is intended for ingest by Solr through Andi's DIH handler. Jobs are broken into subscript dependencies. Requires imagemagick and djvulibre.
View raw-ia.ps1
# processes Internet Archive packages, producing per page: 1 txt, 1 ocrxml, 1 jpg
# requires djvulibre, imagemagick
param(
[string]$indir = ".",
[string]$outbase = $indir
)
[Reflection.Assembly]::LoadWithPartialName("System.IO.Compression.FileSystem")
@peaeater
peaeater / solr-dih-ingest-with-polling.ps1
Last active Aug 29, 2015
Triggers Solr DIH update and monitors its status. Writes exceptions or final success message to Windows Application Event Log.
View solr-dih-ingest-with-polling.ps1
<#
Trigger Solr update and poll for status.
- Writes events to Application Event Log; log source must already have been added
Peter Tyrrell
#>
param(
[Parameter(Mandatory=$false,Position=0)]
@peaeater
peaeater / add-eventlog-source.ps1
Created Jun 19, 2014
Adds supplied source value to the Windows Application Event Log. Requires admin privileges and will warn the user if elevation is required.
View add-eventlog-source.ps1
<#
Add log source to Application Event Log if not already there - REQUIRES ADMIN privileges
Peter Tyrrell
#>
param(
[Parameter(Mandatory=$false,ValueFromPipeline=$true,Position=0)]
[string]$logsrc = "Andi Solr Update"
@peaeater
peaeater / delete-eventlog-source.ps1
Created Jun 19, 2014
Removes a source value from Windows Application Event Log (careful!). Requires admin privileges and warns user if elevation is required.
View delete-eventlog-source.ps1
<#
Remove log source from Application Event Log - requires ADMIN PRIVILEGES
Peter Tyrrell
#>
param(
[Parameter(Mandatory=$true,ValueFromPipeline=$true,Position=0)]
[string]$logsrc
@peaeater
peaeater / push-dir-to-remote.ps1
Last active Aug 29, 2015
Syncs a directory to a remote server via WinSCP in SFTP mode. Writes exceptions or success message to Windows Application event log.
View push-dir-to-remote.ps1
<#
Sync a directory to remote server via WinSCP in SFTP mode.
- Uses SFTP instead of SCP to deny any shell commands to the user, just file transfer.
- Use a chrooted user on remote server (user jailed to their root dir).
- Writes to Application Event Log; log source must already have been added
Peter Tyrrell
#>
param(
@peaeater
peaeater / extractomatic.ps1
Last active Aug 29, 2015
Sample Andi extractomatic script that creates its own ODBC connection string and logs to Application event log.
View extractomatic.ps1
<#
Extracts data from the named textbase as files like {tn}-{0}.xml in output folder.
Logs to Application event log - source must already have been added.
#>
param(
[Parameter(Mandatory=$false,Position=0)]
[string]$logsrc = "Andi Solr Update"
)