This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<# | |
Split csv files with or without header rows, multiline columns with newlines ok. | |
#> | |
param ( | |
[string]$in, | |
[string]$outdir = [System.IO.Path]::GetDirectoryName($in), | |
[int]$count = 1000, | |
[string]$delimiter = ',', | |
[string[]]$header = @() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<fieldType name="text_suggest_edge" class="solr.TextField"> | |
<analyzer type="index"> | |
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/> | |
<tokenizer class="solr.KeywordTokenizerFactory"/> | |
<filter class="solr.LowerCaseFilterFactory"/> | |
<filter class="solr.PatternReplaceFilterFactory" pattern="([\.,;:-_])" replacement=" " replace="all"/> | |
<filter class="solr.EdgeNGramFilterFactory" maxGramSize="30" minGramSize="1"/> | |
<filter class="solr.PatternReplaceFilterFactory" pattern="([^\w\d\*æøåÆØÅ ])" replacement="" replace="all"/> | |
</analyzer> | |
<analyzer type="query"> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Creates new self-signed certificate for testing purposes | |
new-selfsignedcertificate -dnsname "*.domain.local" -friendlyname "*.domain.local Development Certificate" -certstorelocation "cert:\LocalMachine\My" -notafter (get-date).AddYears(100) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<# | |
Downsample PDF and convert to gray if necessary. | |
Requires Ghostscript (gswin64c). | |
#> | |
param ( | |
[string]$indir, | |
[string]$outdir = $indir, | |
[string]$gs = "gswin64c", | |
[string]$dpi = "150" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# convert pdf to png | |
# requires imagemagick w/ ghostscript | |
param ( | |
[Parameter(Mandatory=$true,ValueFromPipeline=$true,Position=0)] | |
[ValidateScript({[System.IO.Path]::GetExtension($_) -eq ".pdf"})] | |
[string]$in, | |
[string]$magick = "C:\utils\imagemagick\ImageMagick-7.1.1-Q16-HDRI\magick.exe" | |
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# ocr tif/png to txt | |
# requires tesseract | |
Param( | |
[string]$ext = "tif", | |
[string]$indir = ".", | |
[string]$outdir = $indir, | |
[string]$tesseract = "C:\utils\tesseract\tesseract.exe" | |
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<# | |
sqlps dependency | |
If module sqlps does not exist, install from: | |
Microsoft SQL Server 2016 Feature Pack (https://www.microsoft.com/en-us/download/details.aspx?id=52676) | |
- SQLSysClrTypes.msi | |
- SharedManagementObjects.msi | |
- PowershellTools.msi | |
#> | |
param( |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<# | |
Sync a directory to remote server in FTP mode | |
Peter Tyrrell | |
#> | |
param( | |
[Parameter(Mandatory = $false, Position = 0)] | |
[string]$logsrc = "", |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<fieldType name="alphaNumericSort" class="solr.TextField" sortMissingLast="false" omitNorms="true"> | |
<analyzer> | |
<!-- KeywordTokenizer does no actual tokenizing, so the entire | |
input string is preserved as a single token | |
--> | |
<tokenizer class="solr.KeywordTokenizerFactory"/> | |
<!-- The LowerCase TokenFilter does what you expect, which can be | |
when you want your sorting to be case insensitive | |
--> | |
<filter class="solr.LowerCaseFilterFactory" /> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<# | |
1. Leaf | |
Given a text file of PDF filenames, extract content from PDFs recursively | |
and create mirror directory structure for text file outputs. | |
* Handles filenames with entry separators. | |
* Ignores PDF older than its text file mirror unless -force param is used. | |
* Requires poppler pdftotext.exe | |
.\text-mirror.ps1 -in C:\dev\abc\extract\extracted\pdfs\abc-pdfs-1.txt ` |
NewerOlder