Skip to content

Instantly share code, notes, and snippets.

Avatar

Joe Wicentowski joewiz

View GitHub Profile
@joewiz
joewiz / check-text-for-ocr-typo-patterns.xq
Last active Apr 29, 2020
Check a text for OCR typo patterns, using XQuery
View check-text-for-ocr-typo-patterns.xq
xquery version "3.1";
(:~
: Find possible OCR errors in a text by checking for patterns that an OCR
: process is known to misread, e.g., "day" misread as "clay", or "France"
: misread as "Prance." If the OCR engine just misread some instances of these
: words but got other instances correct, then this query will highlight
: candidates for correction.
:
: The query lets you configure a source text and define pattern sets to be used.
@joewiz
joewiz / generate-xconfs.xq
Created Apr 27, 2020
Generate eXist facet definitions for xconf index configuration files programmatically
View generate-xconfs.xq
xquery version "3.1";
(: WIP! :)
declare boundary-space preserve;
let $xconfs :=
array {
map {
"qname": "tei:div",
@joewiz
joewiz / generate-xconfs.xq
Created Apr 27, 2020
Generate eXist facet definitions for xconf index configuration files programmatically
View generate-xconfs.xq
xquery version "3.1";
(: WIP! :)
declare boundary-space preserve;
let $xconfs :=
array {
map {
"qname": "tei:div",
@joewiz
joewiz / migrating-from-old-to-new-indexes.md
Last active Mar 12, 2020
Converting an eXist application from old-style fields to new, Lucene-based facets and fields
View migrating-from-old-to-new-indexes.md

Converting an eXist application from old-style fields to new, Lucene-based facets and fields

This article walks through the process of migrating an eXist application from using old-style fields to using the new, Lucene-based facets and fields. For more information, see the eXist documentation's Lucene article.

Old-style approach

In the old-style approach to fields, fields were constructed and maintained manually via the ft:index() function. To add or update fields for a document, a <doc> element containing <field> elements was passed to this function, along with the URI of the resource to be indexed.

For example, in one application, fields were constructed with in the hsa/modules/index.xq library module, whose index:index-one-document() function constructed the <field> elements and passed them to the ft:index() function:

@joewiz
joewiz / tokenize-sentences-nlp.xq
Last active Feb 6, 2020
Split (or "tokenize") a string into "sentences", with XQuery. See https://gist.github.com/joewiz/5889711
View tokenize-sentences-nlp.xq
xquery version "3.1";
(: Use the eXist Stanford NLP package for sentence tokenization.
: Compared to my original "naïve" approach, this approach takes a quarter the number of lines of XQuery code.
: See https://gist.github.com/joewiz/5889711 :)
import module namespace nlp="http://exist-db.org/xquery/stanford-nlp";
declare function local:tokenize-sentences($text as xs:string) {
local:tokenize-sentences($text, map{})
@joewiz
joewiz / zip-barebones.xq
Created Jan 17, 2020
Construct a zip file and stream it to a browser, with XQuery & eXist
View zip-barebones.xq
xquery version "3.1";
let $node := <root><x/></root>
let $entry := <entry name="test.xml" type="xml">{$node}</entry>
let $zip := compression:zip($entry, true())
let $name := "test.zip"
return
response:stream-binary($zip, "media-type=application/zip", "test.zip")
@joewiz
joewiz / pay-periods-between-dates.xq
Created Jan 6, 2020
Generate a list of pay periods (two per month) between two dates, using XQuery
View pay-periods-between-dates.xq
xquery version "3.1";
import module namespace functx="http://www.functx.com";
(: Calculate the number of months between two dates, rounding down :)
declare function local:months-between-dates-floor($start-date as xs:date, $end-date as xs:date) {
local:months-between-dates-floor($start-date, $end-date, xs:yearMonthDuration("P0M"))
};
(: A helper function for local:months-between-dates-floor :)
@joewiz
joewiz / export-eXide-tabs.xq
Last active Sep 18, 2020
Save eXide editor tabs to disk
View export-eXide-tabs.xq
xquery version "3.1";
(:
# Save eXide editor tabs to disk
1. Install "LocalStorage Manager" Chrome extension
https://chrome.google.com/webstore/detail/localstorage-manager/fkhoimdhngkiicbjobkinobjkoefhkap
@joewiz
joewiz / group-by.xq
Last active Dec 31, 2019
How variables in XQuery FLWOR expressions change when using the "group by" clause
View group-by.xq
xquery version "3.1";
(:
## How variables in XQuery FLWOR expressions change when using the `group by` clause
Sometimes, when working with a `group by` clause, an XQuery FLWOR expression
might suddenly seem to act strangely, or at least unintuitively. In particular,
variables defined before the `group by` clause might suddenly seem to go haywire.
@joewiz
joewiz / show-http-request-headers.xq
Created Oct 22, 2019
Display all HTTP request headers for the current request (eXist-db)
View show-http-request-headers.xq
xquery version "3.1";
array {
request:get-header-names() ! map { . : request:get-header(.) }
}
You can’t perform that action at this time.