Skip to content

Instantly share code, notes, and snippets.

Avatar

Joe Wicentowski joewiz

  • Arlington, Virginia
View GitHub Profile
@joewiz
joewiz / merge-fragments.xq
Last active Jul 25, 2021
Merge XML fragments into a composite document
View merge-fragments.xq
xquery version "3.1";
(:~ Merge XML fragments with the same root node into a composite document.
: Based on a question from the exist-open mailing list.
:
: @see https://markmail.org/message/gdrhscaobyya44we
:)
(: Compare the fragment to the composite document. If the node names match,
@joewiz
joewiz / Second Tuesdays.md
Last active Jun 21, 2021
Second Tuesdays of the year, with XQuery
View Second Tuesdays.md

Today in exist-open, Slav asked:

Hi, anyone have function for calculate all second Tuesdays for year?

(That is, Patch Tuesdays.)

Below are two approaches—one that solves the problem directly (finding only 2nd Tuesdays) and one more generally (finding any combination of week of the month and day of the week, e.g., 5th Sundays).

@joewiz
joewiz / chmod-recursive.xq
Created Jun 3, 2021
Batch change permissions on resources and collections in eXist-db
View chmod-recursive.xq
xquery version "3.1";
import module namespace dbutil="http://exist-db.org/xquery/dbutil";
dbutil:scan(
xs:anyURI("/db/apps/airlock-data"),
function($col, $res) {
if ($res) then
(: Set permissions on resources here :)
(
@joewiz
joewiz / list-packages-with-dependency.xq
Created Jun 2, 2021
Find which EXPath packages declare a particular dependency, in eXist-db
View list-packages-with-dependency.xq
xquery version "3.1";
declare namespace pkg="http://expath.org/ns/pkg";
array {
for $app in xmldb:get-child-collections("/db/apps")
let $package-metadata := doc("/db/apps/" || $app || "/expath-pkg.xml")
where $package-metadata//pkg:dependency[@package eq "http://exist-db.org/apps/shared"]
order by $app
return
@joewiz
joewiz / fib-exist.xq
Last active Oct 10, 2020 — forked from apb2006/fib.xq
XQuery tail recursive Fibonacci function, with timing, for eXist-db
View fib-exist.xq
xquery version "3.1";
(: forked from https://gist.github.com/apb2006/4eef5889017be4a50685a467b2754d27
: with tests returned in the style of https://so.nwalsh.com/2020/10/09-fib :)
declare function local:fib($n as xs:integer, $a as xs:integer, $b as xs:integer){
switch ($n)
case 0 return $a
case 1 return $b
default return local:fib($n - 1, $b, $a + $b)
@joewiz
joewiz / check-text-for-ocr-typo-patterns.xq
Last active Apr 9, 2021
Check a text for OCR typo patterns, using XQuery
View check-text-for-ocr-typo-patterns.xq
xquery version "3.1";
(:~
: Find possible OCR errors in a text by checking for patterns that an OCR
: process is known to misread, e.g., "day" misread as "clay", or "France"
: misread as "Prance." If the OCR engine just misread some instances of these
: words but got other instances correct, then this query will highlight
: candidates for correction.
:
: The query lets you configure a source text and define pattern sets to be used.
@joewiz
joewiz / generate-xconfs.xq
Created Apr 27, 2020
Generate eXist facet definitions for xconf index configuration files programmatically
View generate-xconfs.xq
xquery version "3.1";
(: WIP! :)
declare boundary-space preserve;
let $xconfs :=
array {
map {
"qname": "tei:div",
@joewiz
joewiz / generate-xconfs.xq
Created Apr 27, 2020
Generate eXist facet definitions for xconf index configuration files programmatically
View generate-xconfs.xq
xquery version "3.1";
(: WIP! :)
declare boundary-space preserve;
let $xconfs :=
array {
map {
"qname": "tei:div",
@joewiz
joewiz / migrating-from-old-to-new-indexes.md
Last active Mar 12, 2020
Converting an eXist application from old-style fields to new, Lucene-based facets and fields
View migrating-from-old-to-new-indexes.md

Converting an eXist application from old-style fields to new, Lucene-based facets and fields

This article walks through the process of migrating an eXist application from using old-style fields to using the new, Lucene-based facets and fields. For more information, see the eXist documentation's Lucene article.

Old-style approach

In the old-style approach to fields, fields were constructed and maintained manually via the ft:index() function. To add or update fields for a document, a <doc> element containing <field> elements was passed to this function, along with the URI of the resource to be indexed.

For example, in one application, fields were constructed with in the hsa/modules/index.xq library module, whose index:index-one-document() function constructed the <field> elements and passed them to the ft:index() function:

@joewiz
joewiz / tokenize-sentences-nlp.xq
Last active Feb 6, 2020
Split (or "tokenize") a string into "sentences", with XQuery. See https://gist.github.com/joewiz/5889711
View tokenize-sentences-nlp.xq
xquery version "3.1";
(: Use the eXist Stanford NLP package for sentence tokenization.
: Compared to my original "naïve" approach, this approach takes a quarter the number of lines of XQuery code.
: See https://gist.github.com/joewiz/5889711 :)
import module namespace nlp="http://exist-db.org/xquery/stanford-nlp";
declare function local:tokenize-sentences($text as xs:string) {
local:tokenize-sentences($text, map{})