Joe Wicentowski joewiz

## maps-to-csv.xq
xquery version "3.1";

(: Convert a sequence of XQuery maps into CSV or TSV.
 : Each map becomes one row.
 : The entries' keys become column headers.
 :)

declare variable $local:default-options :=
    map {
        (: Character to separate cells with :)

## chmod-recursive.xq
xquery version "3.1";

import module namespace dbutil="http://exist-db.org/xquery/dbutil";

dbutil:scan(
    xs:anyURI("/db/apps/airlock-data"),
    function($col, $res) {
        if ($res) then
            (: Set permissions on resources here :)
            (

## an-introduction-to-recursion-in-xquery.md

      
              1 file
            
          
              2 forks
            
          
              2 comments
            
          
              15 stars
            
          
                joewiz
                / an-introduction-to-recursion-in-xquery.md
            
            
              Last active
              January 3, 2024 15:30
            
              
                An introduction to recursion in XQuery
              
          
    An introduction to recursion in XQuery


Created: Nov 28, 2017
Updated: Nov 29, 2017: Now covers transformation of XML documents

Recursion is a powerful programming technique, but the idea is simple: instead of performing a single operation, a function calls itself repeatedly to whittle through a larger task. In XQuery, recursion can be used to accomplish complex tasks on data that a plain FLWOR expression (which iterates through a sequence) cannot, such as transforming an entire XML document from one format into another, like TEI or DocBook into HTML, EPUB, LaTeX, or XSL-FO. Transforming a document is well-suited to recursion because each of the document's nodes may need to be examined and manipulated based on the node's type, name, and location in the document; and once a node has been processed, the transformation must continue processing the nodes' children and descendants until the deepest leaf node has been processed. But learning the technique of recursion is often hard for a beginning program

  
## post-mortem.md

      
              1 file
            
          
              42 forks
            
          
              27 comments
            
          
              112 stars
            
          
                joewiz
                / post-mortem.md
            
            
              Last active
              September 3, 2023 11:57
            
              
                Recovery from nginx "Too many open files" error on Amazon AWS Linux
              
          
    On Tue Oct 27, 2015, history.state.gov began buckling under load, intermittently issuing 500 errors. Nginx's error log was sprinkled with the following errors:

2015/10/27 21:48:36 [crit] 2475#0: accept4() failed (24: Too many open files)
2015/10/27 21:48:36 [alert] 2475#0: *7163915 socket() failed (24: Too many open files) while connecting to upstream...

An article at http://www.cyberciti.biz/faq/linux-unix-nginx-too-many-open-files/ provided directions that mostly worked. Below are the steps we followed. The steps that diverged from the article's directions are marked with an *.

* Instead of using su to run ulimit on the nginx account, use ps aux | grep nginx to locate nginx's process IDs. Then query each process's file handle limits using cat /proc/pid/limits (where pid is the process id retrieved from ps). (Note: sudo may be necessary on your system for the cat command here, depending on your system.)
Added fs.file-max = 70000 to /etc/sysctl.conf


## csv-to-xml.xq
xquery version "3.1";

(: XQuery adaptation of https://github.com/digital-preservation/csv-tools/blob/master/csv-to-xml_v3.xsl.
   See also the thread on basex-talk https://mailman.uni-konstanz.de/pipermail/basex-talk/2016-September/011272.html.
:)

declare function local:get-cells($row as xs:string) {
    (: workaround for lack of lookahead support: append comma to end of row :)
    let $string-to-analyze := $row || ","
    let $analyze := fn:analyze-string($string-to-analyze, '(("[^"]*")+|[^,]*),')

## fix-straight-quotes.xq
xquery version "3.1";

(: This uses the eXist cache module to mimic xsl:accumulator approach described
 : by Norm Walsh at https://so.nwalsh.com/2023/08/08-accumulators :)

declare function local:initiate-cache() {
    cache:destroy("quotes"),
    cache:create("quotes", map{}),
    cache:put("quotes", "counter", 1)
};

## web-scraping-with-xquery.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              6 stars
            
          
                joewiz
                / web-scraping-with-xquery.md
            
            
              Last active
              March 9, 2023 07:18
            
              
                Web Scraping with XQuery
              
          
    Web Scraping with XQuery

Overview

Learning how to web scrape empowers you to apply your XQuery skills to any data residing on the web. You can fetch data from remote sites and services—for example, entire web pages or just the pieces of a page that matter to you. Once fetched, you can perform further analysis on the data, clean it up, mash it up with other data, transform it into different formats, etc.
Built-in functions for making HTTP requests

XPath-based languages like XQuery offer an standard function for accessing remote documents, the fn:doc() function.
However, a limitation of this function is that it only works if the URI returns a well-formed XML document.

  
## CompareTools.plist
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
    "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
    <array>
        <dict>
            <key>ApplicationIdentifier</key>
            <string>ro.sync.exml.DiffDirs</string>
            <key>ApplicationName</key>
            <string>Diff Directories</string>

## json-ignore-whitespace-text-nodes-param.xq
xquery version "3.1";

(: @see https://github.com/eXist-db/exist/commit/53bdb54c664a8063e43392e0b0fb9eac57baf67d#diff-fc5225a3545e6b6807e7810bc9e40219500842a68dbb49a7eb83670016b160ab :)

declare boundary-space preserve;

let $json-ignore-whitespace-text-nodes-param-xml :=
    <output:serialization-parameters xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization">
        <output:method value="json"/>
        <exist:json-ignore-whitespace-text-nodes value="yes"/>

## restxq-template.xqm
xquery version "3.0";

module namespace my-app = "http://my/app";

(: A starter template for a RestXQ module.
    Save this file into eXist anywhere (e.g., /db/restxq-template.xqm or /db/apps/my-app/modules/restxq-template.xqm),
    And access at http://localhost:8080/exist/restxq/index.html.
    (Why? Because /restxq is the URL space for RestXQ, and the function's %rest:path annotation is for requests to "/index.html".)
    Note that the collection configuration file where you store the module must invoke the RestXQ trigger:
        <collection xmlns="http://exist-db.org/collection-config/1.0">
	xquery version "3.1";

	(: Convert a sequence of XQuery maps into CSV or TSV.
	: Each map becomes one row.
	: The entries' keys become column headers.
	:)

	declare variable $local:default-options :=
	map {
	(: Character to separate cells with :)
	xquery version "3.1";

	import module namespace dbutil="http://exist-db.org/xquery/dbutil";

	dbutil:scan(
	xs:anyURI("/db/apps/airlock-data"),
	function($col, $res) {
	if ($res) then
	(: Set permissions on resources here :)
	(
	xquery version "3.1";

	(: XQuery adaptation of https://github.com/digital-preservation/csv-tools/blob/master/csv-to-xml_v3.xsl.
	See also the thread on basex-talk https://mailman.uni-konstanz.de/pipermail/basex-talk/2016-September/011272.html.
	:)

	declare function local:get-cells($row as xs:string) {
	(: workaround for lack of lookahead support: append comma to end of row :)
	let $string-to-analyze := $row \|\| ","
	let $analyze := fn:analyze-string($string-to-analyze, '(("[^"]")+\|[^,]),')
	xquery version "3.1";

	(: This uses the eXist cache module to mimic xsl:accumulator approach described
	: by Norm Walsh at https://so.nwalsh.com/2023/08/08-accumulators :)

	declare function local:initiate-cache() {
	cache:destroy("quotes"),
	cache:create("quotes", map{}),
	cache:put("quotes", "counter", 1)
	};
	<?xml version="1.0" encoding="UTF-8"?>
	<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
	"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
	<plist version="1.0">
	<array>
	<dict>
	<key>ApplicationIdentifier</key>
	<string>ro.sync.exml.DiffDirs</string>
	<key>ApplicationName</key>
	<string>Diff Directories</string>
	xquery version "3.1";

	(: @see https://github.com/eXist-db/exist/commit/53bdb54c664a8063e43392e0b0fb9eac57baf67d#diff-fc5225a3545e6b6807e7810bc9e40219500842a68dbb49a7eb83670016b160ab :)

	declare boundary-space preserve;

	let $json-ignore-whitespace-text-nodes-param-xml :=
	<output:serialization-parameters xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization">
	<output:method value="json"/>
	<exist:json-ignore-whitespace-text-nodes value="yes"/>
	xquery version "3.0";

	module namespace my-app = "http://my/app";

	(: A starter template for a RestXQ module.
	Save this file into eXist anywhere (e.g., /db/restxq-template.xqm or /db/apps/my-app/modules/restxq-template.xqm),
	And access at http://localhost:8080/exist/restxq/index.html.
	(Why? Because /restxq is the URL space for RestXQ, and the function's %rest:path annotation is for requests to "/index.html".)
	Note that the collection configuration file where you store the module must invoke the RestXQ trigger:
	<collection xmlns="http://exist-db.org/collection-config/1.0">