Skip to content

Instantly share code, notes, and snippets.

Joe Wicentowski joewiz

View GitHub Profile
@joewiz
joewiz / post-mortem.md
Last active Jul 7, 2020
Recovery from nginx "Too many open files" error on Amazon AWS Linux
View post-mortem.md

On Tue Oct 27, 2015, history.state.gov began buckling under load, intermittently issuing 500 errors. Nginx's error log was sprinkled with the following errors:

2015/10/27 21:48:36 [crit] 2475#0: accept4() failed (24: Too many open files) 2015/10/27 21:48:36 [alert] 2475#0: *7163915 socket() failed (24: Too many open files) while connecting to upstream...

An article at http://www.cyberciti.biz/faq/linux-unix-nginx-too-many-open-files/ provided directions that mostly worked. Below are the steps we followed. The steps that diverged from the article's directions are marked with an *.

    • Instead of using su to run ulimit on the nginx account, use ps aux | grep nginx to locate nginx's process IDs. Then query each process's file handle limits using cat /proc/pid/limits (where pid is the process id retrieved from ps). (Note: sudo may be necessary on your system for the cat command here, depending on your system.)
  1. Added fs.file-max = 70000 to /etc/sysctl.conf
  2. Added `nginx soft nofile 1
@joewiz
joewiz / strip-diacritics.xq
Last active Jun 24, 2020
Strip diacritics, with XQuery
View strip-diacritics.xq
xquery version "3.1";
declare function local:strip-diacritics($string as xs:string) as xs:string {
let $normalized := normalize-unicode($string, 'NFD')
let $stripped := replace($normalized, '\p{IsCombiningDiacriticalMarks}', '')
return
$stripped
};
declare function local:inspect-diacritics($string as xs:string) as element() {
@joewiz
joewiz / map-find.xqm
Last active Jun 10, 2020
An implementation of XQuery 3.1's map:find function for eXist
View map-find.xqm
xquery version "3.1";
(:~
: An implementation of XQuery 3.1's map:find function for eXist, which does not support it natively as of 3.4.0.
:
: @author Joe Wicentowski
: @see https://www.w3.org/TR/xpath-functions-31/#func-map-find
:)
module namespace mf="http://joewiz.org/ns/xquery/map-find";
@joewiz
joewiz / roman-numerals.xqm
Last active May 30, 2020
Convert Roman numerals to integers, with XQuery
View roman-numerals.xqm
xquery version "3.0";
module namespace r = "http://joewiz.org/ns/xquery/roman-numerals";
(: Converts standard Roman numerals to integers.
Handles additive and subtractive but not double subtractive.
Case insensitive.
Doesn't attempt to validate a numeral other than a naïve character check.
See discussion of standard modern Roman numerals at http://en.wikipedia.org/wiki/Roman_numerals.
Adapted from an XQuery 1.0 module at
@joewiz
joewiz / check-text-for-ocr-typo-patterns.xq
Last active Apr 29, 2020
Check a text for OCR typo patterns, using XQuery
View check-text-for-ocr-typo-patterns.xq
xquery version "3.1";
(:~
: Find possible OCR errors in a text by checking for patterns that an OCR
: process is known to misread, e.g., "day" misread as "clay", or "France"
: misread as "Prance." If the OCR engine just misread some instances of these
: words but got other instances correct, then this query will highlight
: candidates for correction.
:
: The query lets you configure a source text and define pattern sets to be used.
@joewiz
joewiz / generate-xconfs.xq
Created Apr 27, 2020
Generate eXist facet definitions for xconf index configuration files programmatically
View generate-xconfs.xq
xquery version "3.1";
(: WIP! :)
declare boundary-space preserve;
let $xconfs :=
array {
map {
"qname": "tei:div",
@joewiz
joewiz / generate-xconfs.xq
Created Apr 27, 2020
Generate eXist facet definitions for xconf index configuration files programmatically
View generate-xconfs.xq
xquery version "3.1";
(: WIP! :)
declare boundary-space preserve;
let $xconfs :=
array {
map {
"qname": "tei:div",
@joewiz
joewiz / an-introduction-to-recursion-in-xquery.md
Last active Apr 17, 2020
An introduction to recursion in XQuery
View an-introduction-to-recursion-in-xquery.md

An introduction to recursion in XQuery

  • Created: Nov 28, 2017
  • Updated: Nov 29, 2017: Now covers transformation of XML documents

Recursion is a powerful programming technique, but the idea is simple: instead of performing a single operation, a function calls itself repeatedly to whittle through a larger task. In XQuery, recursion can be used to accomplish complex tasks on data that a plain FLWOR expression (which iterates through a sequence) cannot, such as transforming an entire XML document from one format into another, like TEI or DocBook into HTML, EPUB, LaTeX, or XSL-FO. Transforming a document is well-suited to recursion because each of the document's nodes may need to be examined and manipulated based on the node's type, name, and location in the document; and once a node has been processed, the transformation must continue processing the nodes' children and descendants until the deepest leaf node has been processed. But learning the technique of recursion is often hard for a beginning program

@joewiz
joewiz / exist-xpath-functions.xq
Last active Apr 1, 2020
Compare XPath functions in W3C spec vs. eXist 3.4.0
View exist-xpath-functions.xq
xquery version "3.1";
element modules {
util:registered-modules()[starts-with(., 'http://www.w3')] !
element module {
element namespace-uri {.},
util:registered-functions(.) !
element function {.}
}
}
@joewiz
joewiz / get-latest-created-document.xq
Last active Mar 21, 2020
Get the most recently created document in an eXist collection, using XQuery
View get-latest-created-document.xq
xquery version "3.1";
(: See discussion at http://markmail.org/message/hpu7toznx3fvdiei :)
import module namespace util="http://exist-db.org/xquery/util";
import module namespace xmldb="http://exist-db.org/xquery/xmldb";
declare function local:get-latest-created-document($collection-uri as xs:string) as map(*) {
if (xmldb:collection-available($collection-uri)) then
let $documents := xmldb:xcollection($collection-uri) ! util:document-name(.)
You can’t perform that action at this time.