Skip to content

Instantly share code, notes, and snippets.

joewiz /
Last active May 12, 2021
Recovery from nginx "Too many open files" error on Amazon AWS Linux

On Tue Oct 27, 2015, began buckling under load, intermittently issuing 500 errors. Nginx's error log was sprinkled with the following errors:

2015/10/27 21:48:36 [crit] 2475#0: accept4() failed (24: Too many open files)

2015/10/27 21:48:36 [alert] 2475#0: *7163915 socket() failed (24: Too many open files) while connecting to upstream...

An article at provided directions that mostly worked. Below are the steps we followed. The steps that diverged from the article's directions are marked with an *.

  1. * Instead of using su to run ulimit on the nginx account, use ps aux | grep nginx to locate nginx's process IDs. Then query each process's file handle limits using cat /proc/pid/limits (where pid is the process id retrieved from ps). (Note: sudo may be necessary on your system for the cat command here, depending on your system.)
  2. Added fs.file-max = 70000 to /etc/sysctl.conf
joewiz / yaml-to-xml.xq
Created Aug 22, 2016
Convert YAML to XML, with XQuery
View yaml-to-xml.xq
xquery version "3.0";
(: doesn't support YAML indentation yet - just a start :)
declare function local:process-yaml-value($value) {
let $single-quote := "^'(.+)'$"
let $double-quote := '^"(.+)"$'
if (matches($value, $single-quote) or matches($value, $double-quote)) then
let $pattern := "^['""](.+)['""]$"
joewiz / check-text-for-ocr-typo-patterns.xq
Last active Apr 9, 2021
Check a text for OCR typo patterns, using XQuery
View check-text-for-ocr-typo-patterns.xq
xquery version "3.1";
: Find possible OCR errors in a text by checking for patterns that an OCR
: process is known to misread, e.g., "day" misread as "clay", or "France"
: misread as "Prance." If the OCR engine just misread some instances of these
: words but got other instances correct, then this query will highlight
: candidates for correction.
: The query lets you configure a source text and define pattern sets to be used.
joewiz / enrich-dates-in-mixed-content.xq
Created Nov 21, 2017
Enrich dates in mixed content, with XQuery
View enrich-dates-in-mixed-content.xq
xquery version "3.1";
(: Turning "December 7, 1941" into <date>December 7, 1941</date> isn't too hard, with XPath 3.0's
fn:analyze-string() function, but if the date string occurs in mixed text, such as:
<p>Pearl Harbor was attacked on <em>December</em> 7, 1941.</p>
and you want to preserve the existing element structure to return:
<p>Pearl Harbor was attacked on <date><em>December</em> 7, 1941</date>.</p>
it's quite a bit more challenging.
This query uses string processing to align the results of fn:string-analyze() with the input's
joewiz / date-parser.xqm
Created Aug 26, 2018
Parse various formats of date strings, in XQuery
View date-parser.xqm
xquery version "3.1";
Various Date String Parser
- Parses various flavors of date strings, returns as xs:dateTime or xs:date
- Key functions: dates:parseDateTime() and dates:parseDate()
- Adapted by Joe Wicentowski from
- Adapted to standard XQuery (instead of the MarkLogic 0.9-ml flavor)
- TODO: test against
joewiz / adaptive-serialization.xq
Created Sep 15, 2018
Boilerplate for declaring Adaptive serialization in XQuery
View adaptive-serialization.xq
xquery version "3.1";
declare namespace output="";
declare option output:method "adaptive";
declare option output:indent "yes";
map { "reference": xs:anyURI("") }
joewiz / group-by.xq
Last active Apr 9, 2021
How variables in XQuery FLWOR expressions change when using the "group by" clause
View group-by.xq
xquery version "3.1";
## How variables in XQuery FLWOR expressions change when using the `group by` clause
Sometimes, when working with a `group by` clause, an XQuery FLWOR expression
might suddenly seem to act strangely, or at least unintuitively. In particular,
variables defined before the `group by` clause might suddenly seem to go haywire.
joewiz / zip-barebones.xq
Created Jan 17, 2020
Construct a zip file and stream it to a browser, with XQuery & eXist
View zip-barebones.xq
xquery version "3.1";
let $node := <root><x/></root>
let $entry := <entry name="test.xml" type="xml">{$node}</entry>
let $zip := compression:zip($entry, true())
let $name := ""
response:stream-binary($zip, "media-type=application/zip", "")
joewiz / tokenize-sentences.xq
Last active Apr 9, 2021
Split (or "tokenize") a string into "sentences", with XQuery. See
View tokenize-sentences.xq
xquery version "1.0";
(: A naive approach to sentence tokenization inspired by
: Works well with edited text like newspapers. Parameters like punctuation can/should be edited;
: see the section below called "criteria".
: For a more sophisticated approach, see Tibor Kiss and Jan Strunk, "Unsupervised Multilingual
: Sentence Boundary Detection", Computational Linguistics, Volume 32, Issue 4, December 2006,
: pp. 485-525. Also, see these discussions of sentence tokenization:
joewiz / json-xml.xqm
Last active Jan 19, 2021
An implementation of XQuery 3.1's fn:json-to-xml and fn:xml-to-json functions for eXist
View json-xml.xqm
xquery version "3.1";
: An implementation of XQuery 3.1's fn:json-to-xml and fn:xml-to-json functions for eXist, which does not support them natively as of 4.3.0.
: @author Joe Wicentowski
: @version 0.4
: @see
module namespace jx = "";