Skip to content

Instantly share code, notes, and snippets.

View joewiz's full-sized avatar

Joe Wicentowski joewiz

  • Arlington, Virginia
View GitHub Profile
@joewiz
joewiz / check-links.xq
Created April 21, 2014 13:51
Check a remote webpage for broken links, with XQuery and EXPath HTTP Client
<?xml-model href="../../schemas/frus.rnc" type="application/relax-ng-compact-syntax"?>
@joewiz
joewiz / generate-xfdf.xq
Created February 13, 2014 05:40
Generate XFDF data for use in filling a PDF form, with XQuery; see http://joewiz.org/2014/02/13/filling-pdf-forms-with-pdftk-xfdf-and-xquery/
xquery version "3.0";
import module namespace functx="http://www.functx.com";
(: Prepare XFDF data to use with PDFtk to populate a blank form SF702,
e.g., http://www.archives.gov/isoo/security-forms/sf702.pdf,
with data for each month of the year :)
let $data-collection := xmldb:create-collection('/db', 'sf702')
let $year := 2014
@joewiz
joewiz / save-collection-to-disk.xq
Created October 30, 2013 22:02
Save a collection of XML documents from eXist-db onto the file system, with XQuery
xquery version "3.0";
let $file-system-target-base-directory :=
(: Mac :)
(: '/Users/Joe/workspace/paho-trunk' :)
(: Windows :)
'C:\Users\wicentowskijc\oxygensvn\paho-trunk'
let $source-collection := '/db/cms/apps/tei-content/data/short-history'
for $doc in collection($source-collection)
let $target :=
@joewiz
joewiz / milestones-to-tsv.xq
Created October 28, 2013 19:10
Create a tab-separated TSV file - like a comma-separated CSV file - out of a collection of TEI documents, with XQuery
xquery version "3.0";
declare namespace tei="http://www.tei-c.org/ns/1.0";
let $site-base-url := 'http://history.state.gov/milestones'
let $milestones-files := collection('/db/cms/apps/tei-content/data/milestones')/tei:TEI
let $tab-delimited-cells :=
for $file in $milestones-files
let $filename := substring-before(util:document-name($file), '.xml')
return
@joewiz
joewiz / nixon-chiefs-of-mission.xq
Last active December 22, 2015 20:19
Chiefs of Mission appointed during the presidency of Richard Nixon, using XQuery
xquery version "3.0";
let $roles := collection('/db/cms/apps/principals-chiefs/code-tables/roles/data')/role
let $countries := collection('/db/cms/apps/countries/data')/country
let $start-date := '1969-01-20'
let $end-date := '1974-08-09'
let $appointments := collection('/db/cms/apps/principals-chiefs/data')//event[@type='appointed' and @when gt $start-date and @when lt $end-date]/parent::role[@class='chief' and @type ne 'charge-daffaires-ad-interim']
return
<div>
<p>{count($appointments)} Chiefs of Mission who were appointed between {format-date(xs:date($start-date), "[MNn] [D], [Y]")} and {format-date(xs:date($end-date), "[MNn] [D], [Y]")}.</p>
@joewiz
joewiz / principals-and-chiefs-most-postings.xq
Created September 5, 2013 13:16
Show ambassadors in order of the most numbers of postings, using XQuery
xquery version "3.0";
for $ambassador in collection('/db/cms/apps/principals-chiefs/data')/person
let $postings := $ambassador/role
let $how-many-postings := count($postings)
group by $how-many-postings
order by $how-many-postings descending
return
<group postings="{$how-many-postings}" people-in-this-group="{count($ambassador)}">
{
@joewiz
joewiz / dehyphenate.xq
Last active March 3, 2016 04:54
Dehyphenate text suffering from improper hyphenation, using XQuery
xquery version "3.0";
(: Functions to dehyphenate a word or a paragraph suffering from improper hyphenation.
Uses a dictionary (a list of known words), such as those available at:
https://github.com/marklogic/dictionaries/tree/master/dictionaries
:)
declare namespace fn="http://www.w3.org/2005/xpath-functions";
declare namespace spell="http://marklogic.com/xdmp/spell";
@joewiz
joewiz / get-tei-articles-collection-summary.xq
Last active December 20, 2015 17:29
Find the shortest and longest article in a collection of TEI XML articles by word count, and calculate the average word count, using XQuery
xquery version "3.0";
(: find the shortest and longest article and get the average word count of a collection of TEI XML articles :)
declare namespace tei="http://www.tei-c.org/ns/1.0";
(: in our case, 'articles' are TEI divs that have @xml:id attributes and no child divs;
we filter out the foreward since they're not full articles. :)
let $milestone-articles := collection('/db/cms/apps/tei-content/data/milestones')//tei:div[@xml:id and not(.//tei:div)][@xml:id ne 'foreword']
let $article-infos :=
@joewiz
joewiz / fix-name-capitalization.xq
Last active December 20, 2015 16:18
Fix problems with mis-capitalized names, with XQuery
xquery version "3.0";
declare namespace fn="http://www.w3.org/2005/xpath-functions";
(: Fix problems with mis-capitalized names. For example:
Before: MACARTHUR, Douglas II
After: MacArthur, Douglas II
:)
declare function local:fix-name-capitalization($name as xs:string) {
(: