Skip to content

Instantly share code, notes, and snippets.

@CliffordAnderson
Last active July 17, 2022 01:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save CliffordAnderson/0794d8d582eb37f1adffa3b852ce19c7 to your computer and use it in GitHub Desktop.
Save CliffordAnderson/0794d8d582eb37f1adffa3b852ce19c7 to your computer and use it in GitHub Desktop.
XQuery Puzzles from Week 1, Day 5, Session 4 of the Advanced Digital Editions NEH Institute at the University of Pittsburgh
xquery version "3.1";
let $date := "July 15, 2022"
let $tokens := fn:tokenize($date, " ")
let $month :=
switch ($tokens[1])
case "January" return "01"
case "February" return "02"
case "March" return "03"
case "April" return "04"
case "May" return "05"
case "June" return "06"
case "July" return "07"
case "August" return "08"
case "September" return "09"
case "October" return "10"
case "November" return "11"
case "December" return "12"
default return "01"
let $day := $tokens[2] => translate(",","")
let $year := $tokens[3]
return xs:date($year || "-" || $month || "-" || $day )
xquery version "3.1";
for $date in ("July 15, 2022", "January 5, 2023")
let $tokens := fn:tokenize($date, " ")
let $month :=
switch ($tokens[1])
case "January" return "01"
case "February" return "02"
case "March" return "03"
case "April" return "04"
case "May" return "05"
case "June" return "06"
case "July" return "07"
case "August" return "08"
case "September" return "09"
case "October" return "10"
case "November" return "11"
default return "01"
let $day := ($tokens[2] => translate(",","")) ! (if (fn:string-length(.) < 2) then "0" || . else .)
let $year := $tokens[3]
return xs:date($year || "-" || $month || "-" || $day )
xquery version "3.1";
declare function local:char-to-int($roman as xs:string) as xs:integer {
switch ($roman)
case "I" return 1
case "V" return 5
case "X" return 10
case "L" return 50
case "C" return 100
case "D" return 500
case "M" return 1000
default return 0
};
declare function local:roman-to-integer($roman as xs:string) as xs:integer{
fn:sum(
let $symbols := string-to-codepoints($roman) ! codepoints-to-string(.) ! local:char-to-int(.)
for $symbol at $index in $symbols
return
if ($index lt fn:count($symbols))
then
let $next-symbol := $symbols[$index + 1]
return
if ($symbol ge $next-symbol) then $symbol
else -$symbol
else $symbol
)
};
local:roman-to-integer("MDXCIX")
@djbpitt
Copy link

djbpitt commented Jul 15, 2022

Cool! Puzzles!

Here's an alternative to #1 that uses maps and substring-before() instead of translate() and string-join() instead of ||. It is in no way better than the solution above.

xquery version "3.1";
let $date as xs:string := "July 15, 2022"
let $tokens as xs:string+ := tokenize($date, " ")
let $months as map(*) := map {
    "January": "01",
    "February": "02",
    "March" : "03",
    "April" : "04",
    "May" : "05",
    "June" : "06",
    "July" : "07",
    "August" : "08",
    "September" : "09",
    "October" : "10",
    "November" : "11",
    "December": "12"
}
let $month as xs:string := $months($tokens[1])
let $day as xs:string := $tokens[2] => substring-before(',')
let $year as xs:string := $tokens[3]
return string-join(($year, $month, $day), '-')

@djbpitt
Copy link

djbpitt commented Jul 15, 2022

And here's another poke at #2. The only difference (beyond those in #1) is format-number() instead of string surgery. I'd preferformat-integer(), but that has not yet been implemented in eXist-db.

xquery version "3.1";

for $date in ("July 15, 2022", "January 5, 2023", "November 12, 2024")
let $tokens as xs:string+ := tokenize($date, " ")
let $months as map(*) := map {
    "January": "01",
    "February": "02",
    "March" : "03",
    "April" : "04",
    "May" : "05",
    "June" : "06",
    "July" : "07",
    "August" : "08",
    "September" : "09",
    "October" : "10",
    "November" : "11",
    "December": "12"
}
let $month as xs:string := $months($tokens[1])
let $day as xs:string := $tokens[2] => substring-before(',') => format-number('00')
let $year as xs:string := $tokens[3]
return string-join(($year, $month, $day), '-')

@djbpitt
Copy link

djbpitt commented Jul 15, 2022

My first introduction to the problem behind #3, and to fold-right() in general was https://joewiz.org/2021/05/30/converting-roman-numerals-with-xquery-xslt/.

@CliffordAnderson
Copy link
Author

@djbpitt , I like your alternative versions of the first two queries. The map provides an elegant alternative to switch. And your string functions will no doubt prove more robust than my "string surgery."

@CliffordAnderson
Copy link
Author

As for fold-left, good to see Joe's version for Roman numerals. As I mentioned to you and @gabikeane, here's my version of the Luhn algorithm in XQuery. I wrote the first version in 2013 in XQuery 3.0 and had to rewrite it a few years later after the signature changed in XQuery 3.1.

@hcayless
Copy link

Just for fun, here's a recursive version of the Roman numeral conversion:

xquery version "3.1";

module namespace r = "https://philomousos.com/roman-numerals";

declare variable $r:numerals := map {"I": 1, "V": 5, "X": 10, "L": 50, "C": 100, "D": 500, "M": 1000 };

declare function r:convert($num as xs:string, $sum as xs:integer) as xs:integer {
  if (string-length($num) eq 0) then 
    $sum
  else
    if ($r:numerals(substring($num, string-length($num))) le $sum div 5) then
      r:convert(substring($num, 1, string-length($num) - 1), $sum - $r:numerals(substring($num, string-length($num))))
    else
      r:convert(substring($num, 1, string-length($num) - 1), $sum + $r:numerals(substring($num, string-length($num))))
};

declare function r:convert($num as xs:string) as xs:integer { 
  r:convert($num, 0)
};

Roman numerals can most easily be summed from right to left. If the next number is much smaller than the sum so far (to be exact, less than or equal to the sum divided by 5), then it should be subtracted from the sum, otherwise added. So, with "XIV", you start with 5, then 1 is less than or equal to 5/5, so you subtract 1 from 5 and get 4, then 10 is greater, so you add 4 + 10, and you're done.

(For the students) Recursion is what it's called when a function calls itself. In this case, r:convert() calls r:convert(), each time adding to the sum and whittling down the number string, character by character, from the right.

Recursion is fun, and a little bit dangerous. Fun because it allows you to write much simpler code. Notice how short the function is. Dangerous because you have to be sure there's a way out, or you'll cause an impressive looking error. The computer manages code execution by keeping what's called a "stack" that tracks what function has called what. Stack depth is limited, so if your recursive function gets too deep, it will throw an error. There always has to be an "exit condition". Here, it's when the $num parameter has zero length—that means we're done. Also note that I haven't added any error handling—an exercise for the reader.

@CliffordAnderson
Copy link
Author

Thanks for this excellent write-up of a recursive solution, @hcayless !

@CliffordAnderson
Copy link
Author

CliffordAnderson commented Jul 17, 2022

Just to add to the fun, here's a solution that uses windowing. As I mentioned briefly yesterday, XQuery 3.0 added window as a clause to FLWOR expressions. A window allows you to partition your tuple streams. You can use windows to inflect your data, giving structure to data that would otherwise be flat. I use a window clause in this function to capture any pairwise Roman numerals like IV or IX.

xquery version "3.1";

declare function local:convert-roman($roman as xs:string) as xs:integer {
  fn:sum(
    let $symbols as map(xs:string, xs:integer) := map {"I": 1, "V": 5, "X": 10, "L": 50, "C": 100, "D": 500, "M": 1000 }
    let $tokens as xs:string+ := string-to-codepoints($roman) ! codepoints-to-string(.)
    let $values as xs:integer+ :=  $tokens ! $symbols(.)
    for tumbling window $w in fn:reverse($values)
        start $start when true() 
        end $end next $next when $next ge $end
    return
      typeswitch($w)
        case xs:integer return $w
        default return $w[1] - $w[2]
  )
};

local:convert-roman("DCCLXXIV")

Right now, window clauses are available in BaseX and Saxon, but not eXistDB. But hopefully, they'll be coming to eXistDB soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment