Better filename sort
Typically, strings are sorted by their unicode value. That means you get some weird behavior. We see this with naive sorts of filenames.
- "Zoo.txt" is sorted before "academy.zip"
- "12 Final chapter.txt" is sorted before "2 Second chapter.txt"
We should be able to find something a bit smarter. Let's try two rules:
- Case insensitivity — a and A should be equivalent
- Numbers in order — strings of digits should sort in numerical order, not string order
Write a function that you can pass to sort-by
that will sort filenames correctly.
Examples
(sort-by filename-order ["Zoo.txt" "academy.zip"]) ;=> ("academy.zip" "Zoo.txt")
(sort-by filename-order ["12 Final chapter.txt" "2 Second chapter.txt"]) ;=> ("2 Second chapter.txt" "12 Final chapter.txt")
Bonus
It's also unfortunate that "elephant.txt" is sorted so far away from "éléphant.txt" (the French version). Make them sort next to each other. "e" should come before "é".
Email submissions to eric@purelyfunctional.tv before September 06, 2020. You can discuss the submissions in the comments below.
Here's a variant of my solution which attempts to account for accented characters:
The spec is a little vague on how to sort strings that have runs of digits other than at the start. My approach partitions a string to a vector of alternating string and integer elements. For example,
"Chapter 4 Part 3"
becomes[" chapter " 4 " part " 3]
. I add a single space character at the start to ensure the first element in the vector is a string, which allows the vectors to be compared (during sorting) without mishap.