Skip to content

Instantly share code, notes, and snippets.

@Util
Created June 10, 2021 16:43
Show Gist options
  • Save Util/2d6a03a76021e5da4074574da8e3e78e to your computer and use it in GitHub Desktop.
Save Util/2d6a03a76021e5da4074574da8e3e78e to your computer and use it in GitHub Desktop.
Raku docs - Guide to Sorting - for public comment during TPRC conference BoF
=begin pod :kind("Language") :subkind("Language") :category("tutorial")
=TITLE Sorting
=SUBTITLE Guide to sorting in Raku
=head1 Quick solutions to common tasks
Simple sorts of list of values:
=begin code
@s = @numbers.sort(+*); # Numeric, low to high
@s = @numbers.sort(-*); # Numeric, high to low
@s = @strings.sort(~*); # Alphabetic, A to Z
@s = @strings.sort(~*).reverse; # Alphabetic, Z to A
=end code
Sort by method call on each value (C<.method-name>):
=begin code
@s = @strings.sort(+ *.chars); # By string "length"
@s = @strings.sort(~ *.fc); # Alphabetic, case-insensitive ("folded" case)
# Do NOT use .lc or .uc; See "Ignoring case" below.
=end code
Sort by subroutine call on each value (C<.&sub-name>):
=begin code
sub last_digit ( Int $n ) { return $n % 10 }
sub last_character ( Str $s ) { return substr( $s, *-1 ) }
@s = @numbers.sort(+ *.&last_digit); # Numeric
@s = @strings.sort(~ *.&last_character); # Alphabetic
=end code
Sort files by size, with same-sizes ordered by name (multiple keys):
=begin code
@IO_Path_objects = dir().sort({ - .s, ~ .basename });
=end code
Sort a hash:
=begin code
my %bread_prices = Wheat => 1.39, Rye => 1.19;
@key_value_pairs = %bread_prices.sort; # By hash key
@key_value_pairs = %bread_prices.sort( + *.value ); # By hash value
say "{.key} costs {.value} per loaf" for @key_value_pairs;
=end code
Sort a hash's keys (when you do not need the sorted hash values):
=begin code
my %bread_prices = Wheat => 1.39, Rye => 1.19;
@keys = %bread_prices.keys.sort; # By hash key
@keys = %bread_prices.keys.sort({ + %bread_prices{$_} }); # By hash value
say "$_ costs %bread_prices{$_} per loaf" for @keys;
=end code
Translating a sort from Perl:
=begin code
@s = sort { $a->[3] <=> $b->[3] } @array_of_arrays; # Perl
@s = sort { $^a.[3] <=> $^b.[3] }, @array_of_arrays; # Raku
# ^ ^ ^ ^ ^
# $a and $b become $^a and $^b, arrow becomes dot, and add a comma after the code block.
=end code
=head1 Contraindications
When to not sort, and How not to sort:
=item DO NOT use .sort to get only the least or greatest value. See Minimums and Maximums below.
=item DO NOT use .sort to do "partial" sorts; See Topological Sorting below.
=item DO NOT sort to just do a one-time binary search. Use a hash lookup, or .first; they are faster, and much harder to get wrong.
=item DO NOT write your own Heapsort, Quicksort, Shellsort, Bubblesort, Bogosort, or any other sort of sort taught in Computer Science. The modern way is to write only a comparitor function and pass that function to Raku's sort.
=head1 Intro / Basics
=item Raku, like other languages, does not require you to write code to move elements around
=item Humans expect numbers to sort differently from strings. Alphabetically, 42 comes before 8, because 4 comes before 8; there is no need to compare a second character position. So, string comparison will be different from number comparison.
=item Raku has separate comparison operators for numbers and strings.
=begin table
* | equal | not equal | less than | less than or equal to | greater than | greater than or equal to
=========================================================================================================
Numbers | == | != | < | <= | > | >=
Strings | eq | ne | lt | le | gt | ge
=end table
There are also three-way operators, and Unicode and generic versions of the above; we will cover those later.
=head1 How it all works
Starting with 2-arity
=head1 Special Needs of Datatypes
=head2 Strings
=head2 Numbers
=head2 Lists
=head2 Arrays
=head2 Hashes
=head2 Pairs
DO NOT sort hashes... (Leave out, because the more important advice is to sort anytime you are getting the whole hash for consistancy. Or, maybe OK to say both in the Hash section)
my %soda_counts = Cola => 25, Orange => 11, Grape => 17;
my @low_stock = %soda_counts.grep( *.value <= 20 );
say "We need: ", @low_stock.sort if @low_stock;
=head2 Ignoring case
For case-insensitive sorting:
=item DO use L<C<fc>|/routine/fc>, as either the C<fc()> subroutine or the C<.fc> method.
=item DO NOT use L<C<uc>|/routine/uc> or L<C<lc>|/routine/lc> or L<C<tc>|/routine/tc>.
Outside of sorting:
=item DO use L<C<uc>|/routine/uc> or L<C<lc>|/routine/lc> or L<C<tc>|/routine/tc> wherever you need them. They work as expected in Unicode, except when used for case-insensitive sorting.
=item DO use L<C<fc>|/routine/fc> for case-insensitive comparison, like C<say 'Found!' if fc($name) eq fc($wanted);>.
=item DO NOT use L<C<fc>|/routine/fc> for anything else. It was created for case-insensitive sorting, and is not expected to be useful outside that limited domain. Specifically, you can not rely on the output of L<C<fc>|/routine/fc> to be always uppercase nor always lowercase. If you discover a use case to the contrary, please inform us.
Before Unicode, the common method for case-insensitive alphabetic sorting was to coerce all the input to all be the same case (all-lowercase or all-uppercase, at the programmer's whim).
That coercion might be done during the comparisons, or (if the case-changed strings were useful later) performed on all the input values before the sort began executing.
=item That lc/uc technique is broken in Unicode
=item All Raku strings are Unicode strings
To correctly support all languages, Unicode separates the concept of "converting case" from the concept of "comparing without case".
Unicode defines the exact process to achieve case-insensitive comparison via "case folding".
Raku's implementation of that process is L<C<fc>|/routine/fc>.
If you are translating an existing sort where L<C<lc>|/routine/lc> or L<C<uc>|/routine/uc> was used in the comparator, you can just replace the L<C<lc>|/routine/lc> or L<C<uc>|/routine/uc> with L<C<fc>|/routine/fc>.
XXX Speak to the broader cases? lc/uc transforms before the sort are iffy, but removing them in favor of .fc comparator will change the case uniformity of the output! If you are translating or refactoring code that used L<C<lc>|/routine/lc> or L<C<uc>|/routine/uc> before the sort, and ...
For deeper details, links to the relevant parts of the Unicode standard, and modules that might (untested!) give finer control of the folding, see the L<Perl language documentation about fc|https://perldoc.perl.org/functions/fc> .
# XXX Change @a to @nums and @words. @a and @b will clash with $^a and $^b
# sort(~*.&last_digit).reverse (reverse reverse) Note that r-r is needed!
# Do use +-~, so your don't have to think about the return value of the method
=head2 Default sorting / bare C<.sort>
=begin item
DO use the default .sort for generic ordering.
For example, a module may need to be generic, and not assume anything about its input, even whether values are string or number.
=end item
=begin item
DO use the default .sort for ordering objects that have a custom C<cmp> defined.
=end item
=begin item
DO use the default .sort when you do not care what the order is as long as that order is consistant across runs.
This often occurs when writing tests.
=end item
=begin item
DO NOT use the default .sort otherwise.
Even when the default would work correctly becuase are certain that your input will B<always> be numbers, use C<.sort(+*)>;
Your intent will be clearer to future readers, your code will not malfunction with quoted numbers (like "42"), and you will get a warning if you were wrong about "always".
Similarly, using C<.sort(~*)> when you are expecting all strings will make your intent clear. Further, the ordering of 42 and 8 in C<('apple', 8, 42).sort(~*)> will not suddenly change just because someone removed the apple.
=end item
=head1 Related tasks
=head2 Is this list sorted? Show with comparitor function too!
[<]
TODO
=head2 Minimums and Maximums
TODO
=head2 Topological sorting
Also called a "partial" ordering or Dependency List.
You might have information that:
=item Hammurabi came before Caesar
=item Caesar came before Napoleon
=item Napoleon came before Churchill
=item Napoleon came before Lincoln
If you want to put these leaders into a chronological order with that information, .sort will fail for two reasons:
The ordering of Lincoln vs Churchill is ambiguous; it is not in the table, and cannot be calculated from the table, so their ordering will be undefined.
The ordering of Hammurabi vs (Napoleon,Lincoln,Churchill) are clear to a human reading the table, but they are not explicitly given, and .sort may call the comparitor with Hammurabi-vs-Napoleon. If the comparitor is only relying on the table as given, it cannot return the correct value.
The solution is to use a completely different method of ordering
See http://rosettacode.org/wiki/Topological_sort#Raku
=begin code
sub topo_sort ( %deps ) {
my %ba;
# Build %ba, the before-after ordered Hash of Hashes (working as Sets).
for %deps.kv -> $before, @afters {
for @afters -> $after {
%ba{$before}{$after} = True if $before ne $after;
%ba{$after} //= {};
}
}
# Find all the nodes with no dependents, harvest them, and remove them.
# That removal frees up the next level of nodes.
# Loop to harvest the next level, until nothing is left.
my @r = gather while %ba.grep( not *.value )».key -> @afters {
.take for @afters;
%ba{@afters}:delete;
.{ @afters}:delete for %ba.values;
}
warn "Cycle found! {%ba.keys.sort}" if %ba;
return @r;
}
my %deps =
Hammurabi => ( 'Caesar', ),
Caesar => ( 'Napoleon', ),
Napoleon => ( 'Churchill', 'Lincoln' ),
;
say topo_sort(%deps).reverse;
# OUTPUT: (Hammurabi Caesar Napoleon Lincoln Churchill)
# or
# OUTPUT: (Hammurabi Caesar Napoleon Churchill Lincoln)
=end code
=head2 Comparators
=head2 Oddball / specific / rare-but-useful
XXX Must have section on Speeding things up; benchmarking, hotspots
How not to sort
Unfiled notes about the document in general:
XXX In the `=begin pod` line, I am using :category("tutorial"), but I question the word. More like Task-guide.
XXX will also need to add pointers back to this doc, from all the pieces of https://docs.raku.org/routine/sort
XXX In lots of headings, I have multiple thoughts written. The final version might not be any of them; right now I am trying to express the idea.
Here, I am discussing them as general structure of future Guides, like Guide to Shortcuts and Guide to Rounding.
: =head1 Quick solutions / grab-and-go / most common tasks / for fast copy-and-paste / no explanations given yet
Single lines of code. (Maybe double OK?) Clustered by similarity. Ready to copy-and-paste. No OUTPUT section needed, because it should be clear from the context of the task.
No explaination needed here, because anything we are tempted to cover here should be explained lower in this document.
: =head1 Basics / Intro / preface
What we want to say about what we are going to say.
, and O(N) beats O(N log N).
Task of Rounding - in Perlfaq. Who every knew to look for it there?
If you read the FAQ last year, who has forgotten the asnswer, but remembers the location?
=end pod
# vim: expandtab softtabstop=4 shiftwidth=4 ft=raku
<!doctype html>
<html lang="en">
<head>
<title>Sorting</title>
<meta charset="UTF-8"/>
<style>hr,
img {
box-sizing: content-box
}
body::after,
body::before,
hr::after,
hr::before {
display: table;
content: &quot;&quot;
}
a,
a:not([href]) {
text-decoration: none
}
hr,
svg:not(:root) {
overflow: hidden
}
img,
table tr {
background-color: #fff
}
pre,
table {
overflow: auto
}
dl,
dl dt,
hr,
pre code,
pre&gt;code,
td,
th {
padding: 0
}
input,
pre code {
overflow: visible
}
pre,
pre code {
word-wrap: normal
}
body {
-ms-text-size-adjust: 100%;
-webkit-text-size-adjust: 100%;
color: #333;
font-family: &quot;Segoe UI&quot;, Roboto, Helvetica, Arial, sans-serif, &quot;Apple Color Emoji&quot;, &quot;Segoe UI Emoji&quot;, &quot;Segoe UI Symbol&quot;;
font-size: 16px;
line-height: 1.5;
word-wrap: break-word;
width: 820px;
margin: 2em auto;
}
a {
background-color: transparent;
-webkit-text-decoration-skip: objects;
color: #4078c0
}
a:active,
a:hover {
outline-width: 0;
text-decoration: underline
}
h1 {
margin: .67em 0
}
img {
border-style: none;
max-width: 100%
}
h1,
h2 {
padding-bottom: .3em;
border-bottom: 1px solid #eee
}
input {
font: inherit;
margin: 0;
font-family: inherit;
font-size: inherit;
line-height: inherit
}
* {
box-sizing: border-box
}
strong {
font-weight: 600
}
body::after,
hr::after {
clear: both
}
table {
border-spacing: 0;
border-collapse: collapse;
display: block;
width: 100%
}
blockquote {
margin: 0;
padding: 0 1em;
color: #777;
border-left: .25em solid #ddd
}
ol ol,
ul ol {
list-style-type: lower-roman
}
ol ol ol,
ol ul ol,
ul ol ol,
ul ul ol {
list-style-type: lower-alpha
}
dd {
margin-left: 0
}
code {
font-family: Consolas, &quot;Liberation Mono&quot;, Menlo, Courier, monospace
}
pre {
font: 12px Consolas, &quot;Liberation Mono&quot;, Menlo, Courier, monospace
}
input {
-webkit-font-feature-settings: &quot;liga&quot; 0;
font-feature-settings: &quot;liga&quot; 0
}
body&gt;:first-child {
margin-top: 0!important
}
body&gt;:last-child {
margin-bottom: 0!important
}
a:not([href]) {
color: inherit
}
blockquote,
dl,
ol,
p,
pre,
table,
ul {
margin-top: 0;
margin-bottom: 16px
}
hr {
background: #e7e7e7;
height: .25em;
margin: 24px 0;
border: 0
}
blockquote&gt;:first-child {
margin-top: 0
}
blockquote&gt;:last-child {
margin-bottom: 0
}
h1,
h2,
h3,
h4,
h5,
h6 {
margin-top: 24px;
margin-bottom: 16px;
font-weight: 600;
line-height: 1.25
}
dl dt,
table th {
font-weight: 700
}
h1 code,
h1 tt,
h2 code,
h2 tt,
h3 code,
h3 tt,
h4 code,
h4 tt,
h5 code,
h5 tt,
h6 code,
h6 tt {
font-size: inherit
}
h1 {
font-size: 2em
}
h2 {
font-size: 1.5em
}
h3 {
font-size: 1.25em
}
h4 {
font-size: 1em
}
h5 {
font-size: .875em
}
h6 {
font-size: .85em;
color: #777
}
ol,
ul {
padding-left: 2em
}
ol ol,
ol ul,
ul ol,
ul ul {
margin-top: 0;
margin-bottom: 0
}
li&gt;p {
margin-top: 16px
}
li+li {
margin-top: .25em
}
dl dt {
margin-top: 16px;
font-size: 1em;
font-style: italic
}
dl dd {
padding: 0 16px;
margin-bottom: 16px
}
table td,
table th {
padding: 6px 13px;
border: 1px solid #ddd
}
table tr {
border-top: 1px solid #ccc
}
table tr:nth-child(2n) {
background-color: #f8f8f8
}
code {
padding: .2em 0;
margin: 0;
font-size: 85%;
background-color: rgba(0, 0, 0, .04);
border-radius: 3px
}
code::after,
code::before {
letter-spacing: -.2em;
content: &quot;\00a0&quot;
}
pre&gt;code {
margin: 0;
font-size: 100%;
word-break: normal;
white-space: pre;
background: 0 0;
border: 0
}
pre {
padding: 16px;
font-size: 85%;
line-height: 1.45;
background-color: #f7f7f7;
border-radius: 3px
}
pre code {
display: inline;
max-width: auto;
margin: 0;
line-height: inherit;
background-color: transparent;
border: 0
}
pre code::after,
pre code::before {
content: normal
}
kbd {
display: inline-block;
padding: 3px 5px;
font: 11px Consolas, &quot;Liberation Mono&quot;, Menlo, Courier, monospace;
line-height: 10px;
color: #555;
vertical-align: middle;
background-color: #fcfcfc;
border: 1px solid #ccc;
border-bottom-color: #bbb;
border-radius: 3px;
box-shadow: inset 0 -1px 0 #bbb
}
hr {
border-bottom-color: #eee
}
</style>
</head>
<body class="pod">
<div id="___top"></div>
<nav class="indexgroup">
<table id="TOC">
<caption><h2 id="TOC_Title">Table of Contents</h2></caption>
<tr class="toc-level-1"><td class="toc-number">1</td><td class="toc-text"><a href="#Quick_solutions_to_common_tasks">Quick solutions to common tasks</a></td></tr>
<tr class="toc-level-1"><td class="toc-number">2</td><td class="toc-text"><a href="#Contraindications">Contraindications</a></td></tr>
<tr class="toc-level-1"><td class="toc-number">3</td><td class="toc-text"><a href="#Intro_/_Basics">Intro / Basics</a></td></tr>
<tr class="toc-level-1"><td class="toc-number">4</td><td class="toc-text"><a href="#How_it_all_works">How it all works</a></td></tr>
<tr class="toc-level-1"><td class="toc-number">5</td><td class="toc-text"><a href="#Special_Needs_of_Datatypes">Special Needs of Datatypes</a></td></tr>
<tr class="toc-level-2"><td class="toc-number">5.1</td><td class="toc-text"><a href="#Strings">Strings</a></td></tr>
<tr class="toc-level-2"><td class="toc-number">5.2</td><td class="toc-text"><a href="#Numbers">Numbers</a></td></tr>
<tr class="toc-level-2"><td class="toc-number">5.3</td><td class="toc-text"><a href="#Lists">Lists</a></td></tr>
<tr class="toc-level-2"><td class="toc-number">5.4</td><td class="toc-text"><a href="#Arrays">Arrays</a></td></tr>
<tr class="toc-level-2"><td class="toc-number">5.5</td><td class="toc-text"><a href="#Hashes">Hashes</a></td></tr>
<tr class="toc-level-2"><td class="toc-number">5.6</td><td class="toc-text"><a href="#Pairs">Pairs</a></td></tr>
<tr class="toc-level-2"><td class="toc-number">5.7</td><td class="toc-text"><a href="#Ignoring_case">Ignoring case</a></td></tr>
<tr class="toc-level-2"><td class="toc-number">5.8</td><td class="toc-text"><a href="#Default_sorting_/_bare_.sort">Default sorting / bare <code class="pod-code-inline">.sort</code></a></td></tr>
<tr class="toc-level-1"><td class="toc-number">6</td><td class="toc-text"><a href="#Related_tasks">Related tasks</a></td></tr>
<tr class="toc-level-2"><td class="toc-number">6.1</td><td class="toc-text"><a href="#Is_this_list_sorted?_Show_with_comparitor_function_too!">Is this list sorted? Show with comparitor function too!</a></td></tr>
<tr class="toc-level-2"><td class="toc-number">6.2</td><td class="toc-text"><a href="#Minimums_and_Maximums">Minimums and Maximums</a></td></tr>
<tr class="toc-level-2"><td class="toc-number">6.3</td><td class="toc-text"><a href="#Topological_sorting">Topological sorting</a></td></tr>
<tr class="toc-level-2"><td class="toc-number">6.4</td><td class="toc-text"><a href="#Comparators">Comparators</a></td></tr>
<tr class="toc-level-2"><td class="toc-number">6.5</td><td class="toc-text"><a href="#Oddball_/_specific_/_rare-but-useful">Oddball / specific / rare-but-useful</a></td></tr>
</table>
</nav>
<div class="pod-body">
<h1 id="Quick_solutions_to_common_tasks"><a class="u" href="#___top" title="go to top of document">Quick solutions to common tasks</a></h1>
<p>Simple sorts of list of values:</p>
<pre class="pod-block-code">@s = @numbers.sort(+*); # Numeric, low to high
@s = @numbers.sort(-*); # Numeric, high to low
@s = @strings.sort(~*); # Alphabetic, A to Z
@s = @strings.sort(~*).reverse; # Alphabetic, Z to A
</pre>
<p>Sort by method call on each value (<code>.method-name</code>):</p>
<pre class="pod-block-code">@s = @strings.sort(+ *.chars); # By string &quot;length&quot;
@s = @strings.sort(~ *.fc); # Alphabetic, case-insensitive (&quot;folded&quot; case)
# Do NOT use .lc or .uc; See &quot;Ignoring case&quot; below.
</pre>
<p>Sort by subroutine call on each value (<code>.&amp;sub-name</code>):</p>
<pre class="pod-block-code">sub last_digit ( Int $n ) { return $n % 10 }
sub last_character ( Str $s ) { return substr( $s, *-1 ) }
@s = @numbers.sort(+ *.&amp;last_digit); # Numeric
@s = @strings.sort(~ *.&amp;last_character); # Alphabetic
</pre>
<p>Sort files by size, with same-sizes ordered by name (multiple keys):</p>
<pre class="pod-block-code">@IO_Path_objects = dir().sort({ - .s, ~ .basename });
</pre>
<p>Sort a hash:</p>
<pre class="pod-block-code">my %bread_prices = Wheat =&gt; 1.39, Rye =&gt; 1.19;
@key_value_pairs = %bread_prices.sort; # By hash key
@key_value_pairs = %bread_prices.sort( + *.value ); # By hash value
say &quot;{.key} costs {.value} per loaf&quot; for @key_value_pairs;
</pre>
<p>Sort a hash&#39;s keys (when you do not need the sorted hash values):</p>
<pre class="pod-block-code">my %bread_prices = Wheat =&gt; 1.39, Rye =&gt; 1.19;
@keys = %bread_prices.keys.sort; # By hash key
@keys = %bread_prices.keys.sort({ + %bread_prices{$_} }); # By hash value
say &quot;$_ costs %bread_prices{$_} per loaf&quot; for @keys;
</pre>
<p>Translating a sort from Perl:</p>
<pre class="pod-block-code">@s = sort { $a-&gt;[3] &lt;=&gt; $b-&gt;[3] } @array_of_arrays; # Perl
@s = sort { $^a.[3] &lt;=&gt; $^b.[3] }, @array_of_arrays; # Raku
# ^ ^ ^ ^ ^
# $a and $b become $^a and $^b, arrow becomes dot, and add a comma after the code block.
</pre>
<h1 id="Contraindications"><a class="u" href="#___top" title="go to top of document">Contraindications</a></h1>
<p>When to not sort, and How not to sort:</p>
<ul><li><p>DO NOT use .sort to get only the least or greatest value. See Minimums and Maximums below.</p>
</li>
<li><p>DO NOT use .sort to do &quot;partial&quot; sorts; See Topological Sorting below.</p>
</li>
<li><p>DO NOT sort to just do a one-time binary search. Use a hash lookup, or .first; they are faster, and much harder to get wrong.</p>
</li>
<li><p>DO NOT write your own Heapsort, Quicksort, Shellsort, Bubblesort, Bogosort, or any other sort of sort taught in Computer Science. The modern way is to write only a comparitor function and pass that function to Raku&#39;s sort.</p>
</li>
</ul>
<h1 id="Intro_/_Basics"><a class="u" href="#___top" title="go to top of document">Intro / Basics</a></h1>
<ul><li><p>Raku, like other languages, does not require you to write code to move elements around</p>
</li>
<li><p>Humans expect numbers to sort differently from strings. Alphabetically, 42 comes before 8, because 4 comes before 8; there is no need to compare a second character position. So, string comparison will be different from number comparison.</p>
</li>
<li><p>Raku has separate comparison operators for numbers and strings. </p>
</li>
</ul>
<table class="pod-table">
<thead><tr>
<th>*</th> <th>equal</th> <th>not equal</th> <th>less than</th> <th>less than or equal to</th> <th>greater than</th> <th>greater than or equal to</th>
</tr></thead>
<tbody>
<tr> <td>Numbers</td> <td>==</td> <td>!=</td> <td>&lt;</td> <td>&lt;=</td> <td>&gt;</td> <td>&gt;=</td> </tr> <tr> <td>Strings</td> <td>eq</td> <td>ne</td> <td>lt</td> <td>le</td> <td>gt</td> <td>ge</td> </tr>
</tbody>
</table><p>There are also three-way operators, and Unicode and generic versions of the above; we will cover those later.</p>
<h1 id="How_it_all_works"><a class="u" href="#___top" title="go to top of document">How it all works</a></h1>
<p>Starting with 2-arity</p>
<h1 id="Special_Needs_of_Datatypes"><a class="u" href="#___top" title="go to top of document">Special Needs of Datatypes</a></h1>
<h2 id="Strings"><a class="u" href="#___top" title="go to top of document">Strings</a></h2>
<h2 id="Numbers"><a class="u" href="#___top" title="go to top of document">Numbers</a></h2>
<h2 id="Lists"><a class="u" href="#___top" title="go to top of document">Lists</a></h2>
<h2 id="Arrays"><a class="u" href="#___top" title="go to top of document">Arrays</a></h2>
<h2 id="Hashes"><a class="u" href="#___top" title="go to top of document">Hashes</a></h2>
<h2 id="Pairs"><a class="u" href="#___top" title="go to top of document">Pairs</a></h2>
<p>DO NOT sort hashes... (Leave out, because the more important advice is to sort anytime you are getting the whole hash for consistancy. Or, maybe OK to say both in the Hash section) my %soda_counts = Cola =&gt; 25, Orange =&gt; 11, Grape =&gt; 17; my @low_stock = %soda_counts.grep( *.value &lt;= 20 ); say &quot;We need: &quot;, @low_stock.sort if @low_stock;</p>
<h2 id="Ignoring_case"><a class="u" href="#___top" title="go to top of document">Ignoring case</a></h2>
<p>For case-insensitive sorting:</p>
<ul><li><p>DO use <a href="/routine/fc"><code>fc</code></a>, as either the <code>fc()</code> subroutine or the <code>.fc</code> method.</p>
</li>
<li><p>DO NOT use <a href="/routine/uc"><code>uc</code></a> or <a href="/routine/lc"><code>lc</code></a> or <a href="/routine/tc"><code>tc</code></a>.</p>
</li>
</ul>
<p>Outside of sorting:</p>
<ul><li><p>DO use <a href="/routine/uc"><code>uc</code></a> or <a href="/routine/lc"><code>lc</code></a> or <a href="/routine/tc"><code>tc</code></a> wherever you need them. They work as expected in Unicode, except when used for case-insensitive sorting.</p>
</li>
<li><p>DO use <a href="/routine/fc"><code>fc</code></a> for case-insensitive comparison, like <code>say &#39;Found!&#39; if fc($name) eq fc($wanted);</code>.</p>
</li>
<li><p>DO NOT use <a href="/routine/fc"><code>fc</code></a> for anything else. It was created for case-insensitive sorting, and is not expected to be useful outside that limited domain. Specifically, you can not rely on the output of <a href="/routine/fc"><code>fc</code></a> to be always uppercase nor always lowercase. If you discover a use case to the contrary, please inform us.</p>
</li>
</ul>
<p>Before Unicode, the common method for case-insensitive alphabetic sorting was to coerce all the input to all be the same case (all-lowercase or all-uppercase, at the programmer&#39;s whim). That coercion might be done during the comparisons, or (if the case-changed strings were useful later) performed on all the input values before the sort began executing.</p>
<ul><li><p>That lc/uc technique is broken in Unicode</p>
</li>
<li><p>All Raku strings are Unicode strings</p>
</li>
</ul>
<p>To correctly support all languages, Unicode separates the concept of &quot;converting case&quot; from the concept of &quot;comparing without case&quot;. Unicode defines the exact process to achieve case-insensitive comparison via &quot;case folding&quot;. Raku&#39;s implementation of that process is <a href="/routine/fc"><code>fc</code></a>.</p>
<p>If you are translating an existing sort where <a href="/routine/lc"><code>lc</code></a> or <a href="/routine/uc"><code>uc</code></a> was used in the comparator, you can just replace the <a href="/routine/lc"><code>lc</code></a> or <a href="/routine/uc"><code>uc</code></a> with <a href="/routine/fc"><code>fc</code></a>.</p>
<p>XXX Speak to the broader cases? lc/uc transforms before the sort are iffy, but removing them in favor of .fc comparator will change the case uniformity of the output! If you are translating or refactoring code that used <a href="/routine/lc"><code>lc</code></a> or <a href="/routine/uc"><code>uc</code></a> before the sort, and ...</p>
<p>For deeper details, links to the relevant parts of the Unicode standard, and modules that might (untested!) give finer control of the folding, see the <a href="https://perldoc.perl.org/functions/fc">Perl language documentation about fc</a> .</p>
<p># XXX Change @a to @nums and @words. @a and @b will clash with $^a and $^b # sort(~*.&amp;last_digit).reverse (reverse reverse) Note that r-r is needed! # Do use +-~, so your don&#39;t have to think about the return value of the method</p>
<h2 id="Default_sorting_/_bare_.sort"><a class="u" href="#___top" title="go to top of document">Default sorting / bare <code>.sort</code></a></h2>
<ul><li><p>DO use the default .sort for generic ordering.</p>
<p>For example, a module may need to be generic, and not assume anything about its input, even whether values are string or number.</p>
</li>
<li><p>DO use the default .sort for ordering objects that have a custom <code>cmp</code> defined.</p>
</li>
<li><p>DO use the default .sort when you do not care what the order is as long as that order is consistant across runs.</p>
<p>This often occurs when writing tests.</p>
</li>
<li><p>DO NOT use the default .sort otherwise.</p>
<p>Even when the default would work correctly becuase are certain that your input will <strong>always</strong> be numbers, use <code>.sort(+*)</code>; Your intent will be clearer to future readers, your code will not malfunction with quoted numbers (like &quot;42&quot;), and you will get a warning if you were wrong about &quot;always&quot;.</p>
<p>Similarly, using <code>.sort(~*)</code> when you are expecting all strings will make your intent clear. Further, the ordering of 42 and 8 in <code>(&#39;apple&#39;, 8, 42).sort(~*)</code> will not suddenly change just because someone removed the apple.</p>
</li>
</ul>
<h1 id="Related_tasks"><a class="u" href="#___top" title="go to top of document">Related tasks</a></h1>
<h2 id="Is_this_list_sorted?_Show_with_comparitor_function_too!"><a class="u" href="#___top" title="go to top of document">Is this list sorted? Show with comparitor function too!</a></h2>
<p>[&lt;]</p>
<p>TODO</p>
<h2 id="Minimums_and_Maximums"><a class="u" href="#___top" title="go to top of document">Minimums and Maximums</a></h2>
<p>TODO</p>
<h2 id="Topological_sorting"><a class="u" href="#___top" title="go to top of document">Topological sorting</a></h2>
<p>Also called a &quot;partial&quot; ordering or Dependency List.</p>
<p>You might have information that:</p>
<ul><li><p>Hammurabi came before Caesar</p>
</li>
<li><p>Caesar came before Napoleon</p>
</li>
<li><p>Napoleon came before Churchill</p>
</li>
<li><p>Napoleon came before Lincoln</p>
</li>
</ul>
<p>If you want to put these leaders into a chronological order with that information, .sort will fail for two reasons: The ordering of Lincoln vs Churchill is ambiguous; it is not in the table, and cannot be calculated from the table, so their ordering will be undefined. The ordering of Hammurabi vs (Napoleon,Lincoln,Churchill) are clear to a human reading the table, but they are not explicitly given, and .sort may call the comparitor with Hammurabi-vs-Napoleon. If the comparitor is only relying on the table as given, it cannot return the correct value. The solution is to use a completely different method of ordering See http://rosettacode.org/wiki/Topological_sort#Raku</p>
<pre class="pod-block-code">sub topo_sort ( %deps ) {
my %ba;
# Build %ba, the before-after ordered Hash of Hashes (working as Sets).
for %deps.kv -&gt; $before, @afters {
for @afters -&gt; $after {
%ba{$before}{$after} = True if $before ne $after;
%ba{$after} //= {};
}
}
# Find all the nodes with no dependents, harvest them, and remove them.
# That removal frees up the next level of nodes.
# Loop to harvest the next level, until nothing is left.
my @r = gather while %ba.grep( not *.value )».key -&gt; @afters {
.take for @afters;
%ba{@afters}:delete;
.{ @afters}:delete for %ba.values;
}
warn &quot;Cycle found! {%ba.keys.sort}&quot; if %ba;
return @r;
}
my %deps =
Hammurabi =&gt; ( &#39;Caesar&#39;, ),
Caesar =&gt; ( &#39;Napoleon&#39;, ),
Napoleon =&gt; ( &#39;Churchill&#39;, &#39;Lincoln&#39; ),
;
say topo_sort(%deps).reverse;
# OUTPUT: (Hammurabi Caesar Napoleon Lincoln Churchill)
# or
# OUTPUT: (Hammurabi Caesar Napoleon Churchill Lincoln)
</pre>
<h2 id="Comparators"><a class="u" href="#___top" title="go to top of document">Comparators</a></h2>
<h2 id="Oddball_/_specific_/_rare-but-useful"><a class="u" href="#___top" title="go to top of document">Oddball / specific / rare-but-useful</a></h2>
<p>XXX Must have section on Speeding things up; benchmarking, hotspots How not to sort</p>
<p>Unfiled notes about the document in general:</p>
<p>XXX In the `=begin pod` line, I am using :category(&quot;tutorial&quot;), but I question the word. More like Task-guide.</p>
<p>XXX will also need to add pointers back to this doc, from all the pieces of https://docs.raku.org/routine/sort</p>
<p>XXX In lots of headings, I have multiple thoughts written. The final version might not be any of them; right now I am trying to express the idea. Here, I am discussing them as general structure of future Guides, like Guide to Shortcuts and Guide to Rounding. : =head1 Quick solutions / grab-and-go / most common tasks / for fast copy-and-paste / no explanations given yet Single lines of code. (Maybe double OK?) Clustered by similarity. Ready to copy-and-paste. No OUTPUT section needed, because it should be clear from the context of the task. No explaination needed here, because anything we are tempted to cover here should be explained lower in this document. : =head1 Basics / Intro / preface What we want to say about what we are going to say.</p>
<p>, and O(N) beats O(N log N).</p>
<p>Task of Rounding - in Perlfaq. Who every knew to look for it there? If you read the FAQ last year, who has forgotten the asnswer, but remembers the location?</p>
</div>
</body>
</html>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment