gitonthescene/apl-rank.org

## apl-rank.org

      
    Raw
  

              apl-rank.org
            
          
    Rank absurdity

In this note I’m going to try to wrap my head around Dyalog APL’s rank operator.  Some basic
  knowledge of APL is assumed but not much more than can be attained by browsing an introduction to
  the language.
I’m still pretty new to APL, so you’re welcome to watch my process but if you’re a stickler for
  details this may not be for you.  I am open to criticism though, so please let me know if I’ve got
  something wrong.
Also, this may be a long one because I think it pulls together lots of different aspects of array
  programming.
Arrays

Let’s first have another look waaay at the beginning of this.  Arrays.  I think a lot of the
  literature approaches these as fairly abstract things with sizes and shapes and hides a bit of
  the realities of computer programming, But for understanding rank, I think it’s helpful to
  consider what’s actually implemented here.
Arrays are held in contiguous memory, meaning the address of the second element comes after the
  address of the first element and the third after that and there’s no room to put anything between
  these elements.  Address space is one dimensional, though, even when the array being represented
  is multi-dimensional.  How does that work?  Simple.  Take for example a 2x3 array.  That’s two
  rows of three elements each for a total of six elements.  The first row goes in the first 3
  elements and the next row in the next 3.  Similarly if we had a 7x3x9 dimensional array.  It can
  be seen as a list of seven 3x9 matrices which can each be seen as three rows of nine elements
  each.  In memory they’re laid out in order.
Another way to say this is that any array is laid out in memory in ravel order, which is to say
  if you ravel an array, you’ll get the elements in the order that they’re laid out in memory.
But notice that the description above of how multi-dimensional arrays are laid out is recursive.
  To layout a 7x3x9 dimensional array, you lay out seven 3x9 matrices exactly as they would be
  laid out if they were their own multi-dimensional array.
Thus we can think of a 7x3x9 multi-dimensional array as a sort of list of seven 3x9 matrices
  without the overhead of having identifiers for each of them.  Granted it’s a little unclear what
  I mean by identifiers here.  I’ll give some annecdotal evidence and leave it at that.
If you down rank a matrix, you get an honest to goodness list of rows.
      ↓3 3⍴⍳6
┌─────┬─────┬─────┐
│1 2 3│4 5 6│1 2 3│
└─────┴─────┴─────┘

You can assign this list of elements names like so:
      ⊢(a b c)←↓3 3⍴⍳27
┌─────┬─────┬─────┐
│1 2 3│4 5 6│7 8 9│
└─────┴─────┴─────┘

But if you don’t down rank first to have an honest to goodness list, then you can’t assign these
  names.
      (a b c)←3 3⍴⍳27
RANK ERROR

I don’t want to get too bogged down here, but the point is that the multi-dimensional array is
  like a list but isn’t an actual list.
But if the 7x3x9 matrix is like a list of 3x9 matrices and each 3x9 matrix is like a list of 3
  9-element rows, then the 7x3x9 matrix is also like a list of lists of 1-dimensional rows.  In
  other words it’s a matrix of 1 dimensional rows where each of those rows is laid out in memory
  just exactly as they would be if they were their own array in isolation.
The rank operator

Before exploring the rank operator in any generality, I want to focus on one particular use case.
  Namely using it with enclose.
With enclose, what the rank operator allows us to do is to split out these psuedo-lists into
  actual lists.  The rank operator lets us describe the rank of arrays we want to extract and
  generates the lists we’ve just described above.
E.g.
      ⊂⍤2⊢3 3 3⍴⍳27
┌─────┬────────┬────────┐
│1 2 3│10 11 12│19 20 21│
│4 5 6│13 14 15│22 23 24│
│7 8 9│16 17 18│25 26 27│
└─────┴────────┴────────┘

This is the rank operator with two operands.  On the left is enclose and on the right is the
  dimension of the arrays we’re extracting.  What if we extract rank 1 arrays?
      ⊂⍤1⊢3 3 3⍴⍳27
┌────────┬────────┬────────┐
│1 2 3   │4 5 6   │7 8 9   │
├────────┼────────┼────────┤
│10 11 12│13 14 15│16 17 18│
├────────┼────────┼────────┤
│19 20 21│22 23 24│25 26 27│
└────────┴────────┴────────┘

Or even rank 0 arrays.
      ⊂⍤0⊢3 3 3⍴⍳27
 1  2  3
 4  5  6
 7  8  9
        
10 11 12
13 14 15
16 17 18
        
19 20 21
22 23 24
25 26 27

Hmm..  This looks very different.  That’s because Dyalog uses a “floating array model”, which
  means that you can’t enclose items of rank 0.  i.e. items of rank zero are the same as their
  enclosure.  We won’t pursue this too much because it won’t get in our way here.  If it helps to
  think of these as enclosed rank 0 items, then feel free.
In truth, the rank operator can be used with many other functions and with more sophisticated
  descriptions of which subarrays we’re trying to extract, but using enclose will help us to
  understand what’s going on by realizing these psuedo-subarrays.
Let’s have a closer look at the rank 1 example above.  Notice that this matrix is listed out in
  the order that it is laid out in memory.  Thus the first row is the first matrix as a list of
  rows, which is different than the first row read across from the rank 2 example.
Axes

The next thing we’ll look at carefully is how axes work.  There are different aspects of this,
  but we’ll start with the simplest.  Rank polymorphism.
One of the first cool things you learn about array languages is that if arithmetic operations on
  scalars extends to operations on arrays of the same shape by performing the operation element by
  element.
         7 2 13+9 1 ¯3
16 3 10

One dimensional arrays have only one axis.  This addition is being performed along the axis which
  means we travel across both arrays in parallel doing the addition on each pair of scalars we meet
  along the way.  It’s often helpful to think of this abstractly as the addition of two one
  dimensional arrays but what’s actually happening is that we’re pairing up the individual scalars.
In fact, as we’ve seen above each of these arrays is laid out in memory the same way so we can
  travel along the elements of both in the same fashion.
What about if we try to add two two dimensional arrays?
      2 2⍴6 9 3 1
6 9
3 1
      2 2⍴¯3 0 ¯7 5
¯3 0
¯7 5
      (2 2⍴6 9 3 1)+(2 2⍴¯3 0 ¯7 5)
 3 9
¯4 6

These two arrays are also laid out in memory the same way and the addition property is similiarly
  done on the scalar level as we travel through them.  But another way to look at this is that we
  line up the rows of these two matrices and use rank polymorphism to add those rows.  So we first
  add up the first rows of each and then the second rows of each.
Another feature is scalar extension.  That allows us to add a constant to an array.  The way this
  works is similar in a way to the examples above only the scalar stays put as we travel along the
  other array.  I.e. we pair up the same scalar with each element of the row we’re adding it to and
  then perform the addition.
Another aspect of axes is folding.  We can use reduce to fold along an axis.
      +/⍳5
15

Here, we have a single array and instead of simply moving from one value to the next we fold up
  each value we meet into the result.  We can do the same across an axis of a higher dimensional
  array.
      3 3⍴⍳9
1 2 3
4 5 6
7 8 9
      +/3 3⍴⍳9
6 15 24

Here we’re doing the same operation along the first axis, but across the last axis.  In
  other words we can think of this as taking the first column added to the second column and the
  result added to the third column, along the first axis.  Or we can think of this as performing
  this operation to the first row and the second row and the third row in parallel, similar to
  how rank polymorphism has us traveling along two arrays in parallel.
Dyalog APL has the reduce first operator.
      3 3⍴⍳9
1 2 3
4 5 6
7 8 9
      +⌿3 3⍴⍳9
12 15 18

Here we do our reduction across the first axis, but along the second axis.  Notice that
  there’s something nicer about this version.  Abstractly, columns are just transposed rows, but
  practically rows are laid out contiguously exactly as they would be if they were their own
  arrays.  So here we’re doing the reduction but with something closer to real arrays by adding
  the first row to the second and then the result of that to the third.  The same can’t be said of
  the first example.
Another nice thing about this is that we’re reducing across rows and ending up with a row
  whereas the first example was across the columns, but leaves you with a row.
Now the magic

If you’ve done any programming in Dyalog at all you probably knew all of that but hopefully I
  didn’t lose you there.
What the rank operator does is it allows you operate on these psuedo subarrays.  So starting with
  the enclose example above. ⊂⍤2⊢3 3 3⍴⍳27 applies the enclose function to each of the rank 2
  subarrays and collects the results into an array. ⊂⍤1⊢3 3 3⍴⍳27 collects each of the rank 1
  subarrays and collects the results into a matrix.
Let’s introduce some terminology and pray it doesn’t confuse us more.  The rank 2 subarrays are
  called 2-cells and the rank 1 subarrays are called 1-cells.  Notice that 2-cells are not any two
  dimensional subarray, but just those subarrays which are laid out in contiguous memory.
I haven’t really latched on to the next bit, but here goes nothing.  The 2-cells here form a list
  which means they have a frame of one dimension with shape 3.  The one-cells form a matrix with
  shape 3 3.  The shape of the frame is comprised of the dimensions of the original array not used
  by the cells.  So in our hypothetical 7x3x9 array, the 2-cells have a shape of 3x9 and a frame of
  7, while the 1-cells have a shape of 9 and a frame of 7x3.
Said differently, the 1-cells as units are laid out in memory in exactly the same shape as they
  would be if we made a matrix of their enclosures.  (Note this is the same shape though not
  exactly the same.  That’s because the 1-cells are not arrays in their own right the way their
  enclosures are each individual items.)
Okay, let’s have some fun.  Rank polymorphism lets us do the following:
      ⊂⍤1⊢3 3 3⍴⍳27
┌────────┬────────┬────────┐
│1 2 3   │4 5 6   │7 8 9   │
├────────┼────────┼────────┤
│10 11 12│13 14 15│16 17 18│
├────────┼────────┼────────┤
│19 20 21│22 23 24│25 26 27│
└────────┴────────┴────────┘
      (3 3⍴⍳9)×(⊂⍤1⊢3 3 3⍴⍳27)
┌───────────┬───────────┬───────────┐
│1 2 3      │8 10 12    │21 24 27   │
├───────────┼───────────┼───────────┤
│40 44 48   │65 70 75   │96 102 108 │
├───────────┼───────────┼───────────┤
│133 140 147│176 184 192│225 234 243│
└───────────┴───────────┴───────────┘

Here we have two matrices of the same shape, namely the enclosed 1-cells of our three dimensional
  array and a 3x3 matrix, and we simply multiply them.  Notice that we’re matching scalars on the
  left with enclosed 1-cells on the right.  We can multiply the two using scalar extension.
The rank operator in its fuller generality lets us do the same thing but without having to use
  enclose to get to arrays which have exactly the same shape but rather have the same shape when
  considering its cells.
      (3 3⍴⍳9)×⍤0 1⊢3 3 3⍴⍳27
  1   2   3
  8  10  12
 21  24  27
           
 40  44  48
 65  70  75
 96 102 108
           
133 140 147
176 184 192
225 234 243

Here the rank operator takes mutiplication as its left operand and 0 1 as its right operand.  This
  says to take the 0-cells of the left argument and the 1-cells of the right argument and then
  perform the operation on the two just the way we did above with actual array arguments above.
  Notice also that the results are assembled in exactly the same shape as the operation was
  performed along.  I.e. we’re performing many operations in parallel along a given shape and the
  result is the same shape.
Pretty cool, right?  The rank operator allows us to perform operations along cells as if we were
  operating on actual arrays with the results returned in the shape of the frame of those cells.
It gets cooler..
Let’s look at our reduce example above again.
      3 3⍴⍳9
1 2 3
4 5 6
7 8 9
      +⌿3 3⍴⍳9
12 15 18

Remember this version does the fold across the first axis of our two dimensional array.  We
  can be explicit by saying we’re doing this to the 2-cell of this matrix (which amounts to
  everything and has an empty frame).
      +⌿⍤2⊢3 3⍴⍳9
12 15 18

But what if we did the fold across the first axis to each of our 1-cells?
      +⌿⍤1⊢3 3⍴⍳9
6 15 24

That’s just exactly the other fold above!!  1-cells have only dimension and so folding just folds
  over it.  But folding each 1-cell and then assembling the results in the frame of rank 1 is
  exactly the same thing as folding along the columns.
Let that sink in.  Folding along the columns is exactly the same as folding each of the
  1-cells and assembling the results into the rank 1 frame.
[Pause]
You may wonder, why couldn’t we do the same by reversing the roles of columns and rows?  In the
  abstract we can, but what tips the scales here is that 1-cells are laid out exactly the same as
  if they were arrays in isolation.  So making that work is not only easier but more efficient.
Again, operating on each 1-cell along the frame is the same thing as folding across the second
  axis.  This still holds when we have higher dimensional arrays as well.
        +⌿3 3 3⍴⍳27
30 33 36
39 42 45
48 51 54
      +⌿⍤3⊢3 3 3⍴⍳27
30 33 36
39 42 45
48 51 54
      +⌿⍤2⊢3 3 3⍴⍳27
12 15 18
39 42 45
66 69 72
      +⌿⍤1⊢3 3 3⍴⍳27
 6 15 24
33 42 51
60 69 78

Operating on the 3-cells of a three dimensional array doesn’t change much, but operating on the
  2-cells does.  Across the first axis of the 2-cells is the same as across the second axis of the
  original array and across the first axis of the 1-cells is the same as across the third axis of
  the original array.  So using the rank operator we can sum across any axis of the original
  array.
Once more, summing across the first axis of the 2-cells and assembling the results in the rank 1
  frame is the same thing as summing the original across the second axis.
Maybe now is a good time to introduce some more notation.  If you use negative indices with the
  rank operator, then the absolute value of that number indicates the rank of the frame and not the
  cells.
In this array rank 2-cells have a rank 1 frame and vice versa.
      +⌿⍤2⊢3 3 3⍴⍳27
12 15 18
39 42 45
66 69 72
      +⌿⍤¯1⊢3 3 3⍴⍳27
12 15 18
39 42 45
66 69 72

And for the 2-d version:
      +⌿⍤1⊢3 3⍴⍳9
6 15 24
      +⌿⍤¯1⊢3 3⍴⍳9
6 15 24

With this notation in place we see that along the ¯1 frame is the same as across the 2nd axis.
      +⌿[2]⊢3 3 3⍴⍳27
12 15 18
39 42 45
66 69 72
      +⌿[2]⊢3 3⍴⍳9
6 15 24

Are we done?

Nope.  But this seems like a good stopping place for one post.  I hope you found it useful.