This is just a short article explaining some things I thought about trying to make sense of the term train in Dyalog APL. Some basic knowledge of APL is assumed but not much more than can be attained by browsing an introduction to the language.
There are plenty of rigorous treatments of various aspects of APL out there, but to be clear, this isn’t one. I may get some stuff wrong (please comment below if I do!) but it’s meant to offer an honest view of how one person really goes about thinking in APL.
If you’ve poked around any of the forums or casually read some articles about APL you’ve probably heard the term “train”. What’s a train? Let’s try to figure this out.
First off, in English a train of items is just a sequence of items based on the metaphor of the sequence of cars in a locomotive train. More than a collection of random things in a row, though it evokes a list of similar things just like those cars.
The APL Wiki describes a train as “A train is a series of functions in isolation. An isolated function is either surrounded by parentheses or named.” So, it seems it’s about functions “in isolation”. It’s not yet clear what the “in isolation” part is meant to indicate, but let’s run with the series of functions for a bit.
Let’s take a look at this example. Don’t worry if you’re not familiar with all the parts yet. We’re just trying to get a handle on the terminology here.
This is a sequence of functions followed by an argument. The first function (from the right) is
~’ACGT’∘,~. If you’re new to APL you might see this as a sequence of glyphs, which it is, but
it’s also a single function, constructed from an operator, namely
∘. Operators take operands
to produce functions. Functions produced from operators are known as “derived
functions”. Operands to operators are like arguments to functions but moved to a whole different
arena. The two operands here are ~’ACGT’~ and ~,~ and when applying the operator to the operands
you get a single function which catenates ~’ACGT’~ on the left of its argument.
But we’re here to understand trains. The next function is Key which is a little funky in that
⌸ glyph and the function to its left combine to form the function which itself takes
arguments. It takes a single function, which here is itself sequence of functions! Hmm.. Not
really. 1 is not a function. Pretty close though! We should say that the full function here is
(⊂1↓⊢)⌸⍨ which is the funky Key function as a left operand to the
⍨ operator which makes a
function which takes a single argument and passes it as both the left and right argument to its
functional operand. (Sidebar: is
⌸ actually an operator? It somehow feels different than the
Finally we have,
≢¨ which is the Tally function as the operand to the each operator.
So three functions strung together and applied to an argument, which works exactly as you’d expect. The first (rightmost) is applied to the arguments and the next to the left is applied to the results of that and the last to the results of the previous application. It’s worth noting that with function application everything after the first applies to the single result of the previous application and is therefore applied monadically.
Are these functions “in isolation”? They certainly aren’t surrounded by parentheses nor is the sequence itself named. So we’d have to say no. More importantly the behavior of these functions here is defined by use as successive application. We know that the first will be applied to the argument and that every other function in the chain will be applied monadically to the results of the previous one in the chain.
How is this different from a sequence of functions in isolation. My preferred definition for “in isolation” is “without reference to arguments”. Let’s take a single glyph ~,~. Is this a function? Yes. More specifically it’s a primitive function. How does it behave? Well that depends on the context. When you give it two arguments it’s known as “Catenate” but when you give it only one it’s known as “Ravel”.
In short, functions exist without reference to how they can be applied, but their behavior depends on the context in which they are applied.
Let’s take a look at a short sequence
|-. Two functions in a row. This becomes a single
function which can be referred to as a 2-train. Note that this is not a single function created
by an operator, but rather a single function created by the simple juxtoposition of functions.
How does it work as a function? I.e. how is it applied? Well like most functions it can be
either monadic or dyadic. When applied it must be surrounded by parentheses or named (Oh!! so
that’s where the definition came from!), as in
3(|-)7. The parentheses are needed to show
that this 2-train acts as a single function and not simple Function application described above.
The way a 2-train works is similar to successive application though, but it operates on all its
arguments rather than starting from the right. That is to say the right part of the 2-train is
applied to all the arguments first and left part is applied to the result of that. This is also
known as an Atop.
What if we have three functions in a row? 3-trains are known as forks and behave very differently than successive application. As described in the link, a fork is made up of three successive functions (known as tines) and like 2-trains become a single function, which applies its arguments first to both the outer tines independently and then applies the results to the middle tine.
Perhaps the most famous fork is
+/÷≢ for calcuating the arithmetic mean, which takes a list of
numbers and then takes both the sum and the count of them and finally divides the former by the
Okay, we have 2-trains and 3-trains. How do 4-trains behave? Or 5-trains?
It turns out that all other trains can be broken down into 2-trains and 3-trains. How does that
work? Well first we know that 3-trains form a unit.
+/÷≢ is not two 2-trains as us
So anytime you have 3 functions in a row, you’ll get a single function. In particular, if you
stack on two more functions to the left of a 3-train, then you get another 3-train with the
right most tine being the 3-train on the far right. And so on and so on.
So adding two functions to a 3-train gives you a 5-train which behaves like a 3-train with it’s right most tine being the 3-train on the far right. Add two more gives a 7-train. Two more a 9-train. This covers all the odd numbered trains.
For even trains, like a 4-train, this is simply a 2-train with its right part being an odd numbered train. So adding one function to the left of any odd numbered train or just a single function produces a 2-train. This one function is the “one free function” I refer to in my blog post on forks. Adding two functions makes a fork which behaves very different than successive application.
So in any length train there is at most one 2-train application and the rest are 3-trains strung together on the right tine.
Odds and ends
Remember at the beginning when we said that
⊂1↓⊢ was pretty close to being a succession of
functions but for that pesky 1? Well it turns out that you can use literals for the left tine of
a fork. Instead of applying arguments to that tine, it simply becomes the value that you pass in
to the middle tine. So
1↓⊢ is still called a fork even though it’s not a succession of
functions. To clean up the mess, the parts of a fork are more precisely called “carriages” to cover
the case of literals.
With this more precise understanding, it’s fair game to call
⊂1↓⊢ a 4-train.
Creating functions from other functions before introduction arguments is what’s known as “tacit programming”. Just to reiterate that tacit programming is about function creation but sometimes you want the behavior of function application. I.e. sometimes all you need is a function, but that function is comprised of a sequence of functions each applied the the results of the previous function and not these fancy forks and trains.
To get the behavior of sucessive function application you need to introduce arguments. If you
don’t have arguments handy, you can make a dfn which let’s you represent the arguments as