chrispsn/transitive_closure_k.md

## transitive_closure_k.md

      
    Raw
  

              transitive_closure_k.md
            
          
    Transitive Closure in k

In The APL Orchard, ngn said:

the or-dot-and [∨.∧] trick is one of the most beautiful apl expressions ever, worth staring at :)

Well I had better learn it then...
Apparently it computes a transitive closure of a binary relation. Let's hit Wikipedia.
Transitive:

A relation R on a set X is transitive if, for all x, y, z in X, whenever x R y and y R z then x R z.

Closure:

"A set is closed under an operation if performance of that operation on members of the set always produces a member of that set."

OK - I understand that in an abstract sense, but let's see it in action, using the latest Shakti k.
Take the problem of finding, for each node in a graph, all nodes that node can access via 1+ jumps.
Concretely, let's say we have a 2D matrix m, where m[x] (row x) shows which places x can reach directly. If m[x;y] is 1, then x can get to y directly; if it's 0, then it can't.
Let's assume the edges are one-way (if A can get to B, that doesn't mean B can get to A), and let's make m really simple for now.
 m:(0 1 0;0 0 1;0 0 0)
 m
0 1 0  / A can get to B, but not to C
0 0 1  / B can get to C, but not to A
0 0 0  / C can't get to anything

To see whether A can get to C via B, we'd want to see a 1 in at least two places: for A -> B, and B -> C.
To see where A can get to via B (C or otherwise), we can multiply the A -> B bit (m[0;1]) by B's row (m[1]):
 m[0;1]
1
 m[1]
0 0 1
 m[0;1] * m[1]  / multiplies all of B's row by the "A -> B" bit
0 0 1

To see where A can get to in two hops (via anywhere), we need to do this check for all of A's bits, and all 'intermediary' hops. If a 1 appears in a column, that means there is at least one way that A can get to that place indirectly.
 (m[0;0] * m[0]; m[0;1] * m[1]; m[0;2] * m[2])
0 0 0
0 0 1
0 0 0

We can do that 'is 1 in the column?' check by doing an element-wise max (|):
 (m[0;0] * m[0]) | (m[0;1] * m[1]) | (m[0;2] * m[2])
0 0 1

Or more concisely, using 'max-over' (|/) instead of inserting the max verbs in between manually:
 |/ (m[0;0] * m[0]; m[0;1] * m[1]; m[0;2] * m[2])
0 0 1

Or more concisely, as a function:
 |/ {m[0;x] * m[x]} 0 1 2
0 0 1

Or more concisely, by exploiting the fact that multiplication in k is element-wise:
 m[0] * m
0 0 0      / 0 * 0 1 0
0 0 1      / 1 * 0 0 1
0 0 0      / 0 * 0 0 0

 |/ m[0] * m
0 0 1

So that's everywhere A can get in two hops. If we want to see where each node can get to in two hops, we need to run this for each m[x] on the left. So we can use 'each-left', which has the form x v\ y where v is a verb like *. (I'm going to tweak the appearance of k9's output slightly.)
 m *\ m
0 0 0
0 0 1
0 0 0

0 0 0
0 0 0
0 0 0

0 0 0
0 0 0
0 0 0

As before, we can compress this down using |/, but this time we need to apply it to each 3x3 matrix:
 |/' m *\ m
0 0 1
0 0 0
0 0 0

We can also write this as:
 m (|/*)\ m
0 0 1
0 0 0
0 0 0

Now in doing this calc, we lost info about where each node can get to in one hop. We can either add that back in using max:
 m | m (|/*)\ m
0 1 1            / OK! A can get to B and C
0 0 1            / B can only get to C
0 0 0            / C still can't get to anything

Or we can assume each node can 'hop' to itself, so the 'one hop' routes don't get wiped out by the 'two hops' computation:
 m: (1 1 0; 0 1 1;0 0 1)
 m (|/*)\ m
1 1 1
0 1 1
0 0 1

That was a pretty easy example, so let's make a more complex network to see how performs. Let's assume all nodes can get to themselves directly, and also that:

A -> B
B -> C and D
D -> E

In other words:
 m:(1 1 0 0 0;0 1 1 1 0;0 0 1 0 0;0 0 0 1 1;0 0 0 0 1)
 m
1 1 0 0 0
0 1 1 1 0
0 0 1 0 0
0 0 0 1 1
0 0 0 0 1

/ Or if we're lazy, we can tweak an identity matrix (updated via leprechaun1066 - thanks!):
/ https://www.reddit.com/r/apljk/comments/f0m5ff/transitive_closure_in_k/fgvblq2/

 .[;;:;1]/({x=\x}@!5;0 1;1 2;1 3)
1 1 0 0 0
0 1 1 1 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 1

And let's apply our process from before:
 m:(1 1 0 0 0;0 1 1 1 0;0 0 1 0 0;0 0 0 1 1;0 0 0 0 1)
 m (|/*)\ m
1 1 1 1 0
0 1 1 1 1
0 0 1 0 0
0 0 0 1 1
0 0 0 0 1

We can now see that A can get to C and D (because B can get to them directly), and that B can get to E (because D can get to E directly).
We know A should be able to get to E in three hops (via B and D). We can see this if we apply our process twice; so let's do that by wrapping up the process in a function, f, and applying it twice:
 f: {x (|/*)\ x}
 f@m
1 1 1 1 0
0 1 1 1 1
0 0 1 0 0
0 0 0 1 1
0 0 0 0 1

 f@f@m
1 1 1 1 1  <- top right bit changed!
0 1 1 1 1
0 0 1 0 0
0 0 0 1 1
0 0 0 0 1

Instead of explicitly applying the function twice, we can apply it repeatedly until the output stops changing using 'fixedpoint-over', /::
 f/:m
1 1 1 1 1
0 1 1 1 1
0 0 1 0 0
0 0 0 1 1
0 0 0 0 1

And that result is the transitive closure.
Next steps


What about for sparse graphs?
Connection to matrix multiplication?
https://www.jsoftware.com/papers/tot.htm 3.4 directed graphs
https://www.jsoftware.com/papers/50/
https://dfns.dyalog.com/n_Graphs.htm
https://github.com/quantbin/kdb/blob/master/kdb-flex-integration/kui/src/qidioms.txt
https://www.jsoftware.com/papers/50/50_35.htm [via gitonthescene - thanks!]