chrispsn/domain.md Secret

## domain.md

      
    Raw
  

              domain.md
            
          
    k 'domain' adverb design

tl;dr

A 'domain' adverb could let you:

specify whether to filter a list or dict by keys or values
shuffle list elements via arithmetic, without referencing the list twice
replicate k7's 'step functions' via binary search.

It also obviates having separate 'grade' (<>) and 'sort' (^) verbs.
Background

Some verbs treat lists and dicts the same: they use the domain (indexing @., where &, find ?, match =). For dicts, the domain is its keys; for lists, its indices (!#).
For other verbs, traditionally, k dialects have made pragmatic choices about whether an operation works on keys or values. Searches work on values; arithmetic and filters (take, drop) work on values for lists and keys for dicts.
It'd be nice to get something that worked 100% of the time instead of 80%. So what if you could just specify which part of the 'map' you wish to work on?
For example, perhaps we want to take every second element of a list - something like:
 (2 mod) # !8          / pseudosyntax
1 3 5 7

Or perhaps we want to index into a dict based on the closest lower bound (ie a step function):
 (1 3 6!`a`b`c)': !8   / pseudosyntax
``a`a`b`b`b`c`c 

For reasons that become clear, we're going to use the symbol ^ for this adverb.
How would the adverb behave?

Here is a work-in-progress model:
 domain1: {$[isList y; .; swap] x swap y}
 domain2: {$[isLhsFn x; x[swap y;z]; $[isList z; !<(#z)mod'; swap] x[y; swap z]]}
 
 isLhsFn: {|/x~/:(@; ?; '; ':; !; binDict)}  / searches and indexing
 isList:  {&/(~`.~@x;1~#$@x;~x~*x)}
 swap:    {x[]!$[isList x; !#x; !x]}

Arithmetic (+-*% and mod)

Domain arithmetic is straightforward for dicts:
 1 +^ 1 2!`a`b   / pseudosyntax
2 3!`a`b

But how should it work for lists? My view is that the range (values) should be unaffected, but shuffled. The data structure should remain a list, and no values should be dropped or nested.
What about repeats? We already deal with this with grade - the copy that comes later is assigned a higher index. < 0 1 0 => 0 2 1. This means we get easy weaves, for example 'take every even element, then every odd element':
 2 mod'^ `a`b`c`d`e`f   / pseudosyntax
`a`c`e`b`d`f

What about out-of-bound indices? Grade already brings the result back into a permutation of !#L - in theory that's enough. But I suggest that before the grade is applied, we (#L)mod the indices. This means we get easy rotates:
 1 +^ `a`b`c`d   / pseudosyntax
`d`a`b`c         / notice `d looped around

How to justify this? Well, we already use clock arithmetic for out-of-bounds:
 5#`a`b`c
`a`b`c`a`b

Admittedly the mod step was a practical choice, but I wonder if there's a theoretical basis? Perhaps a difference between indexing and counting?
To be clear, all of this can already be done via indexing, but it saves having to reference the list twice. Compare:
 x (#x)mod -1+!#x:`a`b`c`d
`d`a`b`c

 1 +^ `a`b`c`d   / pseudosyntax
`d`a`b`c

Sort (grades <>, sort ^)

k traditionally used grade to sort a list via indexing:
 {x@<x}`c`a`b
`a`b`c

Later k versions introduce sort as ^:
 ^`c`a`b
`a`b`c

With a domain adverb, <> can become 'sort values ascending/descending', and we can get back the grades as:
 <^ `c`a`b   / pseudosyntax
1 2 0

It felt like a waste to have three primitives for sorting, and weird to have an 'asc' symbol with no 'dsc' symbol. With a domain adverb, we can reduce the real estate use back to two and regain symmetry.
Filters (take x#M, drop x_M)

No need to decide whether 'filter by domain' is more/less common than 'filter by range' for lists vs dicts - just make them all 'filter by range' and get domain variants via v^:
 0 7 4 #^ `c`a`b`a`d   / pseudosyntax
`c`d
 (2 mod) # !8          / every second element
1 3 5 7

Search (find M?, has M', binary search M':)

The biggest win here is that domain-ised binary search replicates k7's step functions:
 d: 1 3 6!`a`b`c
 d':^ !8   / pseudosyntax
``a`a`b`b`b`c`c

Which symbol?

Perhaps it is a good fit for a light symbol like ^. We don't lose much:

Monadic: The 'sort' behavior is already moved to < and > (see above)
Dyadic: We could keep the i^ and I^ behaviours but would lose f^ (which I suspect is of limited use anyway [2020-11-08 - my view is changing...]).

Given it's an adverb, we could also lose the assignment variation in favour of using I^: as reshape (I^:) and perhaps even f^: as commute (though query whether that's worth the reduction in readability).
Why not do this?


It special-cases behaviour depending on the verb. 'Searches' assume the 'main' map is on the left-hand side, while 'filters' assume it's on the right-hand side.
How does it work for lambdas and compositions?
Is it hard to implement efficiently?
Is it useless or meaningless for some verbs (such as group =)?
Can get much of the same benefit with a 'swap' verb that exchanges dict domain and range (even if you have to swap back)?

Digression: lists vs dicts

Trivially, a list's domain is !#L. But a list sometimes behaves differently to its dict equivalent ((!#L)!L), even if we leave aside domain/range targeting. Let's look at merge (,) and addition:
 (`a`b!1 2) , `b`c!3 7
a|1
b|3
c|7

 1 2 , 3 4 5
1 2 3 4 5

 (`a`b!1 2) + `b`c!3 7
a|1
b|5
c|7

 1 2 + 3 4 5
1 2 + 3 4 5
^
:length

So then, what does it mean to talk about the 'domain' of a list? It's an ordering - but what else? A count?
On one view, lists don't have a domain. It's just one implied by their ordering. But dicts also have ordering. It could be interesting if there was a way to 'index' into a dict using its ordering alone; filter (#) could be that way.
Test script - works in 2020.09.17:

/ SETUP

isLhsFn: {|/x~/:(@; ?; '; ':; !; binDict)}  / searches and indexing
isList:  {&/(~`.~@x;1~#$@x;~x~*x)}
swap:    {x[]!$[isList x; !#x; !x]}

/ ?D currently undefined. 'unique-keyed entries'
uniqDict: {k:!x; !/(k;x[])@\:?k?k}
(`a`b`c!1 2 1) ~ uniqDict `a`b`a`c!1 2 3 1

/ D':L currently rank error
binDict: {(!x)@x[]':y}


/ ADVERB MODELS

domain1: {$[isList y; .; swap] x swap y}
domain2: {$[isLhsFn x; x[swap y;z]; $[isList z; !<(#z)mod'; swap] x[y; swap z]]}


/ MONADS

/ sort becomes grade. (<> would change to 'sort (by) values')
/ nb: ^ is 'sort by domain' for dicts in current k9 - this would change to be 'sort by range'
(<L) ~ domain1[^] L: `c`a`b`a`d
(<D) ~ domain1[^] D: `a`b`a`c!1 2 3 1
(^D) ~ domain1[<] D: `a`b`a`c!1 2 3 1

/ unique becomes:
/ list: 'indices of unique elements'
/ dict: 'unique-valued entries'
0 1 2 4        ~ domain1[uniqDict] L
(`a`b`a!1 2 3) ~ domain1[uniqDict] D


/ DYADS

/ addition is rotate
`c`d`a`b ~ domain2[+][2;`a`b`c`d]
`b`c`d`a ~ domain2[+][-1;`a`b`c`d]

/ mod is 'weave'
`a`d`b`e`c`f ~ domain2[mod'][3;`a`b`c`d`e`f]

/ take works on range for lists, but domain for dicts.
/ so we force 'always range' via the workaround below.
rangeTake: {swap x # swap y}
`c`d       ~ domain2[rangeTake][0 7 4; L]              / L#: like indexing but skips missing elements
1 3 5 7    ~ domain2[rangeTake][`b 2 mod; !8]          / F#: take every second element
(1 3!`a`c) ~ domain2[rangeTake][1 2 3; 1 4 3!`a`b`c]   / D#: keep entries where values match

/ drop-domain would drop elements at specific indices, rather than those that appear in the x-arg

/ index becomes the find primitive, but with 'not found' as null
0 4 0 1 ~ domain2[@][L; `c`d`z`a]  / nb: will change if out-of-domain results in 0N instead of 0
`a``b   ~ domain2[@][D; 3 7 2]

/ find just becomes index
`c`d`a` ~ domain2[?][L; 0 4 3 7]
2 0 1   ~ domain2[?][D; `b`z`a]

/ bin becomes, for dicts, step fn indexing
``a`a`b`b`b`c`c ~ domain2[binDict][1 3 6!`a`b`c; !8]

\\