Skip to content

Instantly share code, notes, and snippets.

@mlochbaum
Last active January 26, 2024 05:14
Show Gist options
  • Save mlochbaum/401e379ff09d422e2761e16fedbcd506 to your computer and use it in GitHub Desktop.
Save mlochbaum/401e379ff09d422e2761e16fedbcd506 to your computer and use it in GitHub Desktop.
Iridescence?

What Iridescence do

The idea for Iridescence is to make a high-level array language, where APL, K, and BQN are low-level. Those languages leave type and axis information implicit, forcing you to work out how axes correspond in your head. But Iridescence:

matmul(A[i,j],B[j,k]) = over i,j/+,k: A*B

This matrix multiply implementation indicates the argument shapes including the constraint that A's trailing axis matches B's leading one. This fact about the shape can then be checked statically. The result shape could also be specified with matmul(A[i,j],B[j,k])[i,k].

More information is presented, and in a more human-digestible way, so Iridescence code will be longer than the equivalent BQN, much like C often takes up more space than equivalent machine code.

Type basics

Iridescence has multi-dimensional arrays and multi-argument functions, with identical syntax for both. One of these is called/indexed with parentheses f(a,b,c) or UFCS a.f(b,c), and its type is written with arguments/axis types then result type like [a,b,c]t.

The apparently opposite ordering is needed for consistency. A function is not its result, in fact it's the furthest thing away from it! See how the function type smoothly transitions to a result type as we apply arguments—an alternate ordering would break this.

fun:[a,b][c,d,e]t  # function/array type declaration
fun(a,b):[c,d,e]t  # applied once with type annotation
fun(a,b)(c,d,e):t

The : separating value from type is optional if the type starts with [, and the result type can be unspecified, which allows a type annotation like A[i,j] in matmul above. And when calling a function with 1 or 0 arguments, UFCS can be simplified, so that arg.fn means fn(arg) and .fn means fn().

Numeric types Nat (0, 1, ...), Int, and Float exist. A natural number also indicates the set of numbers less than it as a type (the von Neumann convention). So 2 is the boolean type. A lone . indicates any atomic (not array or function) type. So [3]Float is the type of arrays with 3 floating-point numbers, perhaps 3D vectors.

To specify a list of unknown length, the constraint is that the domain could be any natural number. This is written list[i<Nat], which annotates list as having length i (not too sure about < but I don't think : can be used). And list[Nat] is an infinite list, which could be defined as a function instead since infinite hard drives aren't so cheap. How about a function that takes and returns a list, say APL's Grade?

grade:[[i<Nat,*]][i]i

Given an array of length i—the * indicates any number of extra axes—you get a list [i]i, so a list of indices. Index-of is a more complicated example, showing how names are used to straighten out multiple types.

find:[[i<Nat,*]t, [j]t][j](i+1)

Type inference?

In order to allow partial types like list[10] for an argument, there needs to be either some heavy-duty type inference, or gradual typing, so that anything after the [10] is checked dynamically and not statically. I like gradual typing better for this, as it handles anything at all without much hassle. Singeli only has forward type inference (an operation's return type can be inferred from arguments), and that seems practical enough. And it may make dependent typing more sensible: you don't have to statically sort out what it means for your types to have types and so on ad infinitum. Instead you evaluate with dynamic types at the highest levels and work down from there.

Something that nominally has a specific type can be used where a less specific one is needed, like [i<Nat][j]t could be used as [i], or passed to grade above. When importing from another file, should the type be specified in the import, something like import reverse[[i]t][i]t from ...? In that case, you could also use a more vague type than the file defines, if you don't care about all the details.

With dynamic checking you could also cast [i] to [i<Nat][j]t, although maybe this should have to be explicit. You'd get a runtime error if execution gets to the cast and the value doesn't fit the stricter type.

For-each

The declared type of a variable doesn't just constrain what axes it can have, but names them as well. This powers the for expression, which maps over axes.

each(fn:[a]b, list:[t]a):[t]b = for list over t: list.fn

(Hope I can use Python-like syntax but some parts like the : might be ambiguous)

The result of the for loop gets its axes from the over part, [t] here. So it shares them with the values that it maps over (list). Inside this loop, list is shadowed, and indicates a single element. This has worked out well in Singeli; there will also be an in syntax to avoid shadowing, but not from/to. To map over a range, build the range and then map over it.

The shortcut syntax over indicates that all variables having those axes should be mapped over. Here are a few examples with matrices.

diag(mat[i,i]) = over i: mat

transpose(mat[i,j]) = over j,i: mat

addrows(mat[i,j], row[j]) = over i,j: mat + row

If you want a different mapping, for example if you have mat[i,i] but want to treat the axes separately, you can cast in the for loop, such as for mat[i,j] over i,j.

Mapping over only some axes would be nice but presents some inconsistencies: if it's used, full mapping over [*]t would act on []t values instead of t and that's inconvenient.

for mat[i,j] over i:   mat.reverse  # Reverse each row
for mat[i,j] over i,j: mat.reverse  # Reverse each element?

Some sort of annotation, such as for& mat& over i, might be okay. The mat& indicates mapping over some axes, and for& indicates the mapped and unmapped axes are merged. That is, for& results in type [i,j] while plain for would give [i][j].

Mapping with functions

for works on functions too! Given f[a,b] and g[a,b], these two define equivalent functions:

over a,b: f + g
anon(a,b) = f(a,b) + g(a,b)

It's a more explicit form of a train.

Mixing functions and arrays like this should yield an array; for instance adding a function to a matrix in this way would evaluate the function at each valid index and add it to the matrix's value. It's likely that a function used this way would have a domain larger than the matrix, such as fn[Float,Float] plus mat[5,6]. For this to work, the function at some point needs to be restricted to the domain of the matrix. A convention that allows thing(a:Float, b:Float) = b + 3*a somewhere and thing + mat later might be nice, but if instead it's a matrix thing[5,2] then the implicit cast is definitely unwanted.

Multi-arguments

Functions have any number of arguments, and can also take a variable number of them. Builtins that do this should be defined as a unified whole, unlike APL-style overloading. For example + would add all its arguments. To sum a list, apply + to all arguments:

list/+

This is nicer than a fold because it doesn't specify a slow and inaccurate evaluation order (although the actual evaluation order is implementation-defined, a bit weird). The / syntax is meant to mirror the . for a UFCS call, although maybe that symbol won't work out.

The for syntax allows a function to be applied along a result axis with /+. Here's matmul again, and an equivalent expression:

matmul(A[i,j],B[j,k]) = over i,j/+,k: A*B

matmul(A[i,j],B[j,k]) = (over j,i,k: A*B)/+

So the reduced axis is automatically pulled to the front, and + is applied. Here we clearly want the reduction that works on cells, but this again leaves us with an unwanted [] much of the time. Should I have actually written /&?

The type of +, on floats at least, is [*Float]Float. Other cases might need more sophisticated typing because they care about how many arguments a function has. For example, the type of each should look something like this, using k*t to indicate k copies of t? But generally we don't want to force all these types to be the same. See next section.

each[[k*a]t, k*[i]a][i]t

Dependent types

Consider the function take, that if given an array and a length returns a prefix of the array. Here's how I'd like to write the type of this function:

take[[i<Nat,*a]t, j:i+1][j,*a]t

Unlike i, which is only ever a type, j is the value of the second argument, and also the type of take's result. The syntax j:i+1 to indicate this is different from i<Nat, which is just a constraint on the current type. Both of these might appear together, for example in the type of a triangular nested array:

tri[j:i<Nat][j].

This makes [j] a dependent type, which is a little scary. It seems like languages that support these have to devote a lot of effort to it.

What Iridescence brings to the table is not actually type dependence but independence: the various types in [a,b,c] shouldn't depend on each other. This is what makes arrays homogeneous!

What about the two is in take's type? I think this is polymorphism rather than dependent typing. The function take is supported on those types for every natural number i. But the way to figure out which type j belongs to is to inspect the first argument—specifically, check its length—which is concerning. Not sure, hope it works out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment