Skip to content

Instantly share code, notes, and snippets.

@raiph
Last active May 6, 2022
Embed
What would you like to do?
Raku's "core"

Then mathematical neatness became a goal and led to pruning some features from the core of the language. ... Another way to show that LISP was neater than Turing machines was to write a universal LISP function and show that it is briefer and more comprehensible than the description of a universal Turing machine.

— John McCarthy, History of Lisp

TL;DR Many presume the mathematical theory of functions is the sole right foundation for programming theory and languages. In contrast, Raku adopts the mathematical theory of actors (cf Scala's Akka) as its core primitive (though the capacity of an actor to act as an object is leaned on for less demanding settings than arbitrary heterogeneous/concurrent/distributed computing). This article drills down to that core: a metamodel ("model of a model") of a model of computation ("how units of computations, memories, and communications are organized").1

Why I wrote this article

This article began with u/faiface's reddit "I'm impressed with Raku"2. One sentence in particular stood out for me:

I still generally prefer languages with a small, orthogonal core

I took this to mean they thought Raku did not have a small, orthogonal core. So I wrote a personal response to them that has morphed into this article.3

From here to the core

Feel free to start and end your read of the rest of this article wherever you prefer, but I've tried to make it fun, and recommend those who don't know Raku start at the start.

  • What is Raku's CORE? Raku's CORE is the symbol table containing functions like print and operators like +. This so-called CORE is not the "inner core" we seek.

  • What language is Raku's CORE written in? Raku isn't just one language. Instead it's a mutable "braid" of mutually embedded sub-languages. In fact they're so mutable that they can even all disappear! So there's no fixed syntax in Raku. The only thing that's constant is an underlying "single semantic model".

  • What is Raku's "single semantic model"? I touch on some answers, ending with one that leads us to drill down through the lower layers of Raku, and down through Rakudo, the Raku compiler.

  • What is nqp? Rakudo is built atop a subset of Raku named nqp. nqp is a programming language for writing compilers. The nqp compiler, which is written in nqp, targets NQP, an abstract VM that targets various concrete backends. NQP is also written in nqp, so to get to the inner core we must drill down another level, focusing on a concrete NQP backend.

  • What is MoarVM? MoarVM, short for "Metamodel on a runtime VM", is Rakudo's concrete backend for use in production settings. It implements Raku's single semantic model directly "on the metal" as it were, via a tiny singleton data structure called KnowHOW.

  • What does KnowHOW know how to do? KnowHOW is a self-describing structure that knows how to do just one thing: to be the KnowHOW from which both it and all else can be bootstrapped.

  • Have we truly arrived at Raku's core? Good question. Suffice to say, KnowHOW is Raku's core primitive.

What is Raku's CORE?

As with any PL (programming language), there are cores within cores within cores until you hit electrons. Let's start by identifying Raku's equivalent of a standard library. We can safely presume the core we seek is more primitive than the standard library.4

Consider this four line Raku program:

print 42 + 99;          # 141
print &print.file ;     # ...src/core.c/io_operators.rakumod
print &infix:<+>.file;  # ...src/core.c/Numeric.rakumod
print ?CORE::<&print>;  # True 

The first line demonstrates that print can be called without any explicit import, and + works without any import too. The second and third lines call the .file method on the symbols &print and &infix:<+> to reveal the source code corresponding to the print function call and + operator.5 The final line shows that the &print symbol is stored in a symbol table named CORE (as is &infix:<+>).6

All of Raku's surface features are technically "user-defined", including all of its CORE. These surface features are not the core we seek. We seek something more primitive. To move toward that we next look at the language in which the CORE code is written.

What language is Raku's CORE written in?

Informally speaking, Raku's CORE is written in a simpler "core" language. But here's where things start to get a bit trickier to explain.

This simpler "core" language is actually a braid of even simpler sub-languages aka "slangs". Slangs are DSLs specifically designed to work together to comprise a language made of relatively small parts.7 (Where by "small" I mean relative to the "core" language they are each a piece of, and very small compared to the surface CORE that's built atop the "core" language.)


The "standard" braid (that ships with a Raku compiler) currently includes a half dozen slangs (a GPL, plus DSLs for strings, regexes, embedded documentation, and so on) that mutually embed each other.

Devs can replace, alter, add or remove individual syntax rules or semantics of any slang, and thus the "core" language.

Devs can go further, adding entire slangs of their own to the braid. Or remove slangs; devs can even remove all of the slangs in the braid if they wanted to go to the extreme of yielding zero surface syntax. (Such that, after the last slang/syntax is removed, any program with any further code after that -- even just comments or whitespace -- would fail to compile, because it would be guaranteed to be invalid.)

Where's the core in a Ship of Theseus that can completely vanish en-voyage?


Larry Wall, Raku's lead designer, wrote in his 2001 Apocalypse #1:

Raku will support multiple syntaxes that map onto a single semantic model.

It turns out that Raku's syntax is entirely arbitrary and mutable. So Raku's real inner core isn't anything to do with syntax. Instead, it's some part of Raku's "single semantic model".

What is Raku's "single semantic model"?

As part of writing this article I asked "What is the “semantic model” introduced in Apocalypse #1?" on StackOverflow.8 The answer by jnthn, the lead dev of the Rakudo compiler toolchain covered a range of options, all of which are interesting.

This option sounded enticing:

We could see RakuAST as an alternative syntax for Raku expressed as an object graph. Given it will also be the representation the compiler frontend uses for Raku code, we can also see it as a kind of syntax-independent gateway to the Raku semantic model.

This is why jnthn has written "RakuAST will be found at the very heart of Rakudo". But the truth is, it will still only be a "gateway" to what we seek (albeit a "syntax-independent" one).

Instead, to continue our journey to the inner core, we'll go with another of jnthn's options:

An [interpreter or compiler] implemented in some other language (in which case we lean on its semantic model)

Most of Raku, and the Raku compiler Rakudo, is written in Raku. But it's bootstrapped9 from lower levels. And the next level down is nqp.

What is nqp?

nqp is:

  • A subset of Raku, the middle "doll" in Rakudo's stack of three self-similar systems.10

  • A programming language / system focused on constructing and compiling programming languages.11

The nqp compiler and its standard libraries are written in nqp, so focusing on nqp isn't really going to help progress toward the inner core of things. So instead we next look at what nqp targets: NQP, an abstract VM that runs nqp (and Raku). But NQP is almost entirely written in... nqp!

Have we run out of road? No, we just need to figure out how nqp/NQP maps to "the metal", to machine code running on hardware.


The final steps in our journey to the center of Raku are clearly marked on the map Larry Wall was sketching out back in 2001. Immediately following the important first sentence I quoted above, and repeat below, he wrote a second even more important sentence:

First, Raku will support multiple syntaxes that map onto a single semantic model. Second, that single semantic model will in turn map to multiple platforms.

So, to continue our journey, we recognize that running Rakudo, or the nqp/NQP subset of Raku(do), means running with a selected backend appropriate for a given platform. It's one of these backends that's actually running code on an underlying platform, so we need to look at one of these backends to see Raku's real core, where Raku's ultimate underlying semantics meets "the metal".

Rakudo/nqp/NQP experimentally just about support JVM and JS backends, but we're going to focus on the only backend that's currently production status as well as running on a wide range of OS/hardware combinations: MoarVM12.

What is MoarVM?

At the start I wrote:

Raku adopts the mathematical theory of actors (cf Scala's Akka) as its core primitive (though the capacity of an actor to act as an object is leaned on for less demanding settings than arbitrary heterogeneous/concurrent/distributed computing). This article drills down to that core: a metamodel ("model of a model") of a model of computation ("how units of computations, memories, and communications are organized").

Raku's metamodel -- its model of Raku's single semantic model -- is known as 6model. It can be built atop existing platforms (and is, for the JVM and JS backends), but it can also be implemented directly on the "metal". And MoarVM -- Metamodel on a runtime VM -- does just that, in C.

6model is implemented as a single data structure with associated code. On MoarVM the data structure is a C struct declared in about 30 lines of C code.

Saying it's 30 lines of code is cheating in the sense that this struct makes use of other declarations and setup code. But I think it's fair to say it's pretty small. And it's definitely the core primitive; the entirety of Raku is bootstrapped from this one data structure, by creating copies of it with different initial values, and fanning messages out to the copies, which themselves create more copies of themselves, and so on, in an ever widening system.

This primordial "actor/object", a singleton "self-describing" datum that combines data/state and code/behaviour, is named KnowHOW.

What does KnowHOW know how to do?

Let's momentarily zoom out to the 30,000 foot view and then, with all the setup done so far in this nearly finished article, rapidly drill back down.

We can zoom out and then back in with these four lines of Raku code:

This Raku code... Displays name of... Which is... As computed by...
say 42.^name 42's WHAT object Int Raku code, which calls...
say 42.HOW.^name Int's HOW object Raku::Metamodel::ClassHOW nqp code, which calls...
say 42.HOW.HOW.^name Raku::Metamodel::ClassHOW's HOW object NQPClassHOW nqp code, which calls...
say 42.HOW.HOW.HOW.^name The core primitive KnowHOW backend code
  • 42 is just a random Raku value I chose as a starting point. The drill down through the layers will arrive at the same core primitive regardless of whether we started with an int32 value, an exception, type object, operator, function, keyword, whatever. WHAT is Raku's macro/method for returning a value's corresponding type object.

  • 42.HOW returns a How Objects Work object (aka "metaobject"). 42.HOW knows how objects of Raku's Int class work. The name of this HOW is Raku::Metamodel::ClassHOW. (If I'd chosen, say, a subset as the value -- subset foo -- the HOW would have been Raku::Metamodel::SubsetHOW.) HOWs can live in ordinary Raku userspace, but all HOWs shipped with the Raku compiler Rakudo are written in nqp. (While they look like Raku code, they're written in a subset of full Raku, stored in files with a .nqp file extension, and compiled directly by the nqp compiler.) So this Raku::Metamodel::ClassHOW is an instance of an nqp class that implements the mechanics of a Raku class in general, abstracted from the specifics of any particular Raku class.

  • 42.HOW.HOW is also an nqp object -- we're closing in on Raku's core and are now deep below its CORE, in code that's unaware of (full) Raku. (Though note that Raku and nqp remain 100% compatible due to them sharing the same metamodel.) 42.HOW.HOW is named NQPClassHOW. It knows how the nqp instance named Raku::Metamodel::ClassHOW works. It implements, in nqp, the mechanics of an nqp class in general, abstracted from the specifics of any particular nqp class.

  • 42.HOW.HOW.HOW is KnowHOW. This is the single data structure we reached earlier in prose in the previous What is MoarVM? section. Now we've met it in code. If Raku code is being run with the MoarVM backend, 42.HOW.HOW.HOW corresponds to the C struct I described earlier in abstract terms ("primordial ... self-describing ... datum that combines data/state and code/behaviour"). Now I'll be more concrete: it's MoarVM's C struct and code implementation of 6model that knows how MoarVM's C struct and code implementation of 6model works. For example it implements, in C, calling a function associated with a 6model data structure (aka a "method", though whether "objects" and "methods" exist at this bootstrap stage is like asking which came first, the chicken or the egg).

Have we truly arrived at Raku's core?

In one respect, no. Because this is just C code that targets underlying hardware.

But for the purposes of this article, we have the following satisfying result:

say 42.HOW.HOW.HOW.HOW.^name ;           # KnowHOW
say 42.HOW.HOW.HOW.HOW.HOW.^name ;       # KnowHOW
say 42.HOW.HOW.HOW.HOW.HOW.HOW.^name ;   # KnowHOW
...

That is to say, calling .HOW on 42.HOW.HOW.HOW returns its invocant, i.e. itself, i.e. this innermost KnowHOW knows how it itself works. In a self-similar fashion, it includes a slot declaring its type, and another declaring its type's constructor's kind, and both these slots point to itself. The upshot is that code that it includes for calling a function on a metamodel object, or accessing a slot's data, can be applied to itself.

So this ultimate KnowHOW -- the abstract conceptual singleton heart of 6model -- is Raku's core primitive, and there's a concrete implementation of it in each backend.

Footnotes

1 u/codesections has written in reddit sub r/rakulang "I don't fully understand why you say that Raku is built on an actor (rather than object) model.". I've written an initial response there.

2 The 13th highest ever scoring post in /r/programminglanguages at the time it was posted.

3 My original version of this article began: "Me too from many standpoints including: initial attraction and comfort zone; instinctual sense of formal aesthetics; deep debugging of a thorny problem; modifying a language; working on its compiler; and discussing a language's bowels with folk pointing out they prefer a small core. :)" But I didn't elaborate on any of those notions. Attraction, comfort, and aesthetics are too subjective. Debugging and working on a compiler are more tractable but I skipped those topics too. One thing I did write but have now elided from the body of the article, but want to keep here in this footnote for posterity, is a question I asked about PLs that reflects what I view as nice small FP cores: "Which is more your type of poison, Kernel or Frank?" (Kernel's author John Shutt sadly passed away earlier this year (2021) but I hope to see vau rise again.)

4 The "batteries included" distribution offered to newbies is "Rakudo Star". It includes the Rakudo compiler package plus additional docs, tools, and a collection of useful libraries. I ignore those libraries. The Rakudo compiler package includes a few libraries that have to be explicitly imported if their features are to be used. For example, the Test module requires you write use Test; to use its features. I ignore those too.

5 The .c in core.c stands for Raku Christmas, the first version of Raku released on Christmas Day 2015. The second major version, Raku Diwali, was released in November 2018, and there's a corresponding core.d folder. Modules in core.d are lexically concatenated with modules in core.c to form the pre-populated lexical scope ("setting") of Rakudo Diwali programs that is accessible via the symbol CORE. cf Haskell's Prelude (though Raku's setting is broken into two parts -- prologue and epilogue -- that form a sandwich with user code in the middle).

6 CORE isn't just for conventional functions, but operators too. Consider this one line Raku program that adds a factorial operator to the language:

sub postfix:<!> (Int \n where * > 0) { n == 1 ?? 1 !! n * (n-1)! }
              ^                                                ^

This postfix operator is added at the time the compiler parses the > (indicated with the first ^), so that it's immediately available in subsequent code (note how it's used in the operator definition body, indicated by the second ^). Click this tio.run link to see the above code fail. Note how the first line in the tio code (say 'program starts to run';) does not execute despite being the first line. This is because compilation fails -- the postfix ! is not yet part of the language when it's used by the say 5!; line. Next, cut that first line, paste it as the last line instead, and click the run/play button again; the code now successfully compiles and executes. This ability to extend the language within the language is used to ship a CORE that's full of functions like print and operators like +.

7 Raku is a general purpose language-oriented programming language. This is very useful in its own right, but in addition, Raku includes slangs that make it easy to apply language-oriented programming to the interesting problem of constructing a language. Not only that, but it includes those slangs in such a way that it's easy to apply language-oriented programming to the problem of constructing a language that makes it easy to apply language-oriented programming to the problem of making it easy to...

8 Only my 3rd ever SO question -- unlike my 300+ raku answers. In case it wasn't already obvious, I 💓 Raku. :)

9 Raku(do) is bootstrapped in several ways. ("Bootstrapping" is defined by Wikipedia as "a self-starting process that is supposed to proceed without external input".) This includes Raku culture itself, as seen in its conception and gestation, as well as aspects of:

  • Linguistic bootstrapping. Raku's design aims at the full range of programming language facility from a young child's acquisition via "baby steps" to advanced practioners creating their own language extensions, DSLs, or entirely new languages.

  • Compiler bootstrapping. Rakudo is bootstrapped in several ways. At the outer level there's CORE. As already explained, this isn't the starting point, and neither is the a Raku language, or rather collection of sub-languages of which it's comprised. Rakudo compiles Raku before a user's program is compiled, and it does that via nqp/NQP10, which is a bootstrapping compiler. It goes further than that too, but I'm getting ahead of the story.

10 nqp is a subset of Raku. It has the same braided architecture as Raku, but drops some of the sub-languages that Raku has. While its grammar (parsing) sub-language is a large chunk of Raku's (which inherits from it), nqp's other sub-languages are much smaller. nqp's equivalent of Raku's standard library is also tiny in comparison to Raku's. Similarly, NQP's concrete backends implement a subset of Raku that corresponds to A) the single semantic model and B) features that are best implemented at a low level for performance reasons.

11 For those interested in technical arcana, nqp is a bootstrapping, self-hosted, meta-compiler, a modern day retelling of META II.

12 An Erlang/Elixir/BEAM enthusiast wrote a fairly popular brief intro to MoarVM. They described it as "a fantastic piece of technology".

@samebchase
Copy link

samebchase commented Jul 15, 2020

I /love/ your SO answers. Have learnt a lot from them, thanks for all your efforts.

@uzluisf
Copy link

uzluisf commented Feb 3, 2021

Unfortunately I cannot say I understood everything but that's on me and I def need to reread it a few more times. However this is a highly informative write-up and I really liked the onion-layered view you gave of the language we know as Raku.

Yes, I've taken you on a long and perhaps silly journey. Hopefully you enjoyed reading it as much as I enjoyed writing it. :)

This is anything but silly. It suffices to say I always enjoy reading your write-ups about Raku and programming language design at large I come across on SO, Reddit, Github/Gitlab issues/gist, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment