Skip to content

Instantly share code, notes, and snippets.

@hashmal
Last active July 3, 2020 14:03
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hashmal/5263188 to your computer and use it in GitHub Desktop.
Save hashmal/5263188 to your computer and use it in GitHub Desktop.
Design of the Shirka programming language.

Shirka

Automatic garbage collection has made possible the implementation of many flexible, high-level languages in which memory management can simply be ignored most of the time. Unfortunately, garbage collectors have some properties that make their use in several fields (namely real-time applications) unpractical or even impossible:

  • Program execution is suspended during garbage collection cycles ("GC pauses")
  • GC pauses can happen at any time

A considerable amount of work has been invested in making garbage collectors compatible with real-time use by mitigating their drawbacks.

In this essay I describe Shirka, an attempt to make a programming language that does not require the use of a garbage collector, but does not ask the programmer to manage memory either. Many desirable properties of usual languages are not considered (such as code readability, ease of use, etc) in order to be free to bring ideas and interesting features to the table.

Underlying concept

An object can be deallocated safely when it is guaranteed that it won't be used in the future. Determining this is the basis of all memory management strategies. It is possible to make this task incredibly simple by making all objects useable only once, thus guaranteeing that when it is used, it can safely be deallocated. This is the foundation of Shirka.

The data-stack and what it can contain

When objects can only be used once, accessing them via variable names becomes unpractical. A data-stack is more appropriate to manage such objects. As there is already many documents describing stack-based languages, I won't explain this in detail. In the rest of this essay I assume the reader to be familiar with stacks and concatenative languages.

The data-stack is used to store values. All values have a type (the language is dynamically typed), such as Number, Character, Atom or List. Numbers and characters hopefully doesn't need to be explained here.

Atoms are similar to Lisp's atoms. They are mainly used to trigger the execution of operations.

Lists are sequences of values (lists are themselves values, so nesting lists is possible).

-- Examples of lists:
[1 2 3]
[42 foo :bar]
[]

-- Syntactic sugar allow the creation of strings
-- (strings are just lists of characters)
['a 'b 'c]
"abc"

Reserving operations

Reserving operations are used to "put aside" values from the data-stack. Set-aside values can be retrieved by name (using an atom).

42 -> x -- 42 is not on the stack anymore. It's in `x'
x       -- 42 is copied from x. 42 is on the stack but another 42 is in x
<- x    -- 42 is moved from x to the stack. x is now empty (undefined)

Note

The syntax of reserving operations somewhat breaks the postfix nature of the language. 42 -> x could arguably be represented as 42 :x -> making it easier to manipulate reserving operations programmatically. There are several reasons for this choice:

  • stack-manipulation operations use "visual hints" and 42 -> x fits well in this while 42 :x -> doesn't (and looks more cluttered)
  • given the dynamic scoping nature of the language, things are error-prone enough so encouraging magic via dynamic definition of words does not seem right
  • reserving operations are different enough from regular operations to justify a different syntax

-> x can be viewed as a single token, restoring a fully postfix syntax. Maybe in the future I will add operations to create reserved values programmatically, but it's not a priority.

Execution of operations and Homoiconicity

Lists can be executed, making Shirka homoiconic.

Making the language having this property is not gratuitous: Homoiconicity makes possible to add constructs and powerful abstractions to the language, helping a long way in making the language feel "high level".

As side-benefits, it also keeps the language small and simple, both in terms of syntax and implementation.

-- Executing a list using the `!' operation

[3 2 +] !

-- The stack now consists of the value `5'.

Actually, a file containing Shirka code can be viewed as a list with implicit brackets and an automatic ! operation.

[...]

-- Atoms
+ - * /
<< >> ><
foo bar
:foo :bar

-- Numbers
0 1 2 -1
3.14

-- Characters
'a 'b 'c

[...]

-- Stack manipulation

>>  -- duplicates the value at the top of the stack
<<  -- destroys the value at the top of the stack
><  -- swaps two values at the top of the stack

[using the data-stack] [constants]

Scoping in Shirka is dynamic in nature. This is because lexical scoping, given the set of features, would make closures possible and referenced objects would lose their linear property.

42 -> val

(=> foo)
  [ 3.24 -> val
    [ val ] ]

foo puts --> 42

Exception handling

In "real" programs, things can go wrong. A language construct to recover from errors is desirable, but problematic if the language is stack-based. Indeed, when an exception occurs, the stack can be in an undetermined state. Some intermediate values can clutter it, or some values may be missing. Shirka's approach to the problem is to use a local stack on which an operation that can fail is executed. If the execution succeeds, the local stack is sent back to the "main" stack, and if it fails, some information about the error can be placed on the stack. A symbol indicating the status of the computation is then pushed so a following operation can take an appropriate action.

Defining Operations

Operations are just lists so there is no absolute need to define them in a special way. However, it is possible to define them using => instead of ->. Operations defined this way are automatically executed when referenced.

"Looping" is achieved through recursion. It means that tail optimisation is required by the language.

(=> loop-forever)
  [ [ op tail ] => tail
    => op
    tail ]

["hello" puts] loop-forever

Conditional execution uses standard (but intrinsic) operation calls.

(when)
  [ true  [ ... ]
    false [ ... ] ]

Syntax note

It is possible to simulate a prefix syntax using parenthesis:

foo bar
(bar) foo
-- The two above lines are equivalent.

It is particularly useful for operation definitions:

(=> foo) [
  -- operation body
]

Implementation note

Tail optimisation can be implemented by replacing the current operation on the call-stack with the last operation in the list. There is a subtle difference between a normal call and a tail optimised call: while in the former, parent definitions cannot be altered, they can be in the latter. Hard-to-find are likely to appear in there :)

Polymorphism

Parametric polymorphism can enhance readability, succinctness and flexibility. It also encourages code reuse. All these properties help a long way in making a language feel dynamic and high-level. In Shirka I currently three possible ways to achieve this:

  • provide several intrinsic operations checking for type (e.g is_number?) and let every operation perform the desired checks.
  • define operations for a type (similarly to OOP) and let the interpreter dispatch based on the top-most stack value.
  • let the interpreter choose operations based on the type of multiple stack values

The last option requires that all operations provide how many "arguments" they need. I feel it wouldn't be a good fit.

The "OO-like" option is tempting but it's still unnatural to define such operations given the current language.

The first one is not particularly elegant but it gets the job done and is not too cumbersome as the user can't define new types. We'll choose this option which is also simple to implement.

-- Following operations consume a single value and producean identical
-- value and a boolean.

Number?
Character?
List?
Atom?

Concurrency

The language discussed so far doesn't require a garbage collector and arguably feels high-level. It is, however, impractical because there can't be global, mutable state (amon other things).

This issue can be solved by bringing a concurrency model inspired from Erlang.

Processes

A process is an execution unit with its own stack. A process can be spawned like this:

[ 42 puts ] !!

The operation !! will return immediately, leaving a reference to the spawned process on the stack. Processes are special in Shirka in that way (they are not linear). However, it does not make things as bad as dangling pointers & co. because they are not "raw" references.

Messages

Data (and execution sequences!) can be passed between processes.

(-> foo)
  [ (receive)
      [ get [ ... ]
        set [ ... ] ]
    foo ]
  • asynchronous
  • synchronous

Let it crash

[...]

This concurrency model unfortunately brings back unpredictable pauses in programs (the whole point of Shirka is to avoid that!). I believe this can be solved by giving directives to the scheduler, for example "timeouts" would force the scheduler to prioritise process that are near expiration times. I'm still unsure about this.


  1. Conclusion
    • impractical because of stack and dynamic scoping
    • stack can be mitigated using reserving operations
    • however the language solves the originally stated problem: avoiding GC while still providing a "high level, script" feel.
    • topics not discussed: polymorphism
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment