hashmal/shirka_draft.mkd

## shirka_draft.mkd

      
    Raw
  

              shirka_draft.mkd
            
          
    Shirka

Automatic garbage collection has made possible the implementation of many
flexible, high-level languages in which memory management can simply be
ignored most of the time. Unfortunately, garbage collectors have some
properties that make their use in several fields (namely real-time
applications) unpractical or even impossible:

Program execution is suspended during garbage collection cycles ("GC
pauses")
GC pauses can happen at any time

A considerable amount of work has been invested in making garbage collectors
compatible with real-time use by mitigating their drawbacks.
In this essay I describe Shirka, an attempt to make a programming language
that does not require the use of a garbage collector, but does not ask the
programmer to manage memory either. Many desirable properties of usual
languages are not considered (such as code readability, ease of use, etc) in
order to be free to bring ideas and interesting features to the table.
Underlying concept

An object can be deallocated safely when it is guaranteed that it won't be
used in the future. Determining this is the basis of all memory management
strategies. It is possible to make this task incredibly simple by making all
objects useable only once, thus guaranteeing that when it is used, it can
safely be deallocated. This is the foundation of Shirka.
The data-stack and what it can contain

When objects can only be used once, accessing them via variable names becomes
unpractical. A data-stack is more appropriate to manage such objects. As
there is already many documents describing stack-based languages, I won't
explain this in detail. In the rest of this essay I assume the reader to be
familiar with stacks and concatenative languages.
The data-stack is used to store values. All values have a type (the
language is dynamically typed), such as Number, Character, Atom or
List. Numbers and characters hopefully doesn't need to be explained here.
Atoms are similar to Lisp's atoms. They are mainly used to trigger the
execution of operations.
Lists are sequences of values (lists are themselves values, so nesting
lists is possible).
-- Examples of lists:
[1 2 3]
[42 foo :bar]
[]

-- Syntactic sugar allow the creation of strings
-- (strings are just lists of characters)
['a 'b 'c]
"abc"

Reserving operations

Reserving operations are used to "put aside" values from the data-stack.
Set-aside values can be retrieved by name (using an atom).
42 -> x -- 42 is not on the stack anymore. It's in `x'
x       -- 42 is copied from x. 42 is on the stack but another 42 is in x
<- x    -- 42 is moved from x to the stack. x is now empty (undefined)

Note

The syntax of reserving operations somewhat breaks the postfix nature of the
language. 42 -> x could arguably be represented as 42 :x -> making it
easier to manipulate reserving operations programmatically. There are several
reasons for this choice:

stack-manipulation operations use "visual hints" and 42 -> x fits well in
this while 42 :x -> doesn't (and looks more cluttered)
given the dynamic scoping nature of the language, things are error-prone
enough so encouraging magic via dynamic definition of words does not seem
right
reserving operations are different enough from regular operations to justify
a different syntax

-> x can be viewed as a single token, restoring a fully postfix syntax.
Maybe in the future I will add operations to create reserved values
programmatically, but it's not a priority.
Execution of operations and Homoiconicity

Lists can be executed, making Shirka homoiconic.
Making the language having this property is not gratuitous: Homoiconicity
makes possible to add constructs and powerful abstractions to the language,
helping a long way in making the language feel "high level".
As side-benefits, it also keeps the language small and simple, both in terms
of syntax and implementation.
-- Executing a list using the `!' operation

[3 2 +] !

-- The stack now consists of the value `5'.

Actually, a file containing Shirka code can be viewed as a list with implicit
brackets and an automatic ! operation.
[...]
-- Atoms
+ - * /
<< >> ><
foo bar
:foo :bar

-- Numbers
0 1 2 -1
3.14

-- Characters
'a 'b 'c

[...]
-- Stack manipulation

>>  -- duplicates the value at the top of the stack
<<  -- destroys the value at the top of the stack
><  -- swaps two values at the top of the stack

[using the data-stack]
[constants]
Scoping in Shirka is dynamic in nature. This is because lexical scoping,
given the set of features, would make closures possible and referenced objects
would lose their linear property.
42 -> val

(=> foo)
  [ 3.24 -> val
    [ val ] ]

foo puts --> 42

Exception handling

In "real" programs, things can go wrong. A language construct to recover from
errors is desirable, but problematic if the language is stack-based. Indeed,
when an exception occurs, the stack can be in an undetermined state. Some
intermediate values can clutter it, or some values may be missing. Shirka's
approach to the problem is to use a local stack on which an operation that
can fail is executed. If the execution succeeds, the local stack is sent back
to the "main" stack, and if it fails, some information about the error can be
placed on the stack. A symbol indicating the status of the computation is then
pushed so a following operation can take an appropriate action.
Defining Operations

Operations are just lists so there is no absolute need to define them in a
special way. However, it is possible to define them using => instead of
->. Operations defined this way are automatically executed when referenced.
"Looping" is achieved through recursion. It means that tail optimisation is
required by the language.
(=> loop-forever)
  [ [ op tail ] => tail
    => op
    tail ]

["hello" puts] loop-forever

Conditional execution uses standard (but intrinsic) operation calls.
(when)
  [ true  [ ... ]
    false [ ... ] ]

Syntax note

It is possible to simulate a prefix syntax using parenthesis:
foo bar
(bar) foo
-- The two above lines are equivalent.

It is particularly useful for operation definitions:
(=> foo) [
  -- operation body
]

Implementation note

Tail optimisation can be implemented by replacing the current operation on
the call-stack with the last operation in the list. There is a subtle
difference between a normal call and a tail optimised call: while in the
former, parent definitions cannot be altered, they can be in the latter.
Hard-to-find are likely to appear in there :)
Polymorphism

Parametric polymorphism can enhance readability, succinctness and flexibility.
It also encourages code reuse. All these properties help a long way in making
a language feel dynamic and high-level. In Shirka I currently three possible
ways to achieve this:

provide several intrinsic operations checking for type (e.g is_number?)
and let every operation perform the desired checks.
define operations for a type (similarly to OOP) and let the interpreter
dispatch based on the top-most stack value.
let the interpreter choose operations based on the type of multiple stack
values

The last option requires that all operations provide how many "arguments" they
need. I feel it wouldn't be a good fit.
The "OO-like" option is tempting but it's still unnatural to define such
operations given the current language.
The first one is not particularly elegant but it gets the job done and is not
too cumbersome as the user can't define new types. We'll choose this option
which is also simple to implement.
-- Following operations consume a single value and producean identical
-- value and a boolean.

Number?
Character?
List?
Atom?

Concurrency

The language discussed so far doesn't require a garbage collector and
arguably feels high-level. It is, however, impractical because there can't
be global, mutable state (amon other things).
This issue can be solved by bringing a concurrency model inspired from Erlang.
Processes

A process is an execution unit with its own stack. A process can be spawned
like this:
[ 42 puts ] !!

The operation !! will return immediately, leaving a reference to the spawned
process on the stack. Processes are special in Shirka in that way (they are
not linear). However, it does not make things as bad as dangling pointers &
co. because they are not "raw" references.
Messages

Data (and execution sequences!) can be passed between processes.
(-> foo)
  [ (receive)
      [ get [ ... ]
        set [ ... ] ]
    foo ]


asynchronous
synchronous

Let it crash

[...]
This concurrency model unfortunately brings back unpredictable pauses in
programs (the whole point of Shirka is to avoid that!). I believe this can be
solved by giving directives to the scheduler, for example "timeouts" would
force the scheduler to prioritise process that are near expiration times. I'm
still unsure about this.


Conclusion

impractical because of stack and dynamic scoping
stack can be mitigated using reserving operations
however the language solves the originally stated problem: avoiding GC
while still providing a "high level, script" feel.
topics not discussed: polymorphism