Skip to content

Instantly share code, notes, and snippets.

@lovely-error
Last active August 4, 2020 07:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save lovely-error/474e251ad605e41bca3a25117df4129a to your computer and use it in GitHub Desktop.
Save lovely-error/474e251ad605e41bca3a25117df4129a to your computer and use it in GitHub Desktop.
Klim PL Black Paper

Introduction

The Rant

Since the beginning of a second half of the twentieth century the computer science has underwent through many discoveries and refinements(and still does), such transformations was happening to programming languages as well. Dozens of them of different flavors have emerged, whose purpose was diverse: from constructing artificial intelligence to translating mathematical formulas; there was some other wild applications. Whether fortunately or not, the graveyard now holds many of these blesses souls that died because of infitnes(yeap, DST is present here as well). As history unfold many better alternatives was offered and since 90 some of them - Java, Python, C++, C#, C, JavaScript - has sustained the presence on the market main stage and had become the defacto in many areas and received as much love as they did receive hatred. But while these has shown possibility of developing complex functioning systems, during the years of their existence, many disadvantages was revealed: Java has crappy generics and relatively slow development time(mostly because of the ammisly constructed syntax. Yikes!), c++ has acquired many features which was originally not intended for it, which turned the language into stitched monstrosity; in term of both syntax and semantic. The most famous trait of c++ is that it is fastest among existing and supported langs, to say a word in defense. The Python and JS are obviously make you suffer because of weak type systems, which make developing complex solutions either hard or not possible at all, these two are also DEAD slow and requires coupling with c++ to handle something like fast math operations(yeah, you heard it right), not to say that development/maintaining schemes for these awfull crutches exhausts people of profession; also it may just be spurning and unwanted(why to know 2 languages instead of 1? Essentially, when any of those are fundamentally identical and serve a purpose of instructing a Turing-machine).

The Software Crisis

Over the last 20 years the rapid technological growth in the age of new-digital consequently caused the related increase of software complexity to become almost linear and drasticly broaden code base heterogeneity. They consist now of multiple interconnected many-thousen-lined frameworks, sometimes in several different languages. There is less supply and much larger demand.

The crisis manifested itself in several ways:

The main cause is that improvements in computing power had outpaced the ability of programmers to effectively utilize those capabilities. Various processes and methodologies have been developed over the last few decades to improve software quality management such as procedural programming and object-oriented programming. However software projects that are large, complicated, poorly specified, and involve unfamiliar aspects, are still vulnerable to large, unanticipated problems. Not to say that even these proposed decomplexifying techniques have been the subject of unveiled limitations.

Multitude of Solutions

As a counter to these found problems, multiple approaches to applied programming language discipline was proposed. To show a few:

  • SD Paradigms
  • Software metrics
  • Software quality
  • Contract programming
  • Multiple visualization techniques of software structure
  • Automation of certain aspects(automatic documentation, ml-driven doc-gen, semantic-aware system architecting assistant, auto test, and such)

None of these are mature enough or present at all in any major programming language or any Interactive Development Environment.

What We Offer

The need for a revision has clearly arose. And our language - Klim PL - addresses all of the named problems as well as many other that remain behind the stage.

  • Functions, Objects, Variables, Categories, Actors, Typespaces, Structural conversion, ADTs both named and unnamed, Operators, Macros
  • Transition system
  • AST introspection
  • Better (better than C++🤮) unified operational semantic
  • Linter rewriting. Write less code, make compiler supplement missing details
  • Aspect-oriented programming (mostly as a consequence of a type system and std library)
  • Cross-language code(bridging at now, migrator eventually)
  • Easily expendable syntax(write your compiler extensions, connect them through unobtrusive type flags)
  • Multi-threading without alien constructs
  • Better interaction with OSs(all operations are by default safe)
  • Memory Safety, Multiple cross-code guaranties
  • Virtual memory, instead of raw memory
  • The New IDE(with auto-gens named oracles)
  • Numeric coercion
  • execution graph traversal and variable passing analysis formalisms
  • Explicit mutability annotation
  • Automatic runtime parallelization
  • Zero-overhead resource management.
  • intermediate layer to interact with os services
  • Multiple inheritance without diamonds. (suppress/route/override/hide keywords, use & promote inheritance directives)
  • Categories allows to put constraints/requirements on all entities in the language: actors, objects, even packages
  • Klim's categories support complexity requirement(big O notation) for speed and size
  • Klim automatically traces applied categories and in case where it violated it emits errors and recommendations. For example it might suggest to you that you implement a function with throwing in order to operate on some types.
  • Trace optionals through execution graph and confirm validity of nullable behaviour to remove run-time checks
  • klim uses custom container to store source code. That thing can contain files which are source code, karl representation, resources(.png, .json, etc), stored constexprs, compiler extensions and other libs, encryption of files must be supported as well as access restriction
  • supports custom format for serialization rezus/bire(BInary REpr). it can store bit-patterns of values
  • arhitecture for compiler plugins
  • due to some specific data flow peculiarities, lang noticeably relies on two software-engineering paradigms: Coordinator pattern and Spawner pattern
  • IDE proactive analysis of execution time of functions
  • names with escaping literals ( var some variable = ... )
  • two compilation strategies : osize - to aggressively minimize the actual size of a program by the cost of a speed, and ospeed - to maximize the execution speed by the cost of increased size. The cool thing is it can automatically pick the appropriate strategy based on an information retrieved from a target platform (memory amount, cpu score. these are provided by compiler which has run some benchmarks while installation was proceeded and calculates the size of a specialised program code vs unspecialized)

Noticable Differencies

Klim is influenced by many features from multiple languages; the most notable are pony and its ownership system, lisp' everything-is-an-expression philosophy, many other: plaid, erlang, haskel, and a bunch of academic papers that was collecting dust since 2000th.

"In short, it is a language for post-c era"

What You Gonna Have

Ownership system

For more that 50 years the problem of mutual mutability was addressed by using some pathetic methods such as locks and atomics; former of these does guarantee correct behaviour, and latter can be pretty awkward. Our lang provides a sybtyping capability(stolen from pony lang) for every type in order to allow sound ownership. It was proven that in order to construct automatically correct multi-threaded program without any locks and with parallezation, each type should expose 6 subtypes of mutability. (Actually there is more based on circumstances) The type relationship is represented by the following hierarchy:

  • T Metatype (contains only static members)
  • iso T (only owner can write and read to it)
  • trn T (only one item in cope can write to it, many can read from it)
  • val T (none can write to it, any can read from it)
  • ref T (any can read and write to object, but only within an owner scope. The object can never escape from the scope in which it was declared)
  • box T(anyone can safely read from it)
  • tag T(this is metatype)

There is also a prefix to these subtypes : ~. It is used with generics and extensions and means visible as. And subtype range ... iso..tag from iso covering all types tag. For example object declaration in klim looks like this: object Box {} It has type iso..tag T, which means it can be instantiated with any mutcap(mutation capability).

Mathematically Predictable State

Rich ownership annotations contibute to klim by enabling always traceable state, which allows to a whole new world of optimizations, such as 0-cost parralelization.

Reachable State Analysis

UNDER CONSTRUCTION. WAIT.

Entity flags

Each expression in the language has a spot for setting entity flags. These things has the following syntax '-' name | '-' name ':' argument They are used to specify certain behaviors and feature, for ex objects have -implemented: ... flag that indicates that object inherits certain objects. Or function can have -no_side_effects attribute, must not mutate anything outside of their scope . Here they are:

  • -non_supressable member marked with this flag cannot be suppressed
  • -non_routable functions marked with this cannot be routed
  • -seald functions and behaviours marked with this flag cannot pass its arguments to outer scope
  • -stateless objects marked with this flag must not have interaction between variables and functions/behaviours
  • -no_side_effects functions and behaviours marked with this flag cannot reference object outside of its scope
  • -type_member TBR? type member cannot be modifyied, thus canf be visible through tag reference
  • -instance_member TBR? instance members can be augmented with route and supress
  • -non_extendable This flag forbids type extension
  • -final final objects cannot be subclassed
  • -no_structural_construction is an object that should always be initialized throuhg a related type constructor
  • -designated_initializer attrubute for a function
  • -returns
  • -throws
  • -compiler_insrinsic
  • -implemented: using | promouting
  • -cases
  • -satisfies
  • -bridges
  • -numeric_record objects marked with this should contain only numeric variables. this can be optimized for numeric computation by compiler

Variables

Declaration pattern

'var' name ':' type flags* We'll use simpler notation for now

Every variable is declared with the keyword var no matter if it is mutable or not; because it is indicated by various annotation keywords().

var epoch = "Y2K" There is type inference so you dont need to write types explicitly. Linter will add missing bits when you hit that compile button ;) The expr above evaluates to var epoch: val String.UTF8View = "Y2K" Another example: var 'mocha is js': iso Number.Signed.Int32 = 143;

Don t be afraid of superfluous type decl part. You can declare name aliases, which gonna be transformed by a linter.

This is the same code before translation var 'mocha is js': SInt32 = 143; Vars can have entity attributes. This particular one is named computed property

var 'indirect var': val Number.Signed.Float16 -property: {
	get: {return 0.0}
}

Optionality

Many languages support optionality in for of ADTs, but we decided that it would be better to bake it into the type system core. Optional values are marked with ? followed after type name:

fn `make new int`(from `code point sequence`: box~ |Element: box~ `Code Point`|Collection) -returns: iso SInt32?

When an expression contains optional values it can be paired with or operator to cover the situation when it produces nil.

<some nil producing expr> `or` <another expr if the first one is evaluated to nil> 
SInt32.`make new`?(from: "123") or ();

Objects

Declaration pattern

'object' name genericdecl attributes* body?

Objects can contain instance members(fields and methods), and type members(consts and functions).

object Box {
	var value: val Number.Signed.Int32
}
var `mocha is js` = iso Box{value=32}.value

Because each type is compelled by mutability subtyping relation it is necessary to indicate what is the lowest bound and highest bounds. In the example above there is no explicit default type range specification because it doesn't matter, since all val are embedded into metatype, so any iso..tag instantiation will have the value member. For example if you would need to create a Box type that holds some value, which depends on it you would write it like this:

object iso..tag Box {
	var value: this Number.Signed.Int32
}

Now the field values mutability depends on a type instance mutability.

var mut = iso Box{value=0}
mut.value += 1 //ok
var nonmut = val Box{value=0}
nonmut.value += 1 //error

Inheritance

The lang supports diamond-free multiple inheritance. Suppose you're game designer and have two types object 'Collision Responder' and object 'Reflection Responder' to model the object object 'Glass Object' that can be broken and beautifully react to a light-producers in the scene. In conventional one-descendant oo model programming languages you would need to implement some sort of delegation, which would be less appropriate because delegation does not promotes the properties of a delegate; interfaces is just a part of the problem because they are not allowed to have default implementation neither they can have state. In the klim you can just write this:

object iso..tag 'Glass Object' -implemented: using this 'Collision Responder' & this 'Reflection Responder'

You can also control with precision each member of a type. For ex if you don't want a type - say Glass Object to expose certain members of its ascendants, you can use hide and suppress keywords to disable further propagation.

... {
	//no use is allowed
	suppress self.`Collision Responder`.methodName(argName:argName2:);
	//no descendant of Self can use this method, still can be used internally
	hide self.`Reflection Responder`.fieldName;
}

You override the behaviour of methods by using route keyword;

You can mixin as many classes as you want, but there is certain peculiarities: there is distinction between subclassing extension(extending), and mixin addition(using).

  • using forbids the usage of all members of inherited type, all members are being referred with self keyword
  • extending allows it, but all members are accessed through super keyword

One mention to reassure you is that you rarely use extending inheritance ;)

Type' Cases

All methods of an object can be declared in cases. It can only contain methods

object Helium -implemented: using Gas {var name: val = "Helium";}
case Liquid of not Gaseuos Helium {fn evaporate()[self -> Gaseous]{...};}
case Gaseuos of not Liquid Helium {fn condensate()[self -> Liquid]{...};}

object iso Condensator with |Substance: Gas|{
	var substance: some this Gaseous Substance;
	fn condensate()[self.substance -> Liquid]{
		self.substance.condensate();
	};
}
var `gaseous helium`: iso = Helium{};
var `helium condensator`: iso = Condensator{
	substance = take `gaseous helium`;
};
`helium condensator`.condensate()

Even though type' cases look similar to categories, the former one can be changed in runtime, and later one cannot be changed.

Typespaces

In order to provide effective type system with dependent types as well as mutability subtyping we decided that there should not be any freestanding types, but rather certain space - like c++' namespaces - where all types live. This aproach has the name Type Family. In klim TFs also help to establish type' state/type' transition sematic. (We also decided that the term dependent types is loose and insted we reffer to this concept by wording Rigour Typing :) ). Families are declared with typespace. Take a look:

typespace Number {
	object iso.. 'Binary Number Value'
	-private
	-implemented: using this `Memory Buffer`.`Raw Data` {...};
	object iso.. Signed {...};
	object iso.. Int32 
	-implemented: using this Self.`Binary Number Value` {...};
}

typespaces are needed to scope types to allow usage of Self referral. To further describe the usability of ts, lets continue the simulation thematic :) . With this strict behaviour by ts, it become possible to pass only a valid types to functions. Consider this:

fn 'fill'(texture: iso Texture.Fillable)
-no_side_efects {...}

Notice how type' family' bound here prevents passing the wrong types to this function. Another thing to notice here is the -no_side_effects entity flag which ensures at compile time that function does not mutate objects outside of its scope and because it takes ownership of an object, it is possible to make certain optimizations.

Enumeration

Declaration pattern

'enum' generic_decl? name -attribute*

Lang supports two forms of ADT, tagged unions and anonymous unions. TUs are discriminated unions whose inhabitants are referred with named tags.

enum iso.. |A: this.. Any, B: this.. Any|Either -states: {a: A,b: B} //tagged
enum iso.. |A: this.. Any, B: this.. Any|Either -states: {A,B} //untagged

var _ = iso|A: SInt32, B: SInt16|Either{a=0}
var _ = iso|A: SInt32, B: String.UTF8View|Either{"I am B"}

In this ex all inhabitants must have same as Either or lower mutability subtype. It is a violation to store iso member in the val enum instance. Trivial stuff u know ;-)

When working with summtypes, you can use the if var, match or or pattern to access the enum instance fields.

var a: iso = |A: SInt32, B: String.UTF8View|Either {"I am B"}
if var temp = a as String.UTF8View { ... }
match a {
	case var str = $ as String.UTF8View { ... } => {
		...
	}
	otherwise => {...}
}
print(a.`in upper case`?) or print("This is not a string :0")

Type Extensions

Klim adopts the amazing retroactive modeling paradigm, which implies that to some types new members can be added. Consider this trivial declarationobject Person not pretty much it can do, but you later can add more to it. Lets add a greeting function and a default name.

extend iso.. Person {
	var `default name`: val String.UTF8View -type_member -property: {
		get: {
			match self.sex {
				case $ is .male => {return "John Doe"};
				case $ is .female => {return "Maryll Abrahams"}
			}
		}
	};
	fn `greet the world`() -type_member {
		#print: "Hi! I am \(self.`default name`).";
	}
}

Notice the -type_member attribute, which is an antipode to a -instance_member attribute. The first one ensures that a member is part of a metatype, while the second makes members to be a part of dynamic changeable instances.

Recall an example that involved numeric type family. That one can be extended as well:

extend Numeric.`Binary Number Value` {
	var `zero value`: iso Self -type_member -property: {
		get: {return self.`make new`?(`with exact value`: 0) 
		or `fatal error`("Cannot create a numeric instance from \(self)")}
	}
}

Now all descendants of Binary Number Value provide a value for getting instance that is initialized to 0. Neat :)

Categories

Categories are used to ensure some properties that language entities have. It can be applied to packages as well In short it consist of first order predicate groups which's requirements must be met in order to prove program correctness. It operates on AST and runtime values. If it can prove correctness at compiler time, it does so. If not, error must be handled explicitly.

category 'Not grater than 10 number'{
	applicant $;
	$.value < 10;
}
fn `picky func`(_ arg: box~ |N|) with |N: Number.`Binary Number Value` -satisfies: `Not grater than 10 number`|

Compiler should prove that this statement is respected. If it cannot do so in comptime, it defferes the check til invocation, in which case user must explicitly handle potential error. There is more to it; it is possible to inspect a ast.

category `Without fatal errors` {
	applicant func;
	func.`ast view`.type == "fn";
	func.`ast view`.body.`find first`?(where: fn{ $0 == Function(named: "fatal error") }) == nil;
}
fn `never gonna let you down` () -satisfies: `Without fatal errors` {}

Structural type conversion

You might have noticed that the since each descendant of any type can differentiate not only by subtyping but by rigor typing, the transformation rules until this point was not clear. This section is for covering voids :) Consider that you have two types that both have two identically named fields with identical(or compatible) types

object A {var a: val SInt16; var b: val SInt32}
object B {var a: val SInt16; var b: val SInt32, var c: val UInt32}
//lets instantiate them
var a: iso = A{a=0,b=0}
//now you can transform A into B because they are structurally identical
var b = a as iso B{c=0}
//you must explicitly initialize missing fields

Structural conversion is needed to address the multiple, potentially contravariant rigor types. So for example you can cast a type (if you own it) to another type.

//lets say you have the fn from above and some texture
fn read(texture: box Texture.Atlass)
var texture: iso = Texture.Renormalized{...}
//but you got different type, so you canont pass it into the func
//luckily you own this instance, so you can cast
fill(texture: texture as Texture.Atlass{})

//since conversion from Texture.Renormalized to Texture.Atlass does not expose uninitialized fields, no such operation is required

Casting becomes particularly interesting when it comes to a structural conversion rather than nominal, because it is possible to cast an instance to an annonymous type:

object Person -imlpemented: using this `Homo Sapience` {...}
var `john`: dyn iso = Person{...}
var `sentient creature`: iso = john as? object {
    var `intelegence index`: iso Number.Signed;
    var `lifetime span`: val Number.Signed;    
}
//Note that in the example above the original type of `john` is errased at cast.
//So if you want to revert it to Person, you'd need to fill missing fields,
//like name, average salary ;), etc 

On the contrary, when a variable is declared with dyn keyword, the whole new world of possibilities opens:

object Person -imlpemented: using this `Homo Sapience` {...}
var `john`: dyn iso = Person{...} //ii is smthng like 10 points
var `sentient creature`: iso = john as? object {
    var `intelegence index`: iso Number.Signed;
    var `lifetime span`: box~ Number.Signed;    
}
`sentient creature` -> {
    $.(`intelegence index` as? Int32) += $.`get relative ii baseline` or ();
}
var `back to john` = `sentient creature` as? Person{}
//notice that even thought this type has gone through a conversion,
//because it is declared dynamic it retains its original type, 
//with which it was born

It is worth saying that not all types are trivially convertible and must be constructed through respected constructor. It is possible and all types that want such functionality must be attributed no_structural_conversion.

Type manipulation

The problem that was hounting me for some time about expressivity of pl's is is inability of changing type representation in runtime. While it is possible to check and change stuff in ct, if the bahiviour of the programm that want to change types based on information that is unknown to ct, there is almost nothing to provide. Klim provides that. Any code that manipultes types must be explicitly marked with meta keyword

var type: meta iso `HTML Parser` with |`Valid Elements`: enum {String, L?}, Resolver: _||L: Any|;
specialize type to |Resolver: `Custom Resolver`|;
fn func () -returns: meta iso Box with |Value: _|{}
fn inject (
	type: meta |T|, //any type
	into target: meta iso Result with |Success: _, Error: iso |`Some Error`|| //only unique types are valid
) with |`Some Error`: Exception, T: Any| 
-returns: meta iso Result with |Success: T, Error: _| {
	return specialize target to |Success: type, Failure: _|;
}
var sl: ref meta = inject (type: meta iso String, into: meta Result)
//sl is now having type (meta iso Result with |Success: String, Error: _|)
//to use it as a regular type, it is necesary to fully specialize it
//ideally this should be separated automatically to either runtime eval 
//or ct eval

var `meta result`: iso meta = Result with |Success: _, Failure: _|
specialize `meta result` to |Success: String, Failure: Nothing|
`meta result`{success="Everithing's fine"}

Functions

You got standard stuff with cycles, branching and consequent execution. This triple is introduced with respective operator if <expr> { <expr> } else { expr } while <expr> { <expr> } <consecutive exprs :)> func delc:

'fn' generic_decl? name argument_decl attributes* body
fn `greet us!`(times: box UInt8){
	if times > 5 {
		print("Wow, that's weird :|")
	}
	for _ in 1..times {
		#print: "Hi"
	}
}

Argument sybtyping : You already have seen how complex are objects and inheritance. This is time to complicate things even further! Consider func with this signature:

fn negate(_ argument: iso SInt32)

Looks simple right: take readable only signed integer of bit length 32. But we want this to work on any signed integer ascendant, not just this one. The solution is some keyword.

fn negate(_ argument: some iso Numeric.Signed)

Now, to any type that is a descendant of Numeric.Signed object, this function can be applied. To the body of this function, however the trimmed value is being passed. That means that concrete type of an argument is erased. If there is a need to retain actual type the dyn keyword must be used.

fn some(_ argument: dyn iso Numeric.Signed)

dyn is mostly used to pass around type information. It is possible to constrain a type of an argument with a structural object:

fn |T: object {var age: Numeric.Unsigned -satisfies: `Non zero number`}|
`take anything that has an age field`(_ arg: some box T)

This func taks any object that has an age field, that is greater than zero. As you can see code can be completely self documenting. This thew is used by (currently planned) ml-driven docgen :)

Higher orederness is present

extend iso..box Collection with |Element: T||T: Any|{
	fn map (_ transformer: fn (element: box~ Self.Element) -returns: iso |T|) with |T: AnyNominal| 
	-returns: some iso |Element: iso T|Collection
}

Complexity is considered to be part of a function type among other signature forming properties

Actors

Parralel code in klim is moddeled by actors; these islands of sequentiality on the ocean of concurency provide great oportunities for developers by keeping intricate details under the hood and offering a relatively simplex model. For example the compiler can automatically optimize your concurent code by tracking various runtime metrics, leading to a more performant solutions without developer intervention. Some call these statefull functors :). Messaging between actors should be implemented with atomics and transactional mem. There is an intent to implement messsage queues as a fixsized arrays or vectors of pointers to functions, which is to be specified by call sites. Further investigation into sequential calculus is required.

'actor' name attributes* generic_decl? body 

Since actors are not allowed to expose its internal state, the only valid variable type to have instance of it is a tag

actor `HTTP Server` with |`Request Type`: enum -states: {String, Int}| {
	bh `revieve request`(_ request: `Request Type`)
}

Metaptogramming

AST Introspection

TODO

Syntax Rewriting

TODO

  • All syntax rewtiting happens at compile-time
  • There should be a frontend for manipulation syntax in very convinient way
  • macros can contain custom suntax, but should always translate it into klim code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment