Skip to content

Instantly share code, notes, and snippets.

@PhilipWitte
Last active December 13, 2015 20:59
Show Gist options
  • Save PhilipWitte/4974439 to your computer and use it in GitHub Desktop.
Save PhilipWitte/4974439 to your computer and use it in GitHub Desktop.
The "Perfect" Programming Language

The "Perfect" Programming Language

Assertions

  1. The easier to learn, the better.
  2. Best if easy to mentally parse "at a glance".
  3. Easy to understand structure "at a glance".
  4. Easy to type.

Conclusions

1.a) The number of syntactical symbols and keywords should be minimized.

1.b) The keywords and syntax should be consistent. Syntax rules should be minimized.
     Syntax which expands well (supports advances features similarly) is ideal.

1.c) The Standard-Lib names should be descriptive and non-ambiguous.

1.d) Common operations are mapped to common words. The more universally
     understood words are top priority (eg. "write" vs "echo").

2.a) The syntax should have a distinct signature, where (ideally) each separate
     concept/entity is semantically separate.

2.b) Syntax sugar is allowed, but sparingly, only when the cost of learning the alternative
     doesn't outweigh the annoyance of definition stuttering or lengthy expressions.
     
     Syntax which can represent sugar in a similar way to non-sugar code AND is consistent
     in sugar rules for all areas is ideal.

3.a) Variable types should, ideally, be seen "at a glance" where they're defined.

3.b) There should be (idealy) ONLY ONE way to express something in both the language
     and the standard-lib.

3.c) Symbols should hold consistent meaning, and visually distinct.

3.d) Scope bodies should have ending both beginning and ending marks, as white-space is more
     difficult to distinguish, and it's conceptually consistent with parameter definition.
     
     (NOTE: This point is highly debatable!)

4.a) Abbreviations are allowed, but only when not ambiguous (see 4.c)

4.b) Operators are allowed, but only mathematical ones, since others are not commonly
     understood and would be better written as functions with deducible names.

4.c) Abbreviation should only be used on very commonly used things, and only when it's non
     ambiguous. For instance, "var" & "int" instead of "variable" & "integer" make sense,
     because the cost of learning them doesn't outweigh the annoyance of typing the longer
     names all over the place.
     
     However, library functions & types should have descriptive names, since they're not used
     everywhere consistently like keywords. That doesn't mean they can't have abbreviations, it
     only means they should be very clear as to what they do, and readability should take
     priority over writability, especially since you have the ability to alias names locally,
     and IDE Code-Completion helps with spelling.

Examples

Hello World:

func main() {
    Console.write("Hello, world!")
}

Types, and Type aliasing:

type Foo {
    ...
}

type Bar = Foo

func main() {
    var foo : Foo # foo is type Foo
    var bar : Bar # bar is type Foo
}

Type init/var defaults:

type Person
{
    var name : text
    var age : int
    
    init {
        name = "Unknown"
        age = 0
    }
}
type Person {
    var name : text = "Unknown"
    var age : int = 0
}
type Person {
    var name = "Unknown"
    var age = 0
}

Named init:

type Company
{
    var worth : Dollars
    var employees : uint
    
    init new(worth, employees) {
        this.worth = worth
        this.employees = employees
    }
    
    init load(path) {
        # Load Company data from path
    }
}

func main() {
    var c = Company.new(200_000, 24)
    var c = Company.load("data.xml")
    
    # or like this:
    
    var c : Company
    c.new(4000, 3)
    c.load("data.xml")
}

Parameter & 'this' sugar:

type Foo {
    var a, b, c = 0
    
    init new(.a, .b, c) {
        .c = c
    }
}

func main() {
    var f = Foo.new(1, 2, 3) # sets a, b, & c
}

func parameters deduction and specilization:

func foo(x)         # 'x' is deduced
func foo(x:int)     # 'x' can only be 'int'
func foo(x:int|dec) # 'x' can be either 'int' or 'dec'

# examples...

func foo(x) {
    return x + 5
}

func main() {
    var i = 0
    var d = 0.0
    var t = "text"
    
    foo(i) # works
    foo(d) # works
    foo(t) # ERROR: foo.x requires an internal: 'func += (int)'
}

Tuples can group operations:

type Person {
    func greet() { ... }
}

func main() {
    var andrew = Person
    var philip = Person
    
    (andrew, philip).greet() # greet both
}

"Operator" functions don't require the '()' brackets?:

func main() {
    var a, b = 0
    
    (a, b) += 1 # works
    a, b += 1   # works
}

Tuples can be used for type access:

type Foo {
    var x, y = 0
}

func main() {
    var foo, bar = Foo
    
    foo.(x, y) += 1         # adds both
    foo.(x, y) = bar.(y, x) # swizzle
}

Tuples are used as "initializer lists" (which is consistent & doesn't conflict with Type params or need extra brackets):

type Point(T = int) {
    var x, y = T
}

func main() {
    var p = Point            # default
    var p = Point.(1, 2)     # sets data with Tuple
    var p = Point.(y=3, x=4) # sets by named
    
    var p = Point(dec)
    var p = Point(dec).(1, 2)
    
    p.(y, x) = Point.(10, 20)
}

Arrays are considered "core types" (like int/dec/text/etc..) Arrays are either static or dynamic, and have init sugar (HIGHLY DEBATABLE!):

func main() {
    var a = (1, 2, 3) # static array
    var b = [1, 2, 3] # dynamic list
    
    a += 2 # ERROR: static-arrays don't have func '+='
    b += 2 # works! b == [1, 2, 3, 2]
    
    a[] += 2 # works! a == (2, 3, 4)
    b[] += 2 # works! b == [2, 3, 4, 3]
}
type Person {
    func greet() { ... }
}

func main() {
    var people : Person[] = [ ... ]
    people[].greet() # greet every Person
}

Type Kinds:

type Foo : global { ... }
type Bar : proto { ... }

func main() {
    var foo = Foo # ERROR: Can't create Foo (it's global.. ie 'static')
    var bar = Bar # ERROR: Can't create Bar (it's proto.. ie 'abstract')
    ref faz : Foo # ERROR: Can't reference global types
    ref baz : Bar # works: Can create proto-type references
}
type Foo : final { ... }
type Bar : Foo # ERROR: Can't derive from 'final' type
type Foo : nonil { ... }

func main() {
    var foo : Foo # ERROR: Foo can't be nil
    
    var bar = Foo # works: allocates a Foo
    bar = nil     # ERROR: Foo can't be nil
    
    var i : int = nil # ERROR: int can't be nil (it's nonil)
    var d : dec = nil # ERROR: dec can't be nil (ditto)
}

Type Multi-Hierarchy (Order is relevant here and in other areas of the design):

type A { var x = 0 }
type B { var y = 0 }

type C : A, B
{
    init {
        x = 100
        y = 200
    }
}
type X { var a = 0 }
type Y { var a = 0 }

type Z : X, Y
{
    init {
        a = 100 # X.a
    }
}

type W : Y, X
{
    init {
        a = 100 # Y.a
        
        # or, explicitly...
        
        X.a = 200
        Y.a = 300
    }
}

Sub-Types:

type Ship
{
    type Laser {
        var pos = nextLaserPos()
    }
    
    var laserPoses = [ ... ]
    var cursor = -1
     
    func nextLaserPos() {
        cursor += 1
        return laserPoses[cursor]
    }
}

func main() {
    var xwing = Ship
    var lasers = Ship.Laser[]
    
    lasers += xwing.Laser # works: adds new xwing laser
    lasers += Ship.Laser  # ERROR: needs parent instance
}

Compiling & Symbol Rules

Types are the top-level encapsulation (no modules). There's no need to "import" files, therefore you only need to remember ONE name in order to use an object, and it's immediately available. Types can be prefixed with a pseudo-hierarchy, which is used in resolving symbol conflicts.

"floating" funcs (and Vars?) are allowed, but CANNOT be seen by other files (therefor only useful for local utility functions and 'main', etc).

Examples

Multiple Files:

# File "foo.rs"

type Foo : global
{
    func bar() {
        Console.write("Hi from Foo")
    }
    
    func baz() {
        foofunc()
    }
}

func foofunc() {
    Console.write("bazzy")
}
# File "main.rs"

func main() {
    foofunc() # ERROR: 'foofunc' not found
    Foo.bar() # works: writes "Hi from Foo"
    Foo.baz() # works: writes "bazzy"
}
# Compile like this:

$ reign main.rs foo.rs

Pseudo-Hierarchy:

type Screen.Point {
    var x, y = int
}
type OpenGL.Point {
    var x, y, z = dec
}
func main() {
    var a : Point # uses what is FIRST FOUND
    var b : Screen.Point
    var c : OpenGL.Point
}

To prefer a specific thing, we can specify:

use System
use Console

func main() {
    var a = Point        # System.Point
    var b = OpenGL.Point # explicit
    
    write(a.x) # Console.write(a.x)
}

Or, to alias:

type Sys = System
func echo = Console.write

func main() {
    var p = Sys.Point
    echo(p)
}

Or, you can isolate a file, and only use specific things:

use nil
use Console

type Sys = System         # ERROR: No type 'System' found
func echo = Console.write # works: can find 'Console'

func main() {
    echo("...") # works
}

The 'use' keyword has sugar:

use nil, System, Console

Memory Management

Memory management is extremely important to get right. Memory safety is vital for productivity, but techniques like Garbage Collection is often a big concern for people who are using the language to write highly optimized and competitive real-time applications like Game Engines.

My design is completely safe, but does not require any ref-counting or cyclic-pointer scanning that is typically found in other memory management techniques used today (at least none that I'm aware of). It relies on four types of "variable" declarations:

var - A memory "owner". It can never point to memory it did not allocate.
ref - A week reference to a var. It can never allocate or "own" memory.
def - A compile-time constant variable. It has no runtime memory.
ptr - A unchecked (unsafe) pointer, useful for graphs, hot-loops, and advanced things.

The compiler injects auto-cleanup code in the appropriate areas to ensure that vars get deallocated and refs get set to 'nil':

type Foo {
    var i = 0
}

func main() {
    ref f : Foo # reference of type Foo
    
    scope {
        var foo = Foo # allocates a Foo
        
        f => foo  # f refers to foo
        f.i = 100 # foo.i = 100
        
        # At the end of each scope, the local vars
        # are automatically cleaned up:
        #
        #   Memory.delete(foo)
        #   f => nil
        # 
    }
    
    f.i = 2 # ERROR: f is nil
}

Refs which are used dynamically (to reference dynamic-array items for example) need to have a invisible 'next/prev' (Linked-List Node) pointers which are used when the dynamic object they point too is cleaned up (to set them to null, and make sure they're not pointing to invalid memory space). These 'next/prev' pointers are private and stored in reverse memory space, so they don't effect the size of the memory being passed around, and, since function parameters are non-modifiable by default, normally don't need to be passed around.

Container types like Lists, Stacks, Queues, etc use 'ptr's internally to represent any inter-linking (graphs) references. The idea is that the standard library should, ideally, define most of data structures used for 99% of tasks. Therefor most programmers will rarely, if ever, need to "roll their own" or even use a ptr unless they explicitly want too. It also needs to be said that often, due to distinct limitations of each variable type, often the compiler can optimized local or private/readonly 'ref's to be identical to 'ptr's in terms of performance.

It is often a common criticism of The D Programming Language by C++ programmers that D's standard lib (Phobos) data structures rely on the Garbage Collector and thus make programming in D without the GC very impractical. By using raw, unchecked ptr types for these structure's internals we ensure that the language's provided collection structures are useful even in the most critical performance areas of industrial real-time applications (like AAA Game Engines).

Because there is no ref-counting or cyclic-scanning involved, in theory this technique should be significantly more optimized than competing techniques, and significantly simpler in design (It doesn't require thread-specific GC instances or impose Stop-the-World concerns). Memory is stored on a single heap (not managed & unmanaged heap) and specific optimizations or security features (like memory compression) can just as easily applied to this as any other technique.


Vars can be 'nil' as well:

func main() {
    var foo : Foo # nil & dynamic
    var bar = Foo # allocates & non-dynamic (nonil)
    
    var baz : Foo # nil & non-dynamic because...
    baz = Foo     # allocates directly afterwards
    
    var bax = Foo # allocates & dynamic because...
    bax = nil     # set to nil afterwards
}

Vars which are defined as nil, or have the potential to be set to nil (determined by semantic analysis by the compiler) are considered "dynamic" and require a similar sort of special dynamic-ref management as refs which point to dynamic array items. Vars which can be determined to never be set to nil (or enforced not to be through the 'nonil' attribute) don't need any dynamic management on refs which point to them.


Defines (def) are compiler constants, and can be used for many things. For instance I'm not entirely sure they shouldn't be used for all type/func/var aliasing. They can be complex aliasing:

type Foo {
    var bar = Bar
    def flag => bar.flag
}

func main() {
    var foo = Foo
    var flg = foo.flag # foo.bar.flag
}

However, it still might be best (easier to read, easier to type, more distinguished, etc) to have special syntax for more complicated properties, like:

type Foo
{
    var bar : int {
        get { return bar + 1 }
        set { var = value - 1 }
    } 
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment