hawkw/HawkLang.md

## HawkLang.md

      
    Raw
  

              HawkLang.md
            
          
    Thoughts and Rationale

This isn't a comprehensive language design, it's just ideas for syntactical constructs I'd really
like to see some day. It'd probably be some kind of object/functional hybrid a la Scala - I really
like the recent trend of "post-functional" languages that take a lot of ideas/influence from
functional programming, but aren't fascist about it, or so scary that only math Ph.Ds can learn
them. The idea is to fuse OOP and FP into a language which gives you a high level of expressiveness
and power, but is actually useable for Getting Real Things Done.
Compiled vs Interpreted


"Often people, especially computer engineers, focus on the machines. They think, "By doing this, the machine will run faster. By doing this, the machine will run more effectively. By doing this, the machine will something something something." They are focusing on machines. But in fact we need to focus on humans, on how humans care about doing programming or operating the application of the machines. We are the masters. They are the slaves."
~ Yukihiro "Matz" Matsumoto, The Philosophy of Ruby

This would DEFINITELY be a compiled language (JVM?), mostly because there's a bunch of stuff
that basically says "the compiler should figure this out for you" and if it was an interpreted
language it would probably be painfully slow. I LIKE compiled languages - I like the idea of
Having The Compiler Figure It Out For You, and if you put too much of that in an interpreted
language, it makes parsing real slow. I'm willing to have longer compile times if it gets me
a language that's more versatile & expressive.
There's a lot I don't like about Ruby (although that's mostly because I don't know it well), but I totally agree with the philosophy suggested by Matz in the above quote - the computer should have to do the work. Language designers should write languages to make programmers happy, not programs. However, if you include lots of structures that are easy for people but hard for machines, you impact performance. In an interpreted language, you parse a difficult structure every time you run the program, wasting time every run. In a compiled language, you parse that structure once, and therefore, you only waste time once.
The other nice thing about compiled languages is, of course, faster execution, and I'm personally all for a compiler that takes a really, really long time to compile the code if the binaries I get out of it are really tense. Execution speed and efficiency are beginning to matter again, with Big Data, mobile computing, and the Internet of Things becoming A Thing nowadays. Just because your laptop has a quad-core, 2GHz CPU and 8Gb of RAM doesn't mean your code will be running on a machine with those stats. Differences in performance between the development environment and target environment often mean that it's better to do the Hard Work on the development machine rather than on the machine you're developing for.

"The #1 Programmer's Excuse For Slacking Off: "My Code's Compiling"."
~ one of the most famous xkcd comics

Compiling your code used to be slow. Like, get-up-and-make-yourself-a-sandwich slow. Nowadays, it's much faster, especially because a lot of us write primarily/entirely in interpreted languages or for the Web. So why I am I advocating making compiling your code a long, painfully slow process again? Well, there are some significant differences in programming today  that make a long compile step less of a pain than it was in the Days of Yore:

Continuous integration: Nowadays, most professional developers work in a shop that uses CI and external build servers. This offloads the compile-time slowdown from your machine, so you can keep programming while the CI box builds the project.
Incremental compilation: A lot of programming languages now have incremental compilers, like Scala's zinc. These have the capability to watch a source directory and compile only the changes in source code, keeping the already-compiled binaries for everything that hasn't changed. That means if the compiler has parsed that big, expensive language construct or control structure once, it doesn't need to do that every time you change some little thing. This technology is a HUGE boon for languages with long compile steps.
Background compilation: A related but slightly different technology is background compilation. Pretty much developers have a machine with at least two CPU cores nowdays, so you can be compiling your code WHILE you write code, check emails, or waste time on Facebook. A lot of IDEs hae background compilation baked in - they build the project evern n minutes, or after you've changed n lines of code. Combine this with incremental compilation, and you can have a theoretically huge compile step, but never actually even have to hit the compile button.

I predict that these technologies will have even more widespread adoption in the future. This is a huge boon for languages which do the Hard Work at compile-time rather than at runtime.
I think we need a resurgence in compiled languages these days. Maybe it's the Web's fault, but it seems
like every new language that tries to do the "post-functional" thing I mentioned above is an interpreted language.
This kinda disappoints me. As I said, I like compiled languages.
Readable Programming vs Literate Programming


"Properly written code doesn't need documentation."
~ Radu Creanga

In the 80's, Donald Knuth advocated for the idea of literate programming, which "represents a move away from writing programs in the manner and order imposed by the computer, and instead enables programmers to develop programs in the order demanded by the logic and flow of their thoughts. (Wikipedia)". The idea is, essentially, you write a bunch of natural language macros for programming language constructs and then string them together. Basically, you are writing documentation and then extracting source from it (yes this is a butchering of Knuth's idea, I need it for the sake of comparison).
I personally am more into the idea of readable programming (or "self-documenting code"), probably
because I'm lousy at writing (non-sarcastic) comments. The idea here is that the source itself should be readable to people as well as machines (Python is a BIG influence here - I've heard it referred to as "runnable pseudocode"). In comparison with literate programming,  you are writing source, but the source is essentially its' own documentation, or documentation is rendered unnecessary because the source makes sense - both to the author and to other programmers who have to work with or maintain their code. I personally think that the ideal programming language would be one where the code is completely self-explanatory, but that's like an impossible Holy Grail that we should constantly strive for.
Why is readability important? Well, I think we all know that most of the time, if a computer does something wrong, it's not because of a hardware issue, it's because a programmer did something dumb. The only thing programmers do more than write code is make mistakes. Perhaps even more importantly, the only thing harder than writing code on your own is doing it with your friends - the number of mistakes made in any given project frequently seems like an exponential function of the number of programmers in the project. My experience, and that of many others, is that this is mostly because of issues in communication between programmers. This is decidedly not a new observation (see Fred Brooks' seminal text The Mythical Man-Month). Generally, the issues arise when programmers have to make sense of each other's code - remember the old adage "Code like the guy who's gonna be maintaining it is a homicidal maniac who knows where you live."

"Properly written documentation doesn't need code."
~ Hawk Weisman

I'm not necessarily arguing in favor of natural-language programming, mind you. Even if we set aside the fact that natural-language programming would be incredibly difficult, if not impossible, to implement (considering that the grammars of most natural languages are so ambiguous as to be almost unparseable, a compiler for natural-language programming would also basically Solve AI Forever), I think natural languages aren't the ideal way to communicate with computers (at least, in the general case). I think it would actually be harder for a programmer to read & parse a computer program written in natural language than it would be to parse a program written in an artificial language that's optimized for readability. Note how pseudocode has syntactic elements not generally present in natural languages.
In order to implement readable code, you have to make sacrifices somewhere. In Python, they essentially sacrificed programmer freedom for code readability. I think that the Python vs Perl holy war ("runnable pseudocode" vs "runnable line noise") is one of the most significant events in the history of programming language design. Perl let you do whatever you wanted, and therefore let you produce programs that just looked like a bunch of random characters. Python, on the other hand, adopted a philosohphy of "there's only one way to do it", which seems pretty fascist, but the next major plank in the Python philosophy was "...and that way should be beautiful." It's the old debate of freedom versus conceptual integrity that Brooks writes about in The Mythical Man-Month.

"Documented code doesn't need properly written."
~ Tristan Challener

I'd like to avoid sacrificing programmer freedom, at least, I'd like to not have to sacrifice it as much as Python does. I think code is art, and (this is where I wax philosophical for a bit) you can't have true aesthetic beauty in art without freedom & diversity. I like how, for example, Scala lets you write Haskell-esque functional code, "Java without the semicolons", or pretty much anywhere in between; depending on your background, skills, experience, and the needs of the situation. But you have to make the sacrifice somewhere, and I'd rather sacrifice compile time and compiler complexity than programmer freedom, whenever that's possible, going back to the idea of "let the compiler do the work" I mentioned earlier. Of course, this is easy for me to say since I'm just coming up with programming language ideas I'll never actually have to implement.
Once again, as I said above, this is all from the perspective of a programmer rather than a
language implementor, and it's basically Constructs I Wish I Had Access To While Programming.
There's probably a lot of stuff in here that would be really hard to implement and I just don't
realize that because, as I said, I'm Just A Dumb Programmer.
With all of that said, essentially, my goals are to maximize:

Expressiveness: I view this as the ratio of Stuff Accomplished:LoC
Readability: I view this as the ratio of Comments:Code necessary for a program to be understandable to people other than it's programmer.
Versatility: I'd really LIKE to have a language that is just as useful to a scientist in some Python-esque data-analysis applications but could also be used by a systems programmer writing an OS, but that's probably impossible.

Major influences:

Scala
Python

Languages that would be influences if I knew more about them:

Go
Haskell
CoffeeScript

SOME EXAMPLE FUNCTIONS

An example function showing if/else

def f(int n): # def <name>(<args>) defines functions, def object <name> defines objects

    if n = 1: # single-equals is the equality operator (although == is also acceptable)
                            
        f(n) <- n   # assigning function(arg) to a value is the return keyword
                    # this is cool because you can define conditional returns easily 
                    # (see example 2)
    else if n > 0:  # (there is also a return keyword for the old-fashioned)
                            
        if n > 2:
            n <- (n - 2)    # '<-' is the assignment operator
        if n < 8:           # pronounced "gets"
            n <- n + 1
        f(n) <- n
        
    else:          # colons denote blocks
        f(n) <- 1  # whitespace denotes ends of blocks
An example showing function-assignment returns:

(this returns exactly the same things as the previous function, but allows programmers to think about return values differently)
def f(int n):
    f(n = 1)        <- n
    f(8 > n > 2)    <- (n - 2) + 1
    f(n > 8)        <- n - 2  
    f(n < 0)        <- 1
The idea of this paradigm is that it  allows you to simplify functions that are mostly conditional logic, and think about them as a mapping of input values to output values (which is what they are in math). This should be able to make big conditional functions much shorter, but still clear. You could probably think of this as being like a Scala-style pattern match but with what I think of as a cleaner syntax.
One main flaw in this whole idea is that it probably would require significant levels of Compiler Wizardry to parse this in a Happy way.
Will (@yarbroughw) suggested this alternate notation for applying a function:
def f(int x, int y):
    f(x > 0, y > 0) <- x + y
    f(x <= 0, y<=0) <- x - y

int z = f of x, y

sumOf = sum of List[1, 2, 3, 4]
The idea is to enforce a consistent tense for function names.
OTHER WEIRD/FUN IDEAS

Slicing

#everything is sliceable:
String s = "Hello Max"
print(s[6:8])       # prints "Max"

# numbers can be 'sliced' by power of ten
int n = 19221
print(n[10])        # prints "2" (the number at the tens place)
print(n[1000:10])   # prints "922"
Integer slicing is Probably a Bad Idea - as @arcticlight pointed out, this kills the parser.
It's just an idea that I had one night and basically I went "MAN THIS WOULD BE COOL IF IT WAS A REAL THING".
Max (@arcticlight) pointed out that somebody could say:
int n = 2
int m = n[n:n]
and that the parser would freak the fuck out - it'd probably throw a compile error "Error: Recursive Integer Slice" or something.
But detecting that might be hard, IDK?
(also I don't really see a circumstance where you'd actually want to write something like that)
"English-style" keywords

I personally think that replacing symbol-based operators with word-based operators
allows for code that is more readable and understandable, leading towards my ideal
paradigm of a language which is so readable that it is completely self-documenting.
# 'or', 'xor', 'and', and 'not' are keywords
# but you can use '||', '|', '&&' and '!' if you want

if x = 3 or x = 9:
    print ("X is 3 or 9, but why you care, I dunno.")
    
if x % 3 = 0 and x is not 3:
    print ("X is a power of three that is not 3.")
    
# all of this stuff can be used in python-like list comprehensions
# 'is' is also a keyword, it may be used in place of '=' for ints
list powers_of_3_and_5 <- [x for x in range(3 > x > 30) if (x % 3 is 0) or (x % 5 is 0)]

# is does some other fun stuff...
if x is int:                    #...like testing type for variables...
    print ("X is a number")
    
Implicit boolean-ness:
def exists(object o):
  print "it exists" if o

# unassigned variables are false

list[int] listOne
exists listOne # does nothing
   
listOne = [] # empty data structures are false
exists listOne # does nothing
   
listOne = [1,2,3,4]
exists listOne # prints "it exists!" 

# empty indices of a data structure are false, too
exists listOne(5) # does nothing
Logical operators fun
doSomething() if aCondition # Coffeescript-style postfix if/unless are an option
doSomethingElse() unless aDifferentCondition


if aCondition and aDifferentCondition: #written-out and, && is also OK
 doSomething()
 doSomethingElse()
Object-Oriented features

# defining an object
def object House extends Building:
    int doors
    int windows 
Defining variables without a visibility makes them global variables. Global variables are automatically made constructor parameters if they aren't assigned a value in the class definiton - this might be a bad thing, I just like the idea of reducing the amount of boilerplate code that OOP generally forces you to write.
h <- new House(doors <- 2, windows <- 5) 
All arguments can be treated as keyword args OR positional args, this is another of those things that Probably Make The Parser Way Too Slow.
if h is House:  #is keyword works on objects...
    print("H is a house")

if h is Building:
    print("H is a building")

if House is Building:           #...and classes...
    print ("Houses are buildings")
    
o <- new House(doors <- 3, windows <- 9)

# if two objects are passed, it will test the equality of fields 
# (unless they have an overridden equality operator)
print(h is o)    # this prints "False"
Type system thoughts


"Make illegal states unrepresentable." ~ Yaron Minsky

Okay, let's start this out by saying that type systems are Hard and I don't know enough about type theory to  design a good type system all on my own.
It may seem weird that the language seems fairly strongly-typed, since I've said over and over that I want the compiler to do the heavy listing. I actually made that choice because I think weakly-typed languages are actually harder on the programmer - consider a Python or JavaScript variable that starts out as an int, is cast to a double by some operation, and then is reassigned to, like, a list or something. The programmer is the one who has to keep track of what type that variable is, and that's more state they have to hold in their working memory. A compiler can hold lots of state in working memory - most developers these days have laptops with between 4 and 18 Gb of RAM. A programmer can hold between five and nine digits in working memory at any given time - that's like a byte of RAM (yes I get that this is a bad analogy shut up). Programmers shouldn't have to keep a lot of program state in working memory if they don't have to, and I think strong typing actually is good for programmers.
I really like Scala's type system. I really like how Scala's type inference finds a balance between strongly- and weakly-typed - Scala feels like it's strongly typed when you need it to be, but weakly typed when you want it to be. This is another case of the compiler doing the work for you when it can.
I also really like the way Scala handles type parameters - type parameters are, I think, really important, especially for an object-oriented language. I would definitely want to handle parameterized types via type reification rather than type erasure (which Java does mostly because they added parameterized types in Java 2 and had to maintain backwards-compatibility). I really, really like Scala's variance system for type parameters - compared to Java's <E extends SomeClass>, it seems a little intimidating, but once you actually learn at how it works, it's actually both extremely simple and extremely powerful. It might be more in line with my design goals to provide a keyword-based alternative to Scala's [+T] and [-T] syntax (which is probably what makes it seem like rocket science to Scala newbies and Java refugees), but I'm not really sure how to do that succinctly.
Wishlist

These are langauge features that are Important To Me but that aren't syntax I can give you pretty code examples for:

Operator overloading: The expressive power of operator overloading makes it a really, really useful tool.
Higher-order functions: I don't ever wanna code in a language without higher-order functions again. Seriously.
CoffeeScript-style YAML-like object syntax: this is a Cool Thing and there's some awesome inspiration to be taken from it.