VictorTaelin/why.md

## why.md

      
    Raw
  

              why.md
            
          
    What is wrong with the web and why we need Moon (draft)

A few days ago, I published an article about Moon, a fundamental building block of a decentralized browser that aims to solve many of Mist's problems. I've showed up some fancy features such as its decentralized package manager and a generalized monadic notation. I guess that made some people angry, wondering why the hell I made yet another programming language when we have so many of them. If you're on that group: you're right. I'm sorry. Believe me when I say I'm as tired of new languages as you, and I'm as pissed with myself as you are. But I'd not have done this if I didn't have a very good reason. Give me, thus, a chance to justify my sins. For one, I didn't actually invent a programming language. At least, not according to several definitions around. From Wikipedia,

A programming language is a formal language that specifies a set of instructions that can be used to produce various kinds of output. (...) The description of a programming language is usually split into the two components of syntax (form) and semantics (meaning).

Moon has no side-effects and, as such, it can't do trivial things such as "outputting" data to the console. There is no "hello world". Moreover, Moon has no official "syntax": it lets the programmer pick whatever syntax it likes. Finally, I didn't invent it! So, again, WTF is Moon?


Top-down answer: Moon is just the minimal subset of JavaScript that removes as much as possible while still leaving enough things to be remain practical.


Bottom-up answer: Moon is the minimal extension of the λ-calculus that adds just enough things to make it practical.


Concrete answer: Moon is just an algebraic datatype,
data Term
  = App Term Term
  | Lam String Term
  | Var String
  | Let String Term Term
  | Fix String Term
  | Pri String [Term]
  | Num Number
  | Str String
  | Map [[String, Term]]
plus a normalForm : Term -> Term function, following the semantics you'd expect from the functional-programming literature, as explained on moon-core.js.


And that's it. Note that I didn't invent Moon. I just selected a very interesting subset of JavaScript, gave it a name, love, a compiler and I decided I'd use it instead of JavaScript as my browser's main scripting language. But still, why? Why is this subset interesting and important? Why do I care? And, mostly, how in Earth dare I defy the omnipresent, almighty lord JavaScript and its sacred goddess HTML5?

Short answer: that micro subset is, given the way processors are built, given the way JIT compilers currently work, and as far as I could tell, the most concise format to pass arbitrary pieces of code around in a way that is safe and sufficiently efficient. JavaScript would make several critical optimizations and safety measures impractical and, as such, it had to be discarded. Note developers need not to be aware of Moon. In fact, JS itself could be compiled to it. Moon is more an internal thing than anything else.

Now: that pretty much ends the article. You could stop here. If you, because reasons, wants to read a looong explanation, covering my entire line of thought from the history of Web to WASM to parallel lambda evaluators, then move ahead to the long answer.

Long answer: because that's how the web should've worked, to begin with.

Pt.1: What is wrong with the Web?

To understand what is wrong with the web as it stands, let's first go through an ultra-quick overview of its history. When it started and for a long time, the web consisted mostly of static pages with some texts and images. It used to be fast, safe and robust, but also extremely restricted. To circumvent that, people invented JavaScript: a turing-complete programming language that could be executed in events such as window.onClick or setTimeout, allowing a web-developer to select and modify contents of a web-page dynamically. That was extremely permissive, but equally problematic.
As the web grew, web-pages became more and more interactive and complex. Libraries such as jQuery were invented to make the task of "selecting" and "modifying" those pages's contents easier. Soon, people noticed such coding style didn't scale. The reason wasn't obvious at first and they made backbone.js, but, eventually, the problem was figured out: by keeping half of your application's logic and state inside the DOM, the other half on JavaScript, and mutating that chaotically, programmers often ended up with different sources of truth and confusing, unmaintainable code; the so-called "jQuery spaghetti". Angular attempted to solve that by making HTML a programming language, so you didn't need that much JavaScript anymore. React attempted to solve it the other way around, by putting HTML inside JavaScript. Those lines of thought fought for a while but, nowadays, it is quite obvious that the React style won.
So, what about Moon? Hold on, we will get there.
React wasn't only a clever lib, it revealed something deeper than that. React showed us a well-principled and natural way to build any visual application. A React app (or component) combines 3 essential ingredients: a state, which holds all data on your app that can change, a render() function, which translates that state into a visible user-interface, and events, which talk to the outside world and update that state. There are many slightly different ways to make that spirit concrete, but all of them share one thing in common: they're much more restricted than normal JavaScript apps.
That's important, so, let me elaborate on that. If you're reading a properly-coded React render() function, you know for sure it isn't doing any HTTP request. If you're reading a "pure component", you know for sure it has no state or event. If you're reading a reducer, you know for sure it is not writing stuff to a database. And so on. Compare that to the jQuery era, where every piece of code could do anything, and you'll understand why it is so comforting to maintain well-made React apps.  React achieved most of its success not by inventing a ton of great features, but by removing as many bad ones as possible.
Now, mind the following: despite React restricting JavaScript so much, Facebook, one of the most complex web-apps in the world, was built with it. All those restrictions, thus, were not limiting, at all. That's what is wrong with the web: JavaScript is overpowered. And that's what makes React so brilliant: it restricts JS as much as possible while still keeping it equally powerful; or, alternatively, it adds, to static pages, just the right amount of JS to make them arbitrarily dynamic, but no more than that.
Now, I'm not sure I convinced you the React style is sufficient to express any visual application, but, for a moment, assume it is, and consider the following: what would happen if browsers, instead of running a turing-complete language in a dedicated thread, acted as mere interpreters of React-like apps? In other words, what if they received just initialState, render() and similar, and worked with that? Okay, you're probably thinking about all the things you think you couldn't do. Forget about them for a just a moment, okay? As it turns out, doing things that way would enable browsers to perform amazing optimizations. And when I say amazing, I mean mind-blowing, world-changing. From diffing virtual DOM natively, to coordinating external-data pooling, to a completely revamping how memory is managed, the possibilities are endless. It is not hard to imagine how such a browser could have hundreds or thousands of tabs open with a tiny fraction of the memory usage of a normal one.
Now, suppose a browser actually decided to adopt that crazy idea and implement React-like apps natively. How would such React-like app be specified? Would it be a normal HTML? Would it be a JavaScript file? Would the browser eval() it? Well, yes, technically, that could work. But we can do much better. React was made for JavaScript, and JavaScript wasn't made for that. JavaScript was made in 1995 to run in a single thread that imperatively mutates DOM elements in a jQuery-like style. As such, it has an enormous amount of things that make it imperfect for this goal. I'm not talking about stupid stuff such as that "wat" video, I'm talking about stuff that downright ruins it as an option. eval, setInterval, with, global side-effects and so on. Those, and many other, JavaScript features break a ton of assumptions that the browser could make if the language was less chaotic.
Pt.2: How do we fix it?

Once you accept that React-like specifications are sufficient and complete, it becomes obvious that what we're missing is merely a way to communicate those apps without all the overwhelmingly stupid JS baggage. We need a format to send code around the web that is:


Fast to parse, compile and optimize just-in-time. Imagine we received plain-text Rust code and had to type-check it?


Optimizable and performant. How ironic would it be if we did all that only to use Ruby?


Minifiable. Doesn't it bother you how inefficient minified JS bundles are?


Safe. You don't want apps to access your disk midway through their render() function, do you?


Expressive. Nobody wants to write code in brainfuck.


Small. This is not mandatory, but if a feature can be implemented outside of the core language, there is no reason for it to be a primitive.


Pure. This is the main deal-breaker. Suppose the browser decides to optimize the render() function by memoizing it, but then somebody writes code like function render() { return <div>{++GLOBAL_VAR}</div> }. That'd break everything, leading to inconsistent behavior. This is just one of billions and billions of reasons such a language would need to be side-effect-free.


Most existing programming languages don't pass half of those requisites. In fact, the last one disqualifies pretty much all of them: Python, Rust, C, PHP, Java, even Scheme, they all have side-effects. Problem is, those languages were not really designed to be used as a lightweight code-interchange format as we envisioned. WebAssembly is, perhaps, the closest thing to that. In fact, that is my #2 option, and I'm still wondering if it should be the #1. There are, though, many points that make it sub-optimal. I'll elaborate on those at the end of this article.
So, back to the point, JavaScript is far away from being ideal, and no existing "programming language" seems to satisfy all my requisites. Am I, then, looking for something impossible? No, not at all. There are some things quite like that around. Haskell's Core, for example, is very inspiring. It is the intermediate language to which Haskell programs are compiled before being converted to machine code. This is its definition:
type CoreExpr = Expr Var

data Expr b	-- "b" for the type of binders, 
  = Var	  Id
  | Lit   Literal
  | App   (Expr b) (Arg b)
  | Lam   b (Expr b)
  | Let   (Bind b) (Expr b)
  | Case  (Expr b) b Type [Alt b]
  | Cast  (Expr b) Coercion
  | Tick  (Tickish Id) (Expr b)
  | Type  Type

type Arg b = Expr b
type Alt b = (AltCon, [b], Expr b)

data AltCon = DataAlt DataCon | LitAlt  Literal | DEFAULT

data Bind b = NonRec b (Expr b) | Rec [(b, (Expr b))]
No matter how complex, every single Haskell program is eventually compiled to that tiny language. That seems very close to what we need, no? Core is fast, performant, has no side-effects, is pure and safe. Moreover, Haskell is a practical language with high-level features such as loops, data-structures and so on, demonstrating that a high-level language can be compiled to Core without loss of performance. In fact, if Core wasn't designed so specifically with Haskell in mind, it'd be perfect. Let's, thus, take it as an inspiration and design a core-language that is suitable for our purposes. Let's start with the lambda-calculus, which is the most primitive subset of every functional language:
data Term
  = Var String
  | Lam String Term
  | App Term Term
This gives us functions, variables and function application, which are obviously needed. While recursion and variable assignments can be expressed without further additions (with the Y-combinator and function calls), it has been shown that adding those two as primitives can provide essential performance benefits in many cases, so we add Let (assignments) and Fix (recursion):
data Term
  = Var String
  | Lam String Term
  | App Term Term
  | Let String Term Term
  | Fix String Term
Now, surprisingly, this is almost good enough for our needs! Since there are ways to compile any high-level features and syntaxes to this subset (which is quite obvious, since it is turing-complete), we don't lose any expressivity. Problem is, how to do so efficiently? To represent data-structures on that core, we could use lambda-encodings: it has been shown that those can be pretty much as fast as native structs. For control-flow, Lisp has showed us tail call optimization is sufficient. Most other features can be expressed as syntax sugars - monads are very handy for that. That leaves us with a few holes: native numbers (representing them with lambda-encodings would be prohibitive), a map/array-like structure with O(1) read/write, C-like structs and strings. All of those can, amazingly, be solved by extending this language with... JSON.
data Term
  = Var String
  | Lam String Term
  | App Term Term
  | Let String Term Term
  | Fix String Term
  | Pri String [Term]
  | Num Number
  | Str String
  | Map [[String, Term]]
Here, Num represents a IEEE 754 double, Str represents an UTF-8 string, and Map maps strings to anything else, i.e., like a JavaScript Object. Pri performs a primitive operation on those types (addition, string concatenation, etc.). As it turns out, that is all we need to take Moon all way from slower-than-Python terrains to being faster than JavaScript itself in many cases. With Num, Moon can perform pretty much any number-crunching algorithm as fast as native JS - JIT engines can optimize them to Ints when sutiable. Map allows us to express 3 different things efficiently: C-like structs, arrays and, obviously, maps, all with O(1) reads and writes. This is possible because JIT engines can easily pick the right representation based on usage. Moreover, despite statically untyped, Moon has well-defined types which can be inferred at compile time, generating extremely efficient machine code. Finally, since it is pure, it can perform a wide range of optimizations that impure languages can't; stream fusion, for one, is promising.
So, this language is for sure looking optimizable; at least as much as JS, possibly much more. As such, it is clearly practical. We covered expressivity already. Safety comes naturally from its purity: Moon can only access as many system resources as you allow it to. It is at least as fast to compile as JavaScript; probaby much faster. As a bonus, it can, before specific compilers are built, use existing JS JIT engines like they were made for Moon. Finally, Moon is so close to the λ-calculus, we're able to compress it in a way similar to John Trump's BLC, an extremely compact format designed to pack programs as much as their inherent entropies allow. As such, you can expect Moon bundles to be significantly smaller than JS bundles for equivalent programs, which is very relevant to a browser.
So, that's it. We have it all. There is no reason to add more primitives, because every other feature can be implemented on the language itself. Moon is, if not perfect, at least very close to what is needed here.
Note that, despite being somewhat related to it, there is nothing tying Moon to JavaScript. Thanks to its simplicity, you could compile and implement Moon anywhere else, easily. You could trivially translate Moon to Haskell and use all the power of the marvelous GHC and get rid of V8/Webkit entirely (in fact, chances are this will be how Mist-Lite will work). You could compile a DSL on it to C and have C-like performances. In a platonic future, you could compile it to Lamping's abstract-algorithm and run arbitrary programs in a massively parallel manner.
tldr


Moon is not a new programming language that you'll have to learn. It is just a small subset of JS which is, I believe, a much more suitable format to pass code around the web.


It was carefully selected to be perfect for its purpose as a code-interchange format: it has extremelly compressible bundles, is fast to parse and compile, generates efficient machine code, is 100% safe, etc.


I'm using it instead of JavaScript because JS is a gigantic, defective mess that makes critical optimizations and safety measures impossible.


It is meant to be used like JSON: stringify, parse, pass around the wire, mix with a host language like if it was native.


It will be used on Mist-Lite and will, hopefully, enable it to be mindblowingly faster than existing browsers in several senses.


I didn't "invent" Moon. Really, I just picked a good subset of JS and gave it a name, love and a compiler.


Moon is ridiculously easy. If you know JSON, functions and basic algebra, you know Moon.


You probably hate the syntax I made for it. Sorry about that. Please make a better one. Moon can have many competing syntaxes.


The current ADT is, though, not final and I could've gotten some details wrong.


In an ideal world, we'd use something with even less primitives and an expressive type system, i.e., Morte. I don't think this is practical right now for a few reasons.


Yes, I know you love JS, 99% of the web is made with JS and nobody is gonna drop it. See, 99% of my repos are JS. I understand.


Yes, I know I'll probably be the only one using Moon. If it helps me making a great product with great DApps, I'm ok with that.


I could be wrong about all of this and if you have a good reason to believe so, please destroy me (with love).


For illustration, here is an hypothetical, complete counter app on Moon:
// counterDApp.moon
// A simple counter application
{
  // Initial state of the App is just 0
  "initialState": 0,

  // The view of the App is just a clickable
  // div with a border showing the counter
  "render": count. 
    style: {"border": "1px solid black"}
    (div {"style": style, "onClick": ["inc"]}
      (numberToString count))

  // When you click the div, the counter increases
  "events": {
    "inc": count. (setState (inc count))
  }

};
No HTML, no JS, that's all. More complex behaviors could be achieved with extensible effects. Again, the syntax is unimportant and you could make it look identical to Python, Java or Brainfuck if you wanted to. Also, the format for DApps be slightly different than that, I guess.

The problem with WASM is that it is not as inspection-friendly as Moon, and, as such, it doesn't mix as well with the host language. Moon is very similar to JSON in the sense that Moon terms can be stringified to/from, moved around and used like first-class native values. For example, you can do something like Moon.parse("...").apps[0].render to get the render function of the first app of a collection of apps. The browser can make good use of that flexibility and it is not obvious how I'd have that with WASM. Moreover, WASM is not as small as Moon, which can be fully implemented in 200 or so lines of code. Finally, Moon is expressive enough to allow you to write WASM inside it with a proper DSL, letting the browser to compile it to the same machine code that WASM would generate, similar to how Haskell optimizes the ST dialect. As such, I now believe WASM isn't quite the solution here, although it can certainly be a part of it.