drernie/theory_of_programming.md

## theory_of_programming.md

      
    Raw
  

              theory_of_programming.md
            
          
    Proposed Components for a Simple, Scientific Theory of Programming

Draft 3
cc: @jonathoda  @jamiiecb @jeresig @twleung @lindsey @a_cowley @kmett @ezyang @lambda_calculus
Inspired by The Future Programming Manifesto
Premise

Contemporary programming is complicated because it is the unholy union of a purely mathematical discipline (computer science) and a purely technical discipline (software engineering).  As a result, the languages we use are an idiosyncratic mish-mash of formal concepts and opportunistic hacks.  The end result is ureliable and unreadable software that requires programmers to master endless minituae and arcana in order to achieve the results they want.
Thesis

To solve this problem, we need to go back to first principles.  That is, adopt the scientific attitude of examining the actual practical problems real programs need to solve, then determine the minimal set of concepts necessary to accomplish that task.
Proposal

The following is my first cut at summarizing what those might be. This is list is almost certainly wrong -- but that's what makes it scientific!  The goal is to state our assumptions as clearly as possible so that we can rigorously work out the consequences, compare it against reality, and iterate. Tragically, existing ideas about programming are inextricably tied up with the actual implementations, making it impossible to coherently critique the underlying concepts.

Semantic Model: Stateful Circuits sending Switchable Signals cf. http://www.cs.virginia.edu/~evans/greatworks/shannon38.pdf
Fundamental Primitives: Nil, Name, and Abstraction
Composite Abstractions:  Property List, Closure & Expression
Operations: Creating, Naming & Applying abstractions
Runtime: Assigning and Resolving names to caches while ensuring read/write Access Control
Dynamics: Sound and complete Scope, Dependency & Effect Typing  cf. http://www.cs.jhu.edu/~swaroop/aplas.pdf
Syntax: Expressed via a trivial Data Format rather than a complicated parser cf. http://blog.mongolab.com/2011/03/why-is-json-so-popular-developers-want-out-of-the-syntax-business/

I look forward to your feedback and counter-proposals!
A Note on BitC

cc @geoffreyirving
It is true that Shapiro et al abandoned the approach used by their paper (cited in #6). However, his brilliant post-mortem clearly identifies the four reasons why:


The compilation model.
The insufficiency of the current type system w.r.t. by-reference and
reference types.
The absence of some form of inheritance.
The instance coherence problem.

The first three could be solved by install-time compilation, true ByRef types, and first-class object types.

The first two issues are in my opinion solvable, thought the second
requires a nearly complete re-implementation of the compiler.

The only hard problem is  instance coherence:

The last (instance coherence) does not appear to admit any general solution, and it
raises conceptual concerns about the use of type classes for method
overload in my mind. It's sufficiently important that I'm going to deal
with the first three topics here and take up the last as a separate note.

Which he blames on scoping:

the instance coherence problem exists because the language
design performs method resolution in what amounts to a non-scoped way.

which comes from the way they handle type classes:

At the end of the day, type classes just don't seem to work out very well
as a mechanism for overload resolution without some other form of support.

because:

The type class notion (more precisely: qualified types) is seductive, but
absent a reasonable approach for instance coherence and lexical resolution
it provides an unsatisfactory basis for operator overloading.

In his follow-up paper, he even identifies a possible route for a solution:

The main problem is to switch from an ambient instance resolution process
to a lexical instance resolution process.

This sounds similar to what Rust is doing with coherent and modular type classes.
That is why I claim proper Scoping and Dependency rules (which Lisp never fully embraced) are essential for properly implementing Effect Typing system.  This should also have the side-effect of Right-Time Compilation -- resolving names only and always when you have sufficient information to resolve them unambiguosly (and ensuring you carry around the extra metadata until you do).
The other problem, IMHO, is that the Lisp-inspired syntax made it awkward to properly track effect and constantness everywhere, including the type system:

as the expressiveness of the types becomes too rich, humans
cease to be able to read them

If we can strip away all the other complexity, and require the syntax to only deal with issues of effect and naming, I think we can make it manageable again.
But if I've misunderstood or missed anything, please let me know.  I've been dying to discuss Shapiro's results for years!