luqui/fist-sketch.mkd

## fist-sketch.mkd

      
    Raw
  

              fist-sketch.mkd
            
          
    I find myself wanting an image-based functional environment, like a persistent ghci. I want to use it to track my TODO list, do quick one-off computations, store miscellaneous data and analyze it.  Basically it's an environment where I can store my stuff that has the power of a Turing complete functional language.  Like spencertipping's self-modifying perl scripts, but functional instead.
I'd also like to design it well.  It's kind of a hacky purpose, but I might as well put some thought into its design and apply some of my recent ideas about modularity.  In such an environment everything is constantly changing, and yet it ought to be purely functional, because pure functions are cool.  So the module system has to be able to handle such a dynamic environment -- reusing old code in new contexts.
In an interactive environment, we are not usually designing for modularity.  Instead we just enter expressions, save values away, inspect them, and enter more expressions.  I would like to reuse any value I have given a name in a new context.  Say I wanted a histogram of some data:
<session 2011-01-13>
>>> data = [ ... ]
>>> histogram = Map.assocs . Map.fromListWith (+) . map (,1) $ data

I have written that code once, which required a certain amount of brainpower.  I would like to search for it in my "history", replace data with some new data, and use it to get a histogram of that new data.  The session might look like this:
>>> :search histogram
<2011-01-13> histogram = Map.assocs . Map.fromListWith (+) . map (,1) $ data
>>> histogram d = <2011-01-13>.histogram { data = d }

Now that I am in a place where I am thinking about reuse, I just grabbed that code that I wrote before and promoted it into a proper function.  I would also like to be able to rearrange my namespaces so I can collect and refine useful collections over time.  Also I should be able to merge sessions together so I can work "in parallel" (in serial, but disconnected) and reuse all of that work later.
So essentially I want a good dynamic module system together with a history of everything I have ever done, collected into automatically-named modules.  This seems to draw on the ideas of record-calculus, but in a less puristic light.  Also, I don't think I want types (just yet), so that should make things significantly easier.
I will eschew a global namespace, instead collecting everything by session (which is a type of module).  Modules can be members of modules (sessions), so that is where I will keep my refined modules.  A good search procedure will allow the convenience of a global namespace without the modularity issues.  Eg.
>>> :open AssocList
opening <2011-01-13>.AssocList

It will open the most recent one unless otherwise specified.
Running console commands is also useful, and we'll do that the Haskell way: by making a monad for console commands.   Then an interpreter metacommand can run them:
>>> files = unwords <$> system "ls"
>>> :run files
["foo", "bar", "baz"]

To reuse values we got from the system, we just expand at the interpreter level.  So
>>> :run fs <- files

is exactly equivalent to
>>> fs = ["foo", "bar", "baz"]

even if fs is referenced from a later session in a different directory, fs will still be ["foo", "bar", "baz"].  For better composability, use the monad.  The reason for this is just to maintain purity, so we can reuse code that once depended on an environment in a place where we don't have access to that environment.
The functional core is just like Haskell.  We'll do something about operator precedence and associativity, not exactly sure yet and don't think it's important enough to think about right now.  In general, I think that should be specified at binding time, eg.
>>> \(+)[infixl 4] -> 1+1
>>> (*)[infixl 5] x y = primrec (+ y) 0 x

But numeric precedence levels are annoying and we should find a better way.
That leaves the module system.  We know from Haskell's foibles that record accessors should be first class lenses, so they can be programmatically composed and manipulated.  I am wary of indexing modules with strings, because that leads to a class of dynamic programming that is hard to analyze, and analysis and recombination is what I'm going for: an interactive system that is also capable of creating maintainable code.  Do Foo.foo and Bar.foo really share something in common?  That seems to place too much importance on naming.  But on the other hand, we would like to substitute two implementations of the same module, and in a typeless language names are the only thing that can correlate the modules.
So I think modules should be indexed by symbols, which are basically strings.  foo = 1 defines an entry for 'foo in the current module, and writing 'foo elsewhere is the same symbol.  We can achieve modularity by having a good module manipulation infrastructure, which I think is essentially the interface to Data.Map.
Modules may be opened into a scope.  This lexically resolves symbols to that module, and of course the module itself is abstractable in that scope.  So
>>> :open Map
>>> fromList [(1,x), (2,y)]

is equivalent to
>>> Map = <2011-01-13>.Map
>>> foo = Map.fromList [(1,x), (2,y)]

where x and y are resolved elsewhere.  If you later take foo from this session:
>>> <2011-01-14>.foo { Map = <2011-01-13>.Map `union` { x = 42 } }

then x in foo does not get resolved to the one from the new Map.  So symbols are resolved statically, though the references are abstract.  This is somewhere between late and early binding.
Module opening takes the place of typeclasses in terms of convenience, since in a dynamically typed language we don't have the capability for typeclasses.   So instead of writing:
>>> openFile "foo" >>= bar

and relying on type interface to find the correct monad, you have to specify the monad:
>>> open Shell in openFile "foo" >>= bar

Alas, that's as good as we can do without resorting to runtime ad-hoc polymorphism, which is usually just a notational hack.  This at least gets the semantics right.
This design seems cool to me, but there is a nagging technical consideration which is probably quite important.  How far do normal forms evaluate?  To demonstrate, say you are rearranging some modules:
session <2011-01-14>
>>> Map1 = <2011-01-12>.Map
>>> Map2 = <2011-01-13>.Map
>>> Map = {
>>> fromList = Map1.fromList
>>> fromListWith = Map2.fromListWith
>>> fromListHistogram = fromListWith (+)
>>> }

And then you would like to use fromListHistogram with something substituted for something else.  Maybe it's (+), but maybe it is something in the definition of Map2.fromListWith.  It seems to me that those are different ideas.  In particular, if we inspect Map.fromListHistogram, we might see
>>> Map.fromListHistogram = Map.fromListWith (+)

because we'd like to see things at the appropriate level of abstraction.  How do we "dive in" to fromListWith to substitute, and still reuse the code written in fromListHistogram  (assuming it is not so trivial).  Perhaps we trace the path of construction:
>>> Map { Map2 = <2011-01-14>.Map2 { fromListWith = <2011-01-14>.Map2.fromListWith { foo = bar } } }.fromListHistogram

Man that's a pain.  Oh wait, but we have lenses.  So maybe it's more like:
>>> (modify ['Map2, 'fromListWith, 'foo] bar Map).fromListHistogram

That is not so bad.  It still might be kind of a pain.  But poking around in other people's code is a complex thing to do, so that complexity has to be encoded somewhere.  Hmm, but this code is brittle -- it does not withstand a substitution for a different Map unless that Map was constructed in the same way.  I think code-poking will always have that property.  So maybe this is not such a good idea, and we should flatten the abstraction mechanism a bit.
What if we take from record-calculus and let the module be the fundamental unit of abstraction.  So when you name something it can be substituted.  In the above Map example, I can substitute Map1 and Map2 from the session <2011-01-14>, I can substitute fromList fromListWith and fromListHistogram from <2011-01-14>.Map.  I guess that's kind of what I was saying before.  I guess I mean a module is a concrete entity, so <2011-01-14>.Map is not abstract, but its members are abstract with respect to its other members.  It's more of a psychological distinction: members are the things that are meant to be substituted.  When we are in an interactive development, we will tend to name things that are semantically significant (otherwise the cost of coming up with a name is too great).  And you can do that deep substitution above if you want to... but, bleh.  It sucks.  Don't write sucky code.  Or more like this: you can write sucky code, but don't reuse sucky code.
I think two levels of inspection is sufficient to do most things we would want to do.  Normal form: reduce as much as possible in the current context, for when you want the answer.  Definition form: just get the definition of a symbol as I wrote it before, for when I want to understand or reuse the code.
I think that is enough specification to build a prototype.  Fun :-)