davidad/proposal.txt

## proposal.txt
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Homoiconic Propagator Networks
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

An MMP Mini-Proposal by David Dalrymple
=======================================

============
Who are you?
============

|  David Dalrymple
|  davidad@mit.edu
|  +1 443 668 6698
|  MIT MMP Research Assistant

======
Budget
======

RA support for one year

==============================
Proposed Area of Investigation
==============================

In summary, we propose to develop a model of computation which incorporates the
partial information structures of propagator networks, the parallelization of
RALA, and the reconfigurability of Minsky's critic-selector model, along with
such features as higher-order functions, true macros, algebraic data types,
and other useful means of combination and abstraction.

If fully successful, this programming model will enable the feasible construction
of large artificial-intelligence programs that would have taken tens of man-years
to build in existing sequential programming languages, such as semantically
statistical parsers which handle broad classes of natural language; learning
systems with versatile cross-modal feedback; and vertically integrated control
systems which can appropriately interchange information at the level of leaf
nodes, with central intervention intermittent at best.

In addition, such a model would support the development of bug-free (formally
verifiable) programs for simpler tasks, such as version control systems, or
possibly even web browers.  Finally, it would enable the independent
development of programs which can later be composed by an end-user, with
verifiable results that are robust to, e.g., library version updates.

Computation as a graph
----------------------

We use the word "graph" to refer to a mathematical structure consisting of a
set of objects, with some pairs of objects connected by edges. (Do not
confuse this sense of "graph" with the graph of a function, also known as a
plot.)

We observe that graph structures emerge naturally from the notion of a
computational process in three major ways.

 state graph
   Most clearly, any finite computational system can be modeled as a finite
   state automaton, which consists of a directed graph of states, with edges
   representing the possible transitions from one state to another. (A
   *control flow graph* is closely related to this represenation.)

 connection graph
   A computational medium composed of a number of objects which have some
   connections to one another can clearly be represented as a graph.

 dependency graph
   A computation to be performed on a stream of data can often be decomposed
   into a directed graph, where each object represents a transformation from a
   set of inputs to a set of outputs, and edges represent composition of these
   transforms. (A *dataflow graph* is a type of dependency graph in which the
   values manipulated are typically lazy streams.)

A physical analogy would be that a state graph is primarily timelike, with
the entire spatial extent of the system being compressed to a single point
at each state note; a connection graph is primarily spacelike, with the
entire temporal history of each component compressed to a single point at
its location; and the dependency graph captures aspects of both, without
fully specifying either.

We see that the dependency graph, as distinct from any combination of state
and connection graphs, satisfies the following desirable properties:

1. It captures the computation itself: the mapping from inputs to outputs.

2. It is deterministic: given an input set, there is exactly one possible
   output set.

3. It is parallelizable: without any additional information, it is possible to
   make use of as many processors as available, up to the number of nodes in
   the graph.

We can see that, in general, state graphs are not parallelizable and
connection graphs are not deterministic, but a dependency graph necessarily
entails a separation of concerns with each intermediate value uniquely
determined by the inputs. However, this model lacks two further desirable
properties:

4. It is possible to represent partial information (sufficient for stream
   processing).

5. It is possible for the graph to self-reflect and manipulate itself
   (necessary for scalable data structures, higher-order functions).

The next two sections describe extensions which integrate these properties.

Propagator networks
-------------------

The model of propagator networks, developed by Alexey Radul and Gerry Sussman,
adds property (4). It does this by replacing all edges which contain values by
edges which contain information *about* values, which may be partial
information. In particular, the partial information structure for a given
propagator network consists of a lattice (in the group-theoretic sense, not the
geometric sense): there is a partial ordering on partial values, and for any
two, there exists a unique "least common multiple" and a "greatest common
denominator," and these operations are commutative and associative.

This ensures determinism while enabling more flexibility in the flow of
information. That is, the intermediate *values* - and the final output -
are uniquely determined by the input, but the intermediate partial values
are not. For example, we can now do lazy stream processing in the traditonal
dataflow style: where previously the unique, immutable values would have to
be lists of values, such that nothing is output until all the input is present,
but now, we can do our computations with partial values, representing only the
sequence of input until the current time, and each new datum in the stream
will propagate to the output as additional partial information. Although the
sequence of changes to the partial value at the output is no longer determinstic,
the value represented still is - namely, the unbounded stream of outputs. (The
only possible variation in the sequence of output updates is in the order with
which values in the stream are determined.)

Homoiconic propagator networks
------------------------------

However, without property (5), we cannot represent data structures within the
model (even the partial information structure itself!). Instead, the structures
are defined in an external model, such as Scheme or Java, and used opaquely by
the propagator network. To support dynamic data structures - a must for large,
practical software systems - further investigation is needed.  This represents
the core of the proposal.

The first step toward a self-modifying propagator network is to implement some
graph rewriting functionality, such that the network can at least dynamically
allocate new nodes. There exist many schemes for graph-rewriting abstract
machines, but the primary challenges here are to reconcicle such systems with
the partial information and deterministic parallelism of propagator networks
(as they are generally tied to the standard von Neumann infrastructure).

The bigger question is how such a rewriting propagator network can be
homoiconic: such that the rewriting rules of the network are represented within
the network itself, and can be indefinitely self-reflective.  There is some
existing work on self-hosting graph rewriting systems, but it is in the context
of pure theory (much like the body of cellular automata work before ALA),
rather than anything approaching practical computations.

Throughout this process, we will ground our developments in Minsky's chosen
central problem of MMP: understanding natural-language stories. We will begin
by building simple parsers for context-free grammars (as calculator programs
and the like), and as more fundamental questions are resolved, work up to
natural language processing, changing focus from parse trees to semantic nets.
Because our system will allow the resolution of problems and the addition of
features with extraordinarily low effort, we expect to be able to replicate and
overtake the Katz/Winston system in less than one year with a system based upon
homoiconic propagator networks.

===============================
Intellectual Property Statement
===============================

I hereby agree to the following:

 MMP intellectual property will be managed as part of the CBA intellectual property pool, which provides royalty-free non-exclusive access to its sponsors. There is a two year lockout on licensing to non-sponsors from the date of disclosure, and sponsors may elect to take out a royalty-bearing license that permits sub-licensing and excludes future licensing to non-sponsors at the date of disclosure. MMP rights can co-exist with other access to this work, and inventors will be considered to have the same rights as sponsors to their work.
	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
	Homoiconic Propagator Networks
	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

	An MMP Mini-Proposal by David Dalrymple
	=======================================

	============
	Who are you?
	============

	\| David Dalrymple
	\| davidad@mit.edu
	\| +1 443 668 6698
	\| MIT MMP Research Assistant

	======
	Budget
	======

	RA support for one year

	==============================
	Proposed Area of Investigation
	==============================

	In summary, we propose to develop a model of computation which incorporates the
	partial information structures of propagator networks, the parallelization of
	RALA, and the reconfigurability of Minsky's critic-selector model, along with
	such features as higher-order functions, true macros, algebraic data types,
	and other useful means of combination and abstraction.

	If fully successful, this programming model will enable the feasible construction
	of large artificial-intelligence programs that would have taken tens of man-years
	to build in existing sequential programming languages, such as semantically
	statistical parsers which handle broad classes of natural language; learning
	systems with versatile cross-modal feedback; and vertically integrated control
	systems which can appropriately interchange information at the level of leaf
	nodes, with central intervention intermittent at best.

	In addition, such a model would support the development of bug-free (formally
	verifiable) programs for simpler tasks, such as version control systems, or
	possibly even web browers. Finally, it would enable the independent
	development of programs which can later be composed by an end-user, with
	verifiable results that are robust to, e.g., library version updates.

	Computation as a graph
	----------------------

	We use the word "graph" to refer to a mathematical structure consisting of a
	set of objects, with some pairs of objects connected by edges. (Do not
	confuse this sense of "graph" with the graph of a function, also known as a
	plot.)

	We observe that graph structures emerge naturally from the notion of a
	computational process in three major ways.

	state graph
	Most clearly, any finite computational system can be modeled as a finite
	state automaton, which consists of a directed graph of states, with edges
	representing the possible transitions from one state to another. (A
	control flow graph is closely related to this represenation.)

	connection graph
	A computational medium composed of a number of objects which have some
	connections to one another can clearly be represented as a graph.

	dependency graph
	A computation to be performed on a stream of data can often be decomposed
	into a directed graph, where each object represents a transformation from a
	set of inputs to a set of outputs, and edges represent composition of these
	transforms. (A dataflow graph is a type of dependency graph in which the
	values manipulated are typically lazy streams.)

	A physical analogy would be that a state graph is primarily timelike, with
	the entire spatial extent of the system being compressed to a single point
	at each state note; a connection graph is primarily spacelike, with the
	entire temporal history of each component compressed to a single point at
	its location; and the dependency graph captures aspects of both, without
	fully specifying either.

	We see that the dependency graph, as distinct from any combination of state
	and connection graphs, satisfies the following desirable properties:

	1. It captures the computation itself: the mapping from inputs to outputs.

	2. It is deterministic: given an input set, there is exactly one possible
	output set.

	3. It is parallelizable: without any additional information, it is possible to
	make use of as many processors as available, up to the number of nodes in
	the graph.

	We can see that, in general, state graphs are not parallelizable and
	connection graphs are not deterministic, but a dependency graph necessarily
	entails a separation of concerns with each intermediate value uniquely
	determined by the inputs. However, this model lacks two further desirable
	properties:

	4. It is possible to represent partial information (sufficient for stream
	processing).

	5. It is possible for the graph to self-reflect and manipulate itself
	(necessary for scalable data structures, higher-order functions).

	The next two sections describe extensions which integrate these properties.

	Propagator networks
	-------------------

	The model of propagator networks, developed by Alexey Radul and Gerry Sussman,
	adds property (4). It does this by replacing all edges which contain values by
	edges which contain information about values, which may be partial
	information. In particular, the partial information structure for a given
	propagator network consists of a lattice (in the group-theoretic sense, not the
	geometric sense): there is a partial ordering on partial values, and for any
	two, there exists a unique "least common multiple" and a "greatest common
	denominator," and these operations are commutative and associative.

	This ensures determinism while enabling more flexibility in the flow of
	information. That is, the intermediate values - and the final output -
	are uniquely determined by the input, but the intermediate partial values
	are not. For example, we can now do lazy stream processing in the traditonal
	dataflow style: where previously the unique, immutable values would have to
	be lists of values, such that nothing is output until all the input is present,
	but now, we can do our computations with partial values, representing only the
	sequence of input until the current time, and each new datum in the stream
	will propagate to the output as additional partial information. Although the
	sequence of changes to the partial value at the output is no longer determinstic,
	the value represented still is - namely, the unbounded stream of outputs. (The
	only possible variation in the sequence of output updates is in the order with
	which values in the stream are determined.)

	Homoiconic propagator networks
	------------------------------

	However, without property (5), we cannot represent data structures within the
	model (even the partial information structure itself!). Instead, the structures
	are defined in an external model, such as Scheme or Java, and used opaquely by
	the propagator network. To support dynamic data structures - a must for large,
	practical software systems - further investigation is needed. This represents
	the core of the proposal.

	The first step toward a self-modifying propagator network is to implement some
	graph rewriting functionality, such that the network can at least dynamically
	allocate new nodes. There exist many schemes for graph-rewriting abstract
	machines, but the primary challenges here are to reconcicle such systems with
	the partial information and deterministic parallelism of propagator networks
	(as they are generally tied to the standard von Neumann infrastructure).

	The bigger question is how such a rewriting propagator network can be
	homoiconic: such that the rewriting rules of the network are represented within
	the network itself, and can be indefinitely self-reflective. There is some
	existing work on self-hosting graph rewriting systems, but it is in the context
	of pure theory (much like the body of cellular automata work before ALA),
	rather than anything approaching practical computations.

	Throughout this process, we will ground our developments in Minsky's chosen
	central problem of MMP: understanding natural-language stories. We will begin
	by building simple parsers for context-free grammars (as calculator programs
	and the like), and as more fundamental questions are resolved, work up to
	natural language processing, changing focus from parse trees to semantic nets.
	Because our system will allow the resolution of problems and the addition of
	features with extraordinarily low effort, we expect to be able to replicate and
	overtake the Katz/Winston system in less than one year with a system based upon
	homoiconic propagator networks.

	===============================
	Intellectual Property Statement
	===============================

	I hereby agree to the following:

	MMP intellectual property will be managed as part of the CBA intellectual property pool, which provides royalty-free non-exclusive access to its sponsors. There is a two year lockout on licensing to non-sponsors from the date of disclosure, and sponsors may elect to take out a royalty-bearing license that permits sub-licensing and excludes future licensing to non-sponsors at the date of disclosure. MMP rights can co-exist with other access to this work, and inventors will be considered to have the same rights as sponsors to their work.