joeytwiddle/Deriving values

## Deriving values
# Derivative Values

Often when programming we want access to values which are actually the result
of calculations on other values.

(In some fields, such values are called "invariants".)

There are more complex examples, but let's start with a simple one.

We have a rectangle, defined by its `top`, `left`, `width` and `height`.

The interface gives us access to these four properties, but also to three other
derived values:

- `right`
- `bottom`
- `area`

In Java these might be accessed by `getRight()`, `getBottom()` and `getArea()`.

Now if we alter the `width` property, this will affect the values of `right`
and `area`.  There are three techniques we could use to ensure this happens:

1. Calculate the value on-demand whenever it is accessed.

    setWidth(newWidth) => @width = newWidth

    getRight() => @left + @width

    getArea() => @width * @height

2. Store the derived values internally, and update them every time a property
is set:

    setWidth(newWidth) =>
        @width = newWidth
        @right = @left + @width
        @area  = @width * @height

    getArea() => @area

    getRight() => @right

3. Cache the derived values internally.  Invalidate the cache when a relevant
property is set.  When reading a value, recalculate it only if the cache is
invalid.

    setWidth(newWidth) =>
        @width = newWidth
        @right = undefined   # invalidate
        @area  = undefined    # invalidate

    getArea() =>
        @ensureAreaIsCached()
        return @area

    ensureAreaIsCached() =>
        if @area == undefined
            @area = @width * @height

    # getRight() would follow a similar pattern

To extend this to the general case, we should consider a case when calculations
are heavy.  For the sake of this argument, consider that `+` and `*` operations
are very difficult on the target architecture, compared to reading and writing
values, and comparing them.

Each of these techniques is inefficient and optimal in different circumstances.

1. Always recalculating on demand (without caching) is inefficient on every
read, but optimal on every write.  Worst scheme for doing a lot of reads, best
for doing a lot of writes.

2. Always recalculating in advance is optimal on every read, but inefficient on
every write.  Worst scheme when doing a lot of writes before reading.  (E.g.
changing width and height before reading area would recalculate the area once
unnecessarily.)

3. Has a slight overhead on both reading and writing, but could be considered
the best solution in the general case.

Whichever scheme we choose will be optimal for some tasks, but sub-optimal for
some other tasks.

So then, what should we do?

Our ideal language would let us express the relationships between assigned
properties and derived values in an abstract way.  At compile time, or runtime,
developers should then be able to request a version of the `Rect` class that
will use one of the specific techniques above.  (It might even be desirable to
ask a class to switch technique at runtime, somewhat like the V8 engine does.)


# Bi-directional (and multi-directional) associations

In fact our ideal language would go further, by making associations work both
ways:

- A consumer of the type would be able to set the value of `right` and the
  instance would update to change its value of `left`, keeping `width` the
  same.  (Performing a translation, as would happen if you had set `left`.)

But now we must consider some details:

- There may be a case when the consumer wanted to resize the rectangle rather
  than translate it: by setting the value of `right`, the `width` would change
  but `left` would remained unchanged.  (In fact this was already an
  issue/assumption when we allowed `setWidth` on our rectangle earlier!)

- A stronger example would be when the consumer wants to set the area.  Then he
  must also indicate whether he wants the width to adapt to fit, or the height,
  or both, and about which axis should the scaling take place?

Extending this to more complex situations would require much harder mathematics
when deriving in reverse.  In such cases we might want to use a system like
Maxima or Mathematica to find solutions for us.

In some cases, no solution will exist.  In other cases, a solution might only
be achievable to a specified accuracy, using an iterative approach (e.g.
Newton).  Consumers of the class should be made aware of this, and given
options.  (This would suit an interactive programming environment, although a
text-based compiler could spit out mutable warnings to achieve something
similar.)


# The goal

The goal, which I should perhaps have stated at the start, is DRY (Don't Repeat
Yourself).

We should only have to define once what a rectangle is, and how its properties
relate to each other.

After that, it should be the task of the machine to present the rectangle type
in a form that is most desirable to the developer for a given task.

I can envisage such a system as having a specification language something like
Zed, followed by transformation rules that will turn the specification into a
working implementation.
	# Derivative Values

	Often when programming we want access to values which are actually the result
	of calculations on other values.

	(In some fields, such values are called "invariants".)

	There are more complex examples, but let's start with a simple one.

	We have a rectangle, defined by its `top`, `left`, `width` and `height`.

	The interface gives us access to these four properties, but also to three other
	derived values:

	- `right`
	- `bottom`
	- `area`

	In Java these might be accessed by `getRight()`, `getBottom()` and `getArea()`.

	Now if we alter the `width` property, this will affect the values of `right`
	and `area`. There are three techniques we could use to ensure this happens:

	1. Calculate the value on-demand whenever it is accessed.

	setWidth(newWidth) => @width = newWidth

	getRight() => @left + @width

	getArea() => @width * @height

	2. Store the derived values internally, and update them every time a property
	is set:

	setWidth(newWidth) =>
	@width = newWidth
	@right = @left + @width
	@area = @width * @height

	getArea() => @area

	getRight() => @right

	3. Cache the derived values internally. Invalidate the cache when a relevant
	property is set. When reading a value, recalculate it only if the cache is
	invalid.

	setWidth(newWidth) =>
	@width = newWidth
	@right = undefined # invalidate
	@area = undefined # invalidate

	getArea() =>
	@ensureAreaIsCached()
	return @area

	ensureAreaIsCached() =>
	if @area == undefined
	@area = @width * @height

	# getRight() would follow a similar pattern

	To extend this to the general case, we should consider a case when calculations
	are heavy. For the sake of this argument, consider that `+` and `*` operations
	are very difficult on the target architecture, compared to reading and writing
	values, and comparing them.

	Each of these techniques is inefficient and optimal in different circumstances.

	1. Always recalculating on demand (without caching) is inefficient on every
	read, but optimal on every write. Worst scheme for doing a lot of reads, best
	for doing a lot of writes.

	2. Always recalculating in advance is optimal on every read, but inefficient on
	every write. Worst scheme when doing a lot of writes before reading. (E.g.
	changing width and height before reading area would recalculate the area once
	unnecessarily.)

	3. Has a slight overhead on both reading and writing, but could be considered
	the best solution in the general case.

	Whichever scheme we choose will be optimal for some tasks, but sub-optimal for
	some other tasks.

	So then, what should we do?

	Our ideal language would let us express the relationships between assigned
	properties and derived values in an abstract way. At compile time, or runtime,
	developers should then be able to request a version of the `Rect` class that
	will use one of the specific techniques above. (It might even be desirable to
	ask a class to switch technique at runtime, somewhat like the V8 engine does.)



	# Bi-directional (and multi-directional) associations

	In fact our ideal language would go further, by making associations work both
	ways:

	- A consumer of the type would be able to set the value of `right` and the
	instance would update to change its value of `left`, keeping `width` the
	same. (Performing a translation, as would happen if you had set `left`.)

	But now we must consider some details:

	- There may be a case when the consumer wanted to resize the rectangle rather
	than translate it: by setting the value of `right`, the `width` would change
	but `left` would remained unchanged. (In fact this was already an
	issue/assumption when we allowed `setWidth` on our rectangle earlier!)

	- A stronger example would be when the consumer wants to set the area. Then he
	must also indicate whether he wants the width to adapt to fit, or the height,
	or both, and about which axis should the scaling take place?

	Extending this to more complex situations would require much harder mathematics
	when deriving in reverse. In such cases we might want to use a system like
	Maxima or Mathematica to find solutions for us.

	In some cases, no solution will exist. In other cases, a solution might only
	be achievable to a specified accuracy, using an iterative approach (e.g.
	Newton). Consumers of the class should be made aware of this, and given
	options. (This would suit an interactive programming environment, although a
	text-based compiler could spit out mutable warnings to achieve something
	similar.)



	# The goal

	The goal, which I should perhaps have stated at the start, is DRY (Don't Repeat
	Yourself).

	We should only have to define once what a rectangle is, and how its properties
	relate to each other.

	After that, it should be the task of the machine to present the rectangle type
	in a form that is most desirable to the developer for a given task.

	I can envisage such a system as having a specification language something like
	Zed, followed by transformation rules that will turn the specification into a
	working implementation.