nothings/gist:3aaf2586194265a3a622

## gistfile1.txt
Concerning "Modeling Real-World Terrain with Exponentially Distributed Noise"
http://jcgt.org/published/0004/02/01/

Comments by Sean Barrett (sean at nothings dot org, @nothings on twitter)

Table of contents:

1. Proof of "mu" in section 3 loses a negation partway
through, thus giving an obviously nonsensical result.

2. I'm doubtful that this noise actually gives an expontential
distribution.

3. Some comments on the conclusion while we're at it.

4. Do we really need to change the noise function?

Discussion:

(1)

Paper claims that any value of mu < 1.1637 is small
enough for mu^i to work, but obviously 1.1636^255
is much larger than 1, not much smaller than 1. Even
the hypothesis that this was once mu^-i doesn't fit:

   1.1636^-255 = 1.65979e-17
   2^-126      = 1.75494e-37

Those are not even close, so I don't think this was
just a transcription/notation error, but an actual
error in the proof (well, several).

The error in the proof is as follows:

   "The smallest normalized floating point value
    is 2^-(2^f-2), where f..."

This is correct.

However, following, the negation of the exponent is lost:

    "We therefore want mu^(B-1) <= 2^(2^f-2)"

There follow two *separate* errors:

    "ensure that mu <= 2^(126/256)"

which should be 2^(126/255), and

    "mu < 2^(126/256) < 1.1637"

but 2^(126/256) is 1.40657..., not 1.1637...

Correct calculation is as follows:

     mu^(B-1) <= 2^-(2^f-2)
     mu <= 2^-(2^f-2)/(B-1)
     mu <= 2^(-126/255)
     mu < 0.709995

Of course, if mu doesn't match this constraint, the final
output may still be perfectly acceptable. (The paper doesn't
actually say what mu values were used; the generated images
in the supplementals 1 & 2 don't indicate what mu values are
used; and no attempt to compute the mu from the captured
terrain data is given; thus this paragraph of the paper
is the only actual information provided about mu, but is wrong.)

(2)

The m[] table is constructed by sampling the exponential
function mu^i at regular intervals (and then interpolating
smoothly between them in the noise function).

However, this is not (on the face of it) the same thing
as building a table that when uniformly sampled produces
an exponential distribution. To do that, you need to take
the cumulative probability distribution function, invert
it, and regularly sample that.

Now, exponentials have funny properties (like the derivative
being an exponential), so maybe somehow this works for the
exponential case, but nobody else on the net appears to think
so. And let's cut straight to the supplemental materials:

    "Exponential Distribution: To verify the code for generating
     exponentially distributed random numbers, compile and run
     the program in this folder"

However, this program does *not* use the m[] table approach
described in the previous section as used in the noise. Instead,
this program tests this function:

   float ExpRand(){
     static const float scale = 1/log(0.5f*((float)RAND_MAX + 2.0f));
     return -scale * log(0.5f*rand() + 1.0f) + 1.0f;
   } //ExpRand

whereas m is effectively computed as

   m[i] = exp(mu, i)

which would correspond roughly to a function like:

   float ExpRand???() {
      return exp(mu, a*rand()+b);
   }

That is, the correct generator transforms uniformly sampled
numbers into exponentially distributed numbers by transforming
them with log(), whereas the m[] table is effectively seeded with
with exp(mu,i), so that function is basically transforming
them with exp(), which is quite different from log()!

Maybe there is some justification for doing it this way; perhaps
the rest of the Internet is wrong and this is a way of producing
exponentially-distributed numbers, but in that case the text
should probably justify it.

My best guess is that we have the following:

   A) perlin noise is (essentially) uniformly distributed
   B) actual terrain is exponentially-distributed
      (low magnitudes are more common than high magnitudes)
   C) this paper provides non-uniformly (but not exponentially-distributed) noise
      (low magnitudes are more common than high magnitudes)

Thus it is still producing a useful result, and the distribution
is still guided by the results for measuring real-world terrain,
so the results still (somewhat) mimic reality.

It would be good to try an actual exponential distribution and
see if the results are significantly different; maybe they
basically look the same. Also, accurately matching the statistics
of the real world may not be that useful, due to the issue
from the next section.

C)

"Some interesting open questions remain."

A crucial issue with using summed octaves of noise in generating
terrain is that the octaves are generated _independently_. If
we look at the real world, we observe things like steep cliffs.
A vertical cliff will produce a steep gradient at every octave,
_correlated_. (Well, decaying by half as you go up each octave.)

Presumably the actual processes that produce landforms (erosion
etc) are not working independently in frequency space; we can't
expect to generate cliffs at the rate observed in the real world
just by the summed noise functions happening to align. The question
then is to what degree the statistics observed here reflect
correlated outcomes or uncorrelated outcomes, since the noise
generation will only produce the latter.

Of course, if the program produces "interesting/believable" terrain
this is fine, so I'm not knocking this paper for doing this, but
I do think caution needs to be considered for the open questions
raised in this section, because if you delve more and more down
into fine statistical details while ignoring the correlation, I
suspect you're unlikely to find that much useful.

It's possible that even just the core exponential result of this
paper might be explained primarily by correlated, not decorrelated
processes.

Consider a 1D terrain consisting of a step function of height S.
If we use this paper's method for measuring the gradients of the
terrain, we will see:

   At highest (1 unit) resolution, a single gradient of size S, the rest 0
   At 2-unit resolution, 2 gradients of size S/2
   At 4-unit resolution, 4 gradients of size S/4
   At 8-unit resolution, 8 gradients of size S/8
   At 16-unit resolution, 16 gradients of size S/16

Note that here we see an exponential distribution, (although it is
spread across multiple octaves that we actually measure separately).
Still, I hypothesize that some kind of distribution of
cliffs of varying scales and orientations might
produce some kind of exponential gradient metric observed in the
paper, yet we know this is nothing like what the output of
exponential gradient noise is going to be like, because the latter
is all smooth functions with no correlation between octaves.

(To put it a different way: the output of the program from this
paper is interesting, but does it in any way resemble Utah?)

4. An implementation note

To what degree does this implementation really produce different
results from something like:

    1. compute n = noise()
    2. compute m = noise() // offset to be different
    3. m = uniform_distribution_to_exponential_distribution(m)
    4. return m*n

I.e., do we really need to change the noise function itself?
(I realize this doesn't compute *exactly* the same thing as
when the hash process is inside the noise; the question is
whether it produces something that differs in an important
way.)
	Concerning "Modeling Real-World Terrain with Exponentially Distributed Noise"
	http://jcgt.org/published/0004/02/01/

	Comments by Sean Barrett (sean at nothings dot org, @nothings on twitter)

	Table of contents:

	1. Proof of "mu" in section 3 loses a negation partway
	through, thus giving an obviously nonsensical result.

	2. I'm doubtful that this noise actually gives an expontential
	distribution.

	3. Some comments on the conclusion while we're at it.

	4. Do we really need to change the noise function?

	Discussion:

	(1)

	Paper claims that any value of mu < 1.1637 is small
	enough for mu^i to work, but obviously 1.1636^255
	is much larger than 1, not much smaller than 1. Even
	the hypothesis that this was once mu^-i doesn't fit:

	1.1636^-255 = 1.65979e-17
	2^-126 = 1.75494e-37

	Those are not even close, so I don't think this was
	just a transcription/notation error, but an actual
	error in the proof (well, several).

	The error in the proof is as follows:

	"The smallest normalized floating point value
	is 2^-(2^f-2), where f..."

	This is correct.

	However, following, the negation of the exponent is lost:

	"We therefore want mu^(B-1) <= 2^(2^f-2)"

	There follow two separate errors:

	"ensure that mu <= 2^(126/256)"

	which should be 2^(126/255), and

	"mu < 2^(126/256) < 1.1637"

	but 2^(126/256) is 1.40657..., not 1.1637...

	Correct calculation is as follows:

	mu^(B-1) <= 2^-(2^f-2)
	mu <= 2^-(2^f-2)/(B-1)
	mu <= 2^(-126/255)
	mu < 0.709995

	Of course, if mu doesn't match this constraint, the final
	output may still be perfectly acceptable. (The paper doesn't
	actually say what mu values were used; the generated images
	in the supplementals 1 & 2 don't indicate what mu values are
	used; and no attempt to compute the mu from the captured
	terrain data is given; thus this paragraph of the paper
	is the only actual information provided about mu, but is wrong.)

	(2)

	The m[] table is constructed by sampling the exponential
	function mu^i at regular intervals (and then interpolating
	smoothly between them in the noise function).

	However, this is not (on the face of it) the same thing
	as building a table that when uniformly sampled produces
	an exponential distribution. To do that, you need to take
	the cumulative probability distribution function, invert
	it, and regularly sample that.

	Now, exponentials have funny properties (like the derivative
	being an exponential), so maybe somehow this works for the
	exponential case, but nobody else on the net appears to think
	so. And let's cut straight to the supplemental materials:

	"Exponential Distribution: To verify the code for generating
	exponentially distributed random numbers, compile and run
	the program in this folder"

	However, this program does not use the m[] table approach
	described in the previous section as used in the noise. Instead,
	this program tests this function:

	float ExpRand(){
	static const float scale = 1/log(0.5f*((float)RAND_MAX + 2.0f));
	return -scale * log(0.5f*rand() + 1.0f) + 1.0f;
	} //ExpRand

	whereas m is effectively computed as

	m[i] = exp(mu, i)

	which would correspond roughly to a function like:

	float ExpRand???() {
	return exp(mu, a*rand()+b);
	}

	That is, the correct generator transforms uniformly sampled
	numbers into exponentially distributed numbers by transforming
	them with log(), whereas the m[] table is effectively seeded with
	with exp(mu,i), so that function is basically transforming
	them with exp(), which is quite different from log()!

	Maybe there is some justification for doing it this way; perhaps
	the rest of the Internet is wrong and this is a way of producing
	exponentially-distributed numbers, but in that case the text
	should probably justify it.

	My best guess is that we have the following:

	A) perlin noise is (essentially) uniformly distributed
	B) actual terrain is exponentially-distributed
	(low magnitudes are more common than high magnitudes)
	C) this paper provides non-uniformly (but not exponentially-distributed) noise
	(low magnitudes are more common than high magnitudes)

	Thus it is still producing a useful result, and the distribution
	is still guided by the results for measuring real-world terrain,
	so the results still (somewhat) mimic reality.

	It would be good to try an actual exponential distribution and
	see if the results are significantly different; maybe they
	basically look the same. Also, accurately matching the statistics
	of the real world may not be that useful, due to the issue
	from the next section.

	C)

	"Some interesting open questions remain."

	A crucial issue with using summed octaves of noise in generating
	terrain is that the octaves are generated _independently_. If
	we look at the real world, we observe things like steep cliffs.
	A vertical cliff will produce a steep gradient at every octave,
	_correlated_. (Well, decaying by half as you go up each octave.)

	Presumably the actual processes that produce landforms (erosion
	etc) are not working independently in frequency space; we can't
	expect to generate cliffs at the rate observed in the real world
	just by the summed noise functions happening to align. The question
	then is to what degree the statistics observed here reflect
	correlated outcomes or uncorrelated outcomes, since the noise
	generation will only produce the latter.

	Of course, if the program produces "interesting/believable" terrain
	this is fine, so I'm not knocking this paper for doing this, but
	I do think caution needs to be considered for the open questions
	raised in this section, because if you delve more and more down
	into fine statistical details while ignoring the correlation, I
	suspect you're unlikely to find that much useful.

	It's possible that even just the core exponential result of this
	paper might be explained primarily by correlated, not decorrelated
	processes.

	Consider a 1D terrain consisting of a step function of height S.
	If we use this paper's method for measuring the gradients of the
	terrain, we will see:

	At highest (1 unit) resolution, a single gradient of size S, the rest 0
	At 2-unit resolution, 2 gradients of size S/2
	At 4-unit resolution, 4 gradients of size S/4
	At 8-unit resolution, 8 gradients of size S/8
	At 16-unit resolution, 16 gradients of size S/16

	Note that here we see an exponential distribution, (although it is
	spread across multiple octaves that we actually measure separately).
	Still, I hypothesize that some kind of distribution of
	cliffs of varying scales and orientations might
	produce some kind of exponential gradient metric observed in the
	paper, yet we know this is nothing like what the output of
	exponential gradient noise is going to be like, because the latter
	is all smooth functions with no correlation between octaves.

	(To put it a different way: the output of the program from this
	paper is interesting, but does it in any way resemble Utah?)

	4. An implementation note

	To what degree does this implementation really produce different
	results from something like:

	1. compute n = noise()
	2. compute m = noise() // offset to be different
	3. m = uniform_distribution_to_exponential_distribution(m)
	4. return m*n

	I.e., do we really need to change the noise function itself?
	(I realize this doesn't compute exactly the same thing as
	when the hash process is inside the noise; the question is
	whether it produces something that differs in an important
	way.)