Skip to content

Instantly share code, notes, and snippets.

@nothings
Last active August 29, 2015 14:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nothings/3aaf2586194265a3a622 to your computer and use it in GitHub Desktop.
Save nothings/3aaf2586194265a3a622 to your computer and use it in GitHub Desktop.
Concerning "Modeling Real-World Terrain with Exponentially Distributed Noise"
Concerning "Modeling Real-World Terrain with Exponentially Distributed Noise"
http://jcgt.org/published/0004/02/01/
Comments by Sean Barrett (sean at nothings dot org, @nothings on twitter)
Table of contents:
1. Proof of "mu" in section 3 loses a negation partway
through, thus giving an obviously nonsensical result.
2. I'm doubtful that this noise actually gives an expontential
distribution.
3. Some comments on the conclusion while we're at it.
4. Do we really need to change the noise function?
Discussion:
(1)
Paper claims that any value of mu < 1.1637 is small
enough for mu^i to work, but obviously 1.1636^255
is much larger than 1, not much smaller than 1. Even
the hypothesis that this was once mu^-i doesn't fit:
1.1636^-255 = 1.65979e-17
2^-126 = 1.75494e-37
Those are not even close, so I don't think this was
just a transcription/notation error, but an actual
error in the proof (well, several).
The error in the proof is as follows:
"The smallest normalized floating point value
is 2^-(2^f-2), where f..."
This is correct.
However, following, the negation of the exponent is lost:
"We therefore want mu^(B-1) <= 2^(2^f-2)"
There follow two *separate* errors:
"ensure that mu <= 2^(126/256)"
which should be 2^(126/255), and
"mu < 2^(126/256) < 1.1637"
but 2^(126/256) is 1.40657..., not 1.1637...
Correct calculation is as follows:
mu^(B-1) <= 2^-(2^f-2)
mu <= 2^-(2^f-2)/(B-1)
mu <= 2^(-126/255)
mu < 0.709995
Of course, if mu doesn't match this constraint, the final
output may still be perfectly acceptable. (The paper doesn't
actually say what mu values were used; the generated images
in the supplementals 1 & 2 don't indicate what mu values are
used; and no attempt to compute the mu from the captured
terrain data is given; thus this paragraph of the paper
is the only actual information provided about mu, but is wrong.)
(2)
The m[] table is constructed by sampling the exponential
function mu^i at regular intervals (and then interpolating
smoothly between them in the noise function).
However, this is not (on the face of it) the same thing
as building a table that when uniformly sampled produces
an exponential distribution. To do that, you need to take
the cumulative probability distribution function, invert
it, and regularly sample that.
Now, exponentials have funny properties (like the derivative
being an exponential), so maybe somehow this works for the
exponential case, but nobody else on the net appears to think
so. And let's cut straight to the supplemental materials:
"Exponential Distribution: To verify the code for generating
exponentially distributed random numbers, compile and run
the program in this folder"
However, this program does *not* use the m[] table approach
described in the previous section as used in the noise. Instead,
this program tests this function:
float ExpRand(){
static const float scale = 1/log(0.5f*((float)RAND_MAX + 2.0f));
return -scale * log(0.5f*rand() + 1.0f) + 1.0f;
} //ExpRand
whereas m is effectively computed as
m[i] = exp(mu, i)
which would correspond roughly to a function like:
float ExpRand???() {
return exp(mu, a*rand()+b);
}
That is, the correct generator transforms uniformly sampled
numbers into exponentially distributed numbers by transforming
them with log(), whereas the m[] table is effectively seeded with
with exp(mu,i), so that function is basically transforming
them with exp(), which is quite different from log()!
Maybe there is some justification for doing it this way; perhaps
the rest of the Internet is wrong and this is a way of producing
exponentially-distributed numbers, but in that case the text
should probably justify it.
My best guess is that we have the following:
A) perlin noise is (essentially) uniformly distributed
B) actual terrain is exponentially-distributed
(low magnitudes are more common than high magnitudes)
C) this paper provides non-uniformly (but not exponentially-distributed) noise
(low magnitudes are more common than high magnitudes)
Thus it is still producing a useful result, and the distribution
is still guided by the results for measuring real-world terrain,
so the results still (somewhat) mimic reality.
It would be good to try an actual exponential distribution and
see if the results are significantly different; maybe they
basically look the same. Also, accurately matching the statistics
of the real world may not be that useful, due to the issue
from the next section.
C)
"Some interesting open questions remain."
A crucial issue with using summed octaves of noise in generating
terrain is that the octaves are generated _independently_. If
we look at the real world, we observe things like steep cliffs.
A vertical cliff will produce a steep gradient at every octave,
_correlated_. (Well, decaying by half as you go up each octave.)
Presumably the actual processes that produce landforms (erosion
etc) are not working independently in frequency space; we can't
expect to generate cliffs at the rate observed in the real world
just by the summed noise functions happening to align. The question
then is to what degree the statistics observed here reflect
correlated outcomes or uncorrelated outcomes, since the noise
generation will only produce the latter.
Of course, if the program produces "interesting/believable" terrain
this is fine, so I'm not knocking this paper for doing this, but
I do think caution needs to be considered for the open questions
raised in this section, because if you delve more and more down
into fine statistical details while ignoring the correlation, I
suspect you're unlikely to find that much useful.
It's possible that even just the core exponential result of this
paper might be explained primarily by correlated, not decorrelated
processes.
Consider a 1D terrain consisting of a step function of height S.
If we use this paper's method for measuring the gradients of the
terrain, we will see:
At highest (1 unit) resolution, a single gradient of size S, the rest 0
At 2-unit resolution, 2 gradients of size S/2
At 4-unit resolution, 4 gradients of size S/4
At 8-unit resolution, 8 gradients of size S/8
At 16-unit resolution, 16 gradients of size S/16
Note that here we see an exponential distribution, (although it is
spread across multiple octaves that we actually measure separately).
Still, I hypothesize that some kind of distribution of
cliffs of varying scales and orientations might
produce some kind of exponential gradient metric observed in the
paper, yet we know this is nothing like what the output of
exponential gradient noise is going to be like, because the latter
is all smooth functions with no correlation between octaves.
(To put it a different way: the output of the program from this
paper is interesting, but does it in any way resemble Utah?)
4. An implementation note
To what degree does this implementation really produce different
results from something like:
1. compute n = noise()
2. compute m = noise() // offset to be different
3. m = uniform_distribution_to_exponential_distribution(m)
4. return m*n
I.e., do we really need to change the noise function itself?
(I realize this doesn't compute *exactly* the same thing as
when the hash process is inside the noise; the question is
whether it produces something that differs in an important
way.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment