kriegsman/random8

## random8
In order to evaluate the true randomness of "random8", I generated 64K of random bytes using random8, and ran it through "ent" to analyze the entropy:

  Entropy = 8.000000 bits per byte.

  Optimum compression would reduce the size
  of this 65536 byte file by 0 percent.

  Chi square distribution for 65536 samples is 0.00, and randomly
  would exceed this value more than than 99.99 percent of the times.

  Arithmetic mean value of data bytes is 127.5000 (127.5 = random).
  Monte Carlo value for Pi is 3.343709943 (error 6.43 percent).
  Serial correlation coefficient is 0.083085 (totally uncorrelated = 0.0).

In other words, it has VERY high entropy, and by those measures is almost TOTALLY random.  Howeverm there's some serial-correlation showing up there which I didn't like, and I decided to do what I always like to do: just look at the data with my eyes and brain.  Notice anything non-random about the COLUMNS of data?

  00000000  36 27 dc 65 12 73 58 d1  2e ff 14 7d 8a cb 10 69
  00000010  26 d7 4c 95 02 23 c8 01  1e af 84 ad 7a 7b 80 99
  00000020  16 87 bc c5 f2 d3 38 31  0e 5f f4 dd 6a 2b f0 c9
  00000030  06 37 2c f5 e2 83 a8 61  fe 0f 64 0d 5a db 60 f9
  00000040  f6 e7 9c 25 d2 33 18 91  ee bf d4 3d 4a 8b d0 29
  00000050  e6 97 0c 55 c2 e3 88 c1  de 6f 44 6d 3a 3b 40 59
  00000060  d6 47 7c 85 b2 93 f8 f1  ce 1f b4 9d 2a eb b0 89
  00000070  c6 f7 ec b5 a2 43 68 21  be cf 24 cd 1a 9b 20 b9
  00000080  b6 a7 5c e5 92 f3 d8 51  ae 7f 94 fd 0a 4b 90 e9
  00000090  a6 57 cc 15 82 a3 48 81  9e 2f 04 2d fa fb 00 19
  000000a0  96 07 3c 45 72 53 b8 b1  8e df 74 5d ea ab 70 49
  000000b0  86 b7 ac 75 62 03 28 e1  7e 8f e4 8d da 5b e0 79
  000000c0  76 67 1c a5 52 b3 98 11  6e 3f 54 bd ca 0b 50 a9
  000000d0  66 17 8c d5 42 63 08 41  5e ef c4 ed ba bb c0 d9
  000000e0  56 c7 fc 05 32 13 78 71  4e 9f 34 1d aa 6b 30 09
  000000f0  46 77 6c 35 22 c3 e8 a1  3e 4f a4 4d 9a 1b a0 39

Yeah.  So even if the overall DISTRIBUTION is "totally random",
there's still a strong NONRANDOM pattern in the low bits.


So what to do?  Well, random8 actually uses a 16-bit pseudorandom number generator, and has just been returning the low 8 bits.  Changing it to return the SUM of the low 8 bits and the high 8 bits resulted in MUCH better sequential non-correlation:

  Monte Carlo value for Pi is 3.136055667 (error 0.18 percent).
  Serial correlation coefficient is -0.000140 (totally uncorrelated = 0.0).

Seria correllation was 0.08, and is now 0.0001, which is about a hundred times better.  But what about 'visual inspection'?

  00000000  a1 65 c4 46 19 9f 45 ab  72 3a ac 90 47 43 36 1f
  00000010  74 04 5e c6 15 0b e0 30  87 23 b8 49 65 58 22 3d
  00000020  8b f7 9b 79 15 8b de a8  60 e0 e7 b6 07 02 f3 cf
  00000030  e6 3d 7d 60 19 1e 41 14  fd 70 3b d7 2d 40 a7 d6
  00000040  85 d8 02 7b 21 c6 07 74  5e d5 b2 ac d6 11 40 50
  00000050  68 c7 2b ca 2d 82 31 c8  83 0e 4e 35 04 77 bc 3e
  00000060  8e 0a f9 4e 3d 51 c0 10  6b 1a 0d 73 b6 71 1d a0
  00000070  f9 a0 6a 05 51 35 b2 4d  18 fb f1 64 ec fe 61 76
  00000080  a8 8b 80 f0 69 2d 09 7d  89 b0 f8 09 a6 20 89 c0
  00000090  9b ca 39 0f 85 38 c3 a1  be 38 23 62 e4 d6 96 7f
  000000a0  d2 5c 97 62 a5 58 e2 b9  b7 95 73 6f a6 1f 86 b1
  000000b0  4d 43 98 e9 c9 8c 64 c5  74 c6 e6 31 ec fd 5b 57
  000000c0  0c 7e 3e a5 f0 d3 4b c6  f5 cb 7e a6 b6 6f 13 71
  000000d0  0f 0c 87 94 1c 2f 95 ba  3a a3 39 cf 04 75 b0 ff
  000000e0  56 ef 74 b7 4c 9f 43 a2  43 50 19 ac d6 0e 30 02
  000000f0  e1 26 06 0e 80 23 56 7e  10 d1 1c 3d 2c 3c 95 78

Looks good.  And the way I implemented it introduces only ONE more assembly language instruction, it seems like a good trade-off.


And what about random16?  Well, random16 still has a 'pattern' with the lowest four bits repeating every sixteen calls, but given the scaling built in to random16, that pattern will only START to be present in return values when random16 called with arguments greater than 4095, and then only in the lowest bits.  As always, ideally, you want a larger pool of random bits than you're using at any given moment, and we see that here: random16 has a pool of 16 random bits, and if you use more than 12 of them at a time, you start being able to detect subtle patterns.  However, given the speed of random16 compared to Arduino 'random', I think the tradeoff is still definitely worth it.

If you want to add MORE entropy to the pool without too much overhead, put this line in your 'loop' function:

  random16_add_entropy( random() );

I had been including this in most of my animations to overcome the weaknesses in random8, which I use a ton because of it's speed.  But now that random8 has a hundred time more non-sequential randomness in it, I'm not even sure if I need to.

If you want, for whatever reason, you can always add more entropy to the random pool using random16_add_entropy, and you can get it from whatever source you like: analog inputs, high-resolution timers, or even (interestingly!), the jitter found in low-resolution timers.  See https://sites.google.com/site/astudyofentropy/project-definition/timer-jitter-entropy-sources/entropy-library for some great code and analysis if you want cryptographic-grade entropy from your Arduino.

Anyway, the new random8() code has been pushed to the FastLED 3.1 branch; enjoy, and let me know if you see any significant changes in your animations; random things should be more non-sequentially random now, but with the same overall numeric distribution.

http://imgur.com/K9d57FN
	In order to evaluate the true randomness of "random8", I generated 64K of random bytes using random8, and ran it through "ent" to analyze the entropy:

	Entropy = 8.000000 bits per byte.

	Optimum compression would reduce the size
	of this 65536 byte file by 0 percent.

	Chi square distribution for 65536 samples is 0.00, and randomly
	would exceed this value more than than 99.99 percent of the times.

	Arithmetic mean value of data bytes is 127.5000 (127.5 = random).
	Monte Carlo value for Pi is 3.343709943 (error 6.43 percent).
	Serial correlation coefficient is 0.083085 (totally uncorrelated = 0.0).

	In other words, it has VERY high entropy, and by those measures is almost TOTALLY random. Howeverm there's some serial-correlation showing up there which I didn't like, and I decided to do what I always like to do: just look at the data with my eyes and brain. Notice anything non-random about the COLUMNS of data?

	00000000 36 27 dc 65 12 73 58 d1 2e ff 14 7d 8a cb 10 69
	00000010 26 d7 4c 95 02 23 c8 01 1e af 84 ad 7a 7b 80 99
	00000020 16 87 bc c5 f2 d3 38 31 0e 5f f4 dd 6a 2b f0 c9
	00000030 06 37 2c f5 e2 83 a8 61 fe 0f 64 0d 5a db 60 f9
	00000040 f6 e7 9c 25 d2 33 18 91 ee bf d4 3d 4a 8b d0 29
	00000050 e6 97 0c 55 c2 e3 88 c1 de 6f 44 6d 3a 3b 40 59
	00000060 d6 47 7c 85 b2 93 f8 f1 ce 1f b4 9d 2a eb b0 89
	00000070 c6 f7 ec b5 a2 43 68 21 be cf 24 cd 1a 9b 20 b9
	00000080 b6 a7 5c e5 92 f3 d8 51 ae 7f 94 fd 0a 4b 90 e9
	00000090 a6 57 cc 15 82 a3 48 81 9e 2f 04 2d fa fb 00 19
	000000a0 96 07 3c 45 72 53 b8 b1 8e df 74 5d ea ab 70 49
	000000b0 86 b7 ac 75 62 03 28 e1 7e 8f e4 8d da 5b e0 79
	000000c0 76 67 1c a5 52 b3 98 11 6e 3f 54 bd ca 0b 50 a9
	000000d0 66 17 8c d5 42 63 08 41 5e ef c4 ed ba bb c0 d9
	000000e0 56 c7 fc 05 32 13 78 71 4e 9f 34 1d aa 6b 30 09
	000000f0 46 77 6c 35 22 c3 e8 a1 3e 4f a4 4d 9a 1b a0 39

	Yeah. So even if the overall DISTRIBUTION is "totally random",
	there's still a strong NONRANDOM pattern in the low bits.


	So what to do? Well, random8 actually uses a 16-bit pseudorandom number generator, and has just been returning the low 8 bits. Changing it to return the SUM of the low 8 bits and the high 8 bits resulted in MUCH better sequential non-correlation:

	Monte Carlo value for Pi is 3.136055667 (error 0.18 percent).
	Serial correlation coefficient is -0.000140 (totally uncorrelated = 0.0).

	Seria correllation was 0.08, and is now 0.0001, which is about a hundred times better. But what about 'visual inspection'?

	00000000 a1 65 c4 46 19 9f 45 ab 72 3a ac 90 47 43 36 1f
	00000010 74 04 5e c6 15 0b e0 30 87 23 b8 49 65 58 22 3d
	00000020 8b f7 9b 79 15 8b de a8 60 e0 e7 b6 07 02 f3 cf
	00000030 e6 3d 7d 60 19 1e 41 14 fd 70 3b d7 2d 40 a7 d6
	00000040 85 d8 02 7b 21 c6 07 74 5e d5 b2 ac d6 11 40 50
	00000050 68 c7 2b ca 2d 82 31 c8 83 0e 4e 35 04 77 bc 3e
	00000060 8e 0a f9 4e 3d 51 c0 10 6b 1a 0d 73 b6 71 1d a0
	00000070 f9 a0 6a 05 51 35 b2 4d 18 fb f1 64 ec fe 61 76
	00000080 a8 8b 80 f0 69 2d 09 7d 89 b0 f8 09 a6 20 89 c0
	00000090 9b ca 39 0f 85 38 c3 a1 be 38 23 62 e4 d6 96 7f
	000000a0 d2 5c 97 62 a5 58 e2 b9 b7 95 73 6f a6 1f 86 b1
	000000b0 4d 43 98 e9 c9 8c 64 c5 74 c6 e6 31 ec fd 5b 57
	000000c0 0c 7e 3e a5 f0 d3 4b c6 f5 cb 7e a6 b6 6f 13 71
	000000d0 0f 0c 87 94 1c 2f 95 ba 3a a3 39 cf 04 75 b0 ff
	000000e0 56 ef 74 b7 4c 9f 43 a2 43 50 19 ac d6 0e 30 02
	000000f0 e1 26 06 0e 80 23 56 7e 10 d1 1c 3d 2c 3c 95 78

	Looks good. And the way I implemented it introduces only ONE more assembly language instruction, it seems like a good trade-off.


	And what about random16? Well, random16 still has a 'pattern' with the lowest four bits repeating every sixteen calls, but given the scaling built in to random16, that pattern will only START to be present in return values when random16 called with arguments greater than 4095, and then only in the lowest bits. As always, ideally, you want a larger pool of random bits than you're using at any given moment, and we see that here: random16 has a pool of 16 random bits, and if you use more than 12 of them at a time, you start being able to detect subtle patterns. However, given the speed of random16 compared to Arduino 'random', I think the tradeoff is still definitely worth it.

	If you want to add MORE entropy to the pool without too much overhead, put this line in your 'loop' function:

	random16_add_entropy( random() );

	I had been including this in most of my animations to overcome the weaknesses in random8, which I use a ton because of it's speed. But now that random8 has a hundred time more non-sequential randomness in it, I'm not even sure if I need to.

	If you want, for whatever reason, you can always add more entropy to the random pool using random16_add_entropy, and you can get it from whatever source you like: analog inputs, high-resolution timers, or even (interestingly!), the jitter found in low-resolution timers. See https://sites.google.com/site/astudyofentropy/project-definition/timer-jitter-entropy-sources/entropy-library for some great code and analysis if you want cryptographic-grade entropy from your Arduino.

	Anyway, the new random8() code has been pushed to the FastLED 3.1 branch; enjoy, and let me know if you see any significant changes in your animations; random things should be more non-sequentially random now, but with the same overall numeric distribution.

	http://imgur.com/K9d57FN