Skip to content

Instantly share code, notes, and snippets.

@kriegsman
Last active August 29, 2015 14:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kriegsman/97b55ffa29c86294da50 to your computer and use it in GitHub Desktop.
Save kriegsman/97b55ffa29c86294da50 to your computer and use it in GitHub Desktop.
Analysis of "random8"
In order to evaluate the true randomness of "random8", I generated 64K of random bytes using random8, and ran it through "ent" to analyze the entropy:
Entropy = 8.000000 bits per byte.
Optimum compression would reduce the size
of this 65536 byte file by 0 percent.
Chi square distribution for 65536 samples is 0.00, and randomly
would exceed this value more than than 99.99 percent of the times.
Arithmetic mean value of data bytes is 127.5000 (127.5 = random).
Monte Carlo value for Pi is 3.343709943 (error 6.43 percent).
Serial correlation coefficient is 0.083085 (totally uncorrelated = 0.0).
In other words, it has VERY high entropy, and by those measures is almost TOTALLY random. Howeverm there's some serial-correlation showing up there which I didn't like, and I decided to do what I always like to do: just look at the data with my eyes and brain. Notice anything non-random about the COLUMNS of data?
00000000 36 27 dc 65 12 73 58 d1 2e ff 14 7d 8a cb 10 69
00000010 26 d7 4c 95 02 23 c8 01 1e af 84 ad 7a 7b 80 99
00000020 16 87 bc c5 f2 d3 38 31 0e 5f f4 dd 6a 2b f0 c9
00000030 06 37 2c f5 e2 83 a8 61 fe 0f 64 0d 5a db 60 f9
00000040 f6 e7 9c 25 d2 33 18 91 ee bf d4 3d 4a 8b d0 29
00000050 e6 97 0c 55 c2 e3 88 c1 de 6f 44 6d 3a 3b 40 59
00000060 d6 47 7c 85 b2 93 f8 f1 ce 1f b4 9d 2a eb b0 89
00000070 c6 f7 ec b5 a2 43 68 21 be cf 24 cd 1a 9b 20 b9
00000080 b6 a7 5c e5 92 f3 d8 51 ae 7f 94 fd 0a 4b 90 e9
00000090 a6 57 cc 15 82 a3 48 81 9e 2f 04 2d fa fb 00 19
000000a0 96 07 3c 45 72 53 b8 b1 8e df 74 5d ea ab 70 49
000000b0 86 b7 ac 75 62 03 28 e1 7e 8f e4 8d da 5b e0 79
000000c0 76 67 1c a5 52 b3 98 11 6e 3f 54 bd ca 0b 50 a9
000000d0 66 17 8c d5 42 63 08 41 5e ef c4 ed ba bb c0 d9
000000e0 56 c7 fc 05 32 13 78 71 4e 9f 34 1d aa 6b 30 09
000000f0 46 77 6c 35 22 c3 e8 a1 3e 4f a4 4d 9a 1b a0 39
Yeah. So even if the overall DISTRIBUTION is "totally random",
there's still a strong NONRANDOM pattern in the low bits.
So what to do? Well, random8 actually uses a 16-bit pseudorandom number generator, and has just been returning the low 8 bits. Changing it to return the SUM of the low 8 bits and the high 8 bits resulted in MUCH better sequential non-correlation:
Monte Carlo value for Pi is 3.136055667 (error 0.18 percent).
Serial correlation coefficient is -0.000140 (totally uncorrelated = 0.0).
Seria correllation was 0.08, and is now 0.0001, which is about a hundred times better. But what about 'visual inspection'?
00000000 a1 65 c4 46 19 9f 45 ab 72 3a ac 90 47 43 36 1f
00000010 74 04 5e c6 15 0b e0 30 87 23 b8 49 65 58 22 3d
00000020 8b f7 9b 79 15 8b de a8 60 e0 e7 b6 07 02 f3 cf
00000030 e6 3d 7d 60 19 1e 41 14 fd 70 3b d7 2d 40 a7 d6
00000040 85 d8 02 7b 21 c6 07 74 5e d5 b2 ac d6 11 40 50
00000050 68 c7 2b ca 2d 82 31 c8 83 0e 4e 35 04 77 bc 3e
00000060 8e 0a f9 4e 3d 51 c0 10 6b 1a 0d 73 b6 71 1d a0
00000070 f9 a0 6a 05 51 35 b2 4d 18 fb f1 64 ec fe 61 76
00000080 a8 8b 80 f0 69 2d 09 7d 89 b0 f8 09 a6 20 89 c0
00000090 9b ca 39 0f 85 38 c3 a1 be 38 23 62 e4 d6 96 7f
000000a0 d2 5c 97 62 a5 58 e2 b9 b7 95 73 6f a6 1f 86 b1
000000b0 4d 43 98 e9 c9 8c 64 c5 74 c6 e6 31 ec fd 5b 57
000000c0 0c 7e 3e a5 f0 d3 4b c6 f5 cb 7e a6 b6 6f 13 71
000000d0 0f 0c 87 94 1c 2f 95 ba 3a a3 39 cf 04 75 b0 ff
000000e0 56 ef 74 b7 4c 9f 43 a2 43 50 19 ac d6 0e 30 02
000000f0 e1 26 06 0e 80 23 56 7e 10 d1 1c 3d 2c 3c 95 78
Looks good. And the way I implemented it introduces only ONE more assembly language instruction, it seems like a good trade-off.
And what about random16? Well, random16 still has a 'pattern' with the lowest four bits repeating every sixteen calls, but given the scaling built in to random16, that pattern will only START to be present in return values when random16 called with arguments greater than 4095, and then only in the lowest bits. As always, ideally, you want a larger pool of random bits than you're using at any given moment, and we see that here: random16 has a pool of 16 random bits, and if you use more than 12 of them at a time, you start being able to detect subtle patterns. However, given the speed of random16 compared to Arduino 'random', I think the tradeoff is still definitely worth it.
If you want to add MORE entropy to the pool without too much overhead, put this line in your 'loop' function:
random16_add_entropy( random() );
I had been including this in most of my animations to overcome the weaknesses in random8, which I use a ton because of it's speed. But now that random8 has a hundred time more non-sequential randomness in it, I'm not even sure if I need to.
If you want, for whatever reason, you can always add more entropy to the random pool using random16_add_entropy, and you can get it from whatever source you like: analog inputs, high-resolution timers, or even (interestingly!), the jitter found in low-resolution timers. See https://sites.google.com/site/astudyofentropy/project-definition/timer-jitter-entropy-sources/entropy-library for some great code and analysis if you want cryptographic-grade entropy from your Arduino.
Anyway, the new random8() code has been pushed to the FastLED 3.1 branch; enjoy, and let me know if you see any significant changes in your animations; random things should be more non-sequentially random now, but with the same overall numeric distribution.
http://imgur.com/K9d57FN
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment