Skip to content

Instantly share code, notes, and snippets.

@atoponce
Created July 30, 2018 20:30
Show Gist options
  • Save atoponce/e8cd8cc5ddfa43fea02e0e1ed7940ff0 to your computer and use it in GitHub Desktop.
Save atoponce/e8cd8cc5ddfa43fea02e0e1ed7940ff0 to your computer and use it in GitHub Desktop.
Extracting entropy from mouse movement events

Extracing Entropy From Mouse Movement Events

Here are my findings of entropy extraction estimates from mouse movement events in the browser. Tables below show the results sorted by the minimum entropy extraction. Timing events, keyboard events, and other potential sources of entropy that can be collected from the user are not considered here.

A visual representation of slow, medium, and fast mouse movements can help visualize why the entropy estimation increases as the mouse velocity increases. The recorded data was plotted with Gnuplot as follows:

set term png;
set output 'mouse-slow.png';
set key off;
set xlabel "Browser client x-coordinate";
set ylabel "Browser client y-coordinate";
# the browser window size on my Pinebook- adjust as necessary
set xrange [426:938];
set yrange [10:522];
set term png size 512,512;
plot 'mouse-slow.txt' linetype 1 pointsize .5 linecolor black;

The plots were then drawn with:

$ gnuplot -c plot.gp

At the bottom of this document is an HTML proof-of-concept to record (X, Y) coordinate pairs for your own testing.

Compression As An Entropy Estimator

To estimate the entropy inherent in the mouse movement events, I am using compression to compress the recorded (X, Y) coordinate pairs, using the tightest compression possible for each compression algorithm. The goal is to get an approximation of the minimum entropy as a base to build from.

Per NIST, compression can be used as a good entropy estimator:

The compression estimate, proposed by Hagerty and Draper [HD12], computes the entropy rate of a dataset, based on how much the dataset can be compressed. This estimator is based on the Maurer Universal Statistic [Mau92]. The estimate is computed by generating a dictionary of values, and then computing the average number of samples required to produce an output, based on the dictionary. One advantage of using the Maurer statistic is that there is no assumption of independence. When sequences with dependencies is tested with this statistic, the compression rate is affected (and therefore the entropy), but an entropy estimate is still obtained. A calculation of the Maurer statistic is efficient, as it requires only one pass through the dataset to provide an entropy estimate.

Another advantage of using compression is the availability of various compression algorithms in userspace. This makes it trivial to extract entropy estimates without writing possibly complex tools for calculating collision estimates or lag prediction estimates.

The following command line switches were passed to each compression algorithm:

  • Bzip2: $ bzip --best $file
  • Gzip: $ gzip --best $file
  • LZ4: $ lz4 -9 $file
  • LZMA: $ lzma --best $file
  • LZO: $ lzop --best $file
  • XZ: $ xz --best $file
  • ZPAQ: $ zpaq -a ${file}.zpaq $file -m5
  • lrzip: $ lrz --best $file

As compression algorithms improve, also per the NIST SP800-90B document, it's recommended to reduce the minimum entropy by a factor of 2, just to be safe.

The tables

Because the <div> is 512x512 pixels square, every (X, Y) coordinate pair can fit within 18 bits. This fact was used when calculating how much entropy can be extracted knowing the "bits per byte" result.

Slow movements

From painfully slow mouse movements across the <div> element, trying to get (X, Y) coordinates as sequential as possible, the following entropy extraction estimates are recommended:

  • Min: 2.75-bits
  • NIST: 1.375-bits

The original file size was 7936 bytes. Sorted by compressed bytes:

Comp. Alg. Comp. Bytes Comp. % Ent. Bits/Byte Ent. Extracted
LZMA 1209 84.77 1.22 2.75
XZ 1256 84.17 1.27 2.86
lrzip 1280 83.87 1.29 2.90
ZPAQ 2200 72.28 2.22 5.00
Gzip 2224 71.98 2.24 5.04
BZip2 2366 70.19 2.38 5.36
LZO 3402 57.13 3.43 7.72
LZ4 4771 39.88 4.81 10.82

Medium movements

From moving the mouse around the <div> element as I would normally. No rhyme or reason was made in drawing anything in the browser. As such, the following entropy extraction estimates would be recommended:

  • Min: 5.24-bits
  • NIST: 2.62-bits

The original file size was 7791 bytes. Sorted by compressed bytes:

Comp. Alg. Comp. Bytes Comp. % Ent. Bits/Byte Ent. Extracted
LZMA 2270 70.86 2.33 5.24
XZ 2316 70.27 2.38 5.36
lrzip 2340 69.97 2.40 5.40
BZip2 2831 63.66 2.91 6.55
ZPAQ 3025 61.71 3.06 6.89
Gzip 3426 56.03 3.52 7.92
LZO 4439 43.02 4.56 10.26
LZ4 5964 23.45 6.12 13.77

Fast movements

I moved the mouse across the browser window as insanely fast as possible. No surprise, this approach took the longest to gather all 1,000 events, as most mouse movement was likely happing outside of the <div>, and as such, not getting recorded. These entropy extraction estimates would be:

  • Min: 6.62-bits
  • NIST: 3.31-bits

The original file size was 7775 bytes. Sorted by compressed bytes:

Comp. Alg. Comp. Bytes Comp. % Ent. Bits/Byte Ent. Extracted
BZip2 2862 63.19 2.94 6.62
LZMA 3014 61.23 3.10 6.98
XZ 3060 60.64 3.15 7.09
lrzip 3083 60.35 3.17 7.13
ZPAQ 3423 55.97 3.52 7.92
Gzip 3557 54.25 3.66 8.24
LZO 4537 41.65 4.67 10.51
LZ4 6033 22.41 6.21 13.97

Conclusion

You should run your own tests, and many of them, then record the minimum entropy extraction per event reduced by a factor of 2, per NIST's recommendation. At the bare minimum, you can extract 1-bit per event, also per NIST.

Once collected, run the data through a cryptographically secure key derivation function (KDF) with arbitrary output. Recommended is PBKDF2, scrypt, or Argon2. An extensible output function (XOF) would be sufficient also, such as SHAKE128, SHAKE256, & BLAKE2,

Via my findings, if I were to use 1.375-bits for me extracted estimate per event, then based on the security margin I need in bits, the following table shows the minimum number of event necessary to capture to make that possible:

Security Needed Events Needed
64 47
70 51
80 59
96 70
128 94
224 163
256 187
384 280
512 373
521 379
1024 745
2048 1490
4096 2979
8192 5958

HTML Source Code

Use this to test mouse entropy extraction yourself. The (X, Y) coordinate pairs are logged to the JavaScript console.

<!DOCTYPE html>
<html lang="en">
    <head>
        <title>JavaScript entropy collector</title>
        <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
        <style>
            #ent {
            background-color: white;
            border: 1px solid black;
            height: 512px;
            width: 512px;
            }
        </style>
    </head>
    <body id="body">
        <center>
            <div id="ent"></div>
        </center>
        <script>
            var i=0, limit=1000;
            var logCoords = function(e) { console.log(e.clientX, e.clientY); };
            var ent = document.getElementById("ent");
            var body = document.getElementById("body");
            ent.onmousemove = function(e) {
            // console.log({i, limit}); // debug
            if(++i <= limit) { return logCoords(e); }
            body.style.backgroundColor = "red";
            return (ent.onmousemove = null);
            };
        </script>
    </body>
</html>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment