Skip to content

Instantly share code, notes, and snippets.

@dchest
Created June 20, 2017 17:59
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dchest/50d52015939a5772497815dcd33a7983 to your computer and use it in GitHub Desktop.
Save dchest/50d52015939a5772497815dcd33a7983 to your computer and use it in GitHub Desktop.
Buzhash with secret key
import { wipe } from "@stablelib/wipe";
/**
* Buzhash implements cyclic polymomial rolling hash function.
* It is a custom developed keyed variant with protections against plain text
* recovery from chunk lengths.
*
* Reading:
*
* http://www.serve.net/buz/Notes.1st.year/HTML/C6/rand.012.html
* http://www.serve.net/buz/hash.adt/java.002.html
* https://en.wikipedia.org/wiki/Rolling_hash#Cyclic_polynomial
*
* Buzhash is used to split data into content-dependent chunks.
*
* To make it difficult to guess plain text by watching split points,
* we apply the following measures using a secret key:
*
* - Substitution table is pseudorandomly permuted.
* - Initial 32-bit state is derived from key.
* - Window size slightly varies depending on key.
* - Digest is scrambled with 32-bit block cipher.
*
* To further enhance protection, it's a good idea to pad chunks
* before encrypting them to hide their original length.
*/
export class Buzhash {
private _initialState: number;
private _state: number;
private _rotn: number;
private _windowSize: number;
private _buf: Uint8Array;
private _bufpos: number;
private _filled = false;
private _table: Uint32Array;
private _hashKey: Uint8Array;
constructor(windowSize: number, key: Uint8Array) {
if (key.length !== 32) {
throw new Error("Buzhash: key must be 32 bytes");
}
// Use first 16 bytes of key to generate table and parameters.
const seed = key.subarray(0, 16);
this._table = generateTable(seed);
// Set the initial state to a pseudorandom value.
this._initialState = scramble(seed, 0xFFFFFF01);
// Pseudorandomly vary window size by ~1/4.
const wmin = windowSize - Math.floor(windowSize / 4);
const wmax = windowSize + Math.ceil(windowSize / 4);
const rnd = scramble(seed, 0xFFFFFF02);
windowSize = rnd % (wmax - wmin + 1) + wmin; // ignoring mod bias
// Make window size divisible by 32.
windowSize = Math.ceil(windowSize / 32) * 32;
// Use last 16 bytes of key as a hash scrambling key.
this._hashKey = new Uint8Array(key.subarray(16, 32));
this._rotn = windowSize % 32;
this._buf = new Uint8Array(windowSize);
this._windowSize = windowSize;
this.reset();
}
reset(): this {
this._state = this._initialState;
this._bufpos = 0;
this._filled = false;
return this;
}
update(b: number) {
if (this._bufpos === this._windowSize) {
this._bufpos = 0;
this._filled = true;
}
let s = this._state << 1 | this._state >>> (32 - 1);
if (this._filled) {
let v = this._table[this._buf[this._bufpos]];
s ^= v << this._rotn | v >>> (32 - this._rotn);
}
this._buf[this._bufpos++] = b;
this._state = (s ^ this._table[b]) >>> 0;
}
digest(): number {
return scramble(this._hashKey, this._state);
}
/**
* Returns true if splitting is needed, that is when the current digest
* reaches the given number of the same consecutive low bits.
*
* (Maximum 32, but usually lower, for example 13).
*/
split(bits: number): boolean {
return (this.digest() & ((1 << bits) - 1)) === ((~0) & ((1 << bits) - 1));
}
clean() {
this._initialState = 0;
this._windowSize = 0;
this.reset();
wipe(this._buf);
wipe(this._table);
wipe(this._hashKey);
}
}
/**
* Generates a 256-number table of pseudorandom 32-bit unsigned integers such
* that every bit position in the table has exactly 128 zeros and 128 ones.
*/
function generateTable(seed: Uint8Array): Uint32Array {
const bits = new Uint8Array(256);
const table = new Uint32Array(256);
let ctr = 1;
// Fill bits table with alternating 0 and 1.
for (let i = 0; i < 256; i++) {
bits[i] = i & 1;
}
// Generate table.
for (let i = 0; i < 32; i++) {
// Permute bits table.
let pi = 0;
while (pi < 256) {
// Generate 4 pseudorandom bytes.
let rnd = scramble(seed, ctr++);
// Take each pseudorandom byte as index
// and swap bit table value at this index.
for (let k = 0; k < 4; k++) {
let pj = rnd & 0xff;
let tmp = bits[pi];
bits[pi] = bits[pj];
bits[pj] = tmp;
rnd >>= 8;
pi++;
}
}
// Set bit in each integer in the table
// according to the value in bits table.
for (let j = 0; j < 256; j++) {
table[j] = (table[j] << 1) | bits[j];
}
}
wipe(bits);
return table;
}
/**
* Scramble encrypts 32-bit number v with 16-byte key k
* and returns 32-bit number.
*
* Based on JP Aumasson's 4-byte block cipher
* https://github.com/veorq/ipcrypt
*/
export function scramble(k: Uint8Array, v: number): number {
let a = (v >>> 0) & 0xff;
let b = (v >>> 8) & 0xff;
let c = (v >>> 16) & 0xff;
let d = (v >>> 24) & 0xff;
a ^= k[0]; b ^= k[1]; c ^= k[2]; d ^= k[3];
// Permute
a = (a + b) & 0xff;
c = (c + d) & 0xff;
b = rotl8(b, 2);
d = rotl8(d, 5);
b ^= a;
d ^= c;
a = rotl8(a, 4);
a = (a + d) & 0xff;
c = (c + b) & 0xff;
b = rotl8(b, 3);
d = rotl8(d, 7);
b ^= c;
d ^= a;
c = rotl8(c, 4);
a ^= k[4]; b ^= k[5]; c ^= k[6]; d ^= k[7];
// Permute
a = (a + b) & 0xff;
c = (c + d) & 0xff;
b = rotl8(b, 2);
d = rotl8(d, 5);
b ^= a;
d ^= c;
a = rotl8(a, 4);
a = (a + d) & 0xff;
c = (c + b) & 0xff;
b = rotl8(b, 3);
d = rotl8(d, 7);
b ^= c;
d ^= a;
c = rotl8(c, 4);
a ^= k[8]; b ^= k[9]; c ^= k[10]; d ^= k[11];
// Permute
a = (a + b) & 0xff;
c = (c + d) & 0xff;
b = rotl8(b, 2);
d = rotl8(d, 5);
b ^= a;
d ^= c;
a = rotl8(a, 4);
a = (a + d) & 0xff;
c = (c + b) & 0xff;
b = rotl8(b, 3);
d = rotl8(d, 7);
b ^= c;
d ^= a;
c = rotl8(c, 4);
a ^= k[12]; b ^= k[13]; c ^= k[14]; d ^= k[15];
return ((a << 0) | (b << 8) | (c << 16) | (d << 24)) >>> 0;
}
function rotl8(x: number, n: number): number {
return ((x << n) & 0xff) | (x >>> (8 - n));
}
@tasn
Copy link

tasn commented Jul 6, 2020

Hey,

Thanks for the gist, though I took a look at it and found something a bit odd.
According to the borg people, window size should not be divisible by 32 because it cancels out the seed, and yet here it is forced to be divisible by 32. Is there any reason why you did it this way?

More info: borgbackup/borg#903

@dchest
Copy link
Author

dchest commented Jul 6, 2020

@tasn hm, unfortunately I don't remember my reasoning for this.

@tasn
Copy link

tasn commented Jul 6, 2020

No worries, thanks!

@tasn
Copy link

tasn commented Jul 7, 2020

For future reference, I'm sure I found a mistake relating to this value:

        windowSize = Math.ceil(windowSize / 32) * 32;

        // ... snip ...

        this._rotn = windowSize % 32;

windowSize is made to always be divisible by 32 so rotn is always 0. This means that rotation never happens. Just FYI.

@dchest
Copy link
Author

dchest commented Jul 7, 2020

Thanks, indeed! 🤦

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment