Skip to content

Instantly share code, notes, and snippets.

@sairion
Last active February 9, 2017 03:42
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sairion/78431de4a8826dfc387e486946212cb4 to your computer and use it in GitHub Desktop.
Save sairion/78431de4a8826dfc387e486946212cb4 to your computer and use it in GitHub Desktop.
function benchDedupe(inputSize = 10000000) {
var fixtureNumbers = Array.from({ length: inputSize }).map(() => (Math.random() * 10) | 0)
var fixtureStrings = fixtureNumbers.map(e => String(e))
var fixtureObjects = (function(){
var res = [];
for (var i = 0, rando; i < inputSize; i++) {
rando = (Math.random() * 6) | 0
switch(rando) {
case 0: res[i] = new Object(); break;
case 1: res[i] = res[(Math.random() * res.length) | 0]; break;
case 2: res[i] = new Set(); break;
case 3: res[i] = res[(Math.random() * res.length) | 0]; break;
case 4: res[i] = new Array(); break;
case 5: res[i] = res[(Math.random() * res.length) | 0]; break;
}
}
return res
})();
/*
function dedupe1(arr) {
var d = {}
arr.forEach(el => { d[el] = el })
return Object.values(d)
}
// Optimized version from 1
function dedupe1Optimized(arr) {
var d = {}
var res = []
for (var el, i = 0, len = arr.length; i < len; i++) {
el = arr[i]
if (d[el] === undefined) {
res.push(el)
}
d[el] = true
}
return res
}
*/
var OBJECT = 'object'
// Object support version
function dedupe1Object(arr) {
var d = {}
var wm
var res = []
for (var el, i = 0, len = arr.length; i < len; i++) {
el = arr[i]
if (typeof el === OBJECT) {
if (wm == null) wm = new WeakMap()
if (wm.get(el) === undefined) {
res.push(el)
wm.set(el, true)
}
} else if (d[el] === undefined) {
res.push(el)
d[el] = true
}
}
return res
}
function dedupe2(arr) {
return [...new Set(arr)]
}
function run(label, fixture){
console.time(`${label} dedupe1`)
dedupe1Object(fixture)
console.timeEnd(`${label} dedupe1`)
console.time(`${label} dedupe2`)
dedupe2(fixture)
console.timeEnd(`${label} dedupe2`)
console.time(`${label} _.uniq`)
_.uniq(fixture)
console.timeEnd(`${label} _.uniq`)
}
run('fixtureNumbers', fixtureNumbers)
run('fixtureStrings', fixtureStrings)
run('fixtureObjects', fixtureObjects)
// future tests: mixed element types, sparse array, etc...
}
benchDedupe()
/*
fixtureNumbers dedupe1: 88.203ms
fixtureNumbers dedupe2: 1737.279ms
fixtureNumbers _.uniq: 1683.065ms
fixtureStrings dedupe1: 117.545ms
fixtureStrings dedupe2: 941.787ms
fixtureStrings _.uniq: 970.883ms
fixtureObjects dedupe1: 7221.313ms
fixtureObjects dedupe2: 11861.207ms
fixtureObjects _.uniq: 11418.789ms
*/
@sairion
Copy link
Author

sairion commented Feb 9, 2017

After seeing https://twitter.com/DaniStefanovic/status/828499976745545728 I wrote a basic dedupe function (which does a bit different, but fits for many use cases. This is not a strict benchmark or not saying my function is best approach, just seeing how much each approaches can be fast) and wrote very simple benchmark for similar ones.

After seeing https://twitter.com/jdalton/status/828759221562859521 I added _.uniq too

It seems Set() approach gets much more boost when elements are just string. But when they are just number, it gets critically slower, which is a bit odd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment