-
-
Save ralphcrisostomo/3141412 to your computer and use it in GitHub Desktop.
/** | |
Problem: | |
You have a javascript array that likely has some duplicate values and you would like a count of those values. | |
Solution: | |
Try this schnippet out. | |
*/ | |
function compressArray(original) { | |
var compressed = []; | |
// make a copy of the input array | |
var copy = original.slice(0); | |
// first loop goes over every element | |
for (var i = 0; i < original.length; i++) { | |
var myCount = 0; | |
// loop over every element in the copy and see if it's the same | |
for (var w = 0; w < copy.length; w++) { | |
if (original[i] == copy[w]) { | |
// increase amount of times duplicate is found | |
myCount++; | |
// sets item to undefined | |
delete copy[w]; | |
} | |
} | |
if (myCount > 0) { | |
var a = new Object(); | |
a.value = original[i]; | |
a.count = myCount; | |
compressed.push(a); | |
} | |
} | |
return compressed; | |
}; | |
// It should go something like this: | |
var testArray = new Array("dog", "dog", "cat", "buffalo", "wolf", "cat", "tiger", "cat"); | |
var newArray = compressArray(testArray); | |
/* | |
console: [ | |
Object { value="dog", count=2}, | |
Object { value="cat", count=3}, | |
Object { value="buffalo", count=1}, | |
Object { value="wolf", count=1}, | |
Object { value="tiger", count=1} | |
] | |
*/ |
If anyone wants a tiny one-liner that does the same exact thing:
arr.reduce((b,c)=>((b[b.findIndex(d=>d.el===c)]||b[b.push({el:c,count:0})-1]).count++,b),[]);
Example usage:
Example #1
[5, 5, 2, 9, 4, 2].reduce((b,c)=>((b[b.findIndex(d=>d.el===c)]||b[b.push({el:c,count:0})-1]).count++,b),[]);
Should return: [{el: 5, count: 2}, {el: 2, count: 2}, {el: 9, count: 1}, {el: 4, count: 1}]
Example #2
let arr =["dog", "dog", "cat", "buffalo", "wolf", "cat", "tiger", "cat"];
arr.reduce((b,c)=>((b[b.findIndex(d=>d.el===c)]||b[b.push({el:c,count:0})-1]).count++,b),[]);
Should return: [{el: "dog", count: 2}, {el: "cat", count: 3}, {el: "buffalo", count: 1}, {el: "wolf", count: 1}, {el: "tiger", count: 1}]
Bonus: For anyone curious about how it works (and/or if you want to improve it)
Here's the one-liner broken-down into something a bit more readable (before changing variable names and changing it to a single line arrow function expression and with comments explaining it):
array.reduce((accumulator, currentValue) => {
(
// Get the element in the accumulator array with the same value as the current value, or if none found it'll result in a falsey value which will trigger a short-circuit evaluation using the ||
accumulator[accumulator.findIndex(item => item.value === currentValue)]
||
// Array.push returns the length of the new array
// So, a[a.push(foo) - 1] is like saying: push foo to a, and then retrieve that new element (or the element one less then the length of the newly updated array, which is our new element since push always pushes to the end of the array)
accumulator[accumulator.push({value: currentValue, count: 0}) - 1]
).count++; // Now take the result of the expression in the parenthesis (which is one way or the other an element in the accumulator array), and increment its count property
// Return the accumulator (reduce makes you return the accumulator)
return accumulator;
}, [])
On a high level, the above is essentially either finding an existing entry in the accumulator array, and then incrementing the counter property, OR it's short-circuiting to the next expression which will make a brand new entry in the accumulator with count 0, and then increment that count, this way we combine the two expressions and reduce the need for any conditionals.
The second I posted my previous comment, an even better idea popped into my head: use JS Maps! This makes everything much more efficient (perhaps even more efficient than the original gist), and makes it even smaller:
arr.reduce((a,b)=>a.set(b,a.get(b)+1||1),new Map)
Compared to the previous one-liner:
arr.reduce((b,c)=>((b[b.findIndex(d=>d.el===c)]||b[b.push({el:c,count:0})-1]).count++,b),[]);
The only difference that this new one does not return an array of objects, but a Map
datastucture. For almost all use cases this shouldn't be an issue, since Maps
are supper easy to work with and even itterate.
The two huge advantages of using this new version are:
a) readability
b) performance
Honestly, the previous version was completely unreadable. This version actually makes a lot of sense. Also, Maps are WAY more efficient since instead of iterating through the accumulator array every time, we simply rely on maps to do that for us using very efficient algorithms and data structures under the hood.
Example usage:
Example #1
[5, 5, 2, 9, 4, 2].reduce((a,b)=>a.set(b,a.get(b)+1||1),new Map)
Should return: Map {5 => 2, 2 => 2, 9 => 1, 4 => 1}
Example #2
let arr =["dog", "dog", "cat", "buffalo", "wolf", "cat", "tiger", "cat"];
arr.reduce((a,b)=>a.set(b,a.get(b)+1||1),new Map)
Should return: Map {"dog" => 2, "cat" => 3, "buffalo" => 1, "wolf" => 1, "tiger" => 1}
Here's a non-minified version
array.reduce((countsMap, item) => countsMap.set(item, countsMap.get(item) + 1 || 1), new Map())
From this:
function compressArray(original) {
var compressed = [];
// make a copy of the input array
var copy = original.slice(0);
// first loop goes over every element
for (var i = 0; i < original.length; i++) {
var myCount = 0;
// loop over every element in the copy and see if it's the same
for (var w = 0; w < copy.length; w++) {
if (original[i] == copy[w]) {
// increase amount of times duplicate is found
myCount++;
// sets item to undefined
delete copy[w];
}
}
if (myCount > 0) {
var a = new Object();
a.value = original[i];
a.count = myCount;
compressed.push(a);
}
}
return compressed;
}
To this:
function compressArray(original) {
return array.reduce((countsMap, item) => countsMap.set(item, countsMap.get(item) + 1 || 1), new Map())
}
It's smaller, more efficient, and arguably just as readable 😃 😇
Also, after some further testing and analysis, I found that both of my ideas are significantly faster than the original gist.
I ran all three on an array of 100,000 random numbers between 0-7
my map idea: 1.974 ms
my original idea: 9.172 ms
gist idea: 16.987644 SECONDS!!!
The performance difference was staggering...
@davidnagli ty man.
//THIS IS JUST FOR FUN, ALTHOUGH IT IS WORKING AWESOMELY!!
let bag={};
["NA","NA","NA","NA","NA","NA","NA","NA","NA","BATMAN"].forEach((item)=>{
bag[item] = bag[item]?++bag[item]:1;
})
return bag;
Hi,
I'm using the original counter up there: array_dupplicate_counter.js
The script produce an object not an array.
I'd like to sort the object based on the value property not on the count property.
I tried many things with no avail.
Is someone know how to sort that kind of object?
Thanks
François
great work
How to use this code in typescript ?? I am unable to use as it is.
Please help me....
Thanks for the work :D ! Handy and works great!
Typescript version:
let compressArray = (original: Array) => {
let compressed = [];
// make a copy of the input array
let copy = original.slice(0);
// first loop goes over every element
for (let i = 0; i < original.length; i++) {
let myCount = 0;
// loop over every element in the copy and see if it's the same
for (let w = 0; w < copy.length; w++) {
if (original[i] == copy[w]) {
// increase amount of times duplicate is found
myCount++;
// sets item to undefined
delete copy[w];
}
}
if (myCount > 0) {
let a = {};
a['value'] = original[i];
a['count'] = myCount;
compressed.push(a);
}
}
return compressed;
};
@davidnagli thanks a bunch man! I learned quite a few things from you today.
thanks for both author and @davidnagli
God bless you :)
Very useful!
Thanks both of you @ralphcrisostomo
and @davidnagli for the gain-performance implementation and the explanation bonus!
function countDup(arr = []) {
const set = new Set(arr);
const uniqueArr = Array.from(set);
if (uniqueArr.length === arr.length) return;
const map = new Map();
uniqueArr.forEach((uniqueVal) => {
let n = 0;
arr.forEach((val) => {
if (val === uniqueVal) {
n++;
}
});
map.set(uniqueVal, n);
});
console.log('newMap ', map);
for (const [key, val] of map) {
console.log(key, val);
}
}
array = ['Apples', 'Oranges', 'Apples', 'Strawberries', 'Apples'];
For a one liner to get unique values and their counts and put them in a dictionary :
reducedArray = array.reduce((acc, datum) => {
let count = acc[datum];
acc[datum] = count === undefined ? 1 : ++count;
return acc;
}, {});
acc = accumulator that indexes according to datum
and returns a key value pair with key as datum and count as the number of times the datum appears in the array.
o/p: (try console.log(reducedArray))
Apples: 3
Oranges: 1
Strawberries: 1
array = ['abc', '123', 'abc', 'xyz', 'abc'];
For a one liner using reduce to count unique distinct values and put them in a dictionary:
reducedArray= array.reduce((acc, datum) => {
let count= acc[datum];
acc[datum] = count === undefined ? 1 : ++count;
return acc;
}, {});
acc: indexes the dictionary using datum and increments the count if it finds a duplicate
console:
abc: 3
123: 1
xyz: 1
@davidnagli Thanks Buddy.. it really helped me with just one line code change.
appreciated
Thanks for this @ralphcrisostomo! I implemented one change, though. I needed to use data repeatedly in different ways. The "delete copy[w]" on line 27 line really threw my data off. I changed it for "i++" and was successful in keeping my data intact. Thanks again!!
Had to search high and low for this solution, every tutorial and answer before was only outputting in the console. You are a legend! Thanks you!
@williamlsh those first three lines can be reduced to one via ES6 spread syntax!
let uniqueArr = [...new Set(arr)];
Thank you!
Thanks man, You saved me