Skip to content

Instantly share code, notes, and snippets.

@lsauer
Created September 16, 2011 06:09
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 5 You must be signed in to fork a gist
  • Save lsauer/1221325 to your computer and use it in GitHub Desktop.
Save lsauer/1221325 to your computer and use it in GitHub Desktop.
Javascript word frequency counter - word histogramm
//l.sauer 2011, public domain
//returns a hash table with the word as index and frequency as value; good for svg / canvas -plotting or other experiments
//[:punct:] Punctuation symbols . , " ' ? ! ; : # $ % & ( ) * + - / < > = @ [ ] \ ^ _ { } | ~
var wordcnt = function(id){
var hist = {}, words = document.getElementById(id).innerText.split(/[\s*\.*\,\;\+?\#\|:\-\/\\\[\]\(\)\{\}$%&0-9*]/)
for( i in words)
if(words[i].length >1 )
hist[words[i]] ? hist[words[i]]+=1 : hist[words[i]]=1;
return hist;
};
wordcnt('res') //id of the Element, e.g. res is the div containing the results of a google search
----
//Solution in one continuous line of code:
text.split(/[\s*\.*\,\;\+?\#\|:\-\/\\\[\]\(\)\{\}$%&0-9*]/).map( function(k,v){ words||(words={});words[k]++||(words[k]=1); } )
@lsauer
Copy link
Author

lsauer commented Sep 16, 2011

Original interest came from attempting to make a word counter function in one continuous line of JS code, which is somewhat possible with Array.filter and Array.map. In the end however, the passed closures disqualify the code-result for being considered continuous.

@lsauer
Copy link
Author

lsauer commented Oct 6, 2011

I just figured it out ...

@malavbhavsar
Copy link

Please update this to

for(var i in  words)

because of this reason.

This gist is first link to "javascript word frequency" on google. You don't want newbies making mistakes.

@JonasNo
Copy link

JonasNo commented Feb 8, 2017

This code skips over single letters like I and a and is case-sensitive, etc. Not good.
The one liner doesn't even work.

Example 1:

(function(){
  var hist = {}, words = 'I\'m I ice bucket I iPhone is overpriced garbage throw in a bucket Ice'.split(/[\s*\.*\,\;\+?\#\|:\-\/\\\[\]\(\)\{\}$%&0-9*]/)
  for( var i in  words)
    if(words[i].length >1 )
      hist[words[i]] ? hist[words[i]]+=1 : hist[words[i]]=1;
  return hist;
})();

Result 1:
{I'm: 1, Ice: 1, bucket: 2, garbage: 1, iPhone: 1, ice: 1, in: 1, is: 1, overpriced: 1, throw: 1}

Example 2:
'I\'m I ice bucket I iPhone is overpriced garbage throw in a bucket Ice'.split(/[\s*\.*\,\;\+?\#\|:\-\/\\\[\]\(\)\{\}$%&0-9*]/).map( function(k,v){ words||(words={});words[k]++||(words[k]=1); } )

Result 2:
[undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined]

Tested in Chrome (stable, Version 56.0.2924.87)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment