Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
American Soundex

American Soundex in 177 136 bytes

We did it! Thanks to snowlord and especially to p01 for pointing in the right direction

/**
@param s the string you want to get the soundex value of
@param i placeholder for counter
@param j placeholder for ordinal value of char
@param r placeholder for string concatenation
*/
function(s,i,j,r){
//Iterate over all characters in s starting from second letter.
//Init r with first character of s.
//Init i with 0.
//Read ordinal value of current character into j.
for(r=s[i=0];j=s.charCodeAt(++i);)
//Concat next character
r +=
//Look up the number for the current character, but only if the character does not equal the last character.
s[i] != s[i-1] && +'1230120022455012623010202'[j-98]
//Empty string if
|| '';
return(r+'000').slice(0,4)
}
function(s,i,j,r){for(r=s[i=0];j=s.charCodeAt(++i);)r+=s[i]!=s[i-1]&&+'1230120022455012623010202'[j-98]||'';return(r+'000').slice(0,4)}
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
Version 2, December 2004
Copyright (C) 2011 YOUR_NAME_HERE <YOUR_URL_HERE>
Everyone is permitted to copy and distribute verbatim or modified
copies of this license document, and changing it is allowed as long
as the name is changed.
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. You just DO WHAT THE FUCK YOU WANT TO.
{
"name": "americanSoundex",
"description": "An implementation of the American Soundex algorithm.",
"keywords": [
"soundex",
"american"
]
}
<!DOCTYPE html>
<title>American Soundex</title>
<div>Expected value: <b>R163, A500, B556</b></div>
<div>Actual value: <b id="ret"></b></div>
<script>
// write a small example that shows off the API for your example
// and tests it in one fell swoop.
var myFunction = function(s,i,j,r){for(r=s[i=0];j=s.charCodeAt(++i);)r+=s[i]!=s[i-1]&&+'1230120022455012623010202'[j-98]||'';return(r+'000').slice(0,4)}
document.getElementById( "ret" ).innerHTML = [myFunction('Robert'), myFunction('Anna'), myFunction('Bananarama')].join(', ');
</script>
@peterjaric

This comment has been minimized.

Copy link

peterjaric commented Oct 3, 2011

I just tried your example ('Robert'), but I think I managed to remove 2 bytes (haha, not quite enough) by moving the regex object into the for loop:

var myFunction = function(a,b,c){for(c in b={aehiouwy:"",bfpv:1,cgjkqsxz:2,dt:3,l:4,mn:5,r:6,"]|(\d)\1+|[":"$1"})a=a[0]+a.substr(1).replace(RegExp("["+c+"]","g"),b[c])+0;return a.substr(0,4)};

@Prinzhorn

This comment has been minimized.

Copy link
Owner Author

Prinzhorn commented Oct 3, 2011

Good idea. Will commit.
But you removed one of the two backslashes in the regex ("\d" and "\1"). They are needed. I will add a test case for that.

Edit: I guess GitHub removed them, just as in my comment. Maybe we should use the appropriate Markdown for code in future.

@p01

This comment has been minimized.

Copy link

p01 commented Oct 3, 2011

164 bytes using a LUS ( Look Up String :p )

function(s,i,j,r){s=s.toUpperCase();for(r=s[i=0];j=s.charCodeAt(++i);)r+=+'1230120022455012623010202'[j-66]||'';return(r.replace(/(\d)\1+/g,'$1')+'000').slice(0,4)}
@Prinzhorn

This comment has been minimized.

Copy link
Owner Author

Prinzhorn commented Oct 3, 2011

Looks interesting.
Maybe you should fork my gist so we can golf on two courses, because both approaches seem fundamentally different. And don't forget the "annotated.js" file :-D

Edit: One more thing. The algorithm says "Two adjacent letters with the same number are coded as a single number.". I thought "555" should get "5" but obviously "55" is correct. So we both can strip the plus sign in our regex.

Edit2: LUS ftw!

@Prinzhorn

This comment has been minimized.

Copy link
Owner Author

Prinzhorn commented Oct 3, 2011

What was the idea behind "toUpperCase"? Remove it and subtract 98 instead and BAM 146 bytes.

@p01

This comment has been minimized.

Copy link

p01 commented Oct 3, 2011

I wanted to make my function case insensitive but, yes this is Spa^W140bytes and surely I can get away with that. Thanks

@Prinzhorn

This comment has been minimized.

Copy link
Owner Author

Prinzhorn commented Oct 3, 2011

136 bytes

function(s,i,j,r){for(r=s[i=0];j=s.charCodeAt(++i);)r+=s[i]!=s[i-1]&&+'1230120022455012623010202'[j-98]||'';return(r+'000').slice(0,4)}
@p01

This comment has been minimized.

Copy link

p01 commented Oct 3, 2011

:) That was fast! Nice move getting rid of the replace(...)

@peterjaric

This comment has been minimized.

Copy link

peterjaric commented Oct 4, 2011

A little too late (things have moved on, I see), but for the record: I did not intend to remove the backslashes. It was probably a copy-and-paste error. Sorry about that!

@sylvinus

This comment has been minimized.

Copy link

sylvinus commented Oct 5, 2011

I got it down to 133 : https://gist.github.com/1263293

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.