Skip to content

Instantly share code, notes, and snippets.

@Prinzhorn
Forked from 140bytes/LICENSE.txt
Created October 3, 2011 08:50
Show Gist options
  • Save Prinzhorn/1258724 to your computer and use it in GitHub Desktop.
Save Prinzhorn/1258724 to your computer and use it in GitHub Desktop.
American Soundex

American Soundex in 177 136 bytes

We did it! Thanks to snowlord and especially to p01 for pointing in the right direction

/**
@param s the string you want to get the soundex value of
@param i placeholder for counter
@param j placeholder for ordinal value of char
@param r placeholder for string concatenation
*/
function(s,i,j,r){
//Iterate over all characters in s starting from second letter.
//Init r with first character of s.
//Init i with 0.
//Read ordinal value of current character into j.
for(r=s[i=0];j=s.charCodeAt(++i);)
//Concat next character
r +=
//Look up the number for the current character, but only if the character does not equal the last character.
s[i] != s[i-1] && +'1230120022455012623010202'[j-98]
//Empty string if
|| '';
return(r+'000').slice(0,4)
}
function(s,i,j,r){for(r=s[i=0];j=s.charCodeAt(++i);)r+=s[i]!=s[i-1]&&+'1230120022455012623010202'[j-98]||'';return(r+'000').slice(0,4)}
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
Version 2, December 2004
Copyright (C) 2011 YOUR_NAME_HERE <YOUR_URL_HERE>
Everyone is permitted to copy and distribute verbatim or modified
copies of this license document, and changing it is allowed as long
as the name is changed.
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. You just DO WHAT THE FUCK YOU WANT TO.
{
"name": "americanSoundex",
"description": "An implementation of the American Soundex algorithm.",
"keywords": [
"soundex",
"american"
]
}
<!DOCTYPE html>
<title>American Soundex</title>
<div>Expected value: <b>R163, A500, B556</b></div>
<div>Actual value: <b id="ret"></b></div>
<script>
// write a small example that shows off the API for your example
// and tests it in one fell swoop.
var myFunction = function(s,i,j,r){for(r=s[i=0];j=s.charCodeAt(++i);)r+=s[i]!=s[i-1]&&+'1230120022455012623010202'[j-98]||'';return(r+'000').slice(0,4)}
document.getElementById( "ret" ).innerHTML = [myFunction('Robert'), myFunction('Anna'), myFunction('Bananarama')].join(', ');
</script>
@peterjaric
Copy link

I just tried your example ('Robert'), but I think I managed to remove 2 bytes (haha, not quite enough) by moving the regex object into the for loop:

var myFunction = function(a,b,c){for(c in b={aehiouwy:"",bfpv:1,cgjkqsxz:2,dt:3,l:4,mn:5,r:6,"]|(\d)\1+|[":"$1"})a=a[0]+a.substr(1).replace(RegExp("["+c+"]","g"),b[c])+0;return a.substr(0,4)};

@Prinzhorn
Copy link
Author

Good idea. Will commit.
But you removed one of the two backslashes in the regex ("\d" and "\1"). They are needed. I will add a test case for that.

Edit: I guess GitHub removed them, just as in my comment. Maybe we should use the appropriate Markdown for code in future.

@p01
Copy link

p01 commented Oct 3, 2011

164 bytes using a LUS ( Look Up String :p )

function(s,i,j,r){s=s.toUpperCase();for(r=s[i=0];j=s.charCodeAt(++i);)r+=+'1230120022455012623010202'[j-66]||'';return(r.replace(/(\d)\1+/g,'$1')+'000').slice(0,4)}

@Prinzhorn
Copy link
Author

Looks interesting.
Maybe you should fork my gist so we can golf on two courses, because both approaches seem fundamentally different. And don't forget the "annotated.js" file :-D

Edit: One more thing. The algorithm says "Two adjacent letters with the same number are coded as a single number.". I thought "555" should get "5" but obviously "55" is correct. So we both can strip the plus sign in our regex.

Edit2: LUS ftw!

@Prinzhorn
Copy link
Author

What was the idea behind "toUpperCase"? Remove it and subtract 98 instead and BAM 146 bytes.

@p01
Copy link

p01 commented Oct 3, 2011

I wanted to make my function case insensitive but, yes this is Spa^W140bytes and surely I can get away with that. Thanks

@Prinzhorn
Copy link
Author

136 bytes

function(s,i,j,r){for(r=s[i=0];j=s.charCodeAt(++i);)r+=s[i]!=s[i-1]&&+'1230120022455012623010202'[j-98]||'';return(r+'000').slice(0,4)}

@p01
Copy link

p01 commented Oct 3, 2011

:) That was fast! Nice move getting rid of the replace(...)

@peterjaric
Copy link

A little too late (things have moved on, I see), but for the record: I did not intend to remove the backslashes. It was probably a copy-and-paste error. Sorry about that!

@sylvinus
Copy link

sylvinus commented Oct 5, 2011

I got it down to 133 : https://gist.github.com/1263293

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment