Skip to content

Instantly share code, notes, and snippets.

@roblogic
Last active March 25, 2019 05:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save roblogic/27ea1a4fd79623b295e16e0561fcb6d7 to your computer and use it in GitHub Desktop.
Save roblogic/27ea1a4fd79623b295e16e0561fcb6d7 to your computer and use it in GitHub Desktop.
Crazy golf attempt at Jaro-Winkler algorithm (untested v0.01), per https://codegolf.stackexchange.com/questions/95619/string-similarity-using-jaro-winkler
#!/usr/local/bin/bash
#set -x
a="${1,,}" b="${2,,}" x="$a" y="$b"
[ ${#b} -gt ${#a} ]&&x="$b" y="$a"; # y is shorter
h=$((${#y}/2+1)) # h = limit for char matching
# Substring matching routine
r(){ k=${#1} l=${#2} g="$2" s="" # k,l = length of strings 1,2
for((i=0;i<k;i++)){ # iterate over string1
p="${1:i:1}" # p is the i'th characted of $1
v=$((i-h<0?0:i-h)) # minimum for exploring string2
w=$((i+h>l?l:i+h)) # maximum for exploring string2
for((j=v;j<w;j++)){ # iterate over part of string2
q="${g:j:1}" # q is the j'th character of g, aka $2
# match found
[ "$p" = "$q" ]&&{
s="$s$p"
# replace $2 match with a dot (.) as (*) can cause major headaches
c=$((j<0?0:j)) # safety: start at 0 (but j > w is unlikely)
g="${g::j}.${g:j+1}"
break
}
}
}
echo $s
}
m=`r "$x" "$y"`
n=`r "$y" "$x"`
#check length of returned substrings
d=${#m}
e=${#n}
o=$((d==0|e==0|d!=e?0:1))
[ $o -eq 0 ]&&{ echo 0.0;exit;}
# Transposition routine
t(){ u=0
for((i=0;i<${#1};i++)){
[ "${1:i:1}" != "${2:i:1}" ]&&((u++));
}
echo $((u/2))
}
#get transposn. between matching sets
z=`t "$m" "$n"`
#now get jw distance
bc -l<<<"( $d/${#y} + $e/${#x} + ($d-$z)/$d )/3"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment