Skip to content

Instantly share code, notes, and snippets.

@Daenyth
Created November 12, 2010 14:45
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save Daenyth/674168 to your computer and use it in GitHub Desktop.
Save Daenyth/674168 to your computer and use it in GitHub Desktop.
#!/bin/bash
shingle_size=4
feature_count=1024
if [[ ! -d $1 || ! -d $2 ]]; then
echo "Usage: <dir1> <dir2>"
exit 1
fi
simhash -f $feature_count -s $shingle_size -w $1/*.class
simhash -f $feature_count -s $shingle_size -w $2/*.class
for A in "$1"/*.sim; do
match=""
maxresult=0
for B in "$2"/*.sim; do
if [[ $A != $B ]]; then
result=$(simhash -c $A $B)
if (( result > maxresult )); then
maxresult=$result
match=$B
fi
fi
done
echo "${A#.sim} --> ${match#.sim} ($maxresult)"
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment