This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def sørensen_index(string_a, string_b) | |
matches_a = get_bigrams string_a.dup | |
matches_b = get_bigrams string_b.dup | |
similarities = matches_a & matches_b | |
sum_bigrams = matches_a.count + matches_b.count | |
2 * similarities.count / sum_bigrams.to_f | |
end | |
def get_bigrams(str) | |
bigrams = [] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# output all files in index that haven't been garbage collected | |
git fsck --cache --unreachable $(git for-each-ref --format="%(objectname)") > badfiles | |
# output all files by showing them via sub-hash | |
FILENUM=1 | |
cat badfiles | cut -c 18-24 | while read cur | |
do | |
git show $cur > $FILENUM |
NewerOlder