Skip to content

Instantly share code, notes, and snippets.

@jewel12
Created May 18, 2011 13:21
Show Gist options
  • Save jewel12/978554 to your computer and use it in GitHub Desktop.
Save jewel12/978554 to your computer and use it in GitHub Desktop.
Word Error Rate
# coding:utf-8
# Word Error Rate (空白が区切り文字前提)
module Eval
# cost[挿入, 削除, 置換]
def self.wer( str1, str2, c=[1,1,1] )
s1 = str1.split(' ')
s2 = str2.split(' ')
d = Array.new(s1.size+1) { Array.new(s2.size+1){ 0 } }
(0..s1.size).each {|n| d[n][0] = n}
(0..s2.size).each {|n| d[0][n] = n}
(1..s1.size).each do |i2|
(1..s2.size).each do |i1|
cost = s1[i1-1] == s2[i2-1] ? 0 : c[2]
d[i2][i1] = [d[i2][i1-1] + c[0], # 挿入
d[i2-1][i1] + c[1], # 削除
d[i2-1][i1-1] + cost # 置換
].min
end
end
return d[s1.size][s2.size] / s1.size.to_f
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment