Skip to content

Instantly share code, notes, and snippets.

@oney
Last active August 29, 2015 14:26
Show Gist options
  • Save oney/cc4a79cea3d957d8b669 to your computer and use it in GitHub Desktop.
Save oney/cc4a79cea3d957d8b669 to your computer and use it in GitHub Desktop.
Compute Coleman–Liau index
def coleman_liau_readability_score(content)
def count_sentence_number(content)
content += " "
counter = 0
'.?!'.split('').each {|c| counter += content.scan("#{c} ").length }
counter
end
sentence_number = count_sentence_number(content)
word_number = content.split.size
letter_number = content.count('0-9a-zA-Z')
# puts sentence_number
# puts word_number
# puts letter_number
l = 100.0 * letter_number / word_number
s = 100.0 * sentence_number / word_number
0.0588 * l - 0.296 * s - 15.8
end
puts coleman_liau_readability_score("Existing computer programs that measure readability are based largely upon subroutines which estimate number of syllables, usually by counting vowels. The shortcoming in estimating syllables is that it necessitates keypunching the prose into the computer. There is no need to estimate syllables since word length in letters is a better predictor of readability than word length in syllables. Therefore, a new readability formula was computed that has for its predictors letters per 100 words and sentences per 100 words. Both predictors can be counted by an optical scanning device, and thus the formula makes it economically feasible for an organization such as the U.S. Office of Education to calibrate the readability of all textbooks for the public school system.")
# 14.281680672268909
puts coleman_liau_readability_score("Hello! How are you? I live in U.S. ")
# -9.995
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment