Skip to content

Instantly share code, notes, and snippets.

@agarie
Created April 14, 2015 18:19
Show Gist options
  • Save agarie/229202ad0eee7712f886 to your computer and use it in GitHub Desktop.
Save agarie/229202ad0eee7712f886 to your computer and use it in GitHub Desktop.
Generate {uni,bi,tri}grams from a token list.
class Counter
def initialize
@counts = Hash.new 0
end
def <<(key)
@counts[key] += 1
end
end
def create_counts(tokens)
unigrams = Counter.new
bigrams = Counter.new
trigrams = Counter.new
unigram = [nil]
bigram = [nil, nil]
trigram = [nil, nil, nil]
tokens.each do |token|
unigram.shift; unigram.push token
bigram.shift; bigram.push token
trigram.shift; trigram.push token
unigrams << unigram.clone.freeze
bigrams << bigram.clone.freeze
trigrams << trigram.clone.freeze
end
[unigrams, bigrams, trigrams]
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment