Skip to content

Instantly share code, notes, and snippets.

@vabenjamin
Created February 18, 2013 21:26
Show Gist options
  • Save vabenjamin/4980909 to your computer and use it in GitHub Desktop.
Save vabenjamin/4980909 to your computer and use it in GitHub Desktop.
#Creates a bag of words of input text, returns a hash for now
def self.toBOW(text)
words = wordTokenize(text)
bowWords = Array.new
bowCounts = Array.new
words.each do |word|
if bowWords.include?("#{word}")
thisIndex = bowWords.index("#{word}")
bowCounts[thisIndex] += 1
else
bowWords << "#{word}"
bowCounts << 1
end
end
bow = Hash.new
puts "got here"
bowWords.each do |word|
i = bowWords.index("#{word}")
bow[bowWords[i]] = bowCounts[i]
end
return bow
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment