Skip to content

Instantly share code, notes, and snippets.

@psylone
Created November 22, 2016 23:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save psylone/0185d6df994c55929011b7e5f1131697 to your computer and use it in GitHub Desktop.
Save psylone/0185d6df994c55929011b7e5f1131697 to your computer and use it in GitHub Desktop.
NLP example
class NLPProcessor
STOP_WORDS = %w[
is
a
of
the
]
attr_reader :invert_index
def process
remove_stop_words!
puts @text
# TODO: remove_endings!
build_invert_index!
end
private
def initialize(text)
@text = text
@invert_index = Hash.new { |hash, key| hash[key] = [] }
end
def remove_stop_words!
buffer = ''
@text.each_line do |line|
buffer << line.gsub!(/#{STOP_WORDS.join('|')}/i, '')
end
@text = buffer
end
def build_invert_index!
@text.each_line.with_index do |line, index|
line.split(/\W+/).each do |word|
@invert_index[word].push(index + 1)
end
end
end
end
text = <<~TEXT
It is a briskly blowing wind that blows
from the north, the North of my youth.
The wind is cold too, colder than the
winds of yesteryear.
TEXT
processor = NLPProcessor.new(text)
processor.process
p processor.invert_index
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment