Skip to content

Instantly share code, notes, and snippets.

@localshred
Created December 21, 2011 23:35
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save localshred/1508219 to your computer and use it in GitHub Desktop.
Save localshred/1508219 to your computer and use it in GitHub Desktop.
Evented Sentence Parser
require 'eventually'
class SentenceParser
include Eventually
def initialize(document)
@document = document
end
def parse!
lines = @document.split(/\r?\n/)
lines.each do |line|
emit(:line, line)
words = line.split(/\s+/)
words.each do |word|
emit(:word, word)
end
end
self
end
end
document = %Q{Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus dapibus elit et ligula vestibulum porttitor. Vestibulum tristique suscipit sem eu cursus. Aenean sit amet ligula elit. Morbi venenatis scelerisque viverra. Cras at nisl quis libero rutrum accumsan.
Aenean et nisl felis, nec convallis erat. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Vivamus nec purus nunc, sit amet ornare purus. Vestibulum laoreet mattis sem non malesuada. Nunc vitae lectus neque. Duis sit amet velit non nulla facilisis sodales.
Aenean ultrices sapien ac enim lacinia euismod eleifend pulvinar urna. Nulla leo metus, viverra non lacinia at, posuere at leo. Nullam dictum venenatis tristique. Fusce pellentesque felis vitae libero gravida at interdum est lacinia. Nam rhoncus, diam at gravida dictum, odio velit rutrum erat, vitae laoreet nisl tortor at magna.}
parser = SentenceParser.new(document)
parser.on(:line) do |line|
puts 'Found line with %d characters' % line.length
puts 'Line = %s' % line
end
parser.on(:word) do |word|
puts 'Found word = %s' % word
end
parser.parse!
# output
#
# Found line with 264 characters
# Line = Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus dapibus elit et ligula vestibulum porttitor. Vestibulum tristique suscipit sem eu cursus. Aenean sit amet ligula elit. Morbi venenatis scelerisque viverra. Cras at nisl quis libero rutrum accumsan.
# Found word = Lorem
# Found word = ipsum
# Found word = dolor
# ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment