Skip to content

Instantly share code, notes, and snippets.

@brylor
Created January 7, 2016 22:25
Show Gist options
  • Save brylor/61aa3e2f9ceaaabf6bf0 to your computer and use it in GitHub Desktop.
Save brylor/61aa3e2f9ceaaabf6bf0 to your computer and use it in GitHub Desktop.
require 'ruby_nlp/ngram'
class Corpus
def initialize(glob, klass)
@glob = glob
@klass = klass
end
def files
@files ||= Dir[@glob].map do |file|
@klass.new(file)
end
end
def pos
puts 'some unigram pos'
end
def sentences
files.map do |file|
file.sentences
end.flatten
end
def ngrams(n)
sentences.map do |sentence|
Ngram.new(sentence).ngrams(n)
end.flatten(1)
end
def unigrams
ngrams(1)
end
def bigrams
ngrams(2)
end
def trigrams
ngrams(3)
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment