Skip to content

Instantly share code, notes, and snippets.

@calebhearth
Created July 15, 2021 21:54
Show Gist options
  • Save calebhearth/fd1b640b8bc130c903cbdd9739a632e2 to your computer and use it in GitHub Desktop.
Save calebhearth/fd1b640b8bc130c903cbdd9739a632e2 to your computer and use it in GitHub Desktop.
def top_words(n)
words.tally.sort_by(&:last).reverse.map(&:first)[0..n]
end
def words
alice = File.read("alice.txt")
alice.split(/\s+/)
end
def ngram(n)
words.lazy.each_with_index.map do |word, index|
[word, words[index+1]]
end
end
def ngram_frequency(n)
ngram(n).tally.sort_by(&:last).reverse
end
# {["Alice", "said"] => 10, ["Alice", "did"] => 2}
def markov(root: "Alice", length: 10, bigram_freq: ngram_frequency(2))
chain = [root]
length.times do
root_bigrams = bigram_freq.select do |bigram, _|
bigram.first == chain.last
end
total_freq = root_bigrams.sum(&:last)
# [[Alice, said], 10], ...]
followers = []
root_bigrams.each do |ngram, freq|
followers += freq.times.inject([]) { |memo| memo << ngram.last }
end
chain << followers[rand(total_freq)]
end
chain.join(" ")
end
puts markov(root: "The")
puts markov(root: "Alice")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment