Skip to content

Embed URL

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
An implementation of an O(n + p) work algorithm for selecting p items with replacment from a weighted set of n items
class WeightedSelection
# input is an Enumerable of (item,frequency) pairs
#
# (takes O(n) time)
def initialize(input)
# set up the alias method for selection
# http://prxq.wordpress.com/2006/04/17/the-alias-method/
@n = input.size
@m = input.inject(0) { |sum,(item,freq)| sum + freq }
# find the items with frequency lesser and greater than average
lessers, greaters = input.map do |item,freq|
# pad the frequency so we can keep it integral
# when subdivided
[ item, freq*@n ]
end.partition do |item,adj_freq|
adj_freq <= @m
end
@partitions = Array.new(@n) do
item, adj_freq = lessers.shift
other_item = if adj_freq < @m
# use part of another item's frequency to pad
# out the partition
other_item, other_adj_freq = greaters.shift
other_adj_freq -= (@m - adj_freq)
(other_adj_freq <= @m ? lessers : greaters) << [ other_item, other_adj_freq ]
other_item
end
[ item, other_item , adj_freq ]
end
end
# pick an item from the original input with probability
# proportional to its relative frequency among all items
# P(word) = frequency / sum_of_frequencies
#
# (takes O(1) time)
def select
# pick a partition at random
item, other_item, adj_freq = @partitions[ rand(@n) ]
# select the first item in the partition with appropriate
# probability
if rand(@m) < adj_freq
item
else
other_item
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.