public
Last active

An implementation of an O(n + p) work algorithm for selecting p items with replacment from a weighted set of n items

  • Download Gist
weighted_selection.rb
Ruby
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
class WeightedSelection
# input is an Enumerable of (item,frequency) pairs
#
# (takes O(n) time)
def initialize(input)
# set up the alias method for selection
# http://prxq.wordpress.com/2006/04/17/the-alias-method/
@n = input.size
@m = input.inject(0) { |sum,(item,freq)| sum + freq }
 
# find the items with frequency lesser and greater than average
lessers, greaters = input.map do |item,freq|
# pad the frequency so we can keep it integral
# when subdivided
[ item, freq*@n ]
end.partition do |item,adj_freq|
adj_freq <= @m
end
 
@partitions = Array.new(@n) do
item, adj_freq = lessers.shift
other_item = if adj_freq < @m
# use part of another item's frequency to pad
# out the partition
other_item, other_adj_freq = greaters.shift
other_adj_freq -= (@m - adj_freq)
(other_adj_freq <= @m ? lessers : greaters) << [ other_item, other_adj_freq ]
other_item
end
 
[ item, other_item , adj_freq ]
end
end
# pick an item from the original input with probability
# proportional to its relative frequency among all items
# P(word) = frequency / sum_of_frequencies
#
# (takes O(1) time)
def select
# pick a partition at random
item, other_item, adj_freq = @partitions[ rand(@n) ]
 
# select the first item in the partition with appropriate
# probability
if rand(@m) < adj_freq
item
else
other_item
end
end
end

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.