Created
October 29, 2011 16:17
-
-
Save zellux/1324721 to your computer and use it in GitHub Desktop.
Probability calculator
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'set' | |
k = 1.0 | |
list1 = %w(a perfect world my perfect woman pretty woman) | |
list2 = %w(a perfect day electric storm another rainy day) | |
ncases1 = 3 | |
ncases2 = 3 | |
list1name = 'movie' | |
list2name = 'song' | |
words = %w(perfect storm) | |
# list1 = %w(offer is secret click secret link secret sports link) | |
# list2 = %w(play sports today went play sports secret sports event sports is today sports costs money) | |
# ncases1 = 3 | |
# ncases2 = 5 | |
# list1name = 'spam' | |
# list2name = 'ham' | |
# words = %w(today is secret) | |
x = Set.new list1 | |
x.merge list2 | |
n1 = list1.length.to_f | |
n2 = list2.length.to_f | |
ncases = ncases1 + ncases2 | |
plist1 = (ncases1 + k) / (ncases + 2 * k) | |
plist2 = (ncases2 + k) / (ncases + 2 * k) | |
puts "P(#{list1name}) = #{plist1}" | |
puts "P(#{list2name}) = #{plist2}" | |
pword1 = {} | |
pword2 = {} | |
words.each do |w| | |
pword1[w] = (list1.count(w).to_f + k) / (n1 + k * x.size) | |
pword2[w] = (list2.count(w).to_f + k) / (n2 + k * x.size) | |
puts "P(\"#{w}\"|#{list1name}) = #{pword1[w]}" | |
puts "P(\"#{w}\"|#{list2name}) = #{pword2[w]}" | |
end | |
factor1 = words.inject(1.0) {|a,e| a * pword1[e]} * plist1 | |
factor2 = words.inject(1.0) {|a,e| a * pword2[e]} * plist2 | |
puts "P(#{list1name}|\"#{words.join(' ')}\") = #{factor1 / (factor1 + factor2)}" | |
puts "P(#{list2name}|\"#{words.join(' ')}\") = #{factor2 / (factor1 + factor2)}" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
module Inject | |
def sum | |
self.inject(0) {|a,e| a + e} | |
end | |
end | |
class Array | |
include Inject | |
end | |
x = [0,1,2,3,4].collect &:to_f | |
y = [3,6,7,8,11].collect &:to_f | |
xy = x.zip(y).collect {|a,b| a*b} | |
x2 = x.collect {|e| e*e} | |
y2 = y.collect {|e| e*e} | |
m = x.length.to_f | |
w1 = (m * xy.sum - x.sum * y.sum) / (m * x2.sum - x.sum * x.sum) | |
w0 = y.sum / m - w1 / m * x.sum | |
puts "w0 = #{w0}, w1 = #{w1}" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'set' | |
list1 = %w(a perfect world my perfect woman pretty woman) | |
list2 = %w(a perfect day electric storm another rainy day) | |
ncases1 = 3 | |
ncases2 = 3 | |
list1name = 'movie' | |
list2name = 'song' | |
words = %w(perfect storm) | |
# list1 = %w(offer is secret click secret link secret sports link) | |
# list2 = %w(play sports today went play sports secret sports event sports is today sports costs money) | |
# ncases1 = 3 | |
# ncases2 = 5 | |
# list1name = 'spam' | |
# list2name = 'ham' | |
# words = %w(today is secret) | |
x = Set.new list1 | |
x.merge list2 | |
n1 = list1.length.to_f | |
n2 = list2.length.to_f | |
ncases = (ncases1 + ncases2).to_f | |
plist1 = ncases1 / ncases | |
plist2 = ncases2 / ncases | |
puts "P(#{list1name}) = #{plist1}" | |
puts "P(#{list2name}) = #{plist2}" | |
pword1 = {} | |
pword2 = {} | |
words.each do |w| | |
pword1[w] = list1.count(w).to_f / n1 | |
pword2[w] = list2.count(w).to_f / n2 | |
puts "P(\"#{w}\"|#{list1name}) = #{pword1[w]}" | |
puts "P(\"#{w}\"|#{list2name}) = #{pword2[w]}" | |
end | |
factor1 = words.inject(1.0) {|a,e| a * pword1[e]} * plist1 | |
factor2 = words.inject(1.0) {|a,e| a * pword2[e]} * plist2 | |
puts "P(#{list1name}|\"#{words.join(' ')}\") = #{factor1 / (factor1 + factor2)}" | |
puts "P(#{list2name}|\"#{words.join(' ')}\") = #{factor2 / (factor1 + factor2)}" |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment