Skip to content

Instantly share code, notes, and snippets.

@henrygarner
Created February 1, 2011 12:04
Show Gist options
  • Save henrygarner/805773 to your computer and use it in GitHub Desktop.
Save henrygarner/805773 to your computer and use it in GitHub Desktop.
A simple Ruby demonstration of Ted Dunning's log-likelihood statistical measure, via Paul Rayson http://ucrel.lancs.ac.uk/llwizard.html
require 'matrix'
module LLR
def self.calculate(m)
2 * (m.to_a.flatten.h - m.row_vectors.map(&:sum).h - m.column_vectors.map(&:sum).h)
end
def sum
to_a.inject(nil) { |sum, x| x = yield(x) if block_given?; sum ? sum + x : x }
end
def h
total = sum.to_f
sum { |x| x.zero? ? 0 : x * Math.log(x / total) }
end
end
[Vector, Array].each { |klass| klass.send :include, LLR }
require 'rubygems'
require 'rspec'
require 'matrix'
require 'llr.rb'
describe LLR do
it "should calculate the correct LLR" do
llr = LLR.calculate Matrix[[1,2],[3,4]]
llr.should be_within(1e-8).of(0.08043486)
llr = LLR.calculate Matrix[[1,0],[0,1]]
llr.should be_within(1e-6).of(2.772589)
llr = LLR.calculate Matrix[[10,0],[0,10]]
llr.should be_within(1e-5).of(27.72589)
llr = LLR.calculate Matrix[[2,0],[1,10000]]
llr.should be_within(1e-5).of(34.25049)
llr = LLR.calculate Matrix[[2,8],[1,10000]]
llr.should be_within(1e-5).of(24.24724)
end
end
@tdunning
Copy link

tdunning commented Feb 1, 2011

There is a simpler implementation as well based on the close relationship between mutual information and the LLR.

In R, it goes like this:

llr = function(k) { 2 * (llr.H(k) - llr.H(rowSums(k)) - llr.H(colSums(k)) ) }
llr.H = function(k) { total = sum(k) ; sum( k * log(k / total + k == 0)) }

I don't speak Ruby well enough to translate this, but I am sure that something comparable is possible.

@henrygarner
Copy link
Author

Thank you for getting in touch with this. I'm not familiar with R but I downloaded it to have a play.

Here's an attempt to generate something comparable. It's not as terse as your R version - Ruby matrices don't by default have row- and column-summing capabilities.

module LLR
  def self.calculate(m)
    2 * (m.to_a.flatten.h - m.row_vectors.map(&:sum).h - m.column_vectors.map(&:sum).h)
  end

  def sum
    to_a.inject(nil) { |sum, x| x = yield(x) if block_given?; sum ? sum + x : x }
  end

  def h
    total = sum.to_f
    sum { |x| x.zero? ? 0.0 : x * Math.log(x / total) }
  end
end

require 'matrix'
[Vector, Array].each { |klass| klass.send :include, LLR } 

llr = LLR.calculate Matrix[[1, 2], [3, 4]]

@tdunning
Copy link

tdunning commented Feb 2, 2011

It is definitely hard for me to read, but it looks plausible.

Here are some test vectors for you:
> llr(matrix(c(1,2,3,4), nrow=2))
[1] 0.08043486
> llr(matrix(c(1,0,0,1), nrow=2))
[1] 2.772589
> llr(matrix(c(10,0,0,10), nrow=2))
[1] 27.72589
> llr(matrix(c(2,0,1,10000), nrow=2))
[1] 34.25049
> llr(matrix(c(2,8,1,10000), nrow=2))
[1] 24.24724

@henrygarner
Copy link
Author

Below is the RSpec test I wrote to check the results against your vectors. I'm pleased to say that all assertions pass.

The initial gist does not generate the same results at all, although it seems to be a correct implementation of the formula at http://ucrel.lancs.ac.uk/llwizard.html

require 'rubygems'
require 'rspec'
require 'matrix'
require 'llr.rb'

describe LLR do
  it "should calculate the correct LLR" do

    llr = LLR.calculate Matrix[[1,2],[3,4]]
    llr.should be_within(1e-8).of(0.08043486)

    llr = LLR.calculate Matrix[[1,0],[0,1]]
    llr.should be_within(1e-6).of(2.772589)

    llr = LLR.calculate Matrix[[10,0],[0,10]]
    llr.should be_within(1e-5).of(27.72589)

    llr = LLR.calculate Matrix[[2,0],[1,10000]]
    llr.should be_within(1e-5).of(34.25049)

    llr = LLR.calculate Matrix[[2,8],[1,10000]]
    llr.should be_within(1e-5).of(24.24724)

  end
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment