Skip to content

Instantly share code, notes, and snippets.

@aquach
Created September 4, 2015 01:07
Show Gist options
  • Save aquach/5a320dd13ea5be3fbd97 to your computer and use it in GitHub Desktop.
Save aquach/5a320dd13ea5be3fbd97 to your computer and use it in GitHub Desktop.
Cheap pairwise correlation of two binary variables with confusion matrix
# Requires Hirb and Ruby 2.0+.
# The format for data is [[ test_positive, condition_positive ]], where each entry is a data point.
def correlate(data, test_event: 'Test', condition_event: 'Truth')
counts = data.group_by(&:itself).map { |bucket, v| [ bucket, v.length ] }.to_h
table = [
[ counts[[true, true]] || 0, counts[[true, false]] || 0 ],
[ counts[[false, true]] || 0, counts[[false, false]] || 0 ]
]
all_test_positive = table[0][0] + table[0][1]
all_test_negative = table[1][0] + table[1][1]
all_condition_positive = table[0][0] + table[1][0]
all_condition_negative = table[0][1] + table[1][1]
correlation = (table[0][0] * table[1][1] - table[1][0] * table[0][1]).to_f / Math.sqrt(
all_test_positive * all_test_negative * all_condition_positive * all_condition_negative)
ppv = table[0][0].to_f / all_test_positive
npv = table[1][1].to_f / all_test_negative
sensitivity = table[0][0].to_f / all_condition_positive
specificity = table[1][1].to_f / all_condition_negative
headers = [ "Corr: #{correlation.round(2)}", condition_event, "NOT #{condition_event}", '' ]
Hirb::Helpers::Table.render([
[ test_event, table[0][0], table[0][1], "PPV: #{ppv.round(2)}" ],
[ "NOT #{test_event}", table[1][0], table[1][1], "NPV: #{npv.round(2)}" ],
[ '', "Sensitivity: #{sensitivity.round(2)}", "Specificity: #{specificity.round(2)}", '' ]
], headers: headers)
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment