Skip to content

Instantly share code, notes, and snippets.

@kieranrcampbell
Created March 16, 2017 15:39
Show Gist options
  • Save kieranrcampbell/09b9d8ffdbac7582742d7a5896c3cd85 to your computer and use it in GitHub Desktop.
Save kieranrcampbell/09b9d8ffdbac7582742d7a5896c3cd85 to your computer and use it in GitHub Desktop.
Calculate true positive rate, false positive rate & false discovery rate from contingency table in R
## Suppose we have a contigency table tbl formed by the table(...) command in R, with
## a logical vector of discoveries as the first argument and a logical vector of
## the ground truth as the second, e.g. tbl <- table(discoveries, ground_truth), then
## this function calculates the true positive rate, false positive rate and false discovery rate
## as per the wikipedia definition at https://en.wikipedia.org/wiki/Sensitivity_and_specificity
calculate_statistics <- function(tbl) {
P <- sum(tbl[,2])
N <- sum(tbl[,1])
TP <- tbl[2,2]
TN <- tbl[1,1]
FP <- tbl[1,2]
FN <- tbl[2,1]
TPR <- TP / P
FPR <- FP / N
FDR <- FP / (TP + FP)
data.frame(P = P, N = N, TP = TP, TN = TN, FP = FP,
FN = FN, TPR = TPR, FPR = FPR, FDR = FDR)
}
@iainsproat
Copy link

I think the off-diagonals (false-positive, false-negative) are not correct per the confusion matrix, and should instead be:

  FP <- tbl[2,1]
  FN <- tbl[1,2]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment