Skip to content

Instantly share code, notes, and snippets.

@evilying
Created July 7, 2018 20:44
Show Gist options
  • Save evilying/c0819d970bf461e66644510f1b7a71bc to your computer and use it in GitHub Desktop.
Save evilying/c0819d970bf461e66644510f1b7a71bc to your computer and use it in GitHub Desktop.

Precision-Recall

Precision-Recall is a useful measure of success of prediction when the classes are very imbalanced. In information retrieval, precision is a measure of result relevancy, while recall is a measure of how many truly relevant results are returned.

The precision-recall curve shows the tradeoff between precision and recall for different threshold. A high area under the curve represents both high recall and high precision, where high precision relates to a low false positive rate, and high recall relates to a low false negative rate. High scores for both show that the classifier is returning accurate results (high precision), as well as returning a majority of all positive results (high recall).

A system with high recall but low precision returns many results, but most of its predicted labels are incorrect when compared to the training labels. A system with high precision but low recall is just the opposite, returning very few results, but most of its predicted labels are correct when compared to the training labels. An ideal system with high precision and high recall will return many results, with all results labeled correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment