Skip to content

Instantly share code, notes, and snippets.

@dusenberrymw
Last active January 3, 2024 07:14
Show Gist options
  • Star 13 You must be signed in to star a gist
  • Fork 4 You must be signed in to fork a gist
  • Save dusenberrymw/89bc12a8f9a9afaacdb91668abe4065d to your computer and use it in GitHub Desktop.
Save dusenberrymw/89bc12a8f9a9afaacdb91668abe4065d to your computer and use it in GitHub Desktop.
Interesting Machine Learning / Deep Learning Scenarios

Interesting Machine Learning / Deep Learning Scenarios

This gist aims to explore interesting scenarios that may be encountered while training machine learning models.

Increasing validation accuracy and loss

Let's imagine a scenario where the validation accuracy and loss both begin to increase. Intuitively, it seems like this scenario should not happen, since loss and accuracy seem like they would have an inverse relationship. Let's explore this a bit in the context of a binary classification problem in which a model parameterizes a Bernoulli distribution (i.e., it outputs the "probability" of the true class) and is trained with the associated negative log likelihood as the loss function (i.e., the "logistic loss" == "log loss" == "binary cross entropy").

Imagine that when the model is predicting a probability of 0.99 for a "true" class, the model is both correct (assuming a decision threshold of 0.5) and has a low loss since it can't do much better for that example. Now, imagine that the model starts to predict a probability of 0.51 for that same example. In this case, the model is still correct, but the loss will be much higher. That covers the case of a flattened accuracy alongside increasing loss. Let's now add in another example for which the model was originally incorrectly predicting 0.49 for a true class and is now correctly predicting 0.51. For this individual example, there will only be a small decrease in loss. If we imagine that both changes occur at the same time, the model will have both a higher accuracy and a higher loss (assuming the earlier loss increase is greater than the small decrease for this example).

Example:

import numpy as np

def log_loss(pred, y):
  n = len(pred)
  losses = -y*np.log(pred) - (1-y)*np.log(1-pred)
  loss = np.sum(losses) / n
  return loss

def accuracy(pred, y, threshold=0.5):
  pred = pred >= threshold
  acc = np.mean(pred == y)
  return acc * 100

# almost perfect
pred = np.array([0.99])
y = np.array([1])
loss = log_loss(pred, y)
acc = accuracy(pred, y)
print(loss, acc)  # 0.0100503358535 100.0

# barely correct -- no change in accuracy, much higher loss
pred = np.array([0.51])
y = np.array([1])
loss = log_loss(pred, y)
acc = accuracy(pred, y)
print(loss, acc)  # 0.673344553264 100.0

# one barely incorrect prediction, one very correct prediction
pred = np.array([0.49, 0.99])
y = np.array([1, 1])
loss = log_loss(pred, y)
acc = accuracy(pred, y)
print(loss, acc)  # 0.361700111865 50.0

# two barely correct predictions -- higher accuracy, higher loss
pred = np.array([0.51, 0.51])
y = np.array([1, 1])
loss = log_loss(pred, y)
acc = accuracy(pred, y)
print(loss, acc)  # 0.673344553264 100.0
@muween
Copy link

muween commented Jan 2, 2019

Is there any solution for this?

@Jasonsey
Copy link

Jasonsey commented Jul 1, 2019

Great Hypothesis !

@FengYuanhao
Copy link

I have a differen case which beyond understanding.My training auc is decreasing while the loss is decreasing too. and the auc down from 0.6 to 0.48.

@wll199566
Copy link

wll199566 commented Apr 21, 2020

If this (both validation loss and accuracy are increasing) happens, and assume that the training loss is still decreasing, should we consider it as an overfitting problem and early stop the training process? Or wait it until the accuracy begins to decrease and pick up the best model as the model of the best accuracy?

@mathshangw
Copy link

If this (both validation loss and accuracy are increasing) happens, and assume that the training loss is still decreasing, should we consider it as an overfitting problem and early stop the training process? Or wait it until the accuracy begins to decrease and pick up the best model as the model of the best accuracy?

I'm facing same problem .. how can I solve it ?

@ORainn
Copy link

ORainn commented Mar 11, 2022

Now I also encounter this problem. I use a very large data set (7M pictures) to train mvit. The loss of the training set decreases normally, and the ACC of the training set and the test set increases normally, but the loss of the test set remains unchanged at a high level, which is very strange. Like this:
epoch
1 train: Prec@1 37.562 Prec@5 66.880 Loss 2.68560
2 train: Prec@1 50.227 Prec@5 78.993 Loss 1.92638 test: Prec@1 56.141 Prec@5 83.280 Loss 4.53577
3 train: Prec@1 54.411 Prec@5 81.937 Loss 1.74247
4 train: Prec@1 56.737 Prec@5 83.578 Loss 1.63814 test: Prec@1 59.701 Prec@5 85.663 Loss 4.48434

@GANG370
Copy link

GANG370 commented Oct 11, 2022

https://gist.github.com/dusenberrymw/89bc12a8f9a9afaacdb91668abe4065d?permalink_comment_id=4093604#gistcomment-4093604

Now I also encounter this problem. I use a very large data set (7M pictures) to train mvit. The loss of the training set decreases normally, and the ACC of the training set and the test set increases normally, but the loss of the test set remains unchanged at a high level, which is very strange. Like this: epoch 1 train: Prec@1 37.562 Prec@5 66.880 Loss 2.68560 2 train: Prec@1 50.227 Prec@5 78.993 Loss 1.92638 test: Prec@1 56.141 Prec@5 83.280 Loss 4.53577 3 train: Prec@1 54.411 Prec@5 81.937 Loss 1.74247 4 train: Prec@1 56.737 Prec@5 83.578 Loss 1.63814 test: Prec@1 59.701 Prec@5 85.663 Loss 4.48434
i have the same problem ,please help me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment