Last active
March 11, 2024 17:22
-
-
Save ogrisel/8502eb455cd38d41e92fee31863ffea7 to your computer and use it in GitHub Desktop.
About the miscalibration of logistic regression models
Note that the model is severly miscalibrated despite the balance property. This property only informs us about marginal calibration, not about auto-calibration.
This is not the whole story. The balance property just holds for the design matrix of the logistic regression. If the design matrix is badly chosen, then the balance property is just (very) weak. For instance, if only random features without correlation to the target are chosen, the balance property reduces to the marginal (the conditioning drops out), which is weak.
On the other hand, for a "correct" design matrix, the balance property is stronger than auto-calibration.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Another possibility would be that the solution are comparable but with a large variation (as shown on the validation curve) and we pick up another C just due to this.