\n", "This notebook supports [this blog article](https://mkffl.github.io/2022/03/02/Decisions-Part-3.html)" ], "metadata": { "id": "KPhuGzHOTnPv" } }, { "cell_type": "markdown", "metadata": { "id": "k0iuDR861Vsi" }, "source": [ "## Objective\n", "- Visual validation of ML model score calibration\n", " - Calibration of log-likelihood ratios using logistic regression\n", " - Validation using known score distributions to calculate the true ratios\n", " - Visual inspection of calibration fit on a separate dataset of scores\n", "\n", "## Logistic regression calibration\n", "- Estimate \n", "$$\\text{llr}(x ; \\omega) = \\log \\frac{ p(x \\vert \\omega_1)}{p(x \\vert \\omega_0)}$$\n", "- Using logistic regression, which by default estimates target class log-odds i.e.\n", "$$\\text{log-odds}(\\omega; x) = \\log \\frac{ p(\\omega_1 \\vert x)}{p(\\omega_0 \\vert x)}$$\n", "\n", "- Conversion from log-odds to llr uses the relation log-odds = llr + effective-prior (ep) hence llr = log-odds - ep\n", "\n", "## Score distributions\n", "- I test two assumptions for the score class-conditional density\n", " - Gaussian: logistic regression calibration works well as expected\n", " - Skew-normal: lack of fit\n", "\n", "Main source is *Tutorial on logistic-regression calibration and fusion: Converting a score to a likelihood ratio* by GS Morrison\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "do-mtxdB1Vss" }, "outputs": [], "source": [ "# Stats\n", "import numpy as np\n", "import seaborn as sns\n", "from scipy.special import logit, expit\n", "from scipy.stats import norm as f_norm, skewnorm\n", "\n", "# Off the shelf stats models\n", "from sklearn.linear_model import LogisticRegression as lr\n", "\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "from IPython.display import Image\n", "from IPython.core.display import HTML" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Zm67U0vf1Vs9" }, "outputs": [], "source": [ "# Utils\n", "def reshape_to_1d(array):\n", " \"\"\" Reshape [] to [[]], a requirement from sklearn. \"\"\"\n", " return np.reshape(array,(-1, 1))\n", "\n", "def get_logistic_estimates(clf):\n", " \"\"\" Find the logistic reg parameter estimates \"\"\"\n", " return clf.intercept_[0], clf.coef_[0][0]\n", "\n", "def get_effective_prior(tarN, nontarN):\n", " return logit(tarN/(tarN+nontarN))\n", "\n", "def logistic_logodds(x, β_0, β_1):\n", " return β_0 + β_1*x\n", "\n", "def log_likelihood_ratio_density(tar_density, nontar_density):\n", " def func(x):\n", " return np.log(tar_density(x)/nontar_density(x))\n", " \n", " return func\n", "\n", "# Data\n", "def generate_data(nontar_rv, tar_rv, nontarN, tarN):\n", " \"\"\" Simulate the scores from two classes\n", " \n", " Args:\n", " nontar_rv (scipy.continuous_dist): rv for w0 class\n", " tar_rv (scipy.continuous_dist): rv for w1 class\n", " \"\"\"\n", " w0_sample = nontar_rv.rvs(nontarN)\n", "\n", " w1_sample = tar_rv.rvs(tarN)\n", "\n", " X = np.concatenate([w0_sample, w1_sample])\n", "\n", " y = [0 for d in w0_sample] + [1 for d in w1_sample]\n", " \n", " return X, y\n", "\n", "# Validation\n", "def logistic_calibration_validation(x, y, test_x, nontarN, tarN, nontar_rv, tar_rv):\n", " \"\"\" Validate llr from logistic regression using the formula for two gaussians.\n", "\n", " Args:\n", " x (np.array): raw scores\n", " y (np.array): labels w0 or w1\n", " test_x (np.array): validation dataset to visually inspect the calibrated scores\n", " nontar_rv (scipy.continuous_dist): rv for w0\n", " tar_rv (scipy.continuous_dist): rv for w1\n", " \"\"\" \n", " # True llr\n", " llr_density = log_likelihood_ratio_density(tar_rv.pdf, nontar_rv.pdf)\n", " \n", " llr_true = [llr_density(x) for x in test_x]\n", " \n", " # Estimated llr\n", " x_1d = reshape_to_1d(x)\n", " \n", " clf = lr(random_state=0).fit(x_1d, y)\n", " \n", " β_0, β_1 = get_logistic_estimates(clf)\n", " \n", " lo_preds = [logistic_logodds(x, β_0, β_1) for x in test_x]\n", " \n", " effective_prior = get_effective_prior(tarN, nontarN)\n", " \n", " llr_preds = [(lo - effective_prior) for lo in lo_preds]\n", " \n", " return llr_true, llr_preds\n", "\n", "\n", "def plot_true_and_predicted_llr(test_x, llr_true, llr_preds, title):\n", " (\n", " sns.lineplot(\n", " 'x', \n", " 'value', \n", " hue='variable', \n", " data=(\n", " pd.DataFrame({\n", " \"x\": test_x, \n", " \"llr_true\": llr_true, \n", " \"llr_preds\": llr_preds})\n", " .melt(\"x\")\n", " )\n", " )\n", " .set_title(title)\n", " )\n", "\n", "# main\n", "def run(nontar_rv, tar_rv, nontarN, tarN, title):\n", " X, y = generate_data(nontar_rv, tar_rv, nontarN, tarN)\n", " \n", " sns.distplot(X[:nontarN])\n", " sns.distplot(X[nontarN:])\n", " plt.title(\"Class-conditional score distributions\")\n", " plt.show()\n", "\n", " # Validation sample scores\n", " test_x = np.linspace(-1, 1, 15) #np.linspace(-3, 3, 40)\n", "\n", " llr_true, llr_preds = logistic_calibration_validation(X, y, test_x, nontarN, tarN, nontar_rv, tar_rv)\n", " \n", " plot_true_and_predicted_llr(test_x, llr_true, llr_preds, title)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "6fB-NYZj1VtF", "outputId": "fb3c2a39-24ad-494f-8802-aee06ef3b26b" }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "