Skip to content

Instantly share code, notes, and snippets.

@jtrive84
Created June 5, 2024 21:29
Show Gist options
  • Save jtrive84/581e38579c627ecc14f00c14ef9390d5 to your computer and use it in GitHub Desktop.
Save jtrive84/581e38579c627ecc14f00c14ef9390d5 to your computer and use it in GitHub Desktop.
Severity Distributions
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Estimating Size of Loss Distributions\n",
"\n",
"**Author: James D. Triveri**\n",
"<br>\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* Severity distributions are used in order to describe the loss process associated with a group of homogeneous claims, \n",
"and in ERM settings to quantify operational risk. \n",
"<br> \n",
"\n",
"* Empirical data can be useful in producing central estimates, but tend to fall short when attempting to quantify \n",
"variability and/or tail behavior. \n",
"<br> \n",
"* Actuaries typically implement separate models for frequency and severity to obtain an aggregate loss distribution \n",
"(compositie model). \n",
"<br>\n",
"* Losses are assumed to originate from specific heavy-tailed probability distributions, which are positively skewed \n",
"and very often have relatively high probabilities in the right tails. \n",
"<br>\n",
"* Candidate distributions include: \n",
"<br> \n",
" - **exponential/gamma** \n",
" - **lognormal** \n",
" - **pareto/lomax** \n",
" - **weibull** \n",
" - **inverse gaussian** \n",
" - **burr/log-logistic** \n",
" - **gumbel** \n",
" - **exponential mixture** \n",
"<br>\n",
"* Candidate models must have support on $[0, +\\infty)$.\n"
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 720x360 with 3 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Example of long-tailed right-skewed distribution.\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"from scipy import stats\n",
"%matplotlib inline\n",
"\n",
"rv0 = stats.gamma(a=2.5, scale=1000)\n",
"x0 = np.linspace(rv0.ppf(.0001), rv0.ppf(.9999), 100)\n",
"vals0 = rv0.pdf(x0)\n",
"\n",
"rv1 = stats.lognorm(s=.5, scale=10)\n",
"vals1 = rv1.pdf(np.arange(rv1.ppf(.0001), rv1.ppf(.9999)))\n",
"\n",
"rv2 = stats.lomax(c=3)\n",
"vals2 = rv2.pdf(np.arange(rv2.ppf(.0001), rv2.ppf(.9999)))\n",
"\n",
"\n",
"rv3 = stats.weibull_min(c=1.5)\n",
"vals3 = rv3.pdf(np.arange(rv3.ppf(.0001), rv3.ppf(.9999)))\n",
"\n",
"rv4 = stats.invgauss(mu=.145)\n",
"vals4 = rv4.pdf(np.arange(rv4.ppf(.0001), rv4.ppf(.9999)))\n",
"\n",
"fig, ax = plt.subplots(1, 3, tight_layout=True, figsize=(10, 5))\n",
"\n",
"ax[0].plot(vals0, lw=3, c=\"r\")\n",
"ax[0].set_yticks([]); ax[0].set_xticks([])\n",
"ax[0].set_title(\"gamma\")\n",
"\n",
"ax[1].plot(vals1, lw=3, c=\"b\")\n",
"ax[1].set_yticks([]); ax[1].set_xticks([])\n",
"ax[1].set_title(\"lognormal\")\n",
"\n",
"ax[2].plot(vals2, lw=3, c=\"g\")\n",
"ax[2].set_yticks([]); ax[2].set_xticks([])\n",
"ax[2].set_title(\"lomax\")\n",
"\n",
"plt.show()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Considerations\n",
"\n",
"* Define the target metric (ground-up losses, paid losses, with or without ALAE)? \n",
"* How far back should we look? \n",
"* Treatment of open claims: How to bring to ultimate? \n",
"* Trend considerations\n",
"* How should claims be grouped (LOB, niche, sub-niche, AG)? \n",
"* Individual or grouped data? \n",
"\n",
"\n",
"<br> \n",
"\n",
"\n",
"\n",
"* Define the target metric: **ground-up loss** \n",
"* How far back should we look?: **Claims that have closed within the last 3 years** \n",
"* Treatment of open claims: **Only focus on closed claims** \n",
"* Trend considerations: **No trend** \n",
"* How should claims be grouped: **AG** \n",
"* Individual or grouped data: **Individual**\n",
"\n",
"<br> \n",
"\n",
"<br> \n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Severity Modeling Process\n",
"\n",
"The severity modeling process can be summarized in 3 steps:\n",
"\n",
"0. Assume claims arise as realizations from a certain distribution or family of distributions. \n",
"\n",
" \n",
"1. Estimate the parameters of the selected parametric distribution using **maximum likelihood** using available loss data. \n",
"\n",
"\n",
"2. Test whether the distribution provides an adequate fit to the data using likelihood, Kolmogorov-Smirnov, AIC and minimum $\\chi^{2}$. \n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br> \n",
"\n",
"\n",
"## Modeling Approach: Probability vs. Likelihood\n",
"\n",
"**Probability quantifies anticipation of an outcome**: \n",
"\n",
"> *If a fair coin is flipped 10 times, what is the probability of observing 8 heads?*\n",
"\n",
"<br>\n",
"\n",
"\n",
"$$\n",
"P(k=8) = {10\\choose 8} .50^{8} .50^{2} = 0.087890625\n",
"$$\n",
"\n",
"\n",
"**Likelihood quantifies trust in a model**: \n",
"\n",
"> *A coin is flipped 10 times resulting in 8 heads. What is the most likely value of $p$? \n",
"Is the coin fair? To what extent does the sample support the hypothesis that \n",
"$P(H) = P(T) = .50$?* (e.g., a binomial distribution with $n=10$ and $p=.50$). \n",
"\n",
"\n",
"\n",
"<br>\n",
"\n",
"Probability and likelihood are in a sense inverses of each other. For the former, we start with a known \n",
"parameter, and estimate the probability of a given sample. The latter starts with the sample, and \n",
"determines which parameter makes the sample most likely. \n",
"\n",
"\n",
"When modeling real life stochastic processes, we often do not know $\\theta$. We observe process outcomes, \n",
"and the goal is to arrive at the most plausible estimate for $\\theta$ given the data. \n",
"\n",
"<br>\n",
"\n",
"\n",
"<br>\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Maximum Likelihood Estimation\n",
"\n",
"<br>\n",
"\n",
"\n",
"Our goal is to fit the following property losses to an exponential distribution:\n",
"<br>\n",
"\n",
"```\n",
"12750 15250 17000 21200 50000* 50000*\n",
"```\n",
"\n",
"<br> \n",
"\n",
"The exponential distribution has a single parameter, $\\lambda$, and density given by: \n",
"\n",
"\n",
"$$\n",
"f(x) = \\lambda e^{-\\lambda x}, \\hspace{.50em} \\lambda > 0, x \\geq 0\n",
"$$\n",
"\n",
"<br>\n",
"\n",
"\n",
"Which $\\lambda$ makes our sample of property losses \n",
"most likely, assuming they originate from an exponential distribution. \n",
"\n",
"<br> \n",
"\n",
"\n",
"\n",
"## Derivation\n",
"\n",
"I'll first present the setup for uncensored data. The maximum likelihood estimator for i.i.d. random \n",
"variables can be expressed as a product of the proposed densities for each observation in the sample. \n",
"For a sample consisting of $n$ observations, we have:\n",
"\n",
"$$\n",
"L(\\lambda) = \\prod_{i=1}^{n} f(x_{i}|\\lambda)\n",
"$$\n",
"\n",
"<br>\n",
"\n",
"It is generally easier to work with sums rather than products when computing derivatives. Therefore it is \n",
"common (but not required) to obtain an expression for the *log-likelihood*. Recall that the log of the product \n",
"of terms is equivalent to the sum of the logs, therefore our expression for the log-likelihood is:\n",
"\n",
"\n",
"$$\n",
"l(\\lambda) = \\mathrm{Ln}\\big(\\prod_{i=1}^{n} f(x_{i}|\\lambda)\\big) = \\sum_{i=1}^{n} \\mathrm{Ln}(f(x_{i}|\\lambda)), \n",
"\\hspace{.50em} \\mathrm{for} \\hspace{.25em} 1 \\leq i \\leq n.\n",
"$$\n",
"\n",
"<br>\n",
"\n",
"The expression for the log-likelihood is then set to 0, the derivative is computed, and parameters of interest \n",
"solved for. In all but the simplest models, direct solutions are not available, and numerical techniques must \n",
"be used to arrive at a solution.\n",
"\n",
"\n",
"$$\n",
"\\begin{align*} \n",
"L(\\lambda) &= \\prod_{i=1}^{n} f(x_{i}|\\lambda)\\\\\n",
"&=\\prod_{i=1}^{6} \\lambda e^{-\\lambda x_{i}}\\\\\n",
"&=\\lambda^{6} \\cdot e^{-12750 \\lambda} \\cdot e^{-15250 \\lambda} \\cdot e^{-17000 \\lambda} \n",
"\\cdot e^{-21200 \\lambda} \\cdot e^{-50000 \\lambda} \\cdot e^{-50000 \\lambda}\\\\\n",
"&=\\lambda^{6} \\cdot e^{-\\lambda(12750 + 15250 + 17000 + 21200 + 50000 + 50000)}\\\\\n",
"&=\\lambda^{6} \\cdot e^{-166200 \\lambda}\\\\\n",
"&6 \\cdot \\mathrm{Ln}(\\lambda) - 166200 \\cdot \\lambda\\\\\n",
"&\\frac{\\partial{l}}{\\partial{\\lambda}} = \\frac{6}{\\lambda} - 166200\\\\\n",
"\\end{align*}\n",
"$$\n",
"\n",
"\n",
"\n",
"Setting equal to 0 and rearranging provides a direct solution for $\\lambda$, the \n",
"maximum likelihood estimator $\\hat{\\lambda}$:\n",
"\n",
"\n",
"$$\n",
"\\begin{align*} \n",
"0 &= \\frac{6}{\\lambda} - 166200,\\\\\n",
"166200 &= \\frac{6}{\\lambda}, \\\\\n",
"\\hat{\\lambda} &= 6 / 166200 \\approx 3.61e-05\n",
"\\end{align*}\n",
"$$\n",
"\n",
"<br>\n",
"\n",
"\n",
"<br>\n",
"\n",
"\n",
"## Accounting for Censored Data\n",
"\n",
"Because insurance losses are subject to policy limits, we need to make an adjustment to the likelihood specification \n",
"for loss amounts equal to the policy limit. Specifically, for losses at the policy limit, we replace the probability \n",
"density with the probability of exceeding the limit, or $1 - F(x)$, where $F(x)$ represents the cumulative distribution \n",
"function. The likelihood for a sample with censored observations becomes: \n",
"\n",
"$$\n",
"L(\\lambda) = \\prod_{i=1}^{m} f(x_{i}|\\lambda) \\prod_{j=1}^{n} 1 - F(x_{j}|\\lambda),\n",
"$$\n",
"\n",
"<br>\n",
"\n",
"\n",
"where $i$ indexes uncensored observations and $j$ censored observations. \n",
"\n",
"For the exponential distribution, $F(x) = 1 - e^{-\\lambda x}$, therefore $1 - F(x) = e^{-\\lambda x}$. \n",
"\n",
"<br>\n",
"\n",
"Our goal is to fit the following property losses to an exponential distribution:\n",
"<br>\n",
"\n",
"```\n",
"12750 15250 17000 21200 50000* 50000*\n",
"```\n",
"\n",
"<br>\n",
"\n",
"\n",
"the likelihood becomes:\n",
"\n",
"$$\n",
"\\begin{align*} \n",
"L(\\lambda) &= \\prod_{i=1}^{4} f(x_{i}|\\lambda) \\prod_{j=1}^{2} 1 - F(x_{j}|\\lambda)\\\\\n",
"&=\\lambda^{4} \\cdot e^{-12750 \\lambda} \\cdot e^{-15250 \\lambda} \\cdot e^{-17000 \\lambda} \n",
"\\cdot e^{-21200 \\lambda} \\cdot e^{-50000 \\lambda} \\cdot e^{-50000 \\lambda}\\\\\n",
"&=\\lambda^{4} \\cdot e^{-166200 \\lambda}\\\\\n",
"&4 \\cdot \\mathrm{Ln}(\\lambda) - 166200 \\cdot \\lambda\\\\\n",
"&\\frac{\\partial{l}}{\\partial{\\lambda}} = \\frac{4}{\\lambda} - 166200 = 0\\\\\n",
"0 &= \\frac{4}{\\lambda} - 166200,\\\\\n",
"166200 &= \\frac{4}{\\lambda}, \\\\\n",
"\\hat{\\lambda} &= 4 / 166200 \\approx \\boldsymbol{2.407e-05}\\\\\n",
"\\end{align*}\n",
"$$\n",
"\n",
"<br>\n",
"<br> \n",
"\n",
"\n",
"* Expected loss *not* accounting for censored observations: 1 / 3.61e-05 = **27,700**. \n",
"<br>\n",
"* Expected loss accounting for censored observations: 1 / 2.407e-05 = **41,550**. \n",
"\n",
"<br> \n",
"This is intuitive, since when accounting for censoring, we're explicitly incorporating the probability that \n",
"our censored losses exceed the the point at which they are censored. \n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 800x500 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"\"\"\"\n",
"Graphical comparison of censored vs. uncensored distributions.\n",
"\"\"\"\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"from scipy import stats\n",
"\n",
"\n",
" \n",
"v = np.asarray([12750, 15250, 17000, 21200, 50000, 50000])\n",
"\n",
"\n",
"rv0 = stats.expon(scale=27700.83)\n",
"rv1 = stats.expon(scale=41545.49)\n",
"\n",
"x0 = np.linspace(rv0.ppf(.0), rv0.ppf(.999), 500)\n",
"x1 = np.linspace(rv1.ppf(.0), rv1.ppf(.999), 500)\n",
"fig, ax = plt.subplots(1, 1, figsize=(8, 5), dpi=100, tight_layout=True)\n",
"\n",
"\n",
"ax.plot(x0, rv0.pdf(x0), color=\"red\", linewidth=1.5, linestyle=\"--\", label=\"uncensored\")\n",
"ax.plot(x1, rv1.pdf(x1), color=\"green\", linewidth=1.5, label=\"censored\")\n",
"ax.set_ylim(bottom=0), ax.set_xlim(left=0)\n",
"ax.tick_params(\n",
" axis=\"x\", which=\"both\", top=False, bottom=True, labeltop=False, \n",
" labelbottom=True, direction=\"out\", labelsize=7\n",
" )\n",
"ax.tick_params(\n",
" axis=\"y\", which=\"both\", left=False, right=False, labelleft=False,\n",
" labelright=False\n",
" )\n",
"ax.set_title(\"Effect of accounting for censored observations\", loc=\"left\", size=10)\n",
"ax.legend()\n",
"ax.grid()\n",
"\n",
"plt.show()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Assessment of Fit\n",
"\n",
"> *With four parameters I can fit an elephant, and with five I can make him wiggle his trunk...*\n",
"\n",
"(Attributed to John von Neumann, relayed by Enrico Fermi when asked about the legitimacy of a result that used four \n",
"free parameters in fitting experimental results). \n",
"\n",
"<br>\n",
"\n",
"\n",
"We rely on a number of metrics to assess the overall fit of the proposed model:\n",
"\n",
"\n",
"* Maximum log-likelihood \n",
"<br>\n",
"* Minimum AIC: $-2l + 2p$ \n",
"<br>\n",
"* Minimum BIC: $-2l + p * \\mathrm{Ln}(n)$ \n",
"<br>\n",
"* Minimum Chi-Square: $\\sum_{j=1}^{n} \\frac{(O_{j} - E_{j})^{2}}{E_{j}}$ \n",
"<br>\n",
"* Kolmogorov-Smirnov test \n",
"<br>\n",
"* Visual assessment (histograms, qq/pp plots) \n",
"\n",
"<br>\n",
"\n",
" \n",
"\n",
"**ks.test**: If p-value exceeds alpha (.05, .025, .01), then we cannot reject the null hypothesis that the \n",
"data originate from the specified distribution. There are caveats when using the ks.test if parameters are estimated \n",
"from the data, but we're using it more as a means to compare distribution adequacy across a number of estimates \n",
"models and parameters. Generally, the higher the kspval p-value the more adequate the model. \n",
"\n",
"**pchisq**: The Chi-Square test may be thought of as a formal comparison of a histogram (empirical data) with \n",
"the fitted density. \n",
"<br>\n",
" - $H_0$: the data follow a specified distribution \n",
" - $H_a$: the data do not follow the specified distribution\n",
"<br> \n",
"The hypothesis that the data are from a population with the specified distribution is accepted if chi-square statistic \n",
"is lower than the chi-square percent point function with k-p-1 dof and a significance level alpha. The chi-square test \n",
"is sensitive to the number of bins. \n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Implementation Details\n",
"\n",
"1. Acquire data for target AG, add deductible and identify right-censored losses. \n",
"<br>\n",
"2. Fit each candidate distribution to data, evaluating the goodness of fit of each. \n",
"<br>\n",
"3. Select the distribution which minimizes AIC. \n",
"<br>\n",
"4. Generate 140-point tabular CDF using selected distribution and associated parameter estimates. \n",
"<br>\n",
"\n",
"<br>\n",
"\n",
"\n",
"\n",
"With a selected parametric form, we can answer questions like:\n",
"\n",
"<br> \n",
"<br> \n",
"\n",
"\n",
"\n",
"* ***What is the expected value for all losses under 100K?***\n",
"\n",
"$$\n",
"E[X \\wedge 100,000] =\\int_{0}^{100,000} x \\ f_{X}(x) \\ dx + 100,000 \\cdot S_{X}(100,000) \n",
"$$\n",
"\n",
"<br>\n",
"<br> \n",
"\n",
"* ***What percentage of losses are eliminated by the introduction of a 5,000 deductible?***\n",
"\n",
"$$\n",
"\\mathrm{LER} = \\frac{E[X] - E[Y^{L}]}{E[X]}\n",
"$$\n",
"\n",
"where:\n",
"\n",
"$$\n",
"E[X] =\\int_{0}^{\\infty} x \\ f_{X}(x) \\ dx, \\hspace{.50em} \\mathrm{and} \\hspace{.50em} E[Y^{L}] = \\int_{d}^{\\infty} [1 - F_{X}(x)]dx.\n",
"$$\n",
"\n",
"<br>\n",
"<br>\n",
"\n",
"\n",
"\n",
"Relationship between expected payment per loss $E[Y^{L}]$ and payment per payment $E[Y^{P}]$:\n",
"\n",
"<br>\n",
"\n",
"\n",
"\n",
"$$\n",
"\\begin{align*}\n",
"E[Y^{L}] &= E[(X - d_{+})] = \\int_{d}^{\\infty} [1 - F_{X}(x)]dx\\\\\n",
"E[Y^{P}] &= E[(X - d_{+})] / S(d)\\\\\n",
"E[Y^{L}] &= E[Y^{P}] \\cdot S(d)\\\\\n",
"\\end{align*}\n",
"$$\n",
"\n",
"<br>\n",
"<br>\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Advantages of Current Approach\n",
"\n",
"0. Scalability. Any group of losses can be combined to generate a new severity distribution. \n",
"<br>\n",
"1. Easy to add new distributions. \n",
"<br>\n",
"2. Explicitly accounts for censored losses. \n",
"<br> \n",
" \n",
"\n",
"\n",
"## Shortcomings of Current Approach\n",
"\n",
"0. Relies on point estimates (MLEs). A more robust approach leveraging MCMC may be a superior alternative \n",
"(i.e., assume distribution parameters follow some pre-specified distribution, then sample from likelihood to obtain posterior \n",
"samples of estimated severity distribution). \n",
"<br> \n",
"1. Not a lot of consideration given to tail. \n",
"<br>\n",
"2. No consideration given to cofactors. \n",
"<br>\n",
"3. Unable to discern between latent heterogeneities within a group. \n",
"\n",
"\n",
"## Future Enhancements\n",
"\n",
"0. Leverage industry data. \n",
"<br> \n",
"1. Assess a regression approach to provide a more accurate whole-book view, by aggregating predictions from individual risks. \n",
"<br> \n",
"2. Use existing work as the basis for a \"large loss\" framework. \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Questions?"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment