Chapter 5 Excess of Loss for Two Risks

Chapter Preview. Risk owners want risk transfer agreements that limit the amount of retained risk. This is true for a single risk, where upper limits were introduced in Section 2.2, as well as for several risks, where multivariate excess of loss contracts were introduced in Section 4.1.2. Not only is this feature intuitively appealing, it has also some theoretical basis as seen in Section 3.1. Furthermore, as Chapter 1 noted, upper limit contracts serve as a “basic building block of insurance” (Mildenhall and Major (2022)), making them a fitting starting point for our investigations.

For analysts, upper limits are challenging because of the discrete jumps induced by the minimum operator. For example, even with a single risk we saw in Figure 2.3 how the distribution function changes dramatically above and below an upper limit. These jumps indicate that uncertainty measures might be non-linear and non-smooth (that is, not differentiable). This chapter develops approaches for handling these complications in the context of excess of loss policies.

To focus on fundamental issues, this chapter considers only two risks. Specifically, we now consider two risks \(X_1, X_2\) whose dependence is quantified through a copula \(C\). With \(u_1, u_2\) as the upper limits, the random variable of interest is the retained risk, \(S(u_1, u_2) = X_1 \wedge u_1 + X_2 \wedge u_2\). By focusing on only two risks, this chapter leverages deterministic methods to provide solutions for risk retention problems. For problems with more than two risks, beginning in Chapter 7 we will focus on solutions that employ simulation techniques.

Even for these more complex problems, the foundations using deterministic methods presented in this chapter can be useful. For example, you might be working with a risk retention problem with ten risks that is far too complex to evaluate using deterministic methods. A common procedure is to combine risks so that only two risk categories remain. These two risks can be evaluated using both deterministic and simulation methods. In this way, the deterministic approach serves as a check on the accuracy of the simulation approach. When both methods agree closely, then the simulation approach can be extended to address important practical problems involving more than two risks.

Because the bivariate excess of loss policy is fundamental to multivariate risk retention problems, this chapter examines it in detail. The beginning Section 5.1 establishes that this is not a convex problem meaning that one can only employ numerical approximation methods designed to yield local solutions. Section 5.2 develops the distribution of retained risk and Section 5.3 examines differential changes in the associated risk measures. The visualization in Section 5.4 helps one interpret the optimization problem.

Section 5.5 examines the excess of loss risk retention problem by looking at a special case when the budget constraint is binding. As will be seen, the analysis is very complex even with only two risks. Because of this complexity, it is unlikely that practicing analysts will take an interest in implementing this approach in problems with more than two risks. As another consequence of this complexity, many readers can safely skip this section on their first reading. Nonetheless, this section is important because the dimensionality reduction enjoyed by imposing active constraints may be appealing in some situations. So, by learning about the potential pitfalls described in this section, analysts will be in a better position to guide consumers on appropriate strategies for minimizing risk uncertainty when subject to budget constraints.

5.1 Lack of Convexity

As we learned in Section 4.4.1, the task of constrained optimization becomes immensely simplified if the problem at hand is convex. With convex optimization problems, solutions are globally optimal, not merely locally. Moreover, there exists a rich suite of algorithms for convex optimization (cf. Boyd and Vandenberghe (2004)) that are reliable and work well on large-dimensional problems.

Our discussion of convexity began in Section 2.4. Recall a function \(h\) is convex if a linear (convex) combination of a function evaluated at points is greater than or equal to the function evaluated at the same combination of points. That is, \[ c~ h[{\bf u}_a] + (1-c)h[{\bf u}_b] \ \ge \ h[c {\bf u}_a + (1-c){\bf u}_b] \ \ \text{ for } \ \ \ 0 \le c \le 1 . \] This section demonstrates by means of examples that risk retention problems involving upper limits (and by extension, deductibles) are not convex (nor concave). This means, as summarized in Section 4.4, that we will require general numerical methods that do not assume convexity.

Example 5.1. Lack of Convexity of Value at Risk for Excess of Loss. Suppose that \(X_1\) has a gamma distribution with shape parameter 2 and scale parameter 5,000 so that the mean is 10,000. Suppose that \(X_2\) has a Pareto distribution with shape parameter 5 and scale parameter 25,000 so that the mean is 6,250. The two variables are related through a Gaussian copula with parameter \(\rho=\) -0.3. We seek to evaluate the value at risk with confidence level \(\alpha = 0.90\) for an excess of loss random variable \(S(u_1, u_2) = X_1 \wedge u_1 + X_2 \wedge u_2\), denoted as \(VaR_{0.9}(u_1,u_2)\). We plot \(VaR_{0.9}(u_1,u_2)=VaR_{0.9}({\bf u})\) for various values of \({\bf u} =(u_1,u_2)\).

For the left-hand panel of Figure 5.1, define \({\bf u}_a =(F_1^{-1}(0.7),F_2^{-1}(0.7))\) and \({\bf u}_b =(F_1^{-1}(0.9),F_2^{-1}(0.9))\). Recall the notation that \(F_1^{-1}(0.7)\) denotes the 0.7 quantile of the first distribution (that turns out to be 12,196 for this example). This panel shows a plot of \(VaR_{0.9}[(1-c){\bf u}_a + c {\bf u}_b]\) versus \(c\); the linear interpolation \((1-c)VaR_{0.9}[{\bf u}_a]+ c VaR_{0.9}[{\bf u}_b]\) versus \(c\) is superimposed. Because the graph is above the linear interpolation, the \(VaR\) function is concave over this region.

For the right-hand panel of Figure 5.1, in the same way define \({\bf u}_a =(F_1^{-1}(0.9),F_2^{-1}(0.7))\) and \({\bf u}_b =(F_1^{-1}(0.7),F_2^{-1}(0.9))\). Because the graph is below the linear interpolation, the \(VaR\) function is convex over this region.

In sum, the value at risk function \(VaR_{0.9}({\bf u})\) is neither convex nor concave for all values of \({\bf u} =(u_1,u_2)\) meaning that we cannot use optimization methods that rely on the convexity of a function.

R Code To Plot this Figure
Lack of Convexity of Value at Risk for Excess of Loss. The left-hand panel shows the value at risk to be concave for selected upper limits. In contrast, for other upper limit values, the right-hand panel shows the function to be convex.

Figure 5.1: Lack of Convexity of Value at Risk for Excess of Loss. The left-hand panel shows the value at risk to be concave for selected upper limits. In contrast, for other upper limit values, the right-hand panel shows the function to be convex.


Our interest is in the convexity, or lack thereof, of summary measures of a retained risk. It is important to note that we are thinking of these as functions of risk retention parameters, not as functions of potential losses. That is, it is tempting to think of the realized excess of loss random variable, \(x_1 \wedge u_1 + x_2 \wedge u_2\), as a function of realized losses \(x_1\) and \(x_2\), as is common in applied statistics. In addition, both Examples 5.1 and 5.2 utilize continuous distributions (gamma and Pareto) for losses, mitigating discreteness issues when confirming convexity in the space of potential losses. However, because the interest is in optimization with respect to the risk retention parameter space, we specifically write the excess of loss random variable as \(S(u_1,u_2) =\) \(X_1 \wedge u_1 + X_2 \wedge u_2\) to emphasize our interest in retained loss distributions as a function of risk retention parameters.

Example 5.2. Three Measures of Uncertainty. This example shows that each of the three measures of uncertainty (the distribution function, quantile, and \(ES\)) is neither convex nor concave for all values of dependence parameters.

To demonstrate this, assume that both \(X_1\) and \(X_2\) have gamma distributions with mean 10,000 and standard deviation 7,071. Their relationship is governed by a normal copula with parameter \(\rho\) that is allowed to vary.

We consider two points, \({\bf u}_a = (8000, 14000)\) and \({\bf u}_b = (16000, 7000)\) and evaluate the distribution function, quantile, and expected shortfall at convex combinations given by \(c {\bf u}_a + (1-c){\bf u}_b\). The distribution function is evaluated at 18,000 and the quantile and expected shortfall use \(\alpha = 0.75\).

Figure 5.2 shows that the convexity depends on the level of dependence. That is, for each measure, sometimes the function is convex and sometimes concave, depending on the dependence. Further, if you do not know about the dependence, then you do not know whether the function that you are trying to optimize is convex or concave (or neither). As a consequence you may think you are maximizing a function and wind up minimizing it, or vice versa.

R Code To Plot this Figure
Measures of Uncertainty Evaluated at Convex Combinations. From the left to right, the panels display the distribution function (at 18,000), the value at risk (at 0.75), and the expected shortfall (at 0.75). The middle solid black line corresponds to the case of independence. For the left-hand panel, the solid red curve is for a positive Spearman correlation 0.9 and the dashed red curve is for a positive Spearman correlation 0.3. Further, the dotted and solid blue curves are for negative correlations -0.3 and -0.9, respectively. This pattern is reversed for the middle and right-hand panels where the negative associations are on the upper portions and the positive ones are on the lower portions of each figure.

Figure 5.2: Measures of Uncertainty Evaluated at Convex Combinations. From the left to right, the panels display the distribution function (at 18,000), the value at risk (at 0.75), and the expected shortfall (at 0.75). The middle solid black line corresponds to the case of independence. For the left-hand panel, the solid red curve is for a positive Spearman correlation 0.9 and the dashed red curve is for a positive Spearman correlation 0.3. Further, the dotted and solid blue curves are for negative correlations -0.3 and -0.9, respectively. This pattern is reversed for the middle and right-hand panels where the negative associations are on the upper portions and the positive ones are on the lower portions of each figure.

Video: Section Summary

5.2 Excess of Loss Distribution

In the bivariate case, there are two risks \(X_1, X_2\). For \(j=1,2\), the distribution of \(X_j\) is \(F_j\) and, when available, the density function is denoted as \(f_j\). The excess of loss retained risk function is \(g(X_1,X_2; \boldsymbol \theta)=\) \(X_1 \wedge u_1 + X_2 \wedge u_2\). For simplicity, this chapter uses the limited sum expression \(S(u_1,u_2) = X_1 \wedge u_1 + X_2 \wedge u_2\) for retained risks.

5.2.1 Distribution Function and Quantiles

Distribution Function. In some detail, I develop the distribution function of \(S(u_1,u_2)\) that can be evaluated at a generic \(y\). First note that if \(y \ge u_1+u_2\), then \(\Pr[S(u_1,u_2) \le y] =1\). So, now consider the case where \(y < u_1+u_2\). On this set, one has \[\begin{equation} \begin{array}{ll} \Pr[S(u_1,&u_2) \le y] \\ ~\\ &= \int_0^{F_1(u_1)} \int_0^{F_2(u_2)} I[F_1^{-1}(z_1) + F_2^{-1}(z_2) \le y]~ c(z_1, z_2) ~ dz_2 dz_1 \\ & \ \ \ + F_2(y-u_1) - C[F_1(u_1),F_2(y-u_1)] \\ & \ \ \ + F_1(y - u_2) - C[F_1(y - u_2),F_2(u_2)] \\ \\ &= \int^{u_1}_{-\infty} C_1[F_1(x), F_2(\min(y-x, u_2))] ~ f_1(x)dx \\ & \ \ \ + F_2(y-u_1) - C[F_1(u_1),F_2(y-u_1)] \\ & \ \ \ + F_1(y - u_2) - C[F_1(y - u_2),F_2(u_2)] .\\ \end{array} \tag{5.1} \end{equation}\] The partial derivative of the copula distribution function, \(C_1(u,v) =\partial_u C(u,v)\), is described in more detail in Appendix Chapter 14. Equation (5.1) is valid for random variables with a domain on the entire real line. Note that if \(X_1\) has a domain on the non-negative portion of the real line and \(u_2>y\), then \(F_1(y-u_2) = 0\), and similarly for \(X_2\). This observation can simplify the calculation.

\(Under~the~Hood.\) Verify the Excess of Loss Distribution Function

Example 5.3. Bivariate Excess of Loss Distribution. To show the flexibility of this set-up, we use different distributions for the two risks. Specifically, assume that \(X_1\) has a gamma distribution with shape parameter 2 and scale parameter 2,000 and that \(X_2\) has a Pareto distribution with shape parameter 3 and scale parameter 2,000. Their relationship is governed by a normal copula with parameter \(\rho = 0.5\). The code uses \(u_1 = 5,000\) and \(u_2 =1,500\).

Illustrative code shows how to determine the distribution function in two ways, one via integration following the formula in equation (5.1) and one via simulation. Figure 5.3 summarizes the excess of loss distribution function. Even though the two random variables are continuous, note the point of discreteness at \(y=6,500\) where the jump in the distribution function occurs.

Show R Code for Excess of Loss Distribution Function
Excess of Loss Distribution Function. Based on upper limit parameters \(u_1=5,000\) and \(u_2=1,500\).

Figure 5.3: Excess of Loss Distribution Function. Based on upper limit parameters \(u_1=5,000\) and \(u_2=1,500\).

Quantiles. The R code shows how to determine quantiles (value at risk) for the excess of loss distribution in three ways:

  • using the formula in equation (5.1) with the R uniroot function,
  • via simulation, and
  • as a constrained optimization approach that will be presented in Section 7.2.

I use the distribution and parameter values from Example 5.3 and \(\alpha = 0.85\). It turns out that the quantile is approximately 6,116. Figure 5.4 summarizes the quantile as a function of the two upper limits with a three-dimensional figure, shown from several perspectives.

Show R Code for Quantiles of the Excess of Loss Distribution
Value at Risk as a Function of Upper Limits. Four views of a three-dimensional plot of the value at risk \(VaR\) as a function of upper limits \(u_1\) and \(u_2\). These different viewpoints show that the \(VaR\) is less sensitive to the value of \(u_2\) compared to \(u_1\).

Figure 5.4: Value at Risk as a Function of Upper Limits. Four views of a three-dimensional plot of the value at risk \(VaR\) as a function of upper limits \(u_1\) and \(u_2\). These different viewpoints show that the \(VaR\) is less sensitive to the value of \(u_2\) compared to \(u_1\).

Show R Code for Plotting Quantiles

5.2.2 Partial Derivatives of Distribution Functions

Partial Derivatives of the Distribution Function. I next develop partial derivatives of the distribution function. Based on equation (5.1), one can establish \[\begin{equation} \begin{array}{ll} \partial_{u_1} &\Pr[S(u_1,u_2) \le y] \\ & = - f_2(y-u_1)\left\{1-C_2[F_1(u_1),F_2(y-u_1)] \right\} .\\ \end{array} \tag{5.2} \end{equation}\]

\(Under~the~Hood.\) Verify the Excess of Loss Distribution Function Derivative

Note that this partial derivative is 0 for \(u_1>y\) and non-negative \(X_2\). Intuitively, for large upper limits \(u_1\), changes in \(u_1\) do not affect the distribution function. Now, in the same way, \[\begin{equation} \begin{array}{ll} \partial_{u_2} & \Pr[S(u_1,u_2) \le y] \\ & = - f_1(y-u_2) \left\{1-C_2[F_2(u_2),F_1(y-u_2)] \right\} .\\ \end{array} \tag{5.3} \end{equation}\] This presentation assumes a copula symmetric in its arguments so that \(C(u_1,u_2)=C(u_2,u_1)\). For symmetric copulas, one has \[ C_1(u_1,u_2)=\partial_{u_1}C(u_1,u_2)=\partial_{u_1}C(u_2,u_1)=C_2(u_2,u_1). \] This assumption serves to ease some of the computational aspects, as follows.

Example 5.4. Partial Derivatives of Bivariate Excess of Loss Distribution Function. This is a continuation of Example 5.3. To illustrate, a block of code is available that shows how to evaluate the partial derivatives. The resulting values are corroborated using numerical approximations from the R function grad. Using the distribution and parameter values from Example 5.3 and \(\alpha = 0.85\), Figure 5.5 summarizes the behavior of these partial derivatives.

Show R Code for Partial Derivatives Excess of Loss Distribution Function
Partial Derivatives of an Excess of Loss Distribution Function. Based on upper limit parameters \(u_1=5,000\) and \(u_2=1,500\). The solid black curve provides the derivative with respect to \(u_1\), the red dashed curve is for \(u_2\).

Figure 5.5: Partial Derivatives of an Excess of Loss Distribution Function. Based on upper limit parameters \(u_1=5,000\) and \(u_2=1,500\). The solid black curve provides the derivative with respect to \(u_1\), the red dashed curve is for \(u_2\).


Second-Order Partial Derivatives of the Distribution Function. Note that the partial derivatives in equations (5.2) and (5.3) are functions of a single upper limit, so the derivative with respect to both parameters is zero. One can also determine \[\begin{equation} {\small \begin{array}{lll} & \partial^2_{u_1} \Pr[S(u_1,u_2) \le y] \\ & = -\partial_{u_1} \left[f_2(y-u_1)\left\{1-C_2[F_1(u_1),F_2(y-u_1)] \right\} \right]\\ & = \left[f_2^{\prime}(y-u_1)\left\{1-C_2[F_1(u_1),F_2(y-u_1)] \right\} \right.\\ & \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ + f_2(y-u_1)\{c[F_1(u_1),F_2(y-u_1)]f_1(u_1) \\ & \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \left. -C_{22}[F_1(u_1),F_2(y-u_1)]f_2(y-u_1)\} \right] .\\ \end{array} } \tag{5.4} \end{equation}\] Switch the indices to get an expression for \(\frac{\partial^2}{\partial u_2^2} \Pr[S(u_1,u_2) \le y]\).

Example 5.5. Second Partial Derivatives of Bivariate Excess of Loss Distribution Function. This is a continuation of Example 5.4. To illustrate, a block of code is available that shows how to evaluate the partial second derivatives. The resulting values are corroborated using numerical approximations from the R functions grad and hessian. Using the distribution and parameter values from Example 5.3 and \(\alpha = 0.85\), Figure 5.6 shows the behavior of these partial second derivatives.

Show R Code for Partial Second Derivatives Excess of Loss Distribution Function
Partial Second Derivatives of an Excess of Loss Distribution Function. Based on upper limit parameters \(u_1=5,000\) and \(u_2=1,500\). The solid black curve provides the derivative with respect to \(u_1\), the red dashed curve is for \(u_2\).

Figure 5.6: Partial Second Derivatives of an Excess of Loss Distribution Function. Based on upper limit parameters \(u_1=5,000\) and \(u_2=1,500\). The solid black curve provides the derivative with respect to \(u_1\), the red dashed curve is for \(u_2\).

Video: Section Summary

5.3 Risk Measure Sensitivities

As described in Section 2.1, the uncertainty of retained risks can be summarized using a risk measure. Further, Section 2.3.2 showed how to compute \[ \partial_{\theta} ~VaR_{\alpha}[g(X; \theta)] ~~~\text{and}~~~\partial_{\theta} ~ES_{\alpha}[g(X; \theta)], \] that is, the differential change in these risk measures per unit change in the parameter \(\theta\). The motivation is that it seems reasonable to want to know how such summary measures change in response to (small) changes in the risk retention parameters. The problems in Section 2.3.2 were rather specialized, with only a single random variable \(X\) and parameter \(\theta\). By making some mild additional assumptions to ensure smoothness, we are able to handle many additional situations. let us first introduce a general framework, and then specialize it to the two-risk excess of loss case.

5.3.1 Quantile Sensitivities

Specifically, now consider a generic random vector \(\mathbf{X}\) and let \(g(\mathbf{X}; \boldsymbol \theta)\) be a retention function that depends on outcomes in \(\mathbf{X}\) and a vector of parameters \(\boldsymbol \theta\). Let \(\Pr[ g(\mathbf{X};\boldsymbol \theta) \le y] = F(\boldsymbol \theta; y)\) be the distribution function of the random variable \(g(\mathbf{X};\boldsymbol \theta)\) with corresponding density function \(f(\boldsymbol\theta; \cdot)\). For fixed \(\boldsymbol \theta\), denote the inverse function at \(\alpha\) to be \(VaR_{\alpha}(\boldsymbol \theta)=VaR(\boldsymbol \theta)\), the value at risk (suppressing the explicit dependence on \(\alpha\)).

Assume sufficient local smoothness of \(F(\boldsymbol \theta; y)\) in both arguments \(y\) and \(\boldsymbol \theta\) to permit taking partial derivatives with respect to these arguments. Then, one can use calculus to develop the quantile sensitivity, \[\begin{equation} \begin{array}{ll} \partial_{\boldsymbol \theta}VaR(\boldsymbol \theta) &= \frac{-1}{F_y[\boldsymbol \theta; VaR(\boldsymbol \theta)]} F_{\boldsymbol \theta}[\boldsymbol \theta; VaR(\boldsymbol \theta)] . \end{array} \tag{5.5} \end{equation}\] Here, I use \(F_{\boldsymbol \theta}[\boldsymbol \theta; y] = \partial_{\boldsymbol \theta}F[\boldsymbol \theta; y]\) for the vector of partial derivatives with respect to \(\boldsymbol \theta\) and similarly for \(F_y\). Further, the matrix of second derivatives of the value at risk turns out to be
\[\begin{equation} {\small \begin{array}{ll} &\partial_{\boldsymbol \theta} \partial_{\boldsymbol \theta^{\prime}} VaR(\boldsymbol \theta) \\ &= \frac{-1}{F_{y}[\boldsymbol \theta; VaR(\boldsymbol \theta)]} \left\{F_{\boldsymbol \theta\boldsymbol \theta^{\prime}}[\boldsymbol \theta; VaR(\boldsymbol \theta)]+ F_{y y}[\boldsymbol \theta; VaR(\boldsymbol \theta)] \times \partial_{\boldsymbol \theta}VaR(\boldsymbol \theta) \partial_{\boldsymbol \theta^{\prime}}VaR(\boldsymbol \theta) \right. \\ & ~~~~~~~ + \left. \partial_{\boldsymbol \theta}VaR(\boldsymbol \theta) \times F_{y \boldsymbol \theta^{\prime}}[\boldsymbol \theta; VaR(\boldsymbol \theta)] + F_{y \boldsymbol \theta}[\boldsymbol \theta; VaR(\boldsymbol \theta)] \times \partial_{\boldsymbol \theta^{\prime}}VaR(\boldsymbol \theta) \right\} .\\ \end{array} } \tag{5.6} \end{equation}\]

\(Under~the~Hood.\) Show the Verification of the VaR Derivative Expressions

Special Case. Two Risk Excess of Loss. In this case, the vector of parameters is \(\boldsymbol \theta = (u_1,u_2)^{\prime}\) and the retained risk function is the limited sum \(S(u_1,u_2)= X_1 \wedge u_1 + X_2 \wedge u_2\). The distribution function, density, density derivative, and quantile are: \[ \begin{array}{ll} F(\boldsymbol \theta; y) &= F(u_1, u_2;y) = \Pr[S(u_1,u_2) \le y] \\ F_y(\boldsymbol \theta; y) &= \partial_y ~ F(\boldsymbol \theta; y) = \partial_y ~ F(u_1, u_2;y) \\ F_{yy}(\boldsymbol \theta; y) &= \partial_y^2 ~ F(\boldsymbol \theta; y) = \partial_y^2 ~ F(u_1, u_2;y) \\ VaR(\boldsymbol \theta) &=F^{-1}_{\alpha}(u_1, u_2). \end{array} \] From equations (5.2) and (5.3), the vector of first derivatives with respect to parameters of the distribution function is \[ \begin{array}{ll} F_{\boldsymbol \theta}(\boldsymbol \theta; y) & =F_{u_1, u_2}(u_1, u_2;y) \\ & = \left( \begin{array}{c} - f_2(y-u_1)\left\{1-C_2[F_1(u_1),F_2(y-u_1)] \right\}\\ - f_1(y-u_2) \left\{1-C_2[F_2(u_2),F_1(y-u_2)] \right\} \end{array} \right) . \end{array} \] Taking another partial derivative, one has \[ \begin{array}{ll} F_{\boldsymbol \theta y}(\boldsymbol \theta; y) & = \partial_y F_{u_1, u_2}(u_1, u_2; y) \\ &= \left( \begin{array}{c} - \left[ f_2^{\prime}(y-u_1)\left\{1-C_2[F_1(u_1),F_2(y-u_1)] \right\} \right.\\ \left. - f_2^2(y-u_1)C_{22}[F_1(u_1),F_2(y-u_1)] \right]\\ - \left[ f_1^{\prime}(y-u_2) \left\{1-C_2[F_2(u_2),F_1(y-u_2)] \right\} \right.\\ -\left. f_1^2(y-u_2) C_{22}[F_2(u_2),F_1(y-u_2)] \right]\\ \end{array} \right) . \end{array} \] The matrix of second derivatives with respect to parameters of the distribution function is \[ F_{\boldsymbol \theta\boldsymbol \theta^{\prime}}(\boldsymbol \theta; y) = \left( \begin{array}{cc} \partial^2_{u_1} F(u_1,u_2;y) & 0 \\ 0 & \partial^2_{u_2} F(u_1,u_2;y)\\ \end{array} \right) , \] where the diagonal elements are given in equation (5.4).

Show R Code for Excess of Loss Risk Measure Derivatives

Example 5.6. Bivariate Excess of Loss VaR Sensitivities. Let us continue to use the distribution and parameter values from Example 5.3 and \(\alpha = 0.85\). As anticipated, from Figure 5.7 we see that the value at risk increases as both the confidence level \(\alpha\) increases (top left panel) and as the dependence parameter \(\rho\) increases (bottom left panel). The two right-hand panels show that both \(VaR\) sensitivities are positive for all values of \(\alpha\) and \(\rho\).

VaR and VaR Sensitivities for Excess of Loss. Based on upper limit parameters \(u_1=5,000\) and \(u_2=1,500\). The solid black curve provides the derivative with respect to \(u_1\), the red dashed curve is for \(u_2\). The upper panels are based on \(\rho =\) 0.5 and the bottom panels use \(\alpha =\) 0.85.

Figure 5.7: VaR and VaR Sensitivities for Excess of Loss. Based on upper limit parameters \(u_1=5,000\) and \(u_2=1,500\). The solid black curve provides the derivative with respect to \(u_1\), the red dashed curve is for \(u_2\). The upper panels are based on \(\rho =\) 0.5 and the bottom panels use \(\alpha =\) 0.85.

5.3.2 Expected Shortfall Sensitivities

The development of the \(ES\) sensitivity employs an auxiliary function that will also be useful for us in other contexts. To define this function, consider a generic random variable \(Y\) having distribution function \(F_Y\) with a unique \(\alpha\) quantile, \(F^{-1}(\alpha)\). Now, define the function \[ ES1_F(z) = z + \frac{1}{1-\alpha} \left\{\mathrm{E}[Y] -\mathrm{E}[Y \wedge z]\right\} . \] From equation (2.3), one sees that the auxiliary function evaluated at the quantile is the expected shortfall, that is, \(ES(\alpha)=ES1_F(VaR_{\alpha})\).

After some calculations using this auxiliary function, one can show \[\begin{equation} \begin{array}{ll} \partial_{\boldsymbol \theta} ~ES_{\alpha}[g(\mathbf{X};\theta)] &=\frac{1}{1-\alpha} \left\{\partial_{\boldsymbol \theta}\mathrm{E}[g(X;\boldsymbol \theta)] -\left.\partial_{\boldsymbol \theta}\mathrm{E}[g(X;\boldsymbol \theta) \wedge y]\right|_{y=VaR(\boldsymbol \theta)}\right\} \\ &\ \ \ \ \ \ \ \ + \left( 1 - \frac{1}{1-\alpha} \{1-F[\boldsymbol \theta;VaR(\boldsymbol \theta)]\} \right) \times \partial_{\boldsymbol \theta} VaR(\boldsymbol \theta). \end{array} \tag{5.7} \end{equation}\] Further, \[\begin{equation} \begin{array}{ll} & \partial_{\boldsymbol \theta} \partial_{\boldsymbol \theta^{\prime}} ~ES_{\alpha}[g(\mathbf{X};\theta)] \\ &= \frac{1}{1-\alpha} \left\{ \partial_{\boldsymbol \theta} \partial_{\boldsymbol \theta^{\prime}}\mathrm{E}[g(X;\boldsymbol \theta)] -\left. \partial_{\boldsymbol \theta} \partial_{\boldsymbol \theta^{\prime}}\mathrm{E}[g(X;\boldsymbol \theta) \wedge y]\right|_{y=VaR(\boldsymbol \theta)}\right\}\\ & \ \ \ \ \ \ \ + \left(1 - \frac{1}{1-\alpha} \{1-F[\boldsymbol \theta;VaR(\boldsymbol \theta)]\} \right) \times \partial_{\boldsymbol \theta} \partial_{\boldsymbol \theta^{\prime}} VaR(\boldsymbol \theta) \\ & \ \ \ \ \ \ \ + \frac{1}{(1-\alpha)} F_y[\boldsymbol \theta,VaR(\boldsymbol \theta)] \times \partial_{\boldsymbol \theta}VaR(\boldsymbol \theta) \times \partial_{\boldsymbol \theta^{\prime}} VaR(\boldsymbol \theta) .\\ \end{array} \tag{5.8} \end{equation}\]

Note that if \(VaR(\boldsymbol \theta)\) is a point of continuity (no discrete jump in the distribution function), then \(F[\boldsymbol \theta;VaR(\boldsymbol \theta)]= \alpha\). In this case, equations (5.7) and (5.8) become simpler. Specifically, the second term on the right-hand side of each equation becomes zero.

\(Under~the~Hood.\) Show the Verification of the \(ES\) Derivative Expressions

Special Case. Two Risk Excess of Loss. As before, the vector of parameters is \(\boldsymbol \theta = (u_1,u_2)^{\prime}\) and the retained risk function is \(S(u_1,u_2)= X_1 \wedge u_1 + X_2 \wedge u_2\). In this case, we have \[ \partial_{\boldsymbol \theta}\mathrm{E}[g(X;\boldsymbol \theta)] = \left(\begin{array}{ll} \partial_{u_1}\mathrm{E}[S(u_1,u_2)] \\ \partial_{u_2}\mathrm{E}[S(u_1,u_2)] \end{array}\right) = \left(\begin{array}{ll} 1-F_1(u_1) \\ 1-F_2(u_2) \end{array}\right) \] and \[\begin{equation} \partial_{\boldsymbol \theta}\mathrm{E}[g(X;\boldsymbol \theta)\wedge y] = \left(\begin{array}{ll} F_2(y-u_1) - C[F_1(u_1),F_2(y-u_1)] \\ F_1(y-u_2) - C[F_2(u_2),F_1(y-u_2)] \end{array}\right) . \tag{5.9} \end{equation}\] To see equation (5.9), we have the following.

\(Under~the~Hood.\) Show the Verification of the \(ES\) Excess of Loss Derivative Expressions

Taking derivatives again yields \[ \partial_{\boldsymbol \theta} \partial_{\boldsymbol \theta ^{\prime}} ~\mathrm{E}[g(X;\boldsymbol \theta)] = - \left(\begin{array}{cc} f_1(u_1) &0\\ 0 & f_2(u_2) \end{array}\right) \] and \[ {\small \begin{array}{ll} \partial_{\boldsymbol \theta} \partial_{\boldsymbol \theta ^{\prime}} \mathrm{E}[g(X;\boldsymbol \theta)\wedge y] \\ \ \ \ = \left(\begin{array}{cc} f_2(y-u_1) \{1 - C_2[F_1(u_1),F_2(y-u_1)]\} & 0 \\ 0& f_1(y-u_2)\{1 - C_2[F_2(u_2),F_1(y-u_2)] \} \end{array}\right) . \end{array} } \]

Example 5.7. Bivariate Excess of Loss - \(ES\) Sensitivities. We use the distribution and parameter values from Example 5.3 and \(\alpha = 0.85\). These yield interesting results in Figure 5.8.

ES and ES Sensitivities for Excess of Loss. Based on upper limit parameters \(u_1=5,000\) and \(u_2=1,500\). The solid black curve provides the derivative with respect to \(u_1\), the red dashed curve is for \(u_2\). The two right-hand panels show that both \(ES\) sensitivities are positive for all values of \(\alpha\) and \(\rho\).

Figure 5.8: ES and ES Sensitivities for Excess of Loss. Based on upper limit parameters \(u_1=5,000\) and \(u_2=1,500\). The solid black curve provides the derivative with respect to \(u_1\), the red dashed curve is for \(u_2\). The two right-hand panels show that both \(ES\) sensitivities are positive for all values of \(\alpha\) and \(\rho\).


Looking Forward. Sensitivity analysis will be the main theme of Chapters 9 and 10 so the foundations laid in this section will bear fruit for us later on. From this section’s detailed calculation of sensitivities, observe that the key aspect is the smoothness of parameters in potential outcomes of the retained risk and the retention parameters \(\boldsymbol \theta\). This smoothness is important and will motivate the introduction of kernel methods when we introduce simulation methods in Chapter 7.

Video: Section Summary

5.4 Visualizing Constrained Optimization

Using constrained optimization, the goal is to find those parameter values \(u_1,u_2\) that minimizes an uncertainty measure of retained risk subject to the budget constraint that the risk transfer costs are limited by a given amount, \(RTC_{max}\). With the equation (3.3) structure, one can write this as \[\begin{equation} \boxed{ \begin{array}{cc} {\small \text{minimize}_{u_1,u_2}} & ~~~~~RM[S(u_1, u_2)] \\ {\small \text{subject to}} & ~~~~~RTC(u_1, u_2) \le RTC_{max} \\ &u_1 \ge 0, \ \ \ \ u_2 \ge 0 . \end{array} } \tag{5.10} \end{equation}\]

This section introduces visualization techniques to help us understand the nuances of constrained optimization risk retention problems. As seen in Figure 5.4, it can be difficult to visualize a risk measure as a function of two upper limits. In addition, we want to get some insights by adding the budget constraint function that also could be represented as a three-dimensional figure (the constraint as a function of two upper limits). Instead, it is common in constrained optimization to utilize contours, a curve that is a function of two variables in which the function has a constant value.

5.4.1 Visualizing the Excess of Loss Problem

To explore visualization techniques, we start with \(ES\) minimization and then move on to \(VaR\) minimization. As will be demonstrated in Section 5.5, \(VaR\) minimization can be problematic.

Example 5.8. Visualizing Excess of Loss Constrained Optimization - ES. Consider the distributions specified in Example 5.3, using a positive correlation \(\rho = 0.5\) and a confidence level \(\alpha = 0.85\). Figure 5.9 shows contours of the expected shortfall risk measure \(ES\) and the risk transfer cost \(RTC\) for several choices of upper limits \((u_1, u_2)\).

Specifically, the left-hand panel of Figure 5.9 shows the contours of the risk transfer cost \(RTC\) as solid blue curves. For this plot, darker shaded values indicate a larger value of \(RTC\). The contours are constant \(RTC\) slices as one might take in a three-dimensional graph such as Figure 5.4. For example, if \(RTC_{max} = 500\), then the feasibility region \(\{(u_1, u_2):RTC(u_1, u_2) \le 500\}\) corresponds to the area to the right and above the blue curve corresponding to \(RTC_{max} =500\).

The right-hand panel retains these contours with constant values of the expected shortfall \(ES\) superimposed as dashed red curves. For the optimization problem in Display (5.10), the objective is to find the smallest contour subject to being within this feasibility region. At the point of intersection where the smallest contour just meets the budget curve, the two curves are parallel.

R Code for this Figure
Expected Shortfall Contour Plot. The left-hand panel shows the contours of the risk transfer cost \(RTC\) as solid blue curves. The right-hand panel retains these contours with values of the expected shortfall \(ES\) superimposed as dashed red curves.

Figure 5.9: Expected Shortfall Contour Plot. The left-hand panel shows the contours of the risk transfer cost \(RTC\) as solid blue curves. The right-hand panel retains these contours with values of the expected shortfall \(ES\) superimposed as dashed red curves.

To appreciate this visualization of the optimization problem, it is helpful to know where the optimal values lie when visualizing this graph. With \(RTC_{max} = 1,500\), when minimizing \(ES\), it turns out that the best values of the retention limits are \(u_1^* =3,379\) and \(u_2^* = 3,387\) with expected shortfall \(ES^* =5,997\). This allows one to identify this point in Figure 5.9.

Example 5.9. Visualizing Excess of Loss Constrained Optimization - VaR. Continuing from Example 5.8, we now consider the value at risk \(VaR\) risk measure. Figure 5.10 shows the \(RTC\) as solid blue curves and the \(VaR\) as dashed red curves. When the constraint is binding, the objective is to find the smallest contour subject to being at a given level of the risk transfer cost.

As before, it is helpful to know where the optimal values lie. It turns out that the best risk retention values are \(u_1^* =3,116\) and \(u_2^* =43,375\) (essentially infinity), resulting in a value at risk \(VaR^* =4,761\).

Similar calculations can be performed for other values of \(RTC_{max}\). For example, if \(RTC_{max} = 2,500\), it turns out that the best risk retention values are \(u_1^* =1,628\) and \(u_2^* =23,117\), resulting in a value at risk \(VaR^* =3,376\). As suggested by the graph, large changes in \(u_2\) mean only small changes in the optimal values of \(u_1\) and the \(VaR\), suggesting that they are not sensitive to \(u_2\).

From Figure 5.10, one sees that for this example we are able to determine the value of \(u_1\) well but determining a value of \(u_2\) is more problematic. When comparing this figure to the \(ES\) contour plot in Figure 5.10, it seems like the distinction between \(ES\) and \(RTC\) contours is more pronounced than the distinction between \(VaR\) and \(RTC\) contours, suggesting that it is easier to determine the optimal values for the \(ES\) problem.


Value at Risk Contour Plot. This figure shows the contours of the risk transfer cost \(RTC\) as solid blue curves and value at risk \(VaR\) superimposed as dashed red curves.

Figure 5.10: Value at Risk Contour Plot. This figure shows the contours of the risk transfer cost \(RTC\) as solid blue curves and value at risk \(VaR\) superimposed as dashed red curves.

5.4.2 Using Distribution Functions

When the risk measure is the value at risk (\(VaR\)), an equivalent problem formulation is \[\begin{equation} \boxed{ \begin{array}{lc} {\small \text{minimize}_{u_1,u_2,x}} & ~~~~~x \\ {\small \text{subject to}} & ~~~~~\Pr[S(u_1, u_2) \le x] \ge \alpha \\ & ~~~~~RTC(u_1, u_2) \le RTC_{max} \\ &u_1 \ge 0, \ \ \ \ u_2 \ge 0 . \end{array} } \tag{5.11} \end{equation}\] Intuitively, finding the smallest \(x\) value that minimizes the probability condition is equivalent to the quantile as described in Section 2.1. In this section, we simply provide code that solves each problem and show the equivalence between the two solutions. Section 7.2 will develop this general approach in a broader context, explaining its strengths and limitations compared to the basic approach, such as in Display (5.10).

The minimization problem in Display (5.11) is complicated to visualize because now there are three decision variables, \(x\), \(u_1\), and \(u_2\), over which one seeks to evaluate the probability of the retained risk \(\Pr[S(u_1,u_2) \le x]\) and the risk transfer cost \(RTC\). Figure 5.11 shows different aspects of the problem; the top two panels hold the value of \(u_2\) as constant and the bottom two hold \(u_1\) constant. The upper left-hand panel is a contour plot of the probability of retained risk with the solid green curves representing the contours (the background shading is for values of \(x\)). As before, the dashed blue curves represent contours of the \(RTC\). For this panel, because we hold \(u_2=35,000\) fixed, the contour of \(RTC\) is simply a constant vertical line. The value of \(u_1 = 6,000\) corresponds to a risk transfer cost of \(RTC=500\). When \(x=7,000\), this vertical line is close to the contour corresponding to a probability value of 0.85. As we see below, these values turn out to give near optimal values.

There is little difference when comparing the top two panels of Figure 5.11. This is because the surface is relatively flat over different values of \(u_2\), meaning that changes in this variable have relatively little effect on the problem solution. The bottom two panels of Figure 5.11 each hold \(u_1\) constant. Here, in addition to vertical lines for the contours of \(RTC\), we see near horizontal lines for the contours of \(\Pr[S(u_1,u_2) \le x]\). This is additional evidence that changes in \(u_2\) have relatively little effect on the problem solution.

R Code for Figure
Distribution Function Contour Plot. All panels show the contours of the probability of retained risk as solid green curves and the contours of the risk transfer cost \(RTC\) as dashed blue curves. The top two panels hold \(u_2\) fixed, the bottom two hold \(u_1\) fixed. In part, these plots demonstrate that the optimal solution is robust to the choice of \(u_2\).

Figure 5.11: Distribution Function Contour Plot. All panels show the contours of the probability of retained risk as solid green curves and the contours of the risk transfer cost \(RTC\) as dashed blue curves. The top two panels hold \(u_2\) fixed, the bottom two hold \(u_1\) fixed. In part, these plots demonstrate that the optimal solution is robust to the choice of \(u_2\).

To help interpret the figures, note that the optimal values of the problem in Display (5.11) are equivalent to those in Display (5.10). Interestingly the code for this alternative formulation (as in Display (5.11)) is much faster than for the code that optimizes directly the \(VaR\) risk measure. Intuitively, one can see in Figure 5.11 the sharp distinction among the contour curves. This is in contrast to the earlier Figure 5.9 where the near parallel contours suggested difficulties in the algorithms arriving at optimal solutions.

Video: Section Summary

5.5 Restricting the Feasible Region

As in Display (5.10), we seek to solve the problem \[ \boxed{ \begin{array}{lc} {\small \text{minimize}_{u_1,u_2}} & ~~~~~RM[S(u_1, u_2)] \\ {\small \text{subject to}} & ~~~~~RTC(u_1, u_2) \le RTC_{max} \\ &u_1 \ge 0, \ \ \ \ u_2 \ge 0 . \end{array} } \] The feasible region is within the quadrant where values of \(u_1\) and \(u_2\) are non-negative. Moreover, there is an additional restriction in the budget constraint, \(RTC(u_1, u_2) \le RTC_{max}\). As will be seen, for small values of \(RTC_{max}\) this restriction can be considerable. By recognizing in advance the restrictions on the feasible region, optimization algorithms will have a smaller region to search over which to optimize, potentially improving their convergence.

As a tactic for reducing the feasible region, one might look to solutions where one selects parameter values so that transfer costs are at the maximum, that is, values of \(u_1\) and \(u_2\) so that \(RTC(u_1, u_2) = RTC_{max}\). In addition, analysts are sometimes simply given an equality constraint, “this is what you spend.” You will recall that this choice results in an active, or binding, constraint.

By restricting ourselves to feasible sets where the constraint is binding, we have reduced the number of “free” decision variables by one. This can be especially useful in the case of two risks with upper limits where there are only two decision variables; with a binding constraint, we have reduced the dimension of the problem to a single variable. That is, if a value of \(u_1\) is known, then subject to mild regularity conditions, we can solve for the value of \(u_2\) through the equation \(RTC(u_1, u_2) = RTC_{max}\). However, this tactic comes with caveats; it is also well known in the optimization literature (see for example, Nocedal and Wright (2006), Section 15.3) that this approach can introduce some subtle errors into the problem formulation. As summarized by Nocedal and Wright (2006), “Elimination techniques must be used with care, however, as they may alter the problem or introduce ill conditioning.”

5.5.1 Budget Restriction

As introduced in Section 3.2, the risk transfer cost \(RTC\) is the expense associated with offloading risks. You can think of limiting the \(RTC\) by a fixed amount, \(RTC_{max}\), as our budget restriction. As with all constrained optimization problems, imposing a constraint can restrict the feasible set of candidate parameter values.

To develop intuition, we make two simplifying assumptions. First, let us focus on risk transfer costs that are additive so that we may write \(RTC(u_1,u_2)\) \(= RC_1(u_1)\) \(+ RC_2(u_2)\), where \(RC_j(u)\) is the risk transfer cost for the \(j\)th risk, \(j=1,2\). As will be discussed in Section 7.5.1, this is a very natural assumption to make with separate risk transfer agreements.

Second, consider “fair” risk costs of the form \(RC_j(u) = \int^{\infty}_u [1-F_j(x)]dx\) for \(j=1,2\), as in equation (3.4). It is easy to extend this fair risk cost using multiplicative constants to allow for administrative expenses. Alternatively, risk managers may prefer to use measures such as quantiles for risk costs instead of expectations.

Guidelines for Setting the Maximal Risk Transfer Cost

We first look to bounds on the \(RTC\) function to guide setting the maximal cost \(RTC_{max}\). For example, if \(RTC_{max}\) is greater than the upper bound of the function \(RTC\) then the constraint does not impose any limitations on the minimization problem and one can ignore the constraint. In the same way, if \(RTC_{max}\) is less than the lower bound of the \(RTC\) function, then there are no feasible parameter values for the optimization problem.

With additive fair risk transfer costs, we have \[\begin{equation} \begin{array}{ll} RTC[u_1,u_2] &= RC_1(u_1) +RC_2(u_2) \\ &= \int^{\infty}_{u_1} [1-F_1(x)]~dx + \int^{\infty}_{u_2} [1-F_2(x)]~dx . \end{array} \tag{5.12} \end{equation}\] This \(RTC\) is bounded by \[ \begin{array}{ll} 0 = RTC[\infty,\infty] &\le RTC[u_1,u_2] \\ & \le RTC[0,0] = \mathrm{E}(X_1)+\mathrm{E}(X_2). \end{array} \] Thus, we only consider values of maximal cost \(RTC_{max}\) such that \(0 < RTC_{max}\) \(< \mathrm{E}(X_1)+\mathrm{E}(X_2)\), so that the constraint has some meaning.

In addition, sometimes one wants to know about bounds on the \(RTC\) function when there are restrictions on one of the parameters. For example, suppose that \(u_2=0\), meaning that all of the risk \(X_2\) is transferred (nothing is retained). Then, the bounds on the \(RTC\) function over different choices of \(u_1\) are \[ \mathrm{E}(X_2) = RTC[\infty,0] \le RTC[u_1,0] \le RTC[0,0] = \mathrm{E}(X_1)+\mathrm{E}(X_2). \] Thus, when \(u_2=0\), a selection of \(RTC_{max} < \mathrm{E}(X_2)\) means that the feasible set is null so that the problem is not well defined. Table 5.1 summarizes this and additional bounds.

Table 5.1. Bounds on the Fair Risk Transfer Cost Function

\[ \begin{array}{c|cc}\hline \text{Known} & \text{Lower Bound} & \text{Upper Bound} \\ \text{Parameter Values} & & \\\hline u_1=0 & \mathrm{E}(X_1)& \mathrm{E}(X_1)+\mathrm{E}(X_2)\\ u_1=\infty & 0 & \mathrm{E}(X_2) \\ u_2=0 & \mathrm{E}(X_2)& \mathrm{E}(X_1)+\mathrm{E}(X_2)\\ u_2=\infty & 0 &\mathrm{E}(X_1) \\ \hline \end{array} \]

As a consequence, suppose that one sets the maximal transfer cost so that \(RTC_{max} < \min \{\mathrm{E}(X_1), \mathrm{E}(X_2)\}\). Then, on the feasible set where \(RTC(u_1,u_2) \le RTC_{max}\), both \(u_1\) and \(u_2\) are positive.

Limitations on Risk Retention Parameters Imposed by Constraints

For the feasible set, we require \(u_1 \ge 0,u_2 \ge 0\) and \(0 \le RC_1(u_1) + RC_2(u_2) \le RTC_{max}\). This means that \(RC_j(u_j) \le RTC_{max}\) that we can also write as \(u_j \ge RC_j^{-1}[RTC_{max}]\), for \(j=1,2\). Here, \(RC_j^{-1}\) is the inverse of the cost function \(RC_j\) that may be defined as \(RC_j^{-1}(t) = \inf\{u: RC_j(u) \le t \}\). Thus, if the maximal risk transfer cost is sufficiently small, then we can get some information about the lower bounds for the risk retention parameters \(u_1, u_2\). This is illustrated in the following example.

Example 5.10. Parameter Limits for Gamma and Pareto Distributions. Let \(X_1\) have a gamma distribution with shape parameter 2 and scale parameter 2,000 and let \(X_2\) have a Pareto distribution with shape parameter 3 and scale parameter 2,000. Their relationship is governed by a normal copula with parameter \(\rho = 0.5\). For demonstration purposes, we assume a maximal risk transfer cost is 0.30 times the sum of expected losses. With \(\mathrm{E}(X_1) =\) 4,000, \(\mathrm{E}(X_2) =\) 1,000, this means that \(RTC_{max} =\) 1,500.

R Code for Example 5.10

For this example, it turns out that the lower bound for \(u_1\) is \(RC^{-1}_1(1500) =\) 3,113 and the lower bound for \(u_2\) is 0. Because the maximal risk transfer cost exceeds the expected risk, \(RTC_{max} > \mathrm{E}(X_2)\), it seems reasonable that there is no additional information about the lower bound for \(u_2\).

5.5.2 Active Constraint Feasible Region

It is natural to test limiting values so that one approaches the point where a constraint becomes active in the sense that \(RTC(u_1,u_2) = RTC_{max}\). How does the choice of \(RTC_{max}\) affect the set of risk retention parameters that achieve an active constraint? These parameter bounds could also provide useful starting values for the broader constrained optimization problem (where we do not impose an active constraint in advance).

Let us now consider limitations on the feasible set imposed by the active budget constraint \(RTC(u_1,u_2) = RTC_{max}\). With \(RC_1(u_1) + RC_2(u_2)=RTC_{max}\), we define \[\begin{equation} AT_{1,RC}(u_1) = RC_2^{-1}[RTC_{max} - RC_1(u_1)] = u_2, \tag{5.13} \end{equation}\] a trade-off function among parameter values at the active constraint. The requirement that \(0 \le u_2 \le \infty\) implies constraints on \(u_1\) in that \[\begin{equation} \max\left[0,RTC_{max}- RC_2(0) \right] \le RC_1(u_1) \le RTC_{max} . \tag{5.14} \end{equation}\]

\(Under~the~Hood.\) Proof of the Bounds for Active Constraints

To see this result, consider the following minimal scenario.

Example 5.11. Fair Transfer Costs, Uniform Distributions. Assume that \(X_1, X_2\) are both uniformly distributed on [0,1]. From equation (5.12), we have \(RC_j(u) = (1-u)^2/2\) and so \(RC_j^{-1}(y) = 1- \sqrt{2y}\). With this and equation (5.14), we have \[ \max \left[0 , RTC_{max}- \frac{1}{2} \right] \le \frac{(1-u_1)^2}{2} \le RTC_{max} . \] This is equivalent to \[ \max[0, 1-\sqrt{2 RTC_{max}} \ ] \le u_1 \le 1 - \sqrt{\max[0 , (2RTC_{max}-1)]} \ \ \ . \] Values of \(u_1\) outside of the boundaries may represent legitimate risk parameters but cannot achieve an active constraint.

For valid values of \(u_1\), from equation (5.13), we have \[ AT_{1,RC}(u_1) = 1 - \sqrt{2 RTC_{max} - (1-u_1)^2} ~, \] for the trade-off function. Figure 5.12 shows this trade-off function with these bounds. Note that values of \(u_1\) or \(u_2\) at zero signify total risk transfer and values at one signify total risk retention.

R Code To Plot this Figure
Plot of Trade-off Function Between Two Upper Limits for an Active Constraint. The black solid is for maximum risk transfer cost \(RTC_{max}\) 0.8, the red dashed is for 0.5, the blue dotted for 0.4, and the green dot-dash for 0.2.

Figure 5.12: Plot of Trade-off Function Between Two Upper Limits for an Active Constraint. The black solid is for maximum risk transfer cost \(RTC_{max}\) 0.8, the red dashed is for 0.5, the blue dotted for 0.4, and the green dot-dash for 0.2.

Limitations on Risk Retention Parameters Imposed by an Active Constraint

For an active constraint, \((u_1,u_2)\) are such that \(RC_1(u_1) + RC_2(u_2) = RTC_{max}\) and so there is only one free parameter, bring us back to the discussion in Section 5.5.1 with a known parameter. Formally, given \(u_1\), one can determine \(u_2\) through the active trade-off function in equation (5.13). In the same way, we can define a second active trade-off function \[ \begin{array}{lc} AT_{2,RC}(u_2) = RC_1^{-1}[RTC_{max} - RC_2(u_2)] = u_1 .\\ \end{array} \] that provides an expression to determine \(u_1\) based on knowledge of \(u_2\).

From equation (5.14), we can define two feasible sets: \[ \begin{array}{ll} FS_1 = \{u_1 \ge 0:\max\{0,RTC_{max}- RC_2(0) \} \le RC_1(u_1) \le RTC_{max} \} \\ FS_2 = \{u_2 \ge 0:\max\{0,RTC_{max}- RC_1(0) \} \le RC_2(u_2) \le RTC_{max} \} &. \\ \end{array} \]

  • Assuming that the constraint is binding or active, we can also get an upper bound on each parameter (that may be infinity).
    • The upper bound for \(u_1\) comes from the inequality \(\max\{0,RTC_{max}- RC_2(0) \} \le RC_1(u_1)\) and similarly for \(u_2\).
  • Note that these feasible sets depend on the maximal risk transfer cost, \(RTC_{max}\).

Example 5.12. Active Constraint Parameter Limitations. Table 5.2 provides these bounds using the assumptions of Example 5.10. Using equation (5.14) and \(FS_1\), we can determine a lower bound imposed by the active constraint limitation on \(u_1\) as \(u_{1,lower} = RC_1^{-1}(RTC_{max}) = 3,113\). With the value of \(u_{1,lower}\), we can compute the corresponding value for \(u_2=AT_{1,RC}(u_{1,lower}) =2,717,751\) (that is essentially infinity), as well as the \(VaR =4,757\) and \(ES =6,674\).

Because \(RTC_{max} > \mathrm{E}(X_2) = RC_2(0)\), we also have a finite upper bound on \(u_1\). This is \(u_{1,upper} = RC_1^{-1}[RTC_{max}- RC_2(0)] =5,989\). To summarize, from the table we see that the set of bounds on \(u_1\) for the first feasible set \(FS_1\) is [3113, 5989]. In the same way, the set of bounds on \(u_2\) for the second feasible set \(FS_2\) is [0.00, 100000000], meaning that essentially no bounds are available for this parameter.

Table 5.2: Bounds and Risk Measures
Feasible Set 1 Feasible Set 2
\(u_1\) Based on Lower \(u\) in the FS 3113 5989
\(u_2\) Based on Lower \(u\) in the FS 2717751 0
\(VaR\) Based on Lower \(u\) in the FS 4231 5989
\(ES\) Based on Lower \(u\) in the FS 5809 5989
\(u_1\) Based on Upper \(u\) in the FS 5989 3113
\(u_2\) Based on Upper \(u\) in the FS 0 100000000
\(VaR\) Based on Upper \(u\) in the FS 5989 4231
\(ES\) Based on Upper \(u\) in the FS 5989 5809

5.5.3 Active Constraint Objective Function

By working with feasible sets limited by the active constraint, we can replace the problem summarized in Display (5.10) with \[\begin{equation} \boxed{ \begin{array}{lc} {\small \text{minimize}_{u_1}} & ~~~~~RM[S(u_1, AT_{1,RC}(u_1))] ,\\ {\small \text{subject to}} &\max\{0,RTC_{max}- RC_2(0) \} \le RC_1(u_1) \le RTC_{max} \\ & u_1 \ge 0. \end{array} } \tag{5.15} \end{equation}\] This uses the active trade-off function \(AT_{1,RC}(u)\) defined in equation (5.13). Although certainly less intuitive, it is only a one-parameter problem. We could also express Display (5.15) in terms of the second trade-off function \(AT_{2,RC}\) and feasible set \(FS_2\). Although in principle they have the same solutions, the choice can matter in terms of numerical stability.

This section demonstrates, as suggested by Nocedal and Wright (2006), that the objective function varying over this restricted region can exhibit unanticipated behavior.

To start, we now return to Example 5.11 where the risks have identical distributions. In this case, the two functions are the same. The point of this illustration is to emphasize the importance of the dependence between risks.

Example 5.13. Visualizing the Objective Function, Uniform Distributions. Extending Example 5.11, we now assume the maximal risk transfer cost is \(RTC_{max} = 0.2\). With this, the parameter bounds become \[ \begin{array}{ll} 0.3675 & \approx 1-\sqrt{0.4} \\ & = \max[0, 1-\sqrt{2 RTC_{max}} \ ] \le u_1 \le 1 - \sqrt{\max[0 , (2RTC_{max}-1)]} = 1 . \end{array} \] From \(RC(u) = (1-u)^2/2\), one can easily verify that the inverse function is \(RC^{-1}(y) = 1 - \sqrt{2y}\). Thus, with equation (5.13), we have the active trade-off function to be \[ \begin{array}{ll} AT_{RC}(u) = RC^{-1}[RTC_{max} - RC(u)] =1 - \sqrt {0.4 - (1-u)^2} .\\ \end{array} \] Figure 5.13 summarizes results; it is a remarkable figure. This figure suggests that the optimization problem depends in a critical way on the dependence, reinforcing observations made in Example 5.2.

  • The left-hand panel suggests that the \(VaR\) does not depend on \(u_1\) at the solid black line corresponding to the case of independence. The dotted red curve is for a negative Spearman correlation -0.5, the dashed blue curve is for a positive correlation 0.5 (I use confidence level \(\alpha = 0.7\) for this graph). For negatively dependent risks marked by the dashed red curve, an analyst could use a standard optimization technique, such as seeking a point where the gradient is zero, to determine the best value of \(u\) to minimize the \(VaR\). However, if one uses the same technique for positive dependent risks remarked by the dotted blue curve, the “best” value of \(u\) results in the maximum \(VaR\). The solid black line is flat, suggesting that the analyst would find this optimization problem to be challenging.
  • The right-hand panel shows the relation between the upper limit and the \(ES\). This figure is more comforting in that, regardless of the dependence, one could use a standard optimization technique to determine its minimal value. Less comforting is the fact that one could get a very different optimal point, depending on whether one uses the \(VaR\) or \(ES\) as the risk measure.
R Code To Plot this Figure
Plot of Risk Measures at the Active Constraint for Uniform Distributions. The left-hand panel gives the value at risk at 70 percent confidence level, and the right-hand one gives the corresponding \(ES\). For these panels, the solid black line corresponds to the case of a zero Spearman correlation 0.0. The dotted red curve is for a negative Spearman correlation -0.5 and the dashed blue curve is for a strong positive Spearman correlation 0.5.

Figure 5.13: Plot of Risk Measures at the Active Constraint for Uniform Distributions. The left-hand panel gives the value at risk at 70 percent confidence level, and the right-hand one gives the corresponding \(ES\). For these panels, the solid black line corresponds to the case of a zero Spearman correlation 0.0. The dotted red curve is for a negative Spearman correlation -0.5 and the dashed blue curve is for a strong positive Spearman correlation 0.5.


This example underscores the importance of dependence between risks, a topic that we take up in Chapter 12.


Example 5.14. Visualizing the Objective Function with an Active Constraint. The observations from Example 5.11 are not unique to the uniform distribution. To verify this, we return to the set-up in Example 5.12. Figure 5.14 shows the value at risk and the expected shortfall for \(u_1\) in the interval \([u_{1,lower}, u_{1,upper}]\).

Plot of VaR and ES with an Active Constraint. The left-hand panel reports the \(VaR\), the right-hand panel reports the \(ES\).

Figure 5.14: Plot of VaR and ES with an Active Constraint. The left-hand panel reports the \(VaR\), the right-hand panel reports the \(ES\).

R Code for Bounds in Example 5.14

From the left-hand panel of Figure 5.14, we see that \(u_1=u_{1,lower}\) minimizes the value at risk. A numerical optimization routine that searches for a stationary point (with a zero derivative) will have difficulty reaching this conclusion. In contrast, from the right-hand panel, minimization of the expected shortfall can readily be accomplished with numerical optimization routines. The associated code demonstrates how to use the R function optimize. It verifies that, minimizing \(VaR\), the best value of \(u_1\) is 3,113 with value at risk \(VaR^* =\) 4,757 and expected shortfall \(ES^* =\) 6,673. Further, minimizing \(ES\), the best value of \(u_1\) is 4,257 with value at risk \(VaR^* =\) 5,038 and expected shortfall \(ES^* =\) 5,038.

R Code for Single Variable Optimization

This work assumes a fixed value of \(RTC_{max}\). It is reasonable to ask whether the results vary qualitatively if one changes the maximal risk transfer cost. Figure 5.15 shows the \(VaR\) and the \(ES\) over different levels of \(RTC_{max}\). The left-hand panel tells us that we are likely to encounter problems with the \(VaR\) regardless of the value of \(RTC_{max}\). The right-hand panel tells us that we are likely to have little difficulty with the \(ES\) regardless of the value of \(RTC_{max}\).

Plot of VaR and ES at the Active Constraint with Varying \(RTC_{max}\). The left-hand panel reports the \(VaR\), the right-hand panel reports the \(ES\). Each curve corresponds to a different value of \(RTC_{max}\).

Figure 5.15: Plot of VaR and ES at the Active Constraint with Varying \(RTC_{max}\). The left-hand panel reports the \(VaR\), the right-hand panel reports the \(ES\). Each curve corresponds to a different value of \(RTC_{max}\).

R Code for Figure 5.15

Video: Section Summary

5.6 Supplemental Materials

5.6.1 Further Resources and Readings

The statistics literature has a long history on sensitivity of quantiles to unusual or aberrant data, cf. Serfling (1980). In contrast, the management science literature uses quantile sensitivities, as in Hong (2009), to understand how a quantile changes when a parameter changes. To interpret the quantile sensitivity, Hong (2009) showed how to write it as a conditional expectation. Using notation from G. Jiang and Fu (2015), who provide less stringent conditions for the relationship to hold, we have \[ \mathrm{E}\left[ \partial_{\theta} ~g(X;\theta) | g(X;\theta)=x\right] =- \frac{\partial_{\theta} F_{g(\mathbf{X};\theta)}(x)} {\partial_{x} F_{g(\mathbf{X};\theta)}(x)}. \] In management science, these results have been utilized as part of the general simulation literature on evaluating derivatives, cf. Fu (2008).

The Section 5.3 risk measure sensitivities were portrayed in terms of risk retention parameters \(\boldsymbol \theta\) as these are the key variables of interest for portfolio risk retention. As a follow up, Chapter 9 will consider so-called “auxiliary” variables such as the mean of each risk and will provide motivation to calculate sensitivities with respect to these variables. No additional work will be needed for these auxiliary variables; the calculation techniques introduced in this chapter will hold for this new application.

The \(ES1\) function was first put forth by Rockafellar and Uryasev (2002).

Difficulties with value at risk optimization are well known in the finance literature. For example, Alexander (2013) argues that the use of \(VaR\) minimization in the asset portfolio problem can lead to perverse outcomes and argues for the use of expected shortfall.

See Lee (2023) for an extensive investigation of risk retention policies utilizing active constraints.

5.6.2 Exercises

Section 5.1 Exercises

Exercise 5.1. Lack of Convexity for Value at Risk. The sample code in Example 5.1 provided an example of the lack of convexity for the excess of loss contract based on the value at risk measure using simulation techniques. Replicate this example using the deterministic methods of evaluating the value at risk summarized in Section 5.2.

R Code For Exercise 5.1

Exercise 5.2. Lack of Convexity for Expected Shortfall. The sample code in Example 5.1 provided an example of the lack of convexity for the excess of loss contract based on the value at risk measure using simulation techniques. Continue using simulation techniques but now replicate this example using the expected shortfall instead of value at risk and summarize results as in Figure 5.16.

Exercise 5.2. Excess of Loss Expected Shortfall

Figure 5.16: Exercise 5.2. Excess of Loss Expected Shortfall

R Code For Exercise 5.2

Section 5.2 Exercises

Exercise 5.3. Special Case. Excess of Loss Distribution Function for Independent Uniformly Distributed Losses. Let us get some familiarity with the results in Section 5.2 by assuming that both risks have uniform (on (0,1)) distributions and are independent. Assume \(u_2 \le u_1\).

a. Show that \[ \begin{array}{ll} \Pr[S(u_1,u_2) \le y] \\ ~~~~~~~~~= \left\{\begin{array}{ll} y^2/2 \times I(y\le 1) + \{1- (2-y)^2/2\}\times I(1\le y \le 2) & \text{if } y <u_2 < u_1 \\ u_2^2/2 + y - u_2 & \text{if } u_2 < y <u_1 \\ u_1u_2 - \frac{1}{2}(u_1 + u_2 - y)^2 & \text{if } u_2< u_1 < y \\ ~~~~ +(y-u_1)(1-u_1)+(y-u_2)(1-u_2) .\\ \end{array}\right. \end{array} \]

b. For \(u_1 = 0.8\), \(u_2 = 0.4\), provide code to display the function as in Figure 5.17.

Exercise 5.3. Excess of Loss Distribution Function

Figure 5.17: Exercise 5.3. Excess of Loss Distribution Function

Show Exercise 5.3 Solution

Exercise 5.4. Special Case. Excess of Loss Value at Risk for Independent Uniformly Distributed Losses. Use the assumptions of Exercise 5.3, provide code to replicate the graph of the value at risk versus the level of confidence \(\alpha\) in Figure 5.18.

Exercise 5.4. Excess of Loss Value at Risk versus Level of Confidence (\(\alpha\))

Figure 5.18: Exercise 5.4. Excess of Loss Value at Risk versus Level of Confidence (\(\alpha\))

Show Exercise 5.4 Solution

Section 5.3 Exercises

Exercise 5.5. Risk Sensitivity for the Sum of Two Risks. This is a continuation of Exercises 2.3, 2.4, and 2.6. The risk sensitivities presented in equations (5.5) and (5.7) appear to be mathematically challenging so it might be helpful to provide some numerical context. To this end, consider the sum of two risks and the baseline retained risk \(g(S) =g(S;d =100,c =0.8, u=30000)\) with confidence level \(\alpha = 0.95\).

a. Determine the quantile sensitivity that appears in equation (5.5). Because of the complexity of the quantile, use the R function grad() from the package numDeriv to compute numerical approximations of the derivatives.
b. Determine the \(ES\) sensitivity that appears in equation (5.7). Hint: For the partial derivative of the limited expectations, compute \(\mathrm{E}[g(\mathbf{X}; \boldsymbol \theta) \wedge z]\) as a function of argument \(z\) and parameters \(\boldsymbol \theta\) and then calculate a numerical derivative.

Show Exercise 5.5 Solution

Exercise 5.6. Single Variable Upper Limit - \(VaR\). Consider a single random variable \(X\) with an upper limit so that \(\theta = u\) and \(g(X;\theta) = X \wedge u\). Show that the results from Table 2.2 are not consistent with those from equation (5.5). Infer from this that smoothness conditions on the retention parameter \(\theta=u\) are not satisfied.

\(Under~the~Hood.\) Show Exercise 5.6 Solution

Exercise 5.7. Single Variable Upper Limit - \(ES\). Consider a single random variable \(X\) with an upper limit so that \(\theta = u\) and \(g(X;\theta) = X \wedge u\). Although this set-up is similar to Exercise 5.6, show that one can use equation (5.7) to determine

\[ \partial_{u} ES_{\alpha}(X \wedge u) = \left\{ \begin{array}{cl} 1 & \text{if } u \le F_{\alpha}^{-1} \\ \frac{1-F(u)}{1-\alpha} & \text{if } u > F_{\alpha}^{-1} .\\ \end{array} \right. \]

This corroborates the results from Table 2.2.

\(Under~the~Hood.\) Show Exercise 5.7 Solution

Exercise 5.8. Special Case. Excess of Loss Value at Risk for Independent Uniformly Distributed Losses. Use the assumptions of Exercise 5.3, determine the first and second derivatives of the value at risk with respect to the first retention parameter, \(u_1\). Describe what this says about the convexity of the value at risk function.

Show Exercise 5.8 Solution

Section 5.5 Exercises

Exercise 5.9. Constrained Optimization with an Equality Constraint. Consider the trade-off function, defined in equation (5.13).

a. Show that the derivative of the trade-off function can be expressed as \(- RC_1^{\prime}(u) / RC_2^{\prime}[AT_{RC}(u)]\).
b. In the case of fair transfer costs, show that the derivative is \(-\{1-F_1(u)\} / \{1-F_2[AT_{RC}(u)]\}\).

Show Exercise 5.9 Solution

Exercise 5.10. Derivatives of the Risk Measure at the Equality Constraint. To ease the notation a bit, define \(u_2^* =AT_{RC}(u)\) for the second upper limit (that is a function of the first upper limit \(u\)). Show that \[\begin{equation} \begin{array}{ll} &\partial_u VaR[u, u_2^*] \\ &= \frac{1}{F_y[u, u_2^*,y]} \left\{ f_2(y-u)\left\{1-C_2[F_1(u),F_2(y-u)] \right\} \right.\\ & \ \ \ \ \left. + f_1(y-u_2^*) \left\{1-C_2[F_2(u_2^*),F_1(y-u_2^*)] \right\} AT_{1RC}(u) \right\}, \end{array} \tag{5.16} \end{equation}\] at \(y=VaR[u, u_2^*]\).

Show Exercise 5.10 Solution

Exercise 5.11. Special Case. Identical Distributions. Consider the assumptions of Exercise 5.10 and further suppose that we have identical distributions so that \(F_1=F_2=F\) and similarly for the densities. Give a simpler expression for equation (5.16) and argue that \(u = RC^{-1}(RTC_{max}/2)\) is a stationary point for the \(VaR\). Note this result does not depend on the dependence and relate this to Figure 5.13.

Show Exercise 5.11 Solution

Exercise 5.12. Special Case. Independent Distributions. Consider the assumptions of Exercise 5.10 and now assume independent risks. Provide sufficient mild conditions so that a stationary point of the \(VaR\) can be found at \(u\) such that \(f_2(y-u)=f_1(y-u_2^*)\), where \(y=VaR[u, u_2^*]\).

Show Exercise 5.12 Solution