Chapter 11 Risk Retention Conditions

11.1 KKT Conditions

We now return to the general constrained optimization problem in Display (3.9) and the associated Lagrangian in equation (3.10). Suppose that ${\bf z}^*$ is a local minimizer on the feasible set. Then, there exist multipliers $\mathbf{LMI}^* = \left(LMI_j^*; ~j \in CON_{in} \right)^{\prime}$ and $\mathbf{LME}^* = \left(LME_j^*; ~j \in CON_{eq} \right)^{\prime}$ that satisfy the

Karush-Kuhn-Tucker ($KKT$) Conditions. \[\begin{equation} \begin{array}{ll} \partial_{z_i} ~ \left. LA\left({\bf z},\mathbf{LMI}^*,\mathbf{LME}^* \right) \right|_{{\bf z}={\bf z}^*} = 0 & i=1, \ldots, p_z \\ LMI_j^* f_{con,j}({\bf z}^*) = 0 & j \in CON_{in} \\ LMI_j^* \ge 0 & j \in CON_{in} \\ f_{con,j}({\bf z}^*) \le 0 & j \in CON_{in} \\ f_{con,j}({\bf z}^*) = 0 & j \in CON_{eq} . \\ \end{array} \tag{11.1} \end{equation}\]

The conditions in Display (11.1) are known as the Kuhn-Tucker or the Karush-Kuhn-Tucker (KKT) conditions, named after their originators. Mathematically, they are necessary conditions; that is, if a point is an optimum, then the conditions must hold. Many constrained optimization algorithms can be seen as methods for solving these conditions; they are extensively used in analyses.

The KKT conditions are also referred to as first-order necessary conditions as they are based on first derivatives. Being derivative-based, these conditions only ensure a point is a local optimum and do not indicate global behavior. Additional constraints, such as convexity or conditions on second derivatives, are necessary to guarantee a point is a global minimizer. It is important to note that these are (partial) derivatives with respect to the decision variables. This differs from the sensitivity analysis in Chapter 9, which focused on derivatives with respect to auxiliary parameters.

Example 11.1. Univariate Risk Retention Conditions. Consider a single (univariate) risk $X$ and the risk retention function given in Section 2.2 that has parameters $\boldsymbol \theta = (d,c,u)$. For this example, let us take the coinsurance parameter $c=1$. For the risk measure, consider the expected shortfall $ES_{\alpha}[g(X);\boldsymbol \theta]$ given in equation (2.11). For simplicity, use the fair risk transfer cost $RTC(\boldsymbol \theta)$ $= \mathrm{E}(X) - \mathrm{E}[g(X;\boldsymbol \theta)]$, with an expression for $\mathrm{E}[g(X;\boldsymbol \theta)]$ in equation (2.7). Explicitly, consider the risk retention problem \[ \boxed{ \begin{array}{lll} {\small \text{minimize}}_{d,u} & f_0(d,u) =ES_{\alpha}[g(X);d,u] \\ {\small \text{subject to}} & f_{con,1}(d,u) = RTC(d,u) - RTC_{max} \le 0 & \\ & d \ge 0, ~~~~~~~~ u \ge d .\\ \end{array} } \] As in Section 3.5, we assume that the $RTC_{max}$ is sufficiently small so that the largest feasible deductible is smaller than the $\alpha$ quantile of the risk $X$, that is, $d_{max} < F^{-1}_{\alpha}$. Under this constraint, one can use the KKT conditions to verify that the optimal deductible is $d^*=0$.

$Under~the~Hood.$ Verify that the Optimal $d=0$

As with equation (3.10), the associated Lagrangian is

\[ \begin{array}{ll} LA\left(\boldsymbol \theta, {\bf LMI}\right) =ES_{\alpha}[g(X);\boldsymbol \theta] &+ LMI_1 [RTC(\boldsymbol \theta) - RTC_{max}] \\ &- LMI_2 ~d + LMI_3 (d-u) . \\ \end{array} \]

Taking partial derivatives, we have

\[ \begin{array}{ll} \partial_d LA\left(\boldsymbol \theta,{\bf LMI}\right) & =\partial_d ES_{\alpha}[g(X);\boldsymbol \theta] + LMI_1 ~\partial_d RTC(\boldsymbol \theta) +LMI_3 - LMI_2\\ \partial_u LA\left(\boldsymbol \theta,{\bf LMI}\right) & =\partial_u ES_{\alpha}[g(X);\boldsymbol \theta] + LMI_1 ~\partial_u RTC(\boldsymbol \theta) -LMI_3 , \\ \end{array} \]

where explicit expressions for $\partial_{\theta} ES_{\alpha}(g(X)$ and $\partial_{\theta} RTC(\boldsymbol \theta)$ are in Table 2.2. Henceforth, we evaluate expressions at the optimum and, for simplicity, drop the asterisk ($~^*$) superscript on $\boldsymbol \theta$ and ${\bf LMI}$.

From equation (2.7), we can express the risk transfer cost as

\[ {\small \begin{array}{rl} RTC(d, c=1, u) & = \mathrm{E}(X) - \mathrm{E}[g(X;d, c=1, u])] \\ & = \mathrm{E}(X) - \displaystyle\int_d^u [1 - F(x)]dx \\ & = \displaystyle\int_0^d [1 - F(x)]dx + \displaystyle\int_u^{\infty} [1 - F(x)]dx . \end{array} } \]

Now, suppose that $LMI_1 = 0$ so that, from the second partial derivative at the optimum, we have

\[ \partial_u LA\left(\boldsymbol \theta,{\bf LMI}\right) =\partial_u ES_{\alpha}[g(X);\boldsymbol \theta] -LMI_3 <0, \]

using Table 2.2. This violates the KKT conditions, so $LMI_1 > 0$.

Because $LMI_1 > 0$, the requirement that $RTC(\boldsymbol \theta) = RTC_{max}$ is binding. Further, because $RTC(d,c=1, u) = RTC_{max} < E(X)$, this means that $u>d$ (because $RTC(d,c, u=d)=\mathrm{E}(X)$). Thus, from the second KKT condition, we have $LMI_3 = 0$.

Because $u>d$ and with the KKT conditions, we have $\partial_u LA\left(\boldsymbol \theta,LMI_1\right) =0$. So, we can solve for $LMI_1$ as

\[ \begin{array}{ll} 0 &=\partial_u LA\left(\boldsymbol \theta,{\bf LMI}\right) = \partial_u ES_{\alpha}[g(X);\boldsymbol \theta] - LMI_1 [1-F(u)] \\ \Rightarrow & LMI_1 = \frac{\partial_u ES_{\alpha}[g(X);\boldsymbol \theta]}{1-F(u)} . \end{array} \]

From Table 2.2,

\[ \begin{array}{ll} &LMI_1 = \left\{ \begin{array}{cc} \frac{1}{1-\alpha} & F_{\alpha}^{-1} < d \\ \frac{1}{1-\alpha} & d \le F_{\alpha}^{-1} < u \\ \frac{1}{1-F(u)}& u \le F_{\alpha}^{-1} .\\ \end{array} \right. \\ \\ \end{array} \]

Thus,

\[ \begin{array}{ll} \partial_d LA\left(\boldsymbol \theta,{\bf LMI}\right) & = -\frac{1-F(d)}{1-\alpha} + LMI_1 [1-F(d)] - LMI_2\\ &= - LMI_2 +\left\{ \begin{array}{cc} -\frac{1-F(d)}{1-\alpha}+\frac{1}{1-\alpha} [1-F(d)] & F_{\alpha}^{-1} < d \\ -1+\frac{1}{1-\alpha} [1-F(d)] & d \le F_{\alpha}^{-1} < u \\ -1+\frac{1}{1-F(u)} [1-F(d)] & u \le F_{\alpha}^{-1} .\\ \end{array} \right. \\ &= - LMI_2 +\left\{ \begin{array}{cc} 0 & F_{\alpha}^{-1} < d \\ \frac{\alpha-F(d)}{1-\alpha} & d \le F_{\alpha}^{-1} < u \\ ~~~~~~~~~~~~~\frac{F(u)-F(d)}{1-F(u)}~~~~~~~~~~~~~& u \le F_{\alpha}^{-1} .\\ \end{array} \right. \\ \end{array} \]

From the KKT conditions, at the optimum we have $\partial_d LA\left(\boldsymbol \theta,LMI_1\right) =0$. Thus, $LMI_2>0$ except when $d >F_{\alpha}^{-1}$, a case ruled out by the problem set-up where $d \le d_{max} < F^{-1}_{\alpha}$, so $\partial_d LA\left(\boldsymbol \theta,LMI_1\right) >0$. Because $LMI_2>0$, from the second KKT condition we have a binding constraint and so the optimal $d=0$, as claimed.

11.2 Risk Retention Conditions for a Single Risk

I now extend Example 11.1 to incorporate coinsurance along with other risk measures. This section revisits the work that was done in Section 3.5, but presenting it within the KKT framework offers a more concise and elegant approach. The focus remains on a single (univariate) risk $X$ and the risk retention function detailed in Section 2.2 that has parameters $\boldsymbol \theta = (d,c,u)$. Initially, let us consider a generic risk measure $RM$ and risk transfer cost $RTC$. It is worth noting that when $d=u$, no losses are retained. To introduce some risk retention, assume that the maximal risk transfer cost $RTC_{max} < RTC(d,c, u=d)$.

Explicitly, consider the risk retention problem \[ \boxed{ \begin{array}{lll} {\small \text{minimize}} & f_0(\boldsymbol \theta) =RM[g(X);\boldsymbol \theta] \\ {\small \text{subject to}} & f_{con,1}(\boldsymbol \theta) = RTC(\boldsymbol \theta) - RTC_{max} \le 0 & \\ & f_{con,2}(\boldsymbol \theta) = c -1 \le 0 \\ & d \ge 0, c \ge 0, u \ge d .\\ \end{array} } \] Here, the constraint on $f_{con,2}$ ensures $c \le 1$.

Consistent with Table 2.2, I also assume: \[\begin{equation} \boxed{ {\small \begin{array}{llll} \partial_d RM[g(X);\boldsymbol \theta] \le0 & & \partial_d RTC(\boldsymbol \theta) > 0 \\ \partial_c RM[g(X);\boldsymbol \theta] > 0 & & \partial_c RTC(\boldsymbol \theta) < 0 \\ \partial_u RM[g(X);\boldsymbol \theta] \ge 0 & & \partial_u RTC(\boldsymbol \theta) < 0 . \\ \end{array} } } \tag{11.2} \end{equation}\]

For each parameter $\theta$ in the vector $\boldsymbol \theta$, define the risk measure relative marginal change \[ RM^2_{\theta} = - \frac{\partial_{\theta} RM[g(X);\boldsymbol \theta]}{\partial_{\theta} RTC(\boldsymbol \theta)}, \ \ \ \text{for } \theta = d, c, u, \] and, using Display (11.2), note that $RM^2_{\theta} \ge 0$ for each $\theta$.

At the optimum values of $\boldsymbol \theta^* = (d^*,c^*,u^*)$, one can use the KKT conditions to provide sufficient conditions so that the optimal values of parameters are on boundaries, as follows \[\begin{equation} \begin{array}{ccc} &&\text{if } \ \ \ RM^2_{u^*} > RM^2_{d^*}, \ \ \ \text{then } d^*=0 \\ \end{array} \tag{11.3} \end{equation}\] and \[\begin{equation} \begin{array}{ccc} && \text{if } \ \ \ RM^2_{u^*} > RM^2_{c^*}, \ \ \ \text{then } c^*=1. \\ \end{array} \tag{11.4} \end{equation}\]

$Under~the~Hood.$ Proof of the Single Risk Retention Results

From equation (3.10), the associated Lagrangian is

\[ \begin{array}{ll} LA\left(\boldsymbol \theta,\mathbf{LMI}\right) & =RM[g(X);\boldsymbol \theta] + LMI_1 (RTC(\boldsymbol \theta) - RTC_{max}) + LMI_2 (c-1) \\ & ~~~~~~~~~~~~~~~~~~ - LMI_3 d - LMI_4 c + LMI_5 (d-u) . \end{array} \]

Taking partial derivatives, we have

\[ {\small \begin{array}{llr} \partial_d LA\left(\boldsymbol \theta,\mathbf{LMI}\right) & =\partial_d RM[g(X);\boldsymbol \theta] + LMI_1 ~\partial_d RTC(\boldsymbol \theta) - LMI_3 + LMI_5 \\ \partial_c LA\left(\boldsymbol \theta,\mathbf{LMI}\right) & =\partial_c RM[g(X);\boldsymbol \theta] + LMI_1 ~\partial_c RTC(\boldsymbol \theta) - LMI_4 & (11.4a) \\ \partial_u LA\left(\boldsymbol \theta,\mathbf{LMI}\right) & =\partial_u RM[g(X);\boldsymbol \theta] + LMI_1 ~\partial_u RTC(\boldsymbol \theta) - LMI_5 . \\ \end{array} } \]

Henceforth, we evaluate expressions at the optimum and, for this proof, drop the asterisk ($~^*$) suffix on $\boldsymbol \theta$ and $LMI$.

The assumption $RTC_{max} < RTC(d,c, u=d)$ means that $u>d$. Thus, from the KKT conditions, we have $LMI_5 = 0$. With this and the third expression in Display (11.4a), we have

\[ 0=\partial_u LA\left(\boldsymbol \theta,\mathbf{LMI}\right) = \partial_u RTC(\boldsymbol \theta) (LMI_1 - RM^2_{u}) \Rightarrow LMI_1 = RM^2_{u}, \]

with our assumption that $\partial_{u} RTC(\boldsymbol \theta) < 0$.

From the first two expressions in Display (11.4a), we have

\[ {\small \begin{array}{ll} \partial_d LA\left(\boldsymbol \theta,\mathbf{LMI}\right) & =(-RM^2_{d} + RM^2_{u}) ~\partial_d RTC(\boldsymbol \theta) - LMI_3 \\ \partial_c LA\left(\boldsymbol \theta,\mathbf{LMI}\right) & =(-RM^2_{c} + RM^2_{u}) ~\partial_c RTC(\boldsymbol \theta) +LMI_2 - LMI_4. \\ \end{array} } \]

As in equation (11.3), assume that $RM^2_{u} > RM^2_{d}$. Then, with $\partial_d LA\left(\boldsymbol \theta,\mathbf{LMI}\right)=0$, we have

\[ {\small \begin{array}{ll} LMI_3 & =( RM^2_{u}-RM^2_{d}) ~\partial_d RTC(\boldsymbol \theta) >0 \\ \end{array} } \]

By the KKT conditions, this means that $d=0$, the desired result.

Next, as in equation (11.4), assume that $RM^2_{u} > RM^2_{c}$. Then, with $\partial_c LA\left(\boldsymbol \theta,\mathbf{LMI}\right)=0$, we have

\[ {\small \begin{array}{ll} LMI_2 & =( RM^2_{c}-RM^2_{u}) ~\partial_c RTC(\boldsymbol \theta) + LMI_4 > 0 \\ \end{array} } \]

By the KKT conditions, this means that $c=1$, the desired result.

To show how to apply these results, we explore the $VaR$ and $ES$ measures. The following Table 11.1 summarizes the $RM^2$ measures, providing an extension of Table 2.2.

Table 11.1. $RM^2$ for Value at Risk and Expected Shortfall \[\begin{equation*} {\small \begin{matrix} \begin{array}{l | ccc | c} \hline \text{Summary} & & \text{Parameter} (\theta) \\ \text{Measure} & d & c & u & \text{Range of } \alpha\\ \hline \begin{array}{c} RM^2_{\theta} \text{ for} \\ VaR\end{array} & \begin{array}{cc} 0 \\ \frac{1}{1-F(d)} \\ \frac{1}{1-F(d)} \\ \end{array} & \begin{array}{cc} 0 \\ \frac{F^{-1}_{\alpha}-d}{\mathrm{E} (X \wedge u)- \mathrm{E} (X \wedge d) } \\ \frac{u-d}{\mathrm{E} (X \wedge u)- \mathrm{E} (X \wedge d) } \end{array} & \begin{array}{cc} 0 \\ 0 \\ \frac{1}{1-F(u)} \\ \end{array} &\begin{array}{c} \alpha < F(d) \\ F(d) \le \alpha < F(u) \\ F(u) \le \alpha \\ \end{array} \\ \hline \begin{array}{c} RM^2_{\theta} \text{ for} \\ ES\end{array} & \begin{array}{c} \frac{1}{1-\alpha} \\ \frac{1}{1-F(d)} \\ \\ \\ \frac{1}{1-F(d)} \\ \end{array} & \begin{array}{c} \frac{1}{1-\alpha} \\ \frac{1}{1-\alpha} \left\{ \mathrm{E} (X \wedge u)- \mathrm{E} (X \wedge F_{\alpha}^{-1}) \right. \\ \left. +(1-\alpha)(F_{\alpha}^{-1} - d) \right\} / \\ \left\{ \mathrm{E} (X \wedge u)- \mathrm{E} (X \wedge d) \right\} \\ \frac{u-d}{\mathrm{E} (X \wedge u)- \mathrm{E} (X \wedge d) } \\ \end{array} & \begin{array}{c} \frac{1}{1-\alpha} \\ \frac{1}{1-\alpha} \\ \\ \\ \frac{1}{1-F(u)} \\ \end{array} & \begin{array}{c} \alpha < F(d) \\ F(d) \le \alpha < F(u) \\ \\ \\ F(u) \le \alpha \\ \end{array} \\ \hline \end{array} \end{matrix} } \end{equation*}\]

For both the $VaR$ and $ES$ risk measures, from Table 11.1 we see that if the optimal $u^*<F^{-1}_{\alpha}$, then the condition in Display (11.3) holds and the optimal deductible is $d^*=0$.

For the expected shortfall measure, suppose that the optimal deductible $d^*<F^{-1}_{\alpha}$. Then, from Table 11.1 we see that the condition in Display (11.3) holds and the optimal deductible is $d^*=0$. As in Section 3.5, we can assume that the $RTC_{max}$ is sufficiently small so that $d_{max} < F^{-1}_{\alpha}$, thus ensuring that $d^*<F^{-1}_{\alpha}$.

For the coinsurance parameter, let us again consider the case where the optimal $u^*<F^{-1}_{\alpha}$. Then, for both the value at risk and the expected shortfall, the condition in Display (11.4) can be expressed as \[ \begin{array}{ll} &\frac{1}{1-F(u^*)}=RM^2_{u^*} > RM^2_{c^*} = \frac{u^*-d^*}{\mathrm{E} (X \wedge u^*)- \mathrm{E} (X \wedge d^*) } \\ \iff &\mathrm{E} (X \wedge u^*)- \mathrm{E} (X \wedge d^*) > (u^*-d^*)[1-F(u^*)] . \end{array} \] Using \[ \begin{array}{ll} \mathrm{E} (X \wedge u^*)- \mathrm{E} (X \wedge d^*) &= \int^{u^*}_{d^*} [1-F(z)]dz \\ &> \int^{u^*}_{d^*}[1-F(u)]dz = (u^*-d^*)[1-F(u^*)] ,\\ \end{array} \] we see that this condition is satisfied at the optimum so the optimal parameter is $c^*=1$.

Readers are invited to explore extensions to the range value at risk, $RVaR$.

11.3 Risk Retention Conditions for Multiple Risks

We now use the KKT conditions introduced in Section 11.1 to investigate general conditions required for achieving an optimal risk portfolio.

11.3.1 Risk Retention Conditions

We start by looking at a risk retention problem in a fairly abstract way and then specialize to problems of interest to us. To be explicit, consider a slight modification of risk retention problem in Display (9.1) as follows. For simplicity, I dropped the auxiliary decision variable $z_0$ used in expected shortfall and use $RM$ for a risk measure instead of the generic $f_0$.

Risk Retention Problem \[\begin{equation} \begin{array}{lc} {\small \text{minimize}_{\boldsymbol \theta}} & RM( \boldsymbol \theta) \\ {\small \text{subject to}} & ~~~~~RTC(\boldsymbol \theta) \le RTC_{max} \\ & {\bf P} \boldsymbol \theta \le {\bf p}_0 ~~. \end{array} \tag{11.5} \end{equation}\]

To permit detailed analysis, define ${\bf P}_i$ to be the $i$th row of $\bf P$ and $p_{0,i}$ to be the $i$th element of ${\bf p}_0$. With this notation, the $(i+1)$st constraint is ${\bf P}_i ~ \boldsymbol\theta \le p_{0,i}$. Thus, we can write the Lagrangian is \[\begin{equation} \begin{array}{lc} LA( \boldsymbol \theta, {\bf LMI}) &= RM[g(\mathbf{X};\boldsymbol \theta)] + LMI_1[RTC(\boldsymbol \theta) - RTC_{max} ] \\ & ~~~~~~ +\sum_{i=1}^{m-1} LMI_{i+1} \left({\bf P}_i ~ \boldsymbol\theta - p_{0,i}\right) . \end{array} \tag{11.6} \end{equation}\] Using Display (11.1), the risk retention problem KKT conditions are

$KKT$ Conditions for the Risk Retention Problem \[\begin{equation} \begin{array}{ll} \partial_{\theta_j} ~ \left. LA\left(\boldsymbol \theta,\mathbf{LMI}^* \right) \right|_{\boldsymbol \theta=\boldsymbol \theta^*} = 0 & j=1, \ldots, p_z \\ LMI_1^* [RTC(\boldsymbol \theta^*) - RTC_{max} ] = 0 \\ LMI_{i+1}^* \left[{\bf P}_i ~ \boldsymbol\theta - p_{0,i}\right] = 0 & i=1, \ldots, m-1\\ LMI_i^* \ge 0 & i=1, \ldots, m \\ RTC(\boldsymbol \theta^*) \le RTC_{max} \\ {\bf P}_i ~ \boldsymbol\theta^* \le p_{0,i} & i=1, \ldots, m-1 .\\ \end{array} \tag{11.7} \end{equation}\]

Our interest is in summarizing behavior when the optimal value of the risk retention parameters are not on one of the edges defined by the vector constraint ${\bf P} \boldsymbol \theta \le {\bf p}_0$. So, for the $j$th decision variable, let us consider a typical row, say, the $i$th one, where $\theta_j$ is present. Mathematically, we can express this as $\partial_{\theta_j} {\bf P}_i \boldsymbol \theta \ne 0$ for some $\boldsymbol \theta$. For such a row, to ensure that the $j$th decision variable is not on an edge, we require ${\bf P}_i ~ \boldsymbol\theta^* < p_{0,i}$. To summarize, we require:

Edge Condition for the $j$th Decision Variable. For $i = 1, \ldots, m-1$, if $\partial_{\theta_j} {\bf P}_i \boldsymbol \theta \ne 0$ for some $\boldsymbol \theta$, then ${\bf P}_i ~ \boldsymbol\theta^* < p_{0,i}$.

As before, it is helpful to use the risk measure relative marginal change, \[ RM^2_j(\boldsymbol \theta) = - \frac{\partial_{\theta_j}RM[g(\mathbf{X};\boldsymbol \theta)]}{\partial_{\theta_j}RTC(\boldsymbol \theta)} . \] Problems of interest to us largely adhere to the following:

Condition RR1. $\partial_{\theta_j} ~ \left. RTC\left(\boldsymbol \theta \right) \right|_{\boldsymbol \theta=\boldsymbol \theta^*} \ne 0$.
Condition RR2. $\partial_{\theta_j} ~ \left. RM[g(\mathbf{X};\boldsymbol \theta)] \right|_{\boldsymbol \theta=\boldsymbol \theta^*} \ne 0$.

When the edge condition for the $j$th decision variables holds, there are opportunities to consider small changes in the parameter. When this happens, we want these small changes to affect both the risk transfer cost (Condition RR1) and the risk measure (Condition RR2). It is worth noting that the requirement $0<RM^2_j(\boldsymbol \theta^*) < \infty$ is sufficient for Conditions RR1 and RR2. We typically use this simpler condition in applications.

Binding Budget Constraint

Assume that the edge condition for the $j$th decision variable and the corresponding Conditions RR1 and RR2 hold. Then, $RTC(\boldsymbol \theta^*) = RTC_{max}$.

$Under~the~Hood.$ Confirm the Budget Constraint to be Binding

Because the edge condition for the $j$th decision variables holds, we can let $C_j$ be the set of rows where $\theta_j$ is present, that is,

\[ C_j = \{i: \partial_{\theta_j} {\bf P}_i \boldsymbol \theta \ne 0, \text{ for some } \boldsymbol \theta \}. \]

Then, with Conditions RR1 and RR2 and the third KKT condition, the associated Lagrange multiplier is zero, that is, $LMI_{i+1}^* = 0$ for $i \in C_j$. Thus, the Lagrangian is

\[ \begin{array}{lc} LA( \boldsymbol \theta, {\bf LMI}^*) &= RM[g(\mathbf{X};\boldsymbol \theta)] + LMI_1^*[RTC(\boldsymbol \theta) - RTC_{max} ] \\ & ~~~~~~ +\sum_{i \notin C_j} LMI_{i+1}^* \left({\bf P}_i ~ \boldsymbol\theta - p_{0,i}\right) . \end{array} \]

By definition, the third term on the right-hand, those $i \notin C_j$, do not contain $\theta_j$. Thus, taking a partial derivative of the Lagrangian and evaluating it at the optimum yields

\[ \begin{array}{ll} &0 = \partial_{\theta_j} ~ \left. LA\left(\boldsymbol \theta,{\bf LMI}^*\right) \right|_{\boldsymbol \theta=\boldsymbol \theta^*} \\ & = \partial_{\theta_j} ~ \left. RM[g(\mathbf{X};\boldsymbol \theta)] \right|_{\boldsymbol \theta=\boldsymbol \theta^*} + LMI_1^* ~\partial_{\theta_j} ~ \left. RTC(\boldsymbol \theta) \right|_{\boldsymbol \theta=\boldsymbol \theta^*} \\ \Rightarrow &LMI_1^* = -\left.\frac{\partial_{\theta_j}RM[g(\mathbf{X};\boldsymbol \theta)]}{\partial_{\theta_j}RTC(\boldsymbol \theta)}\right|_{\boldsymbol \theta=\boldsymbol \theta^*} = RM^2_j(\boldsymbol \theta^*) ~. \end{array} \]

By the fourth KKT assumption, the risk measure is non-negative at the optimum. Further, with Conditions RR1 and RR2, $LMI_1^* = RM^2_j(\boldsymbol \theta^*) > 0$. Thus, by the second KKT condition, the constraint $RTC(\boldsymbol \theta^*)=RTC_{max}$ is binding which is sufficient for the result.

Note that $LMI_1^* = RM^2_j(\boldsymbol \theta^*)$ for all $j$ for which the conditions hold. This then yields immediately the following result on a balance at the optimum.

If the budget constraint is binding, this reduces the feasible region where one searches for the optimal parameter values. If we have good reason to suspect that an algorithm will converge where one of the parameters has a positive $RM^2$ (so that both Conditions RR1 and RR2 hold), then we might get faster convergence by assuming the risk transfer constraint is binding.

Moreover, from the proof, one can observe that the optimal value of the Lagrange multiplier is $RM^2_j(\boldsymbol \theta^*)$. It is interesting to note that this holds for all retention parameters where the edge condition on the decision variables hold. This leads to the following.

Balance Among Retention Parameters at the Optimum

Assume that the edge conditions for the $i$th and $j$th decision variables and the corresponding Conditions RR1 and RR2 hold. Then, \[\begin{equation} RM^2_i(\boldsymbol \theta^*) = RM^2_j(\boldsymbol \theta^*). \tag{11.8} \end{equation}\]

This result was previously hinted at in Section 2.3 where, in the simpler context of a single risk with only a few parameters, we described the natural interpretation of the risk measure relative marginal change. Recall that $RM^2$ can be interpreted as measuring the marginal (negative) change in the risk measure per unit marginal change in the risk transfer cost. In other words, if a change in the $j$th parameter causes a unit change in the risk transfer cost, the $RM_j^2$ quantifies the amount of change in the measure of uncertainty. It seems reasonable that, at the optimum, we would strive for a balance among the parameter settings. Without this balance, a change in one parameter would result in a greater (or lesser) change in the objective function, given the same change in the risk transfer cost.

Example 11.2. Multivariate Excess of Loss with Variance as a Risk Measure. For the variance as a risk measure, we have \[ \begin{array}{ll} \partial_{\theta_j} RM[g(\mathbf{X};\boldsymbol \theta)] & = \partial_{\theta_j} \frac{1}{2} \mathrm{Var}[g(\mathbf{X};\boldsymbol \theta)] \\ & = \frac{1}{2} \partial_{\theta_j} \mathrm{E}[g(\mathbf{X};\boldsymbol \theta)]^2 - \mathrm{E}[g(\mathbf{X};\boldsymbol \theta)] \{\partial_{\theta_j} \mathrm{E}[g(\mathbf{X};\boldsymbol \theta)]\}\\ & = \mathrm{E}[g(\mathbf{X};\boldsymbol \theta)\partial_{\theta_j}g({\bf x};\boldsymbol \theta)] - \mathrm{E}[g(\mathbf{X};\boldsymbol \theta)] \{ \mathrm{E}[\partial_{\theta_j}g({\bf x};\boldsymbol \theta)]\} .\\ \end{array} \] For excess of loss, we require that upper limit parameters $u$ be nonnegative. So, the constraint $u_j \ge 0$ is equivalent to ${\bf P}_j ~ \boldsymbol\theta \le p_{0,j}$ by taking ${\bf u} = \boldsymbol\theta$, ${\bf P}_j = -{\bf 1}_j'$, and $p_{0,j}=0$.

For the multivariate excess of loss, we take $g({\bf X};{\bf u}) = X_{1} \wedge u_1 + \cdots + X_{p} \wedge u_p = S({\bf u})$. With this, we have $\partial_{u_j}g({\bf x};{\bf u}) = I(X_{j} > u_j)$. Thus, \[\begin{equation} \begin{array}{ll} RM_j^2({\bf u}) &= -\frac{\partial_{u_j} RM[S({\bf u})]}{\partial_{u_j} RTC({\bf u})} \\ &= \mathrm{E}[S({\bf u}) | X_{j} > u_j] - \mathrm{E}[S({\bf u})] \\ & = u_j - \mathrm{E}[X_j \wedge u_j ] + \sum_{i \ne j}^p \left\{ \mathrm{E}[X_i \wedge u_i | X_{j} > u_j] - \mathrm{E}[X_i \wedge u_i ]\right\} . \\ \end{array} \tag{11.9} \end{equation}\] In the case of independence, this reduces to $RM_j^2=u_j - \mathrm{E}[X_j \wedge u_j ]$, the result for the classical problem that we developed in Section 4.1.2.

$Under~the~Hood.$ Confirm the $RM^2$ for Example 11.2

At the optimum, if for some risk $u^*_j >0$ and $0<RM_j^2({\bf u}^*) < \infty$, then the budget constraint is binding. In addition, if for another risk $u^*_i >0$ and the risk measure relative marginal change is positive and finite, then we have a balance $RM_i^2({\bf u}^*)=RM_j^2({\bf u}^*)$.

Example 11.3. Bivariate Excess of Loss with $VaR$ as a Risk Measure. For excess of loss with two risks, the retained losses are the limited sum $g(\mathbf{X};\boldsymbol \theta) = S(u_1,u_2)$ $=X_1 \wedge u_1 + X_2 \wedge u_2$ where $\boldsymbol \theta = (u_1,u_2)^{\prime}$. For simplicity, we assume a fair risk transfer cost.

$Under~the~Hood.$ Check the $VaR$ Balance

Using equation (11.8), the balance among retention parameters can be expressed as \[ \begin{array}{ll} I(u_1^*<z_0^*)f_2(z_0^*-u_1^*)\frac{1-C_1[F_2(z_0^*-u_1^*),F_1(u_1^*)]}{1-F_1(u_1^*)}\\ ~~~~~~~~~~~~~= I(u_2^*<z_0^*)f_1(z_0^*-u_2^*)\frac{1-C_1[F_1(z_0^*-u_2^*),F_2(u_2^*)]}{1-F_2(u_2^*)} . \end{array} \] at the optimal value of $z_0^*=F_{S(u_1^*,u_2^*)}^{-1}(\alpha)$. Here, $C_1$ is a copula derivative defined in Appendix Section 14.1. From this expression, we see that the balance depends on the copula and hence on the dependence between risks. Further, both optimal retention levels $u_1^*$ and $u_2^*$ must be less than the optimal value $F_{S(u_1^*,u_2^*)}^{-1}(\alpha)$. Yet we need $u_1^*+u_2^* > F_{S(u_1^*,u_2^*)}^{-1}(\alpha)$. In addition:

Special Case of Independence. To get further insights, it is also of interest to consider the special case where the risks are independent. Here, $C_1(u,v)=v$ and we have \[ \begin{array}{ll} I(u_1^*<z_0^*)f_2(z_0^*-u_1^*) = I(u_2^*<z_0^*)f_1(z_0^*-u_2^*). \end{array} \]
Special Case of Identical Distributions. As another interesting case, if the distributions are equal, then we have \[ \begin{array}{ll} I(u_1^*<z_0^*)f(z_0^*-u_1^*)\frac{1-C_1[F(z_0^*-u_1^*),F(u_1^*)]}{1-F(u_1^*)} \\ ~~~~~~~~~~~~~= I(u_2^*<z_0^*)f(z_0^*-u_2^*)\frac{1-C_1[F(z_0^*-u_2^*),F(u_2^*)]}{1-F(u_2^*)} . \end{array} \] This naturally holds true if $u_1^*=u_2^*$ regardless of the dependence.

Example 11.4. Separable Contracts and Risk Measures. Now consider the case where the portfolio risk transfer costs can be subdivided into separate contracts; see equation (7.10) of Section 7.5.1 where we wrote $g(\mathbf{X}; \boldsymbol \theta) = \sum_{j=1}^p g_j(X_j; \boldsymbol \theta_j)$. From this, let us assume that one can write the risk transfer cost as $RTC(\boldsymbol \theta) = \sum_{j=1}^p RTC_j( \boldsymbol \theta_j)$. In the same way, we also assume that the risk measure can be subdivided additively as $RM(\boldsymbol \theta) = \sum_{j=1}^p RM_j( \boldsymbol \theta_j)$. With these assumptions, we may write $RM_j^2 = -\partial_{\theta_j}RM_j(\theta_j) / \partial_{\theta_j}RTC_j(\theta_j)$.

Suppose that the $j$th contract has an upper limit form (excess of loss). Then, from Table 11.1, \[ RM_j^2 = \left\{ \begin{array}{ll} \frac{1}{1-F_j(u_j)}I[F_j(u_j-) \le \alpha] & \text{for } VaR \\ \min\left(\frac{1}{1-\alpha},\frac{1}{1-F_j(u_j)} \right) & \text{for } ES . \\ \end{array} \right. \] Under the mild conditions required for equation (11.8), this suggests that $F_j(u_j^*)$ is the same. That is, at the optimum, all positive upper limits have the same quantile.

The assumption of separable risk measures in Example 11.4 is unlikely to hold in general. However, it is true when using, for example, $VaR$ and $ES$, as risk measures and when risks are comonotonic (see Section 9.1 or, for a broader introduction, Denuit et al. (2006), Chapter 2). Comonotonicity represents extreme positive dependence and so is unlikely to hold in practice. However, it may be useful to make this assumption to quickly generate upper limit values and then use these as starting values for general situations.

11.3.2 Risk Retention Boundary Conditions

To simplify the presentation, this subsection only considers edge conditions of the form $\theta_j \ge 0$. In this context, one might wonder under what conditions is one of the retention parameters equal to 0? To illustrate, suppose that it is known that $\theta_1^*>0$ and we would like conditions that lead to $\theta_2^* = 0$. To this end, from equation (11.6) \[ \begin{array}{ll} \partial_{\theta_2} LA(\boldsymbol \theta, {\bf LMI}) &= ~\partial_{\theta_2}RM(\boldsymbol \theta) +LMI_1 ~\partial_{\theta_2}RTC(\boldsymbol \theta) - LMI_3\\ &= \partial_{\theta_2}RTC(\boldsymbol \theta)\left(\frac{\partial_{\theta_2}RM(\boldsymbol \theta)}{\partial_{\theta_2}RTC(\boldsymbol \theta)} ~ +RM^2_1(\boldsymbol \theta^*) \right) - LMI_3\\ &= \partial_{\theta_2}RTC(\boldsymbol \theta)\left(RM_1^2-RM_2^2 \right) - LMI_3.\\ \end{array} \] So, at the optimum, if $\partial_{\theta_2}RTC(\boldsymbol \theta)\left(RM_1^2-RM_2^2\right)$ is positive, by the first KKT condition, this means that $LMI_3^*$ is positive. By the third condition, this means that $\theta_2^* = 0$. We summarize this as follows.

Risk Retention Boundary Conditions

Assume $\theta_i^* > 0$ for the $i$th decision variable and the corresponding Conditions RR1 and RR2 hold. In addition, suppose that either

$\partial_{\theta_j}RTC(\boldsymbol \theta) >0$ and $RM_i^2 > RM_j^2$ or
$\partial_{\theta_j}RTC(\boldsymbol \theta) <0$ and $RM_i^2 < RM_j^2$,

at $\boldsymbol \theta=\boldsymbol \theta^*$. Then $\theta_j^* = 0$.

In summary, from equation (11.8), we have a collection of $RM^2$ ratios that are the same. If a ratio is not equal to the collective number, then there are mild conditions so that the corresponding retention parameter is zero.

In many of our problems, we focus on cases where an increase in a parameter $\theta$ means that more risk is retained, such as with upper limits and coinsurances. More risk retained means that as $\theta$ increases, we expect risk transfer costs to decrease and our (risk) measure of retained risk to increase. However, for other problems, such as deductibles, an increase in a parameter means that less risk is retained.

Example 11.5. Multivariate Deductible with Variance as a Risk Measure. As in Example 11.2, the partial derivative of the risk measure is \[ \begin{array}{ll} \partial_{\theta_j} RM[g(\mathbf{X};\boldsymbol \theta)] & = \mathrm{E}[g(\mathbf{X};\boldsymbol \theta)\partial_{\theta_j} g_{j}(\mathbf{X};\boldsymbol \theta)] - \mathrm{E}[g(\mathbf{X};\boldsymbol \theta)] \{ \mathrm{E}[\partial_{\theta_j} g_{j}(\mathbf{X};\boldsymbol \theta)]\} .\\ \end{array} \] For the multivariate deductible, we take $d_j = \theta_j$ and \[ \begin{array}{ll} g({\bf X};{\bf d}) &= (X_{1}-d_1)_+ + \cdots + (X_{p} - d_p)_+ \\ &= [X_1 - X_{1} \wedge d_1] + \cdots + [X_p - X_{p} \wedge d_p ] =S(\boldsymbol \infty) -S({\bf d}). \end{array} \] With this, we have $\partial_{d_j}g({\bf x};{\bf d}) = -I(X_{j} > d_j)$. Further calculations show \[\begin{equation} \begin{array}{ll} RM_j^2({\bf d}) &= -\frac{\partial_{d_j} RM[S({\bf d})]}{\partial_{d_j} RTC({\bf d})} \\ & = d_j - \mathrm{E}[X_i \wedge d_j ] + \sum_{i \ne j}^p \left\{ \mathrm{E}[X_i \wedge d_i | X_{j} > d_j] - \mathrm{E}[X_i \wedge d_i ]\right\} . \\ \end{array} \tag{11.10} \end{equation}\] This is the same $RM^2$ measure as in Example 11.2. The only difference between the two problems is the sign of the partial derivative of the risk transfer cost, $\partial_{\theta_j} RTC(\boldsymbol \theta)$. So, when we take a look at the edge condition for the deductible problem, a small value of $RM_j^2({\bf d}^*)$ at the optimum means that $d_j = 0$, signifying full retention.

$Under~the~Hood.$ Confirm the $RM^2$ for Example 11.5

Suppose that all of the parameters adhere to the assumption that retained risk increases with a parameter increase. Additionally, if all of the parameters equal 0, that is, if $\theta_1 = \cdots = \theta_p = 0$, then there is no risk retention, corresponding to full transfer (which could be full insurance). This is illustrated in Examples 11.6 - 11.9 of the subsequent Section 11.4.1. In this scenario, we can set the maximal risk transfer cost below the case of no risk retention, $RTC_{max} < RTC(\mathbf{0})$, ensuring that we have some positive retention parameters. Under this assumption, at least one $\theta_j^*>0$, satisfying one of the basic ingredients for risk retention conditions to hold.

Re-parameterization

The result of a parameter at a boundary has a natural interpretation, such as full transfer for upper limit and coinsurance parameters, and full retention for deductible parameters. In some cases, analysts can re-parameterize ( or redefine) parameters to achieve alternative desirable interpretations.

For instance, if the jth risk has upper limit $u_j < \infty$, then we may re-parameterize the deductible as $d_{1j} = (u_j - d_j)_+$, representing the amount that the deductible $d_j$ falls below the upper limit. The new parameter $d_{1j}$ has zero risk retention when $d_{1j}=0$ and aligns with our objective of taking on more risk as the parameter increases. In such cases, we can interpret the boundary condition result $\theta_j^* = 0$ to indicate no retention, or full transfer, of the jth risk.

The re-parameterization tactic can be applied to identify conditions for full risk retention in other scenarios. For example, in a multivariate excess of loss policy, if an upper limit $u_j$ is associated with a risk, then $u_j=0$ implies no risk retention, whereas $u_j=\infty$ signifies full retention. Therefore, we might define a new parameter, the reciprocal of the upper limit $u_j^R = 1/u_j$. We could then identify conditions so that the boundary conditions hold so that $u_j^R = 0$, signifying full risk retention.

11.4 Risk Measure Relative Marginal Changes

As seen in Section 11.3, the risk measure relative marginal change, $RM_j^2$, plays a key role in the investigation of risk transfer conditions. Because practical applications involves simulation of multivariate risks, this section develops $RM_j^2$ metrics in this context.

Separate Contract

Because the focus of $RM^2_j$ is on marginal changes, we can extend slightly the idea of separable contracts introduced in Section 7.5.1 to specify that only the jth contract is separate. Mathematically, define $\boldsymbol \theta_{(j)}= (\theta_1, \ldots, \theta_{j-1}, \theta_{j+1}, \ldots, \theta_p)'$ to be the vector of parameters excluding the jth one and similarly for ${\bf X}_{(j)}$. Then, we could write the retention function for the jth risk as \[ \tilde{g}_j(X_j; \theta_j) = g({\bf X};\boldsymbol \theta) - g_{(j)}({\bf X}_{(j)};\boldsymbol \theta_{(j)}) , \] where $g_{(j)}(\cdot)$ is the retention function for all risks except the jth one. Unlike in Section 7.5.1, we do not require that $g_{(j)}(\cdot)$ be additive.

If the jth risk is separate, then determination of partial derivatives of risk retention become easier as they rely on only one parameter. Table 11.2 summarizes this calculation for different types of retentions.

Table 11.2. Separate Risk Retention Functions and Partial Derivatives

\[ {\small \begin{array}{l|l|l} \hline \text{Retention Type} & \text{Retention Function} & \text{Partial Derivative} \\ & ~~~~~~~~~ g({\bf X}; \theta) & ~~~~~~~~~ \partial_{\theta} ~g({\bf X}; \theta) \\ \hline \text{Deductible} & g(x; d) ~~~~= (x-d)_+ & \partial_d ~g(x; d) ~~~~= - I(x>d)\\ \text{Coinsurance} & g(x; c) ~~~~= c ~x & \partial_c ~g(x; c) ~~~~=x\\ \text{Upper Limit} & g(x; u) ~~~~= x \wedge u & \partial_u ~g(x; u)~~~~= I(x>u)\\ \text{Upper Limit with } & g(x; u_R)~= x \wedge \frac{1}{u_R} & \partial_{u_R} ~g(x; u_R)= \frac{-1}{u_R^2} I(x>\frac{1}{u_R})\\ ~~~~\text{Reciprocal Parameter} \\ \hline \end{array} } \]

11.4.1 Variance Risk Retention

Using the variance as a risk measure, we are able to utilize the usual empirical simulation estimators of the distribution because the variance is differentiable in the risk retention parameters. Readers will not be surprised that we can get some intuitively pleasing results using the classic assumption of the variance as a risk measure. To recap, here is a summary of the simulation version of this risk retention problem. \[\begin{equation} \boxed{ \begin{array}{lc} {\small \text{minimize}_{\boldsymbol \theta}} & RM_{var}(\boldsymbol \theta) =\frac{1}{2R} \sum_{r=1}^R \left\{g({\bf X}_{r};\boldsymbol \theta)^2 -\overline{g({\bf X}_{R};\boldsymbol \theta)}\right\}^2\\ {\small \text{subject to}} & ~~~~~RTC_R(\boldsymbol \theta) \le RTC_{max} . \end{array} } \tag{11.11} \end{equation}\] For notation, let $g_{rj} =\partial_{\theta_j}g({\bf X}_{r};\boldsymbol \theta)$ and $\bar{g}_{j}=\frac{1}{R} \sum_{r=1}^R g_{rj}$. Also, let the average risk retained be denoted as $\overline{g({\bf X}_{R};\boldsymbol \theta)}$. With this, we can express the risk measure relative marginal statistic as \[\begin{equation} \begin{array}{ll} RM_{var,j}^2(\boldsymbol \theta) &= -\frac{\partial_{\theta_j} ~RM_{var}(\boldsymbol \theta) }{\partial_{\theta_j}RTC_R(\boldsymbol \theta)} = \overline{g({\bf X}_{R};\boldsymbol \theta)} - {\large \frac{ \mathrm{E}_R[ g({\bf X}_{R};\boldsymbol \theta) g_{Rj}] }{\bar{g}_{j}} } .\\ \end{array} \tag{11.12} \end{equation}\] Here, the second term on the right-hand side of equation (11.12) is a weighted average of retained risks where the weights are given by the partial derivatives $g_{rj}$. It is worth noting that if we are comparing $RM_i^2$ to $RM_j^2$ to determine the balance in the system, only the second term is important as the first term, the average retained risk, to the same for the $i$th and $j$th measures.

$Under~the~Hood.$ Confirm the $RM^2$ for the Simulation Variance Risk Measure

Interpretations of this result are best seen in the context of some special cases.

Example 11.6. Coinsurance. For the $j$th risk, take $c_j = \theta_j$. Thus, from Table 11.2 we have $g_{rj} = X_{rj}$. With equation (11.12), in this case we can express the risk measure relative marginal change as \[ \begin{array}{ll} RM_{var,j}^2 = \overline{g({\bf X}_{R};\boldsymbol \theta)} - {\large \frac{ \mathrm{E}_R[ g({\bf X}_{R};\boldsymbol \theta) X_{rj}] }{\overline{X}_{Rj}} } .\\ \end{array} \] Thus, the $j$th risk measure relative marginal change is the average retained risk minus a weighted average retained risk where the weights are given by the size of the $j$th risk.

Example 11.7. Excess of Loss. Now let $u_j = \theta_j$. Using Table 11.2, we have $g_{rj} = I(X_{rj} > u_j)$. Then, equation (11.12) becomes \[ \begin{array}{ll} RM_{var,j}^2 & =\overline{g({\bf X}_{R};\boldsymbol \theta)} - {\large \frac{ \mathrm{E}_R[ g({\bf X}_{R};\boldsymbol \theta) g_{Rj}] }{\bar{g}_{j}} } \\ &= \overline{g({\bf X}_{R};\boldsymbol \theta)} - \frac{ \sum_{r=1}^R g({\bf X}_{r};\boldsymbol \theta) I(X_{rj} > u_j) ] }{ \sum_{r=1}^R I(X_{rj} > u_j)} .\\ \end{array} \] The $j$th risk measure relative marginal change can be expressed as the overall average minus average portfolio risk of those losses that exceeds the upper limit.

11.4.2 Quantile-Based Risk Retention

Using the KKT conditions yields desirable properties that, when applied to risk retention problems, can be described in terms of the risk measure relative marginal change, $RM^2$. However, both the KKT conditions and $RM^2$ metric are based on marginal derivatives that rely on smoothness in terms of risk retention parameters. However, basic simulation techniques assign a weight of $1/R$ to each simulated outcome, and when used with typical check or indicator functions, the result is that simulated approximations are no longer smooth.

Unlike the variance, the usual empirical simulation approximations of quantile-based risk measures such as the $VaR$ and the $ES$ are not differentiable in the risk retention parameters. Therefore, technically, we need to use smooth estimators of the distribution introduced in Section 7.3, with properties developed in Section 10.5.3. Recall the density, distribution function, and general expectations from those sections as given in Table 11.3.

Table 11.3. Kernel Smoothed Expressions

\[ {\small \begin{array}{l|c|l} \hline\hline \textbf{Term} & \textbf{Symbol} & ~~~~~~~~\textbf{Expression} \\ \hline \text{Density} & f_{Rk}(y) &= \frac{1}{R~b} \sum_{r=1}^R k\left(\frac{y - g({\bf X}_r; \boldsymbol \theta)}{b}\right) \\\hline \text{Distribution function} & F_{Rk}(y) &= \frac{1}{R} \sum_{r=1}^R K\left(\frac{y - g({\bf X}_r; \boldsymbol \theta)}{b}\right) \\ &&~~~~= \mathrm{E}_R \left\{ K\left(\frac{y - g({\bf X}_R; \boldsymbol \theta)}{b}\right)\right\} \\\hline \text{General expectation} &\mathrm{E}_{Rk}\{h[g({\bf X}_R; \boldsymbol \theta)]\} & ={\LARGE \int} \mathrm{E}_{R} \left\{ h[g({\bf X}; \boldsymbol \theta) + bz] \right\} ~k(z) dz \\ ~~~\text{of retained risk}&\\ \hline \text{Partial derivative of the}& F_{Rk,\theta_j}(y) &= \frac{1}{R} \sum_{r=1}^R \partial_{\theta_j}K\left(\frac{y - g({\bf X}_r; \boldsymbol \theta)}{b}\right) \\ ~~~\text{distribution function}& \\ ~~~\textit{wrt} \text{ a retention parameter} &&~~~~= \frac{-1}{b} \mathrm{E}_{R} \left\{ k\left(\frac{y - g({\bf X}_R; \boldsymbol \theta)}{b}\right) g_{Rj} \right\} \\\hline\hline \end{array} } \] From these expressions, one can determine the quantile, or value at risk, in the usual way and is denoted as $VaR_{Rk}$. In addition, the above display provides a partial derivative of the distribution function with respect to a retention parameter which is needed in the following.

Table 11.1: Silly. Create a table just to update the counter…
x
2

For the $VaR$, recall that in Section 5.3.1 we developed an expression for the quantile sensitivity but that development assumed smoothness in the argument of the distribution function as well as the risk retention parameters. This result in equation (5.5), with a smoothed empirical estimator of the distribution, can be expressed as \[ \begin{array}{ll} \partial_{\theta_j} VaR_{Rk}(\boldsymbol \theta) &= \frac{-1}{f_{Rk}[VaR_{Rk}]} F_{Rk,\theta_j}[VaR_{Rk}] . \end{array} \] We also need partial derivatives with respect to the risk transfer cost. From the general expectation for retained risks, one can see that $\partial_{\theta_j}~RTC_R(\boldsymbol \theta) = - \bar{g}_j$.

$Under~the~Hood.$ Confirm the Partial Derivative of $RTC$

Thus, risk measure relative marginal change for the value at risk is \[ \begin{array}{ll} RM_{VaR,j}^2(\boldsymbol \theta) &= -\frac{\partial_{\theta_j} ~VaR_R(\boldsymbol \theta) }{\partial_{\theta_j}RTC_R(\boldsymbol \theta)} \\ &= \frac{1}{\bar{g}_j ~f_{Rk}[VaR_{Rk}]} F_{Rk,\theta_j}[VaR_{Rk}] . \end{array} \]

For the expected shortfall, we use the $ES$ sensitivity in equation (5.7). With this, we have \[ \begin{array}{ll} \partial_{\theta_j} ~ES_{Rk}[g(\mathbf{X};\theta)] \\ ~~~~=\frac{1}{1-\alpha} \left\{\partial_{\theta_j}\mathrm{E}_{Rk}[g(X;\boldsymbol \theta)] -\left.\partial_{\theta_j}\mathrm{E}_{Rk}[g(X;\boldsymbol \theta) \wedge z_0]\right|_{z_0=VaR_{Rk}}\right\} \\ ~~~~~~~~~~~~~ + [1 - \frac{1}{1-\alpha} \{1-F_{Rk}(VaR_{Rk})\}] \times \partial_{\theta_j} VaR_R(\boldsymbol \theta) . \end{array} \] Because the distribution function is smooth, in many cases there is no discreteness at the quantile, so $F_{Rk}(VaR_{Rk})=\alpha$, and the second term on the right-hand side is zero. Nonetheless, sometimes discreteness is induced by the risk retention parameters, as seen in Example 2.6. With no discreteness, the risk measure relative marginal change for the expected shortfall can be written as \[ \begin{array}{llr} RM_{ES,j}^2(\boldsymbol \theta) &=\frac{1}{(1-\alpha)} {\Large \frac{\mathrm{E}_R \left\{\tilde{k}_{R\theta} ~g_{Rj}\right\} }{\bar{g}_j} }~~,&(11.12ES)\\ \end{array} \] where $\tilde{k}_{R\theta}= 1-K\left([VaR_{Rk}-g({\bf X}_R; \boldsymbol \theta)]/b\right)$. For the $r$th simulated value, $\tilde{k}_{r\theta}$ represents the probability that the reference (such as the normal) distribution exceeds the argument $[VaR_{Rk}-g({\bf X}_r; \boldsymbol \theta)]/b$. Thus, large values of $g({\bf X}_r; \boldsymbol \theta)$ mean that $\tilde{k}_{r\theta}$ is large, so it can be interpreted as a measure of the size of the retained risk.

$Under~the~Hood.$ Confirm the Expected Shortfall $RM^2$

Example 11.8. Coinsurance. As in Example 11.6, take $c_j = \theta_j$ and $g_{rj} = X_{rj}$ for the $j$th risk. In this case, equation $(11.12ES)$ becomes \[ \begin{array}{ll} RM_{ES,j}^2(\boldsymbol \theta) &=\frac{1}{(1-\alpha)} \frac{1}{\bar{g}_j} \mathrm{E}_R \left\{\tilde{k}_{R\theta} ~g_{Rj}\right\} ~~,\\ &=\frac{1}{(1-\alpha)} {\large \frac{\sum_{r=1}^R \tilde{k}_{r\theta} ~X_{rj} }{\sum_{r=1}^R X_{rj}} }~~.\\ \end{array} \] This is proportional to the weighted average of the measure of size of retained risk, where the weight is given by the $j$th loss.

Example 11.9. Upper Limit. As in Example 11.7, take $u_j = \theta_j$ and $g_{rj} = I(X_{rj} > u_j)$ for the $j$th risk. Now, equation $(11.12ES)$ becomes \[ \begin{array}{ll} RM_{ES,j}^2(\boldsymbol \theta) &=\frac{1}{(1-\alpha)} {\large \frac{\sum_{r=1}^R \tilde{k}_{r\theta} ~I(X_{rj} > u_j) }{\sum_{r=1}^R I(X_{rj} > u_j) } }~~.\\ \end{array} \] This is proportional to the average of the measure of size of retained risk, where the average is taken over large values of the $j$th risk, those exceeding the upper limit $u_j$.

Table 11.2: Silly. Create a table just to update the counter…
x
2

Table 11.3: Silly. Create a table just to update the counter…
x
2

Example 11.10. Varying the Cyber Risk Premium. This example continues from Exercise 8.7, which analyzed the ANU Excess of Loss problem with a market loading $Cyber~Load$ given in Table 11.4. In Exercise 8.7, readers were tasked with minimizing the expected shortfall and determining optimal upper limits from which risk measures were computed.

Table 11.4 displays selected optimal retention levels. As in Exercise 8.7, the results are pleasing. A lower value of $Cyber~Load$ (at half the fair cost) suggests full insurance, indicated by the optimal upper limit $u_3^*=0$. Conversely, a high value of $Cyber~Load$ (at double the fair cost) extends the optimal upper limit$u_3^*$ well beyond the 99th risk percentile, suggesting full retention. In this example, we supplement this analysis by computing the $ES$ risk measure relative marginal change ($RM^2$).

To assess the simulation accuracy of the $RM^2$, note that it can be expressed as the ratio of two averages. The theory underpinning the standard errors are sketched out in Exercise 11.3. For this example, the standard errors that appear in Table 11.4 (with $R=100000$) enable the assessment of accuracy.

Code for Example 11.10

Table 11.4: **ANU Cyber Risk Retention**
Cyber Load	$VaR$	$u_3$: Cyber	$RM_3^2$	$se(RM_3^2)$	$u_2$	$u_4$
0.5	6093	0	0.395	0.005	175	288
1.0	6226	432	0.327	0.005	186	252
1.5	6357	1506	0.295	0.008	198	144
2.0	6361	5633	0.338	0.055	204	145
2.5	6364	5848	0.386	0.014	204	147
3.0	6364	5848	0.321	0.011	204	147

In addition, Table 11.5 shows risk measure relative marginal changes for other risks $RM^2_j$. Corresponding standard errors were also computed although not displayed, they showed approximately the same level as for the cyber risk displayed in Table 11.4. One can see a reasonable balance among $RM^2$ metrics for most risks (although not the 13th and 15th) for the $Cyber~Load$ 1.0 and 1.5. For other values of the $Cyber~Load$ as well as risks 13 and 15 (corresponding to Motor Vehicle and Marine Hull), there is a lack of balance suggesting that the corresponding upper limits may be in doubt.

Table 11.5: **ANU Cyber Risk Measure Relative Marginal Change**
Cyber Load	$RM^2_{1}$	$RM^2_{2}$	$RM^2_3$: Cyber	$RM^2_{4}$	$RM^2_{5}$	$RM^2_{6}$	$RM^2_{9}$	$RM^2_{10}$	$RM^2_{13}$	$RM^2_{15}$
0.5	0.233	0.303	0.395	0.325	0.317	0.310	0.328	0.329	0.267	0.336
1.0	0.234	0.303	0.327	0.319	0.317	0.310	0.324	0.330	0.245	0.386
1.5	0.235	0.290	0.295	0.309	0.306	0.304	0.309	0.319	0.290	0.353
2.0	0.233	0.291	0.338	0.308	0.307	0.304	0.308	0.300	0.290	0.361
2.5	0.233	0.291	0.386	0.309	0.307	0.303	0.308	0.300	0.254	0.353
3.0	0.233	0.291	0.321	0.308	0.307	0.303	0.309	0.299	0.258	0.353

11.5 Supplemental Materials

11.5.1 Further Resources and Reading

See, for example, Boyd and Vandenberghe (2004) and Simon and Blume (1994), for fascinating introductions to constrained optimization and more details about the Lagrange multiplier method.

11.5.2 Exercises

Exercise 11.1. de Finetti Optimal Retention Proportions. Consider the quota share agreement described in Section 4.3.2. Here, the insurer’s portion of the portfolio risk is $Y_{insurer} = \sum_{i=1}^n c_i X_i$ $= \mathbf{c}^{\prime} \mathbf{X}$. We seek to find those values of $c_i$ that minimize $\mathrm{Var}(Y_{insurer})$ subject to the constraint that $\mathrm{E}(Y_{reinsurer}) = RTC_{max}$. Subject to this budget constraint, the insurer wishes to minimize the uncertainty of the retained risks as measured by the variance. We now further impose constraints that the sharing coefficients are bounded between 0 and 1, as considered by De Finetti (1940) (although he assumed independence among risks).

See also Glineur and Walhin (2006) and Pressacco, Serafini, and Ziani (2011) for additional background. \[ \boxed{ \begin{array}{lc} {\small \text{minimize}}_{c_1, \ldots, c_n} & \frac{1}{2} \mathrm{Var} (Y_{insurer}) = \frac{1}{2}\mathbf{c}^{\prime} \boldsymbol \Sigma \mathbf{c} \\ {\small \text{subject to}} & \sum_i (1-c_i)\mathrm{E} (X_i) \le RTC_{max} \\ & 0 \le c_i \le 1, \ \ \ \ \ \ i=1, \ldots, n, \end{array} } \]

a. Put the problem into standard form and develop the Lagrangian. Henceforth, assume zero correlations among risks and write the variance of the $i$th risk as $\sigma_i^2$. Verify that the KKT conditions can be expressed as: \[\begin{equation} \boxed{ \begin{array}{cl} \sigma_i^2 c_i^* - LMI_1^*~ [\mathrm{E} ~X_i] + LMI_{2i}^* -LMI_{3i}^* = 0 & i=1, \ldots, n \\ LMI_{2i}^*\ge 0, LMI_{3i}^* \ge 0 & i=1, \ldots, n \\ LMI_{2i}^* (c_i^*-1) = 0, LMI_{3i}^* c_i^* = 0 & i=1, \ldots, n \\ 0 \le c_i^* \le 1 & i=1, \ldots, n \\ LMI_1^* \ge 0, \sum_i (1-c_i^*)\mathrm{E} (X_i) \le RTC_{max} & \\ LMI_1^*\left( \sum_i (1-c_i^*)\mathrm{E} (X_i) - RTC_{max} \right) = 0 & \\ \end{array} } \tag{11.13} \end{equation}\]

b. To get some practice, let us make some assumptions about the optimal coefficients and use the KKT conditions to verify how we expect the Lagrange multipliers to behave. Verify: \[\begin{equation} \boxed{ \begin{array}{c|cc} \text{Suppose} & \textit{KKT }\text{Results} \\ \hline 0 < c_i^* < 1 &LMI_{2i}^* = LMI_{3i}^*= 0 & \sigma_i^2 c_i^* - LMI_1^*~ [\mathrm{E} ~X_i] = 0 \\ c_i^* =0 & LMI_{2i}^* = LMI_{3i}^*= 0 & LMI_1^*= 0\\ c_i^* =1 &LMI_{3i}^* = 0& LMI_{2i}^* = LMI_1^*~ [\mathrm{E} ~X_i] - \sigma_i^2 . \\ \end{array} } \tag{11.14} \end{equation}\]

c. For another approach, let us make some assumptions about the Lagrange multipliers and use the KKT conditions to verify how we expect the optimal coefficients to behave. Verify: \[\begin{equation} {\scriptsize \boxed{ \begin{array}{cl|cl} \text{Suppose} & &\textit{KKT }\text{Results} \\ \hline LMI_{3i}^* > 0 & &c_i^* =0 &LMI_{2i}^* = 0\\ LMI_{2i}^* > 0 & &c_i^* =1 &LMI_{3i}^* = 0\\ LMI_{2i}^* = LMI_{3i}^* = 0 && \sigma_i^2 c_i^* - LMI_1^*~ [\mathrm{E} ~X_i] = 0 \\ \hline LMI_1^* = 0 & &LMI_{2i}^* = LMI_{3i}^* = 0 &c_i^* =0\\ \hline LMI_1^* > 0 & LMI_{3i}^* > 0 & \text{violates conditions} \\ LMI_1^* > 0 & LMI_{2i}^* > 0 & c_i^* =1 & \sigma_i^2- LMI_1^*~ [\mathrm{E} ~X_i] \\ & & & ~~~+ LMI_{2i}^* = 0\\ LMI_1^* > 0 & LMI_{2i}^* = &\sigma_i^2 c_i^* - LMI_1^*~ [\mathrm{E} ~X_i] = 0 & \\ & ~~LMI_{3i}^* = 0&& \\ \end{array} } \tag{11.15} } \end{equation}\]

d. Check the following. To determine the optimal coefficients, we first find the value of Lagrange multiplier $LMI_1^*$ as the solution of this equation: \[\begin{equation} \sum_i \mathrm{E} (X_i) - \sum_i \min\left\{1,\frac{LMI_1^*~ [\mathrm{E} ~X_i]}{\sigma_i^2 } \right\}\mathrm{E} (X_i)= RTC_{max} \tag{11.16} \end{equation}\] and then determine the optimal coefficients as \[\begin{equation} c_i^* = \min\left\{1,\frac{LMI_1^*~ [\mathrm{E} ~X_i]}{\sigma_i^2 } \right\}. \tag{11.17} \end{equation}\]

Show the Exercise 11.1 Solution

To put this in standard form, we use the inequalities

\[ \begin{array}{lll} f_{1,in}(\mathbf{c}) &= (\mathrm{E}~\mathbf{X} )^{\prime}(\mathbf{1}-\mathbf{c}) - RTC_{max} \le 0 \\ f_{2,in, i}(\mathbf{c}) &= c_i -1 \le 0, & i=1, \ldots, n \\ f_{3,in, i}(\mathbf{c}) &= -c_i \le 0, & i=1, \ldots, n \end{array} \]

With this, the Lagrangian is

\[ \begin{array}{ll} LA &= \frac{1}{2}\mathbf{c}^{\prime} \boldsymbol \Sigma \mathbf{c} + LMI_1 [(\mathrm{E}~\mathbf{X} )^{\prime}(\mathbf{1}-\mathbf{c}) - RTC_{max}] + \sum_{i=1}^n LMI_{2,i} ~(c_i-1) - \sum_{i=1}^n LMI_{3,i}~ c_i . \end{array} \]

Taking derivatives, we have

\[ \begin{array}{ll} \frac{\partial}{\partial \mathbf{c}}LA &= \boldsymbol \Sigma \mathbf{c} - LMI_1~ [\mathrm{E} ~\mathbf{X}] + {\bf LMI}_2 - {\bf LMI}_3 . \\ \Rightarrow & \frac{\partial}{\partial c_i}LA = {\bf 1}_i '\boldsymbol \Sigma \mathbf{c} - LMI_1~ [\mathrm{E} ~X_i] + LMI_{2i} -LMI_{3i}. \\ \end{array} \]

Note that, with zero correlations, we have ${\bf 1}_i '\boldsymbol \Sigma \mathbf{c} = \sigma_i^2 c_i$.

Assuming zero correlations, the KKT conditions are as in Display (11.13).

Now suppose $0 < c_i^* < 1$. Then, from KKT, we have $LMI_{2i}^* = LMI_{3i}^*= 0$ and

\[ \begin{array}{cl} \sigma_i^2 c_i^* - LMI_1^*~ [\mathrm{E} ~X_i] = 0 \\ \Rightarrow LMI_1^* = \frac{\sigma_i^2 c_i^*}{\mathrm{E} ~X_i}. \end{array} \]

Suppose $ c_i^* =0$. Then, from KKT, we have $LMI_{2i}^* = 0$ and

\[ \begin{array}{cl} - LMI_1^*~ [\mathrm{E} ~X_i] -LMI_{3i}^* = 0 \\ \Rightarrow LMI_{3i}^* = - LMI_1^*~ [\mathrm{E} ~X_i] \le 0 \\ \Rightarrow LMI_{3i}^* = 0 . \end{array} \]

Suppose $c_i^* =1$. Then, from KKT, we have $LMI_{3i}^* = 0$ and

\[ \begin{array}{cl} \sigma_i^2- LMI_1^*~ [\mathrm{E} ~X_i] + LMI_{2i}^* = 0 \\ \Rightarrow LMI_{2i}^* = LMI_1^*~ [\mathrm{E} ~X_i] - \sigma_i^2 . \\ \end{array} \]

We summarize these results in Display (11.14).

Again, start with the KKT conditions in Display (11.13).

Suppose $LMI_{3i}^* > 0$. Then, from KKT, we have $c_i^* = 0$. In the same way, if $LMI_{2i}^* > 0$. Then, we have $c_i^* = 1$. So, apparently, we cannot have both $LMI_{3i}^* > 0$ and $LMI_{2i}^* > 0$.

How about the case $LMI_{2i}^* = LMI_{3i}^* = 0$? For this case, we have

\[ \begin{array}{cl} \sigma_i^2 c_i^* - LMI_1^*~ [\mathrm{E} ~X_i] = 0 \\ \Rightarrow LMI_1^* = \frac{\sigma_i^2 c_i^*}{\mathrm{E} ~X_i}. \end{array} \]

Now suppose $LMI_{1}^* = 0$. Then,

\[ \begin{array}{cl} \sigma_i^2 c_i^* + LMI_{2i}^* -LMI_{3i}^* = 0 & i=1, \ldots, n \\ \end{array} \]

If in addition $LMI_{3i}^* > 0$, then $c_i^* = 0$ meaning that $LMI_{2i}^* = LMI_{3i}^* >0$ which cannot happen.
If in addition $LMI_{2i}^* > 0$, then $c_i^* = 1$ and $LMI_{3i}^* = 0$. Thus, $LMI_{2i}^* = -\sigma_i^2$ which cannot happen.
Thus, if $LMI_{1}^* = 0$, then $LMI_{2i}^* = LMI_{3i}^* = 0$ meaning that $c_i^* = 0$.

Now suppose $LMI_{1}^* > 0$. Then,

If in addition $LMI_{3i}^* > 0$, then $c_i^* = 0$, $LMI_{2i}^* = 0$, and $- LMI_1^*~ [\mathrm{E} ~X_i] -LMI_{3i}^* = 0$. This violates the KKT conditions.
If in addition $LMI_{2i}^* > 0$, then $c_i^* = 1$, $LMI_{3i}^* = 0$, and $\sigma_i^2- LMI_1^*~ [\mathrm{E} ~X_i] + LMI_{2i}^* = 0$.
If in addition $LMI_{2i}^* = LMI_{3i}^* = 0$, then $\sigma_i^2c_i^* - LMI_1^*~ [\mathrm{E} ~X_i] = 0$.

These results are summarized in Display (11.15).

From Display (11.15), no feasible solution is likely to exist when $LMI_{1}^* = 0$. When $LMI_{1}^* > 0$, the risk transfer condition is binding so that $\sum_i (1-c_i^*)\mathrm{E} (X_i)$ $= RTC_{max}$. This suggests finding $LMI_1^*$ as the solution of the equation (11.16).

\[ \begin{array}{cl} \sigma_i^2 c_i^* - LMI_1^*~ [\mathrm{E} ~X_i] = 0 \\ \Rightarrow c_i^* = \frac{LMI_1^*~ [\mathrm{E} ~X_i]}{\sigma_i^2 } \\ \sum_i \mathrm{E} (X_i) - \sum_i c_i^*\mathrm{E} (X_i)= RTC_{max} \\ \sum_i \mathrm{E} (X_i) - \sum_i \min\left\{1,\frac{LMI_1^*~ [\mathrm{E} ~X_i]}{\sigma_i^2 } \right\}\mathrm{E} (X_i)= RTC_{max} \end{array} \]

To check this, let us examine what happens when the ratio is large.

\[ \begin{array}{ccccc} \text{if } \frac{LMI_1^*~ [\mathrm{E} ~X_i]}{\sigma_i^2 } \ge 1 & \text{then } LMI_1^*~ [\mathrm{E} ~X_i] \ge \sigma_i^2 > \sigma_i^2 c_i^*\\ LMI_1^*~ [\mathrm{E} ~X_i] - \sigma_i^2 c_i^* >0 & \Rightarrow LMI_{2,i}^*~ >0 \text{ and } c_i^* = 1\\ \end{array} \]

In the same way, let us examine the case when the ratio is small.

\[ \begin{array}{ccccc} \text{if } \frac{LMI_1^*~ [\mathrm{E} ~X_i]}{\sigma_i^2 } < 1 & \text{then } LMI_1^*~ [\mathrm{E} ~X_i] < \sigma_i^2 \\ \text{we can find } c_i^* \text{ such that } LMI_1^*~ [\mathrm{E} ~X_i] = \sigma_i^2 c_i^* & \Rightarrow LMI_{2,i}^*~ =0\\ \end{array} \]

\[ \boxed{ \begin{array}{cl|cl} \text{Suppose} & &\textit{KKT }\text{Results} \\ \hline LMI_1^* >0 & LMI_1^*~ [\mathrm{E} ~X_i] - \sigma_i^2 > 0 & LMI_{2i}^* > 0 & c_i^* =1 \\ LMI_1^* >0 &LMI_1^*~ [\mathrm{E} ~X_i] - \sigma_i^2 \le 0 &LMI_{2i}^* = 0 &0 < c_i^* <1\\ \end{array} } \]

Exercise 11.2. Lasso Regression Conditions. To get additional practice with the KKT conditions, we now apply them to the lasso regression problem introduced in Exercise 3.1. Similar to Exercise 3.1, we consider the problem \[ {\small \boxed{ \begin{array}{lc} {\small \text{minimize}}_{\boldsymbol \beta} & f_{0,ls}(\boldsymbol \beta) = \frac{1}{2} \sum_{i=1}^n (y_i - {\bf x}_i^{\prime} \boldsymbol \beta)^2 \\ {\small \text{subject to}} & \sum_{j=1}^p |\beta_j | \le c_{lasso} , \end{array} } } \] with observations $({\bf x}_i, y_i)$, for $i=1,\ldots, n$. If $c_{lasso}$ is large enough, then the constraint has no effect and the resulting estimator is $\hat{\boldsymbol \beta}^{ols}$ $= \left[ {\bf X}^{\prime}{\bf X}\right]^{-1} {\bf X}'{\bf y}$. where ${\bf y}$ is a vector of responses and ${\bf X} = ({\bf x}_1', \ldots, {\bf x}_n')'$ is a matrix of explanatory variables. So, we choose $c_{lasso}$ to be smaller than $\sum_{j=1}^p |\hat{\beta}_j^{ols} |$ so the constraint has some effect.

As is common for problems involving absolute values, we can linearize the constraints by taking positive and negative parts. Specifically, define the positive part $\beta_j^+ = \max(0,\beta_j)$ and the negative part $\beta_j^- = \max(0,-\beta_j)$. Thus, the problem becomes \[ {\small \boxed{ \begin{array}{lc} {\small \text{minimize}}_{\boldsymbol \beta^+,\boldsymbol \beta^-} & f_{0,ls}(\boldsymbol \beta^+,\boldsymbol \beta^-) = \frac{1}{2} \sum_{i=1}^n [y_i - {\bf x}_i^{\prime} (\boldsymbol \beta^+ - \boldsymbol \beta^-)]^2 \\ {\small \text{subject to}} & \sum_{j=1}^p (\beta_j^+ +\beta_j^- ) \le c_{lasso} . \end{array} } } \] a. Confirm that the Lagrangian can be expressed as \[ \begin{array}{ll} LA(\boldsymbol \beta^+,\boldsymbol \beta^-) = & f_{0,ls}(\boldsymbol \beta^+,\boldsymbol \beta^-) +LMI_1[ \sum_{j=1}^p (\beta_j^+ +\beta_j^- ) -c_{lasso}] \\ &- \sum_{j=1}^p LMI_{1+j}^+ ~\beta_j^+ - \sum_{j=1}^p LMI_{1+j}^- ~\beta_j^- . \end{array} \]

b. Show that \[ \partial_{\boldsymbol \beta^+}~f_{0,ls}(\boldsymbol \beta^+,\boldsymbol \beta^-) = -{\bf X}^{\prime}{\bf e} = -\partial_{\boldsymbol \beta^-}~f_{0,ls}(\boldsymbol \beta^+,\boldsymbol \beta^-) , \] where ${\bf e}={\bf y} - {\bf X}\boldsymbol \beta$ is a vector of residuals.
c. At the optimum, show that $\beta_j^+ >0$ implies $\beta_j^- =0$ and vice-versa, that $\beta_j^- >0$ implies $\beta_j^+ =0$. Thus, there is no contradiction, as one would hope.
d. Let ${\bf x}^{(j)}$ be the $j$th column of $\bf X$. At the optimum, show that ${\bf x}^{(j)'}{\bf e}= 0$ implies $\beta_j = 0$.

Show the Exercise 11.2 Solution

Using equation (3.10), the Lagrangian can be expressed as

\[ \begin{array}{ll} LA(\boldsymbol \beta^+,\boldsymbol \beta^-) = & f_{0,ls}(\boldsymbol \beta^+,\boldsymbol \beta^-) +LMI_1[ \sum_{j=1}^p (\beta_j^+ +\beta_j^- ) -c_{lasso}] \\ &- \sum_{j=1}^p LMI_{1+j}^+ ~\beta_j^+ - \sum_{j=1}^p LMI_{1+j}^- ~\beta_j^- . \end{array} \]

For the KKT conditions, we can use

\[ \begin{array}{ll} \partial_{\boldsymbol \beta^+}~f_{0,ls}(\boldsymbol \beta^+,\boldsymbol \beta^-) &= \frac{1}{2} \sum_{i=1}^n \partial_{\boldsymbol \beta^+}~[y_i^* - {\bf x}_i^{\prime} \boldsymbol \beta^+ ]^2 \\ &= - \sum_{i=1}^n ~[y_i^* - {\bf x}_i^{\prime} \boldsymbol \beta^+ ]{\bf x}_i \\ &= - \sum_{i=1}^n ~[y_i - {\bf x}_i^{\prime} (\boldsymbol \beta^+ - \boldsymbol\beta^- )]{\bf x}_i \\ &= -{\bf X}^{\prime}{\bf y} + {\bf X}^{\prime}{\bf X}(\boldsymbol \beta^+ - \boldsymbol\beta^- ) \\ &= -{\bf X}^{\prime}\left[{\bf y} - {\bf X}(\boldsymbol \beta^+ - \boldsymbol\beta^- ) \right] = -{\bf X}^{\prime}\left[{\bf y} - {\bf X}\boldsymbol \beta \right] \\ &= -{\bf X}^{\prime} {\bf e} ,\\ \end{array} \]

where $y_i^* =y_i + {\bf x}_i^{\prime} \boldsymbol \beta^-$. To ease notation, let the $j$th row of this be denoted as $\nabla^+_j f_{0,ls}(\boldsymbol \beta^+,\boldsymbol \beta^-) = \nabla^+_j f_{0,ls}$.

In the same way, with $y_i^{**} =y_i - {\bf x}_i^{\prime} \boldsymbol \beta^+$

\[ \begin{array}{ll} \partial_{\boldsymbol \beta^-}~f_{0,ls}(\boldsymbol \beta^+,\boldsymbol \beta^-) &= \frac{1}{2} \sum_{i=1}^n \partial_{\boldsymbol \beta^-}~[y_i^{**} + {\bf x}_i^{\prime} \boldsymbol \beta^- ]^2 \\ &= \sum_{i=1}^n ~[y_i^{**} + {\bf x}_i^{\prime} \boldsymbol \beta^- ]{\bf x}_i \\ &= \sum_{i=1}^n ~[y_i - {\bf x}_i^{\prime} (\boldsymbol \beta^+ - \boldsymbol\beta^- )]{\bf x}_i \\ &= - \partial_{\boldsymbol \beta^+}~f_{0,ls}(\boldsymbol \beta^+,\boldsymbol \beta^-) . \end{array} \]

c. Taking derivatives of the Lagrangian yields

\[ \begin{array}{ll} \partial_{\beta_j^+}~LA(\boldsymbol \beta^+,\boldsymbol \beta^-) &=\nabla^+_j f_{0,ls} +LMI_1 - LMI_{1+j}^+ \\ \partial_{\beta_j^-}~LA(\boldsymbol \beta^+,\boldsymbol \beta^-) &=\nabla^-_j f_{0,ls} +LMI_1 - LMI_{1+j}^- . \end{array} \]

By making $c_{lasso}$ sufficiently small, the constraint on the sum of absolute values is active, meaning that $LMI_1^*>0$ at the optimum.

Further, if $\beta_j^+ >0$ at the optimum, $LMI_{1+j}^+ = 0$. In this case, $LMI_1=-\nabla^+_j f_{0,ls}$. This means that $-\nabla^+_j f_{0,ls} = \nabla^-_j f_{0,ls} >0$ so that when we look at

\[ \begin{array}{ll} \partial_{\beta_j^-}~LA(\boldsymbol \beta^+,\boldsymbol \beta^-) =\nabla^-_j f_{0,ls} +LMI_1 - LMI_{1+j}^-, \end{array} \]

we have $LMI_{1+j}^->0$, meaning that $\beta_j^- =0$. Thus, there is no contradiction; a value of $\beta_j^+ >0$ implies $\beta_j^- =0$, as one would hope.

d. If ${\bf x}^{(j)'}{\bf e}=\nabla^+_j f_{0,ls}= 0$, then $\partial_{\beta_j^+}~LA(\boldsymbol \beta^+,\boldsymbol \beta^-) =LMI_1 - LMI_{1+j}^+ =0$ by the KKT conditions. Because $LMI_1>0$, we have $LMI_{1+j}^+>0$ and so, by the KKT conditions, $\beta_j^+=0$. In the same way, $\beta_j^-=0$ which is sufficient for the result.

Exercise 11.3. Linearity of Lasso Regression Conditions. This exercise is motivated by Exercise 3.27 of Hastie, Tibshirani, and Friedman (2009). Similar to Exercises 3.1 and 11.2, we consider an equivalent way of writing the problem is using the penalized version \[ \begin{array}{cc} {\small \text{minimize}}_{\boldsymbol \beta^+,\boldsymbol \beta^-} & f_{0,ls}(\boldsymbol \beta^+,\boldsymbol \beta^-) + \lambda \sum_{j=1}^p (\beta_j^+ +\beta_j^- ) . \end{array} \] The Lagrangian has essentially the same form as in Exercise 11.2 \[ \begin{array}{ll} LA(\boldsymbol \beta^+,\boldsymbol \beta^-) = & f_{0,ls}(\boldsymbol \beta^+,\boldsymbol \beta^-) +\lambda \sum_{j=1}^p (\beta_j^+ +\beta_j^- ) \\ &- \sum_{j=1}^p LMI_{1+j}^+ ~\beta_j^+ - \sum_{j=1}^p LMI_{1+j}^- ~\beta_j^- , \end{array} \] and so the solution set is the same.

a. At the optimum, show that $\beta_j \ne 0$ implies $\lambda =- sgn(\beta_j) ~{\bf x}^{(j)'}\left[{\bf y} - {\bf X}\boldsymbol \beta \right]$.
b. At the optimum, show that the Hessian is \[ \begin{array}{ll} \nabla^2 LA &= \left(\begin{array}{cc} {\bf X}^{\prime}{\bf X} & -{\bf X}^{\prime}{\bf X}\\ -{\bf X}^{\prime}{\bf X} & {\bf X}^{\prime}{\bf X} \end{array} \right) .\\ \end{array} \]

c. Suppose that we perturb the penalty parameter $\lambda$ slightly and solve the problem for $\lambda_{\delta}=\lambda + \delta$. Call the perturbed solution $\boldsymbol \beta_{\delta}$. Then, if $sgn(\beta_j)>0$ show that \[ \begin{array}{ll} \delta & = {\bf x}^{(j)'}{\bf X} \left[\boldsymbol \beta_{\delta} -\boldsymbol \beta \right] \\ \end{array} \]

d. At the optimum, use the Perturbation Sensitivity Proposition from Section 10.1 to show that \[ - {\bf 1}_{A}= [\nabla^2 LA]_A ~\partial_{\lambda} (\boldsymbol \beta^+,\boldsymbol \beta^-)_A , \] where the subscript $A$ means that we restrict the rows and columns to be those in the active set (where the corresponding parameters are not zero). Thus, the vector of derivatives $\partial_{\lambda} (\boldsymbol \beta^+,\boldsymbol \beta^-)_A$ is constant in the penalty parameter $\lambda$, meaning that the solution is locally linear. (See Rosset and Zhu (2007) for extensions of this idea to more general settings.)

Show the Exercise 11.3 Solution

a. Taking a derivative of the Lagrangian, we have

\[ \begin{array}{ll} \partial_{\beta_j^+}~LA(\boldsymbol \beta^+,\boldsymbol \beta^-) &=\nabla^+_j f_{0,ls} +\lambda - LMI_{1+j}^+ \\ &= {\bf x}^{(j)'}{\bf e} +\lambda - LMI_{1+j}^+ . \end{array} \]

If $\beta_j^+>0$, then $LMI_{1+j}^+ =0$ by the KKT conditions and so $\lambda =-{\bf x}^{(j)'}{\bf e}$ which is sufficient for the result. The case of $\beta_j^->0$ can be established in the same way.

\[ \begin{array}{ll} \partial_{\boldsymbol \beta^+}~f_{0,ls}(\boldsymbol \beta^+,\boldsymbol \beta^-) & = -{\bf X}^{\prime}\left[{\bf y} - {\bf X}\boldsymbol \beta \right] \\ \end{array} \]

b. Starting with the relationship

\[ \partial_{\boldsymbol \beta^+}~f_{0,ls}(\boldsymbol \beta^+,\boldsymbol \beta^-) = -{\bf X}^{\prime}\left[{\bf y} - {\bf X}\boldsymbol \beta \right] =- ~\partial_{\boldsymbol \beta^-}~f_{0,ls}(\boldsymbol \beta^+,\boldsymbol \beta^-) , \]

the Hessian is

\[ \begin{array}{ll} \nabla^2 LA &= \left(\begin{array}{cc} \partial_{\boldsymbol \beta^+}\partial_{\boldsymbol \beta^+}'~LA(\boldsymbol \beta^+,\boldsymbol \beta^-) & \partial_{\boldsymbol \beta^+}\partial_{\boldsymbol \beta^-}'~LA(\boldsymbol \beta^+,\boldsymbol \beta^-) \\ \partial_{\boldsymbol \beta^-}\partial_{\boldsymbol \beta^+}'~LA(\boldsymbol \beta^+,\boldsymbol \beta^-) & \partial_{\boldsymbol \beta^-} \partial_{\boldsymbol \beta^-}'~LA(\boldsymbol \beta^+,\boldsymbol \beta^-) \end{array} \right)\\ &= \left(\begin{array}{cc} \partial_{\boldsymbol \beta^+}\partial_{\boldsymbol \beta^+}'~f_{0,ls}(\boldsymbol \beta^+,\boldsymbol \beta^-) & \partial_{\boldsymbol \beta^+}\partial_{\boldsymbol \beta^-}'~f_{0,ls}(\boldsymbol \beta^+,\boldsymbol \beta^-) \\ \partial_{\boldsymbol \beta^-}\partial_{\boldsymbol \beta^+}'~f_{0,ls}(\boldsymbol \beta^+,\boldsymbol \beta^-) & \partial_{\boldsymbol \beta^-} \partial_{\boldsymbol \beta^-}'~f_{0,ls}(\boldsymbol \beta^+,\boldsymbol \beta^-) \end{array} \right)\\ &= \left(\begin{array}{cc} {\bf X}^{\prime}{\bf X} & -{\bf X}^{\prime}{\bf X}\\ -{\bf X}^{\prime}{\bf X} & {\bf X}^{\prime}{\bf X} \end{array} \right) .\\ \end{array} \]

c.

\[ \begin{array}{ll} \delta & = \lambda_{\delta}-\lambda \\ & = -{\bf x}^{(j)'}\left[{\bf y} - {\bf X}\boldsymbol \beta_{\delta} \right] +{\bf x}^{(j)'}\left[{\bf y} - {\bf X}\boldsymbol \beta \right] \\ & = {\bf x}^{(j)'}{\bf X} \left[\boldsymbol \beta_{\delta} -\boldsymbol \beta \right] \\ \end{array} \]

d. Perturbation Sensitivity Proposition from Section 10.1, we have

\[ -f_{0,\theta a}(\boldsymbol \theta,a) = SLA_{\theta\theta}\left[\boldsymbol \theta,a\right] \partial_{a} \boldsymbol \theta(a) \] where $a=\lambda$, $\boldsymbol \theta=(\boldsymbol \beta^+,\boldsymbol \beta^-)$,

\[ \begin{array}{ll} f_{0,\theta a}(\boldsymbol \theta,a) &= \partial_{\lambda}\partial_{\boldsymbol \theta}\left\{ f_{0,ls}(\boldsymbol \beta^+,\boldsymbol \beta^-) +\lambda \sum_{j=1}^p (\beta_j^+ +\beta_j^- )\right\} \\ &= \partial_{(\boldsymbol \beta^+,\boldsymbol \beta^-)} \sum_{j=1}^p (\beta_j^+ +\beta_j^- ) , \\ \end{array} \]

and

\[ \begin{array}{ll} SLA_{\theta\theta}\left[\boldsymbol \theta,a\right] &= \partial_{\boldsymbol \theta} \partial_{\boldsymbol \theta'}\left\{ f_{0,ls}(\boldsymbol \beta^+,\boldsymbol \beta^-) +\lambda \sum_{j=1}^p (\beta_j^+ +\beta_j^- )\right\} \\ &= \nabla^2 LA . \\ \end{array} \]

Restricting the rows and columns to be those in the active set (where the corresponding parameters are not zero) is sufficient for part d.

Exercise 11.4. Standard Errors of Ratios. To determine the accuracy of the $RM^2$ in Example 11.10, note that it can be expressed as the ratio of two averages. Especially with simulation applications, it can be useful to understand the reliability of this type of statistic.

To this end, let $\{ (x_{11},x_{21}), \ldots, (x_{1R},x_{2R})$ be a random sample of size $R$. Summarize this sample with means $\bar{x}_1$ and $\bar{x}_2$, sample variances $s_1^2$ and $s_2^2$, coefficient of variations $CV_1 = s_1/\bar{x}_1$ and $CV_2 = s_2/\bar{x}_2$ and correlation coefficient $r_{12} = [(R-1) s_1 ~ s_2]^{-1} \left[\sum_{r=1}^R (x_{1r}-\bar{x}_1)(x_{2r}-\bar{x}_2)\right]$. It is known from the mathematical statistic literature (see, for example, Levy and Lemeshow (2013)), that a desirable approximate standard error for the ratio of the means is \[ se\left(\frac{\bar{x}_1}{\bar{x}_2}\right) = \frac{(\bar{x}_1/\bar{x}_2)}{\sqrt{R}} \sqrt{CV_1^2 + CV_2^2 - 2r_{12}CV_1 CV_2 } . \]

a. For the risk measure relative marginal change in equation $(11.12ES)$, identify the two variables $x_1$ and $x_2$ that you would use to compute the standard error.
b. Suppose that both $x_1$ and $x_2$ are binary variables (as in the upper limit special case). Show that $s_j^2 = \bar{x}_j(1-\bar{x}_j)/R$, for $j=1,2$, and the correlation coefficient reduces to \[ r_{12} \approx \frac{\bar{x}_{12} - \bar{x}_1\bar{x}_2}{\sqrt{ \bar{x}_1(1-\bar{x}_1) ~ \bar{x}_2(1-\bar{x}_2)}} , \] where $\bar{x}_{12} = (\sum_{r=1}^R x_{1r}x_{2r})/R$.
c. Suppose further that $x_{1r}=0 \implies x_{2r}=0$. Then show that the correlation coefficient can be expressed in an odds ratio form \[ r_{12} = \sqrt{\frac{(1- \bar{x}_1)/\bar{x}_1}{ (1-\bar{x}_2)/\bar{x}_{2}}} . \]

Show the Exercise 11.4 Solution

Exercise 11.5. Quantile Regression Approach. This is a follow-up to Exercise 7.1 where we explored the relationship between quantile regression and the auxiliary version of expected shorted fall. In this exercise, we defined the objective functions \[ \begin{array}{ll} f_{QR,0}(z_0, \boldsymbol \theta) &= \mathrm{E}_R~ \phi_{\alpha}[g({\bf X};\boldsymbol \theta)-z_0] \\ f_{ES,0}(z_0, \boldsymbol \theta) &=z_0 + \frac{1}{(1-\alpha)} \mathrm{E}_R [g({\bf X};\boldsymbol \theta) - z_0]_+ \end{array} \] and showed their relationship through the expression \[ f_{ES,0}(z_0, \boldsymbol \theta) = \mathrm{E}_R ~g({\bf X};\boldsymbol \theta) + \frac{1}{1-\alpha} f_{QR,0}(z_0, \boldsymbol \theta) . ~~~~(7.12) \] Now, assume that we only have a budget constraint of the form $RTC(\boldsymbol \theta) \le RTC_{max}$.

a. Show that the Lagrangians derivatives of the two problems can be expressed as the following. \[ \begin{array}{ll} \partial_{\theta_j} LA_{QR} &= ~~~~~~~~~\partial_{\theta_j} f_{QR,0}(z_0, \boldsymbol \theta) - \bar{g}_j ~~~LMI_{1,QR}\\ \partial_{\theta_j} LA_{ES} &= \frac{1}{1-\alpha} \left\{\partial_{\theta_j} f_{QR,0}(z_0, \boldsymbol \theta) - \bar{g}_j (1-\alpha)(LMI_{1,ES}-1) \right\} .\\ \end{array} \]

b. Interpret this to mean that the two algorithms will converge to the same solutions, so analysts need only implement one of them. Moreover, it will be the case that generally the two approaches will have the same level of computational difficulty.

Show the Exercise 11.5 Solution

11.5.3 Appendix. Establishing the KKT Conditions

A proof to establish the KKT conditions for a general setting is cumbersome and not very insightful. Fortunately, there are several good resources where readers can access the details, including Boyd and Vandenberghe (2004), Simon and Blume (1994), and Nocedal and Wright (2006).

In the following, we provide sketches of proofs in two important special cases, of a single equality and inequality constraints. These sketches provide some useful intuition into the underpinnings of the KKT conditions.

11.5.3.1 Single Equality Constraint

We now restrict considerations to a single equality constraint so that the problem in Display (3.9) reduces to \[ \boxed{ \begin{array}{lcc} {\small \text{minimize}} & f_0({\bf z}) & \\ {\small \text{subject to}} & f_{con,1}({\bf z}) = 0 & \\ \end{array} } \] and corresponding Lagrangian \[ LA\left({\bf z},LME_1\right) = f_0({\bf z}) + LME_1 ~f_{con,1}({\bf z}) . \] In this setting, the Display (11.1) reduces to \[ \boxed{ \begin{array}{ll} \partial_{\theta_i} ~ \left. LA\left({\bf z},LME_1^* \right) \right|_{{\bf z}={\bf z}^*} = 0 & i=1, \ldots, p \\ f_{con,1}({\bf z}^*) = 0 & . \\ \end{array} } \] These are the classical conditions that appear in many beginning economics and calculus courses.

$Under~the~Hood.$ Show the Justification of KKT Conditions for a Single Equality Constraint

11.5.3.2 Single Inequality Constraint

We now restrict considerations to a single inequality constraint so that the problem in Display (3.9) reduces to \[ \boxed{ \begin{array}{lcc} {\small \text{minimize}} & f_0({\bf z}) & \\ {\small \text{subject to}} & f_{con,1}({\bf z}) \le 0 & \\ \end{array} } \] and corresponding Lagrangian \[\begin{equation} LA\left({\bf z},LMI\right) = f_0({\bf z}) + LMI ~f_{con,1}({\bf z}) . \tag{11.18} \end{equation}\] In this setting, the Display (11.1) reduces to \[ \boxed{ \begin{array}{ll} \partial_{\theta_i} ~ \left. LA\left({\bf z},LMI^* \right) \right|_{{\bf z}={\bf z}^*} = 0 & i=1, \ldots, p \\ LMI^* \times f_{con,1}({\bf z}^*) = 0 & \\ LMI^* \ge 0 \\ f_{con,1}({\bf z}^*) \le 0 . & \\ \end{array} } \]

$Under~the~Hood.$ Show the Justification of KKT Conditions for a Single Inequality Constraint

Sketch of Proof:

As in the Section 11.5.3.1 sketch of the proof, we first seek conditions so that a small step $s$ decreases the objective function while retaining feasibility. As before, for decreasing the objective function, we have $f_0({\bf z}+s) < f_0({\bf z})$. With a Taylor series expansion, this means

\[ f_{0}({\bf z} + s) \approx f_{0}({\bf z}) + \nabla f_{0}({\bf z})^{\prime} s \\ \Rightarrow \nabla f_{0}({\bf z})^{\prime} s <0. \]

For feasibility, we now require that $f_{con,1}({\bf z}) \le 0$ and $f_{con,1}({\bf z} + s) \le 0$. Using a Taylor series expansion, we have

\[ f_{con,1}({\bf z} + s) \approx f_{con,1}({\bf z}) + \nabla f_{con,1}({\bf z})^{\prime} s . \]

Taken together, we seek

\[\begin{equation} \nabla f_{0}({\bf z})^{\prime} s <0, \ \ \ f_{con,1}({\bf z}) + \nabla f_{con,1}({\bf z})^{\prime} s \le 0 . \tag{11.19} \end{equation}\]

Case 1. Begin by assuming that the constraint is inactive so that $f_{con,1}({\bf z})<0$. From this, any direction $s$ that is sufficiently small satisfies the feasibility constraint

\[ f_{con,1}({\bf z}) + \nabla f_{con,1}({\bf z})^{\prime} s \le 0 . \] Thus, if ${\bf z}$ is an optimal point then $\nabla f_{0}({\bf z})^{\prime} = 0$. Otherwise, the point could be improved upon and would not be optimal. A proportionality constant is not required and so we may define $LMI = 0$ without imposing any restrictions.

Case 2. Now assume that the constraint is active at ${\bf z}$ so that $f_{con,1}({\bf z})=0$. Then, equation (11.19) becomes

\[\begin{equation} \nabla f_{0}({\bf z})^{\prime} ~s <0, \ \ \ \nabla f_{con,1}({\bf z})^{\prime} ~ s \le 0 . \tag{11.20} \end{equation}\]

Suppose that we cannot find a $s$ such that equation (11.20) holds (assuming an active constraint). Then, as in the Section 11.5.3.1 proof sketch, ${\bf z}$ is an optimal point, ${\bf z} = {\bf z}^*$, and this can only occur when $\nabla f_{0}({\bf z})$ is proportional to $\nabla f_{con,1}({\bf z})$. Writing this proportionality constraint as

\[ \nabla f_{0}({\bf z}) + LMI~ \nabla f_{con,1}({\bf z})=0 , \] for some constant $LMI$, we see that we now also require that $LMI > 0$. If $LMI < 0$, then we could choose a direction $s$ so that both the objective function and the constraint function would decrease without bound (both $\nabla f_{0}({\bf z})^{\prime} s <0$ and $\nabla f_{con,1}({\bf z})^{\prime} ~ s < 0$).

Using these constructions for $LMI$, we can define the Lagrangian as in equation (11.18). With this,

if the constraint is inactive, then $LMI=0$ and $\nabla LA({\bf z}) =\nabla f_{0}({\bf z}) =0$ at the optimum. Thus, the KKT conditions are satisfied.
if the constraint is active, then $f_{con,1}({\bf z})=0$, $\nabla LA({\bf z}) =0$, and $LMI>0$. Thus, the KKT conditions are satisfied.

This is sufficient for the result.

Constructing Insurable Risk Portfolios

11.1 KKT Conditions

\(Under~the~Hood.\) Verify that the Optimal \(d=0\)

11.2 Risk Retention Conditions for a Single Risk

\(Under~the~Hood.\) Proof of the Single Risk Retention Results

11.3 Risk Retention Conditions for Multiple Risks

11.3.1 Risk Retention Conditions

\(Under~the~Hood.\) Confirm the Budget Constraint to be Binding

\(Under~the~Hood.\) Confirm the \(RM^2\) for Example 11.2

\(Under~the~Hood.\) Check the \(VaR\) Balance