Brief Answers to Selected Exercises

Chapter 1

1.1 a(i). Mean = 12,840, Median = 5,695
a(ii). Standard deviation = 48,836.7 = 3.8 times the mean. The data appear to be skewed.
b. The plots are not presented here. When viewing them, the distribution appears to be skewed to the right.
c(i). The plots are not presented here. When viewing them, although the distribution has moved towards symmetry, it is still quite lop-sided.
c(ii). The plots are not presented here. When viewing them, yes, the distribution appears to be much more symmetric.
d. Mean = 1,854.0, Median = 625.7, Standard deviation =3,864.3. A similar pattern holds true for outpatient as for inpatient.

1.2. Part 1. a. Descriptive Statistics for the 2000 data \[ \small{ \begin{array}{lrrrrrrr} \hline & & 1st & & & 3rd & & \text{Standard} \\ & \text{Min} & \text{Quartile} & \text{Median} & \text{Mean} & \text{Quartile} & \text{Max} & \text{Deviation} \\\hline \text{TPY} & 11.57 & 56.72 & 80.54 & 88.79 & 108.60 & 314.70 & 46.10 \\ \text{NUMBED} & 18.00 & 60.25 & 90.00 & 97.08 & 118.8 & 320.00 & 48.99 \\ \text{SQRFOOT} & 5.64 & 28.64 & 39.22 & 50.14 & 65.49 & 262.00 & 34.50 \\ \hline \end{array} } \] b. The plots are not presented here. When viewing them, the histogram appears to be skewed to the right but only mildly.
c. The plots are not presented here. When viewing them, both the histogram and the $qq$ plot suggest that the transformed distribution is close to a normal distribution.
Part 2. a. Descriptive Statistics for the 2001 data \[ \small{ \begin{array}{lrrrrrrr} \hline & & 1st & & & 3rd & & \text{Standard} \\ & \text{Min} & \text{Quartile} & \text{Median} & \text{Mean} & \text{Quartile} & \text{Max} & \text{Deviation} \\\hline \text{TPY} & 12.31 & 56.89 & 81.13 & 89.71 & 109.90 & 440.70 & 49.05 \\ \text{NUMBED} & 18.00 & 60.00 & 90.00 & 97.33 & 119.00 & 457.00 & 51.97 \\ \text{SQRFOOT} & 5.64 & 28.68 & 40.26 & 50.37 & 63.49 & 262.00 & 35.56 \\ \hline \end{array} } \] c. Both the histogram and the $qq$ plot (not presented here) suggest that the transformed distribution is close to the normal distribution.

1.5 a. Mean = 5.953, Median = 2.331
b. The plots are not presented here. When viewing them, the histogram appears to be skewed to the right. The $qq$ plot indicates a serious departure from normality.
c(i). For ATTORNEY=1, we have Mean = 9.863 and Median = 3.417. For ATTORNEY=2, we have Mean = 1.865 and Median = 0.986. This suggests that the losses associated with attorney involvement (ATTORNEY=1) are higher than when an attorney is not involved (ATTORNEY=1).

1.7 a. The plots are not presented here. When viewing them, the histogram appears to be skewed to the left. The $qq$ plot indicates a serious departure from normality.
b. The plots are not presented here. When viewing them, the transformation does little to symmetrize the distribution.

Chapter 2

2.1 $r=0.5491, b_0=4.2054, b_1=0.1279$

2.3 a. \[\begin{eqnarray*} 0 & \leq & \frac{1}{n-1}\sum_{i=1}^n\left( a\frac{x_i-\overline{x}}{s_{x}}-c \frac{y_i-\overline{y}}{s_{y}}\right) ^{2} \\ &=&{\frac{1}{n-1}}\sum_{i=1}^n\left[a^2\frac{(x_i-\overline{x})^2}{s_{x}^2} -2ac\frac{(x_i-\overline{x})(y_i-\overline{y})}{s_x s_{y}}+c^2\frac{(y_i-\overline{y})^2}{s_{y}^2} \right]\\ &=& a^2\frac{1}{s_x^2}\frac{1}{n-1}\sum_{i=1}^n\left(x_i-\overline{x}\right)^2-2ac\frac{1}{s_x s_y}\frac{1}{n-1}\sum_{i=1}^n\left(x_i-\overline{x}\right)\left(y_i-\overline{y}\right)\\ &&+~c^2\frac{1}{s_y^2}\frac{1}{n-1}\sum_{i=1}^n\left(y_i-\overline{y}\right)^2\\ &=&a^2\frac{1}{s_x^2}s_x^2-2acr+c^2\frac{1}{s_y^2}s_y^2\\ &=& a^{2}+c^{2}-2acr. \end{eqnarray*}\] b. From part (a), we have $a^{2}+c^{2}-2acr \geq 0$. So,\[\begin{eqnarray*} a^{2}+c^{2}-2ac+2ac &\geq& 2acr\\ (a-c)^2&\geq& 2acr-2ac\\ (a-c)^2&\geq& 2ac(r-1) \end{eqnarray*}\] c. Using the result in part b) and taking $a = c$, we can get $2a^{2}(r-1)\leq0$. Also $a^2\geq 0$, so $r-1\leq 0$. Thus, $r \leq 1$.
d. Using the result in part (b) and taking $a = -c$, we can get $-2a^{2}(r-1)\leq4a^2$. Also $-2a^2\leq 0$, so $r-1\geq -2$. Thus, $r \geq -1$.
e. If all of the data lie on a straight line that goes through the upper left and lower right hand quadrants, then $r=-1$. If all of the data lie on a straight line that goes through the lower left and upper right hand quadrants, then $r=1$.

2.5 a. \[\begin{eqnarray*} b_1 = r\frac{s_{y}}{s_{x}} &=& \frac{1}{(n-1)s_{x}^{2}}\sum_{i=1}^n\left( x_{i}-\overline{x}\right) \left( y_{i}-\overline{y}\right) \\ &=& \frac{1}{\sum_{i=1}^n\left( x_{i}-\overline{x}\right)^{2}}\sum_{i=1}^n\left[\frac{y_{i}-\overline{y}}{x_{i}-\overline{x}}\left( x_{i}-\overline{x}\right)^2\right]\\ && \\ &&\\ &=& \frac{\sum_{i=1}^nweight_i~slope_i}{\sum_{i=1}^nweight_{i}}. \end{eqnarray*}\] where, \[ slope_i=\frac{y_{i}-\overline{y}}{x_{i}-\overline{x}}~~~~~\mathrm{and}~~~~~weight_i=\left( x_{i}-\overline{x}\right)^2 \] b. $slope_1=-1.5, weight_1=4$

2.7 a. For the model in this exercise, the least squares estimate of $\beta_1$ is the $b_1$ that minimizes the sum of squares $\mathrm{SS}(b_1^{\ast} )=\sum_{i=1}^n\left( y_i - b_1^{\ast }x_i\right) ^{2}.$ So, taking derivative with respect to $b_1^{\ast}$, we have \[ \frac{\partial }{\partial b_1^{\ast }}SS(b_1^{\ast })=\sum_{i=1}^n(-2x_{i})\left( y_{i}-b_1^{\ast }x_{i}\right) \] Setting this quantity equal to zero and canceling constant terms yields \[ \sum_{i=1}^n\left(x_{i}y_{i}-b_1^{\ast }x_{i}^2\right) =0 \] So, we get the conclusion \[ b_1 = \frac{\sum_{i=1}^n x_i y_i}{\sum_{i=1}^nx_i^{2}}. \] b. From the problem, we have $x_i = z_i^2$. Using the result of part (a), we can reach the conclusion that \[ b_1 = \frac{\sum_{i=1}^n z_i^2 y_i}{\sum_{i=1}^nz_i^{4}}. \]

2.10 a(i). Correlation$= 0.9372$
a(ii). Table of correlations \[ \small{ \begin{array}{llll} \hline &\text{TPY} & \text{NUMBED} & \text{SQRFOOT} \\\hline \text{TPY} & 1.0000 & 0.9791 & 0.8244\\ \text{NUMBED} & 0.9791 & 1.0000 & 0.8192\\ \text{SQRFOOT} & 0.8244 & 0.8192 & 1.0000\\ \hline \end{array} } \] a(iii). Correlation$= 0.9791.$ Correlations are unaffected by scale changes.
b. The plots are not presented here. When viewing them, there is a strong linear relationship between NUMBED and TPY. The linear relationship of SQRFOOT and TPY is not as strong as that of NUMBED and TPY.
c(i). $b_1=0.92142, t-\mathrm{ratio}=91.346, R^2=0.9586$
c(ii). $R^2 =0.6797.$ The model using NUMBED is preferred.
c(iii). $b_1 =1.01231, t-\mathrm{ratio}=81.235, R^2=0.9483$
c(iv). $b_1 =0.68737, t-\mathrm{ratio}=27.25, R^2=0.6765$
Part 2: $b_1=0.932384, t-\mathrm{ratio}=120.393, R^2=0.9762.$ The pattern is similar to the cost report for year 2000.

2.11 $\hat{e}_1 = -23.$

2.13 a. \[ \hat{y}_i - \overline{y} = (b_0 + b_1 x_i) - \overline{y} = (\overline{y}-b_1 \overline{x} + b_1 x_i) - \overline{y} = b_1(x_i - \overline{x}). \] b. \[ \sum^n_{i=1}(y_i - \overline{y})^2 = \sum^n_{i=1}(b_1(x_i - \overline{x}))^2 = b_1^2 \sum_{i=1}^n(x_i - \overline{x})^2 = b_1^2 s_x^2(n-1). \] c. \[ R^2 = \frac{Regression ~SS}{Total ~SS} = \frac{b_1^2 s_x^2(n-1)}{\sum_{i=1}^n(y_i - \overline{y})^2} = \frac{b_1^2 s_x^2(n-1)}{s_y^2(n-1)} = \frac{b_1^2 s_x^2}{s_y^2}. \]

2.15 a. From the definition of the correlation coefficient and Exercise 2.8(b), we have \[ r(y,x)(n-1)s_y s_x = \sum_{i=1}^n \left( y_i-\overline{y}\right) \left( x_i-\overline{x}\right) = \sum_{i=1}^n y_i x_i - n \overline{x} \overline{y}. \] If either $\overline{y}=0,\overline{x}=0$ or both $\overline{x}$ and $\overline{y}=0,$ then $r(y,x)(n-1)s_y s_x = \sum_{i=1}^n y_i x_i$. Therefore, $r(y,x)=0$ implies $\sum_{i=1}^ny_i x_i=0$ and vice-versa.
b. \[\begin{eqnarray*} \sum_{i=1}^n x_i e_i &=& \sum_{i=1}^n x_i (y_i - (\overline{y} + b_1(x_i-\overline{x}) )) \\ &=& \sum_{i=1}^n x_i (y_i - \overline{y}) - b_1 \sum_{i=1}^n x_i (x_i-\overline{x}) \\ &=& \sum_{i=1}^n x_i b_1(x_i - \overline{x}) - b_1 \sum_{i=1}^n x_i (x_i-\overline{x}) = 0, \end{eqnarray*}\] c. \[\begin{eqnarray*} \sum_{i=1}^n \widehat{y}_i e_i &=& \sum_{i=1}^n ( \overline{y}+b_1(x_i-\overline{x}) ) e_i \\ &=& \overline{y} \sum_{i=1}^n e_i + b_1\sum_{i=1}^n ( (x_i-\overline{x})) e_i = 0, \end{eqnarray*}\]

2.17 When $n = 100$, $k = 1$, $Error~SS = [n-(k+1)]s^2 = 98s^2$
a. $e_{10}^2/(Error~SS) = (8s)^2/(98s^2) = 65.31\%$
b. $e_{10}^2/(Error~SS) = (4s)^2/(98s^2) = 16.33\%$
When $n = 20$, $k = 1$, $Error~SS = [n-(k+1)]s^2 = 18s^2$
c. $e_{10}^2/(Error~SS) = (4s)^2/(18s^2) = 88.89\%$

2.20 a. Correlation=0.9830
Descriptive Statistics \[ \small{ \begin{array} {lrrrrrrr} \hline & & 1st & & & 3rd & & \text{Standard} \\ & \text{Min} & \text{Quartile} & \text{Median} & \text{Mean} & \text{Quartile} & \text{Max} & \text{Deviation} \\\hline \text{LOGTPY} & 2.51 & 4.04 & 4.40 & 4.37 & 4.70 & 6.09 & 0.51 \\ \text{LOGNUMBED} & 2.89 & 4.09 & 4.50 & 4.46 & 4.78 & 6.13 & 0.49 \\ \hline \end{array} } \] b. $R^2 = 0.9664$, $b_1 = 1.01923$, $t(b_1) = 100.73$.
c(i). The degrees of freedom is $df = 355 - (1+1) = 353$. The corresponding $t$-value is 1.96. Because the $t$-statistic $t(b_1) = 100.73 > 1.9667$, we reject $H_0$ in favor of the alternative.
c(ii). The $t$-statistic is $t - \mathrm{ratio} = (b_1 - 1)/se(b_1) = (1.01923 - 1)/0.01012 = 1.9002.$ Because $t-\mathrm{ratio} < 1.9667$, we do not reject $H_0$ in favor of the alternative.
c(iii). The corresponding $t$-value is 1.645. The $t$-statistic is $t - \mathrm{ratio} = 1.9002.$ We reject $H_0$ in favor of the alternative.
c(iv). The corresponding $t$-value is -1.645. The $t$-statistic is $t - \mathrm{ratio} = 1.9002.$ We do not reject $H_0$ in favor of the alternative.
d(i). A point estimate is 2.0384
d(ii). 95% C.I. for slope $b_1$ is $1.0192 \pm 1.9667 \times 0.0101 = (0.9993, 1.0391)$. A 95% C.I. for expected change of LOGTPY is $(0.9993 \times 2, 1.0391 \times 2) = (1.9987, 2.0781)$
d(iii). 99% C.I. is $(2(1.0192 - 2.5898), 2(1.0192 + 2.5898) =(1.9861, 2.0907) $
e(i). $\widehat{y} = -0.1747 + 1.0192 \times \ln 100 = 4.519037.$
e(ii). The standard error of the prediction \[ se(pred) = s \sqrt{1+\frac{1}{n}+\frac{\left( x^{\ast }-\overline{x}\right) ^{2} }{(n-1)s_{x}^{2}}} = 0.09373 \sqrt{1+\frac{1}{355}+\frac{\left( \ln(100)-4.4573\right) ^{2} }{(355-1)0.4924^{2}}}=0.0938. \] e(iii). The 95% prediction interval at $x^*$ is \[ \widehat{y}^{\ast } \pm t_{n-2,1-\alpha /2} ~se(pred) = 4.519037 \pm 1.9667(0.0938) = (4.3344, 4.7034). \] e(iv). The point prediction is $e^{4.519037}= 91.747$.
The prediction interval is $(e^{4.334405}=76.280, e^{4.703668}=110.351 ).$
e(v). The prediction interval is $(e^{4.364214}=78.588, e^{4.673859}=107.110 ).$

2.22 a. Fitted US $LIFEEXP = 83.7381 - 5.2735 \times 2.0 = 73.1911$
b. A 95% prediction interval for the life expectancy in Dominica is \[ \widehat{y}_{\ast} \pm t_{n-2,1-\alpha /2} ~se(pred)=73.1911\pm(1.973)(6.642)=(60.086, 86.296) \] c. \[ e_{i}=y_{i}-\widehat{y}_{i}=y_{i}-\left( b_0+b_1x_{i}\right)= 72.5-(83.7381 - 5.2735 \times 1.7)=-2.273 \] This residual is 2.273/6.615 = 0.3436 multiples of $s$ below zero.
d. Test $H_0: \beta_1 = -6.0$ versus $H_a: \beta_1 > -6.0$ at the 5% level of significance using a $t$-value = 1.645. The calculated $t$-statistics $= \frac{-5.2735-(-6)}{0.2887}=2.5165$, which is $\geq1.645$. Hence, we reject $H_0$ in favor of the alternative. The corresponding $p$-value $= 0.00637$.

Chapter 3

3.1. a. $R^2_a = 1 - s/{s^2_y} = 1 - {(50)^2}/{(100)^2} = 1 - 1/4 = 0.75.$
b. $Total ~SS =(n - 1)s^2_y = 99(100)^2 = 990000$ and
$Error ~SS = (n - (k + 1))s^2 = (100 - (3 + 1))(50)^2 = 240000.$
\[ \small{ \begin{array}{lcrcc} \hline Source & SS & df & MS & F \\ \hline \text{Regression} & 750000 & 3 & 250000 & 100 \\ \text{Error} & 240000 & 96 & 2500 & \\ \text{Total} & 990000 &99 && \\ \hline \end{array} } \] c. $R^2 = (Regression~SS)/(Total~SS) =750000/990000 = 75.76\%$.

3.3 a. ${\bf y}=(0~1~5~8)^{\prime}$, $\mathbf{X}=\left( \begin{array}{ccc} 1 & -1 & 0 \\ 1 & 2 & 0 \\ 1 & 4 & 1 \\ 1 & 6 & 1 \\ \end{array} \right)$.
b. $\hat{y}_{3}=x^{\prime}_{3}\mathbf{b}=(1~4~1)\left( \begin{array}{c} 0.15 \\ 0.692 \\ 2.88 \\ \end{array} \right)=5.798$
c. $se(b_{2})=s\sqrt{3rd~diagonal~element~of~(\mathbf{X^{\prime }X)}^{-1}} = 1.373\sqrt{4.11538}=2.785$
d. $t(b_1)=b_1/se(b_1)=0.692/(1.373\times\sqrt{0.15385})=1.286$

3.6 a. The regression coefficient is -0.1846, meaning that when public education expenditures increase by 1% of GDP, life expectancies is expected to decrease by 0.1846 years, holding other variables fixed.
b. The regression coefficient is -0.2358, meaning that when health expenditures increase by 1% of GDP, life expectancies is expected to decrease by 0.2358 years, holding other variables fixed.
c. $H_{0}:\beta_2=0, H_{1}:\beta_2\neq0$. We can not reject null hypothesis because the $p$-value is greater than the significance level, say 0.05. Therefore, PUBLICEDUCATION is not a statistically significant variable.
d(i). The purpose of added variable plot is to explore the correlation between PUBLICEDUCATION and LIFEEXP after removing the effects of other variables.
d(ii). The partial correlation is \[ r=\frac{t(b_2)}{\sqrt{t(b_2)^2+n-(k+1)}}=\frac{-0.6888}{\sqrt{-0.6888^2+152-(3+1)}}=-0.0565 . \]

Chapter 4

4.1 a. $R^2 = (Regression~SS)/(Total~SS)$.
b. $F$-ratio=$(Regression~MS)/(Error~MS)$.
c. \[ 1-R^2 = \frac{Total~SS}{Total~SS}-\frac{Regression~SS}{Total~SS}= \frac{Error~SS}{Total~SS}. \] Now, from the right hand side, we have \[\begin{align*} \frac{R^2}{1-R^2} \frac{(n-(k+1))}{k} &=\frac{(Regression~SS)/(Total~SS)}{(Error~SS)/(Total~SS)}\frac{(n-(k+1))}{k}\\ &= \frac{Regression~SS}{Error~SS}\frac{(n-(k+1))}{k}\\ &=\frac{(Regression~SS)/k}{(Error~SS)/(n-(k+1))}\\ &=\frac{Regression~MS}{Error~MS} = F-\mathrm{ratio}. \end{align*}\]
d. $F-\mathrm{ratio}=1.7.$
e. $F-\mathrm{ratio}=19.7.$

4.3 a. The third level of organizational structure will be captured by the intercept term of the regression.
b. $H_0$:TAXEXEMPT is not important, $H_1$: TAXEXEMPT is important. $p= 0.7694>0.05$, we do not reject null hypothesis.
c. Because $p$-value = $1.74e^{-6}$ is less than significance level $\alpha=0.05$, MCERT is an important factor in determining LOGTPY.
c(i). The point estimate of LOGTPY is 3.988.
c(ii). The 95% confidence interval is $ 3.988 /() = (3.826, 4.150)$.
d. $R^2=0.1448$. All the variables are statistically significant.
e. $R^2= 0.9673$. Only LOGNUMBED is statistically significant at $\alpha=0.05$.
e(i). The partial correlation is 0.0744. The correlation between LOGTPY and LOGSQRFOOT is 0.8151. The partial correlation removes the effect of other variables on LOGTPY.
e(ii). The $t$-ratio tests whether the individual explanatory variable is statistically significant. The $F$-ratio tests whether the explanatory variables taken together have an significant impact on response variable. In this case, only LOGNUMBED is significant and the $R^2$ is high, this explains why $F$-ratio is large while most of the $t$-ratios are small.

4.7 a. $H_0$: PUBLICEDUCATION and lnHEALTH are not jointly statistically significant. That is to say that the coefficients of the two variables are equal to zero. $H_1$: PUBLICEDUCATION and lnHEALTH are jointly statistically significant. At least one of the coefficients of the two variables is not equal to zero. To make decision, we compare the $F$ statistics with critical value, if $F$ statistics is greater than critical value, we reject null hypothesis. Otherwise, we do not.
$F-ratio = (6602.7 - 6535.7)/(2 \times 44.2) = 0.76$. The 95% of $F$ distribution with $df_1=2$ and $df_2=148$ is approximately 3.00. Since $F-ratio$ is less than the critical value, we can not reject null hypothesis. That is, PUBLICEDUCATION and lnHEALTH are not jointly significant.
b. We can see that the life expectancy varies across different regions.
c. $H_0$: $\beta_{REGION}=0$, $H_1$: $\beta_{REGION}\neq0$. To make the decision, we compare the $p$-value with significance level $\alpha=0.05$. If $p<\alpha$, we reject null hypothesis. Otherwise, we do not. In this case, $p=0.000 < 0.05$, so reject null hypothesis. REGION is a statistically significant determinant of LIFEEXP.
d(i). If REGION=Abrab state, $\widehat{LIFEEXP} = 83.3971-2.7559 \times 2-0.4333\times 5-0.7939\times 1=74.9249$. If REGION=Sub-Sahara Africa, $\widehat{LIFEEXP} = 83.3971-2.7559\times 2-0.4333 \times 5-0.7939 \times 1 -14.3567 = 60.5682$.
d(ii). The 95% confidence interval is $-14.3567\pm 1.976\times 1.8663=(-18.044,-10.669).$
d(iii). The point estimate for the difference is 18.1886.

Chapter 5

5.1 a. From equation (2.9), we have \[\begin{eqnarray*} h_{ii} & = & \mathbf{x_i}^{\prime}\left(\mathbf{X}^{\prime }\mathbf{X}\right)^{-1}\mathbf{x_i}\\ &=&\left(\begin{array}{cc}1 & x_i\end{array}\right) \frac{1}{ \sum_{i=1}^{n}x_i^2-n\overline{x}^2} \left(\begin{array}{cc}n^{-1}\sum_{i=1}^{n}x_i^2 & -\overline{x} \\-\overline{x} & 1\end{array}\right) \left(\begin{array}{c}1 \\x_i\end{array}\right)\\ &=& \frac{1}{\sum_{i=1}^{n}x_i^2-n\overline{x}^2} \left( n^{-1}(\sum_{i=1}^{n}x_i^2-n\overline{x}^2)+\overline{x}^2-2\overline{x}x_i+x_i^2 \right)\\ &=&\frac{1}{n}+\frac{(x_i-\overline{x})^2}{(n-1)s_x^2} . \end{eqnarray*}\]

The average leverage is \[ \bar{h}=\frac{1}{n} \sum_{i=1}^n h_{ii} = \frac{1}{n}+\frac{1}{n}\sum_{i=1}^n\frac{(x_i-\bar{x})^2}{(n-1)s_{x}^2}=\frac{1}{n}+\frac{1}{n}=\frac{2}{n} \]
Let $c = (x_i-\bar{x})/s_x$. Then, \[ \frac{6}{n}=h_{ii}=\frac{1}{n}+\frac{(x_i-\bar{x})^2}{(n-1)s_{x}^2} = \frac{1}{n}+\frac{(cs_x)^2}{(n-1)s_{x}^2}=\frac{1}{n}+\frac{c^2}{n-1} . \] For large $n$, $x_i$ is approximately $c=\sqrt{5}=2.236$ standard deviations away from the mean.

5.3 a. The plots are not presented here. When viewing them, it is difficult to detect linear patterns from the plot of GDP versus LIFEEXP. The logarithmic transform of GDP spreads out values of GDP, allowing us to see linear patterns. Similar arguments hold for HEALTH, where the pattern in lnHEALTH is more linear.
c(ii). It is both. The standardized residual is -2.66, which exceeds the cut-off of 2, in absolute value. The leverage is 0.1529, which is greater than the cut off $3 \times\overline{h} =3\times (k+1)/n = 0.08$.
c(iii). The variable PUBLICEDUCATION is no longer statistically significant.

Chapter 6

6.1 a. The variable involact is somewhat right-skewed but not drastically so. The variable involact has several zeros that may be a problem with limited dependent variables. The variable age appears to be bimodal, with six observations that are 28 or less and the others greater than or equal to 40. \[ \small{ \begin{array}{lrrrrrrr} \hline & & & \text{Standard} \\ & \text{Mean} & \text{Median} & \text{Deviation} & \text{Minimum} & \text{Maximum}\\ \hline \text{race} & 34.9 & 24.5 & 32.6 & 1.0 & 99.7\\ \text{fire} & 12.3 & 10.4 & 9.3 & 2.0 & 39.7\\ \text{theft} & 32.4 & 29.0 & 22.3 & 3.0 & 147.0\\ \text{age} & 60.3 & 65.0 & 22.6 & 2.0 & 90.1\\ \text{income} & 10,696 & 10694.0 & 2,754 & 5,583 & 21,480\\ \text{volact} & 6.5 & 5.9 & 3.9 & 0.5 & 14.3\\ \text{involact} & 0.6 & 0.4 & 0.6 & 0.0 & 2.2\\ \hline \end{array} } \] b. The scatterplot matrix (not presented here) shows a negative relation between volact and involact, a negative relation between race and volact and a positive relation between race and involact. If there exists racial discrimination, we would expect zip codes with more minorities to have less access to the voluntary (less expensive) market, meaning that the have to go to the involuntary market for insurance.

Table of correlations \[ \small{ \begin{array}{lrrrrrrr} \hline & \text{race}& \text{race} &\text{theft} & \text{age} & \text{income} &\text{volact} &\text{involact}\\ \hline \text{race} & 1.000 & & & & & &\\ \text{fire} & 0.593 & 1.000 & & & & &\\ \text{theft} & 0.255 & 0.556 &1.000 & & & &\\ \text{age} & 0.251 & 0.412 &0.318 & 1.000 & & &\\ \text{income} & -0.704 &-0.610 &-0.173 &-0.529 & 1.000 & &\\ \text{volact} & -0.759 &-0.686 &-0.312 &-0.606 & 0.751 &1.000 &\\ \text{involact} & 0.714 & 0.703 & 0.150 & 0.476 & -0.665 &-0.746 & 1.000\\ \hline \end{array} } \]

d(i). The coefficient associated with race is negative and statistically significant.
d(ii). The high leverage zip codes are number 7 and 24. The variable race remains statistically negatively significant. The variable fire is no longer significant although income becomes significant.
e. The variable race remains positively statistically significant. Similarly, the role of the other variables do not change depending on the presence of the two high leverage points.
f. The variable race remains positively statistically significant. Similarly, the role of the other variables do not change depending on the presence of the two high leverage points.
g. Leverage depends on the explanatory variables, not on the dependent variables. Because the explanatory variables remained unchanged in the three analyses, the leverages remained unchanged.
h. The demand for insurance depends on the size of the loss to be insured, the ability of the applicant to pay for it and knowledge of insurance contracts. For homeowners insurance, the size of the loss relates to house price, type of dwelling structure, available safety precautions taken, susceptibility to catastrophes such as tornado, flood and so on. Ability to pay is based on income, wealth, number of dependents and other factors. Knowledge of insurance contracts depends on, for example, education. All of these omitted factors may be related to race.
i. One would expect zip codes that are adjacent to one another (``contiguous’’) to share similar economic experiences. We could subdivide the city into homogeneous groups, such as inner city and suburbs. We could also do a weighted least squares where the weights are given by the distance from the city center.

Chapter 7

7.1 a. \[\begin{align*} \textrm{E}~y_t &= \textrm{E}~(y_0+c_1+\cdots+c_t)=\textrm{E}~y_0+\textrm{E}~c_1+\cdots+\textrm{E}~c_t\\ &= y_0+\mu_c+\cdots+\mu_c = y_0 + t \mu_c. \end{align*}\]

\[\begin{align*} \textrm{Var}~y_t &= \textrm{Var}~(y_0+c_1+\cdots+c_t)=\textrm{Var}~c_1+\cdots+\textrm{Var}~c_t\\ &=\sigma_c^2+\cdots+\sigma_c^2 = t\sigma_t^2. \end{align*}\]

7.3 a(ii). No. There is a clear downward trend in the series, indicating that the mean changes over time.
b(i). The $t-$ratios associated with the linear and quadratic trend portions are highly statistically significant. The $R^2 = 0.8733$ indicates that the model fits well.
b(ii). The sign of a residual is highly likely to be the same as preceding a subsequent residuals. This suggests a strong degree of autocorrelation in the residuals.
b(iii). $\widehat{EURO_{702}} = 0.808 + 0.0001295(702) - 4.639 \times 10^{-7}(702)^2 = 0.6703.$
c(i). This is a random walk model.
c(ii). $\widehat{EURO_{702}} = 0.6795 + 3(-0.0001374) = 0.679088.$
c(iii). An approximate 95% prediction interval for $EURO_{702}$ is \[ 0.679088\pm2(0.003621979)\sqrt{3} \approx (0.66654, 0.691635). \]

Chapter 8

8.1 \[ \begin{array}{ll} r_1 &=\left(\sum_{t=2}^{5}(y_{t-1}-\bar y)(y_{t}-\bar y)\right) /\left(\sum_{t=1}^{5}(y_{t}-\bar y)^{2}\right) = -0.0036/0.0134 = -0.2687 \\ r_2 &= \left(\sum_{t=3}^{5}(y_{t-2}-\bar y)(y_{t}-\bar y)\right) /\left(\sum_{t=1}^{5}(y_{t}-\bar y)^{2}\right) = 0.0821 \end{array} \]

8.3 a. $b_1= \left(\sum_{t=2}^{T}(y_{t-1}-\bar y_{-})(y_t-\bar y_{+})\right) /\left(\sum_{t=2}^{T}(y_{t-1}-\bar y_{-})^2\right)$, where $\bar y_{+}=\left(\sum_{t=2}^{T}y_{t}\right)/(T-1)$ and $\bar y_{-}=\left(\sum_{t=1}^{T-1}y_{t} \right)/(T-1)$.
b. $b_0=\bar y_{+}-b_1 \bar y_{-}$.
c. $b_0\approx \bar y \left[ 1- \left(\sum_{t=2}^{T}(y_{t-1}-\bar y_{-})(y_t-\bar y_{+})\right) /\left(\sum_{t=2}^{T}(y_{t-1}-\bar y_{-})^2\right) \right] \approx \bar y \left[1-r_1\right]$.

8.6 a. Since the mean and variance of the sequence does not vary over time, the sequence can be thought to be weakly stationary.
b. The summary statistics of the sequence are as follows: \[ \small{ \begin{array}{rrrrr} \hline \text{Mean}& \text{Median}& \text{Std}& \text{Minimum}& \text{Max}\\ 0.0004& 0.0008& 0.0064& -0.0182 & 0.0213\\ \hline \end{array} } \] Under the assumption of white noise, the forecast of an observation in the future is its sample mean, that is 0.0004. This forecast does not depend on the number of steps ahead.
c. The autocorrelations for the lags 1 through 10 is shown: \[ \small{ \begin{array}{ccccccccccc} \hline 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10\\ 1.000 & -0.046 & -0.096 & 0.019 & -0.002 & -0.004 & -0.054 & -0.035 & -0.034 & -0.051 & 0.026\\ \hline \end{array} } \] Because $|r_k/se(r_k)|<2$ ($se(r_k)=1/\sqrt{503} = 0.0446$) for $i=1,\ldots,10$, none of the autocorrelations is strongly statistically significant different from zero except for lag 2. For lag 2, the autocorrelation is $0.096/0.0446 = 2.15$ standard errors below zero.

Chapter 11

11.1 a. The probability density function is \[ \mathrm{f}(y)=\frac{\partial}{\partial y}\mathrm{F}(y) =(-1)(1+e^{-y})^{-2}e^{-y}(-1)= \frac{e^y}{(1+e^y)^2}. \] b. \[ \mu_y = \int_{-\infty}^{\infty}\ y \mathrm{f}(y)dy = \int_{-\infty}^{\infty}\ y \frac{e^{y}}{(1+e^{y})^{2}}dy= 0. \] c. \[ \mathrm{E~}y^2 = \int_{-\infty}^{\infty}\ y^2 \mathrm{f}(y)dy = \pi^2 /3 . \] Since $\mu_y = 0$, the standard deviation is $\sigma_y = \pi / \sqrt{3}=1.813798.$
d. The probability density function for $y^{\ast \ast}$ is
\[\begin{eqnarray*} \mathrm{f}^{\ast}(y)&=& \frac{\partial}{\partial y}\Pr(y^{\ast \ast} \leq y) = \frac{\partial}{\partial y}\Pr(y^{\ast } \leq y \sigma_y + \mu_y) =\sigma_y \mathrm{f}(y \sigma_y) = \sigma_y \frac{e^{y\sigma_y}}{(1+e^{y\sigma_y})^2}. \end{eqnarray*}\]

11.3 Let $\Pr(\varepsilon_{i1} \leq a) = F(a)=\exp(-e^{-a})$ and $f(a)=\frac{dF(a)}{da}=\exp(-e^{-a})e^{-a}.$ Then

\[\begin{align*} \Pr(\varepsilon_{i2}-\varepsilon_{i1} \leq a) &=\int_{-\infty}^{\infty}F(a+y)f(y)\,dy=\int_{-\infty}^{\infty}\exp\left[-e^{-y}(e^{-a}+1)\right]e^{-y}\,dy\\ &=\int_{\infty}^0\exp(-zA)z \,d(-\ln z)=-\int_{\infty}^0\exp(-zA)\,dz\\ &= \frac{\exp(-zA)}{A}|_\infty^0=\frac{1}{A}=\frac{1}{1+e^{-a}}, \end{align*}\] with $A=e^{-a}+1$ and $z=e^{-y}$. Thus,

\[ \pi_i=\Pr (\epsilon _{i2}-\epsilon _{i1}<V_{i1}-V_{i2})=\Pr (\epsilon _{i2}-\epsilon _{i1}<\mathbf{x}_i^{\mathbf{\prime }}\boldsymbol \beta) =\frac{1}{1+\exp (-\mathbf{x}_i^{\mathbf{\prime }}\boldsymbol \beta)}. \]

11.5 From equation (11.5) we know that

\[ \sum\limits_{i=1}^{n}\mathbf{x}_i\left( y_i-\mathrm{\pi }(\mathbf{x} _i^{\mathbf{\prime}}\mathbf{b}_{MLE})\right) = \sum\limits_{i=1}^{n}(1~~x_{i1}~~\cdots~~x_{ik})^\prime \left( y_i-\mathrm{\pi }(\mathbf{x} _i^{\mathbf{\prime}}\mathbf{b}_{MLE})\right) =(0~~0~~\cdots~~0)^{\prime}. \]

From the first row, we get $\sum\limits_{i=1}^{n} \left( y_i-\mathrm{\pi }(\mathbf{x} _i^{\mathbf{\prime}}\mathbf{b}_{MLE})\right) =0$. Dividing by $n$ yields $\overline{y} = n^{-1} \sum_{i=1}^n \widehat{y}_i .$

11.7 a. The derivative of the logit function is \[ \frac{\partial}{\partial y}\mathrm{\pi}(y) = \mathrm{\pi}(y)\frac{1}{(1+e^y)}=\mathrm{\pi}(y)(1-\mathrm{\pi}(y)). \] Thus, using the chain rule and equation (11.5), we have \[ \begin{array}{ll} \mathbf{I}(\boldsymbol \beta) &=& - \mathrm{E~}\frac{\partial ^{2}}{\partial \boldsymbol \beta\partial \boldsymbol \beta ^{\prime }}L(\boldsymbol \beta) = -\mathrm{E~} \frac{\partial }{\partial \boldsymbol \beta^{\prime}} \left( \sum\limits_{i=1}^{n}\mathbf{x}_i\left( y_i-\mathrm{\pi }(\mathbf{x} _i^{\mathbf{\prime }}\boldsymbol \beta)\right) \right) \\ &=& \sum\limits_{i=1}^{n}\mathbf{x}_i \frac{\partial }{\partial \boldsymbol \beta^{\prime}} \mathrm{\pi }(\mathbf{x} _i^{\mathbf{\prime }}\boldsymbol \beta) =\sum\limits_{i=1}^{n}\mathbf{x}_i \mathbf{x}_i^{\prime} \mathrm{\pi}(\mathbf{x} _i^{\mathbf{\prime}}\boldsymbol \beta)(1-\mathrm{\pi}(\mathbf{x} _i^{\mathbf{\prime}}\boldsymbol \beta)). \end{array} \] This provides the result with $\sigma_i^2 = \mathrm{\pi}(\mathbf{x}_i^{\prime} \boldsymbol \beta)(1-\mathrm{\pi}(\mathbf{x}_i^{\prime}\boldsymbol \beta))$.

Define $\mathbf{a}_i=\mathbf{x}_i\left( y_i-\mathrm{\pi }(\mathbf{x} _i^{\mathbf{\prime}}\boldsymbol \beta)\right)$ and $\mathbf{H}_i = \frac{\partial }{\partial \boldsymbol \beta^{\prime}}\mathbf{a}_i =-\mathbf{x}_i \mathbf{x}_i^{\mathbf{\prime}} \mathrm{\pi }^{\prime}( \mathbf{x}_i^{\mathbf{\prime}}\boldsymbol \beta).$
Note that $\mathrm{E}(\mathbf{a}_i)=\mathbf{x}_i \mathrm{E}\left( y_i-\mathrm{\pi }(\mathbf{x} _i^{\mathbf{\prime}}\boldsymbol \beta)\right)= \mathbf{0}$. Further define $b_i=\frac{\mathrm{\pi }^{\prime}( \mathbf{x}_i^{\mathbf{\prime}}\boldsymbol \beta)}{\mathrm{\pi }(\mathbf{x} _i^{\mathbf{\prime}}\boldsymbol \beta)(1-\mathrm{\pi }(\mathbf{x}_i^{ \mathbf{\prime}}\boldsymbol \beta))}$. With this notation, the score function is $\frac{\partial }{\partial \boldsymbol \beta}L(\boldsymbol \beta) = \sum\limits_{i=1}^{n} \mathbf{a}_i b_i$. Thus, \[ \begin{array}{ll} \mathbf{I}(\boldsymbol \beta) & = - \mathrm{E} \left( \frac{\partial^2}{\partial \boldsymbol \beta ~ \partial \boldsymbol \beta ^{\prime}}L(\boldsymbol \beta) \right) = - \mathrm{E} \left( \frac{\partial}{\partial \boldsymbol \beta^{\prime}}\sum\limits_{i=1}^{n} \mathbf{a}_i b_i\right) \\ &= - \mathrm{E} \left( \sum\limits_{i=1}^{n} \left( \left(\frac{\partial}{\partial \boldsymbol \beta^{\prime}}\mathbf{a}_i \right) b_i + \mathbf{a}_i \frac{\partial}{\partial \boldsymbol \beta^{\prime}}b_i \right) \right) = - \sum\limits_{i=1}^{n} \left[\mathrm{E}(\mathbf{H}_i)b_i + \mathrm{E}(\mathbf{a}_i) \frac{\partial}{\partial \boldsymbol \beta^{\prime}}b_i \right] \\ &= - \sum\limits_{i=1}^{n} \mathbf{H}_i b_i = \sum\limits_{i=1}^{n} \mathbf{x}_i \mathbf{x}_i^{\mathbf{\prime}} \frac {\left( \mathrm{\pi }^{\prime}( \mathbf{x}_i^{\mathbf{\prime}}\boldsymbol \beta)\right)^2} {\mathrm{\pi }(\mathbf{x} _i^{\mathbf{\prime}}\boldsymbol \beta)(1-\mathrm{\pi }(\mathbf{x}_i^{ \mathbf{\prime}}\boldsymbol \beta))} . \end{array} \]

11.8 a(i). The plots are not presented here. \[ \small{ \begin{array}{lrrrrrrr} \hline & mean & median & std & minimum & maximum\\ \hline \text{CLMAGE} & 32.531 & 31.000 & 17.089 & 0.000 & 95.000\\ \text{LOSS} & 5.954 & 2.331 & 33.136 & 0.005 & 1067.700\\ \hline \end{array} } \] a(ii). Not for CLMAGE, but both versions of LOSS appear to differ by ATTORNEY. \[ \small{ \begin{array}{crrr} \hline \text{ ATTORNEY} & \text{CLMAGE} & \text{LOSS} & \text{lnLOSS} \\ \hline 1 & 32.270 & 9.863 & 1.251\\ 2 & 32.822 & 1.865 & -0.169\\ \hline \end{array} } \] a(iii). SEATBELT and CLMINSUR appear to be different, CLMSEX and MARITAL are less so.

\[ \small{ \begin{array}{cccccccc} \hline \text{ATTORNEY} & \text{CLMSEX} & & \text{MARITAL} &&& & \text{ CLMINSUR } & & \text{SEATBELT} \\ & 1 & 2 & 1 & 2 & 3 & 4 & 1 & 2 & 1 & 2 \\ \hline 1 & 325 & 352 & 320 & 329 & 6 & 20 & 76 & 585 & 643 & 16 \\ 2 & 261 & 390 & 304 & 321 & 9 & 15 & 44 & 594 & 627 & 6 \\ \hline \end{array} } \] a(iv). Number of values missing is shown as:

\[ \small{ \begin{array}{cccccc} \hline \text{CLMAGE} & \text{LOSS} & \text{CLMSEX} & \text{MARITAL}& \text{CLMINSUR} & \text{SEATBELT}\\ 189 & 12 & 16 & 41 & 48 & NA\\ \hline \end{array} } \] b(i). The variable CLMSEX is statistically significant. The odds ratio is $\exp(-0.3218) = 0.7248$, indicating that women are 72% times as likely to use an attorney as men (or men are 1/0.72 = 1.379 times as likely to use an attorney than women).
b(ii). CLMSEX and CLMINSUR are statistically significantly and SEATBELT is somewhat significant, as given by the $p$-values. CLMAGE is not significant. MARITAL does not appear to be statistically significant.
b(iii). Men use attorneys more often - the odds ratio is $\exp(-0.37691) = 0.686$, indicating that women are 68.6% times as likely to use an attorney as men.
b(iv). The logarithmic version, lnLOSS, is more important. In the final model without LOSS, the $p$-value associated with lnLOSS was very tiny ($< 2e-16$), indicating strong statistical significance.
b(v). All variables remain the same except one of the MARITAL binary variables becomes marginally statistically significant. The main difference is that we are using an additional $168$ observations by not requiring that CLMAGE be in the model.
b(vi). For the systematic component, we have \[ \begin{array}{llll} \mathbf{x}^{\prime}\mathbf{b}_{MLE} &= 0.75424 \\ &~~~ -0.51210* (\text{CLMSEX}=2)& + 0.04613*(\text{MARITAL}=2) \\ &~~~+ 0.37762*(\text{MARITAL}=3) &+ 0.12099*(\text{MARITAL}=4) \\ &~~~+ 0.13692*(\text{SEATBELT}=2) &-0.52960*(\text{CLMINSUR}=2) \\ &~~~-0.01628* \text{CLMAGE} &+ 0.98260*\text{lnLOSS} \\ &= 1.3312. \end{array} \] The estimated probability of using an attorney is \[ \widehat{\pi} = \frac {\exp (1.3312)}{1+\exp (1.3312)} = 0.791. \] c. Females are less likely to use attorneys. Those not wearing a seat belt (SEATBELT=2) are more likely to use an attorney (although not significant). Single (MARITAL=2) are more likely to use an attorney. Claimants not uninsured (CLMINSUR=2) (are insured) are less likely to use an attorney. The higher the loss, the more likely that an attorney will be involved.

11.11 a. The intercept and variables PLACE%, MSAT, RANK are significant at 5% level.
b(i). The success probability for this case is 0.482.
b(ii). The success probability for this case is 0.281.
b(iii). The success probability for this case is 0.366.
b(iv). The success probability for this case is 0.497.
b(v). The success probability for this case is 0.277.

Chapter 12

12.1 Take derivative of equation (12.2) with respect to $\mu$ and set the first order condition equal to zero. With this, we have $\partial L(\mu)/\partial \mu=\sum_{i=1}^{n}(-1+y_i/\mu)=0$, that is $\hat\mu=\bar y$.

12.3 a. From the expression of the score equation (12.5), \[ \left. \frac{\partial }{\partial \boldsymbol \beta} L(\boldsymbol \beta )\right\vert _{\mathbf{\beta =b}}= \sum_{i=1}^{n}\left( y_i-\widehat{\mu }_i\right) \left( \begin{array}{c} 1 \\ x_{i,1} \\ \vdots \\ x_{i,k} \\ \end{array} \right) = \mathbf{0}. \] From the first row, we have that the average of residuals $e_i = y_i - \widehat{\mu}_i$ is equal to zero.
b. From the $(j+1)^{st}$ row of the score equation (12.5), we have \[ \sum_{i=1}^{n} e_i x_{i,j}= 0. \] Because residuals have a zero average, the sample covariance between residuals and $x_j$ is zero, and hence the sample correlation is zero.

12.5 a. The distribution of COUNTOP has a long tail and is skewed to the right. The variance ($12.5^2 = 156.25$) is much bigger than the mean, 5.67.
\[ \small{ \begin{array}{cccccc} \hline &1st & & &3rd & \\ \text{Minimum} & \text{Quartile} & \text{Median} & \text{Mean} &\text{Quartile} & \text{Maximum}\\\hline 0.00 & 0.00 & 2.00 & 5.67 & 6.00 & 167.00\\ \hline \end{array} } \] b. Yes, the tables suggest that most variables have a significant impact on COUNTOP.
c. The Pearson’s chi-square statistic is 55044.
d(i). All of the variables appear to be very statistically significant.
d(ii). The coefficient of GENDER is 0.4197. Roughly, we would expect females to have 42% more outpatient expenditures than males.
d(iii). The chi-square statistic is 33,214 - lower that the one in part (b) (55,044). This indicates that the covariates help with the fitting process. The statistical significance also indicates that the covariates are statistically significant but the overdispersion is suspect - see d(iv).
d(iv). Now most of the variables remain statistically significant but the strength of statistical significance has decreased dramatically. It is not clear if the income variable is statistically significant.
e(i). All of the variables appear to be statistically significant. The income variable is perhaps the least important.
e(ii). The chi-square statistic is 33,660 - higher than the Poisson model (33,214) but lower that the one in part (b) (55,044). This suggests that the two models fit about the same with the Poisson having the slight edge. The AIC for the basic Poisson is 22,725 - which is much higher than the AIC for the negative binomial (10,002). Thus, the negative binomial is preferred to the basic Poisson. However, the quasi-Poisson is probably as good as the negative binomial.
e(iii). From the output, the likelihood ratio test statistic is 18.7 - based on 4 degrees of freedom, the $p$-value is 0.000915. This indicates that income is a statistically significant factor in the model.
f. For GENDER, education, personal health status, anylimit, income and insurance, the models report the same sign and statistically significant effects. RACE does not appear to be statistically significant in the logistic regression model. For REGION, the signs appear to be the same although the statistical significance has changed.