{"id":3325,"date":"2015-04-11T23:11:07","date_gmt":"2015-04-12T04:11:07","guid":{"rendered":"http:\/\/www.ssc.wisc.edu\/~jfrees\/?page_id=3325"},"modified":"2015-08-17T13:48:35","modified_gmt":"2015-08-17T18:48:35","slug":"partitioning-the-variability","status":"publish","type":"page","link":"https:\/\/users.ssc.wisc.edu\/~ewfrees\/regression\/basic-linear-regression\/2-3-is-the-model-useful-some-basic-summary-measures\/partitioning-the-variability\/","title":{"rendered":"Partitioning the Variability"},"content":{"rendered":"<p> The squared deviations, \\(\\left( y_i-\\overline{y}\\right) ^2\\), provide a basis for measuring the spread of the data. If we wish to estimate the \\(i\\)th dependent variable <em>without<\/em> knowledge of <em>x<\/em>, then \\(\\overline{y}\\) is an appropriate estimate and \\(y_i- \\overline{y}\\) represents the deviation of the estimate. We use \\(Total~SS=\\sum_{i=1}^{n}\\left( y_i-\\overline{y}\\right) ^2\\), the total sum of squares, to represent the variation in all of the responses. <\/p>\n<p> Suppose now that we also have knowledge of <em>x<\/em>, an explanatory variable. Using the fitted regression line, for each observation we can compute the corresponding<em>\\ fitted value<\/em>, \\(\\widehat{y}_i = b_0 + b_1x_i\\). The fitted value is our estimate <em>with<\/em> knowledge of the explanatory variable. As before, the difference between the response and the fitted value, \\(y_i- \\widehat{y}_i\\), represents the deviation of this estimate. We now have two &#8220;estimates&#8221; of \\(y_i\\), these are \\(\\widehat{y}_i\\) and \\(\\overline{y}\\). Presumably, if the regression line is useful, then \\( \\widehat{y}_i\\) is a more accurate measure than \\(\\overline{y}\\). To judge this usefulness, we algebraically decompose the total deviation as:<br \/>\n\\begin{equation}\\label{E2:deviationdecomp} \\begin{array}{ccccc} \\underbrace{y_i-\\overline{y}} &#038; = &#038; \\underbrace{y_i-\\widehat{y}_i} &#038; + &#038; \\underbrace{\\widehat{y}_i-\\overline{y}} \\\\ \\text{total} &#038; {\\small =} &#038; \\text{unexplained} &#038; {\\small +} &#038; \\text{explained} \\\\ \\text{deviation} &#038;  &#038; \\text{deviation} &#038;  &#038; \\text{deviation} \\end{array} \\end{equation} Interpret this equation as &#8220;the deviation without knowledge of <em>x<\/em> equals the deviation with knowledge of <em>x<\/em> plus the deviation explained by <em>x<\/em>.&#8221; Figure 2.4 is a geometric display of this decomposition. In the figure, an observation above the line was chosen, yielding a positive deviation from the fitted regression line, to make the graph easier to read. A good exercise is to draw a rough sketch corresponding to Figure 2.4 with an observation below the fitted regression line. <\/p>\n<figure class=\"wp-caption aligncenter\" style=\"max-width: 300px;\" aria-label=\"Figure 2.4 Geometric display of the deviation decomposition.\"><a href=\"http:\/\/www.ssc.wisc.edu\/~jfrees\/wp-content\/uploads\/2015\/04\/F2ANOVADecomp.png\"><img decoding=\"async\" loading=\"lazy\" src=\"http:\/\/www.ssc.wisc.edu\/~jfrees\/wp-content\/uploads\/2015\/04\/F2ANOVADecomp.png\" alt=\"F2ANOVADecomp\" width=\"576\" height=\"288\" class=\"aligncenter size-full wp-image-3254\" srcset=\"https:\/\/users.ssc.wisc.edu\/~ewfrees\/wp-content\/uploads\/2015\/04\/F2ANOVADecomp.png 576w, https:\/\/users.ssc.wisc.edu\/~ewfrees\/wp-content\/uploads\/2015\/04\/F2ANOVADecomp-300x150.png 300w\" sizes=\"(max-width: 576px) 100vw, 576px\" \/><\/a><figcaption class=\"wp-caption-text\">Figure 2.4 Geometric display of the deviation decomposition.<\/figcaption><\/figure>\n<h2 style=\"text-align: center;\"><a id=\"displayText2.4f\" href=\"javascript:togglecode('toggleText2.4f','displayText2.4f');\"><i><strong>R Code for Figure 2.4<\/strong><\/i><\/a> <\/h2>\n<div id=\"toggleText2.4f\" style=\"display: none\">\n<pre>\r\n<strong>R-Code<\/strong>\r\npar(mar=c(2.2,2.1,.2,.2),cex=1.2)\r\nx &lt;- seq(-4, 4, len=101)\r\ny &lt;- x\r\nplot(x, y, type = \"l\", xlim=c(-3, 4), xaxt=\"n\", yaxt=\"n\", xlab=\"\", ylab=\"\")\r\naxis(1, at = c(-1, 1),lab = expression(bar(x), x))\r\naxis(2, at = c(-1, 1, 3),lab = expression(bar(y), hat(y), y), las=1)\r\nabline(-1, 0, lty = 2)\r\nsegments(-4, 1, 1, 1, lty=2)\r\nsegments(-4, 3, 1, 3, lty = 2)\r\nsegments(1, -4, 1, 3, lty = 2)\r\nsegments(-1, -4, -1, -1, lty = 2)\r\n\r\npoints(1, 3, cex=1.5, pch=19)\r\n\r\narrows(1.0, 1, 1.0, 3, code = 3, lty = 1, angle=15, length=0.12, lwd=2)\r\ntext(1.3, 2.2,   expression( y-hat(y)),cex=0.8) \r\n#text(.3,2.2,\"'unexplained' deviation\", cex=.8) \r\narrows(1.0, -1, 1.0, 1, code = 3, lty = 1, angle=15, length=0.12, lwd=2)\r\ntext(1.7, 0, expression(hat(y)-bar(y) == b[1](x-bar(x)) ), cex=0.8 )\r\n#text(2.1, -0.5, \" 'explained' deviation\", cex=0.8)\r\n#arrows(.9, -1, .9, 3, code = 3, lty = 1, angle=15, length=0.12, lwd=2)\r\n#text(.6, 1.2, cex=0.8  , expression( y-bar(y)))#\"total deviation\")\r\narrows(-1, -1.0, 1, -1.0, code = 3, lty = 1, angle=15, length=0.12, lwd = 2)\r\ntext(0, -1.3, expression( x-bar(x)), cex=0.8  )\r\ntext(3.5, 2.7, expression( hat(y)== b[0]+ b[1]*x), cex=0.8  )\r\n<\/pre>\n<\/div>\n<p> Now, from the algebraic decomposition in equation (2.1), square each side of the equation and sum over all observations. After a little algebraic manipulation, this yields \\begin{equation}\\label{E2:ANOVADecomposition} \\sum_{i=1}^{n}\\left( y_i-\\overline{y}\\right) ^2=\\sum_{i=1}^{n}\\left( y_i-\\widehat{y}_i\\right) ^2+\\sum_{i=1}^{n}\\left( \\widehat{y}_i- \\overline{y}\\right) ^2. \\end{equation} We rewrite this as \\(Total~SS=Error~SS+Regression~SS\\) where \\(SS\\) stands for sum of squares. We interpret: <\/p>\n<ul>\n<li> \\(Total~SS\\) as the total variation without knowledge of <em>x<\/em>,\n<\/li>\n<li> \\(Error~SS\\) as the total variation remaining after the introduction of <em>x<\/em>, and\n<\/li>\n<li> \\(Regression~SS\\) as the difference between the \\(Total~SS\\) and \\(Error~SS\\) , or the total variation &#8220;explained&#8221; through knowledge of <em>x<\/em>. <\/li>\n<\/ul>\n<p>  When squaring the right-hand side of equation (2.1), we have the cross-product term \\(2\\left( y_i-\\widehat{y}_i\\right) \\left( \\widehat{y}_i-\\overline{y}\\right) \\). With the &#8220;algebraic manipulation,&#8221; one can check that the sum of the cross-products over all observations is zero. This result is not true for all fitted lines but is a special property of the least squares fitted line. <\/p>\n<p> In many instances, the variability decomposition is reported through only a single statistic. <\/p>\n<p><div class=\"scbb-content-box scbb-content-box-gray\"><br \/>\n <em>Definition.<\/em> The <em>coefficient of determination<\/em> is denoted by the symbol \\(R^2\\), called &#8220;\\(R\\)-square,&#8221; and defined as \\begin{equation*} R^2=\\frac{Regression~SS}{Total~SS}. \\end{equation*}<br \/>\n<\/div><br \/>\n  We interpret \\(R^2\\) to be the proportion of variability explained by the regression line. In one extreme case where the regression line fits the data perfectly, we have \\(Error~SS=0\\) and \\(R^2=1\\). In the other extreme case where the regression line provides no information about the response, we have \\(Regression~SS=0\\) and \\(R^2=0.\\) The coefficient of determination is constrained by the inequalities \\(0 \\leq R^2 \\leq 1\\) with larger values implying a better fit. <\/p>\n<p><div class=\"alignleft\"><a href=\"https:\/\/users.ssc.wisc.edu\/~ewfrees\/regression\/basic-linear-regression\/2-3-is-the-model-useful-some-basic-summary-measures\/\" title=\"2.3 Is the Model Useful? Some Basic Summary Measures\">&#9668 Previous page<\/a><\/div><div class=\"alignright\"><a href=\"https:\/\/users.ssc.wisc.edu\/~ewfrees\/regression\/basic-linear-regression\/2-3-is-the-model-useful-some-basic-summary-measures\/the-size-of-a-typical-deviation-s\/\" title=\"The Size of a Typical Deviation: <em>s<\/em>\">Next page &#9658<\/a><\/div><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The squared deviations, \\(\\left( y_i-\\overline{y}\\right) ^2\\), provide a basis for measuring the spread of the data. If we wish to estimate the \\(i\\)th dependent variable without knowledge of x, then \\(\\overline{y}\\) is an appropriate estimate &hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":3322,"menu_order":1,"comment_status":"closed","ping_status":"closed","template":"","meta":{"jetpack_post_was_ever_published":false},"jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/P8cLPd-RD","acf":[],"_links":{"self":[{"href":"https:\/\/users.ssc.wisc.edu\/~ewfrees\/wp-json\/wp\/v2\/pages\/3325"}],"collection":[{"href":"https:\/\/users.ssc.wisc.edu\/~ewfrees\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/users.ssc.wisc.edu\/~ewfrees\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/users.ssc.wisc.edu\/~ewfrees\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/users.ssc.wisc.edu\/~ewfrees\/wp-json\/wp\/v2\/comments?post=3325"}],"version-history":[{"count":8,"href":"https:\/\/users.ssc.wisc.edu\/~ewfrees\/wp-json\/wp\/v2\/pages\/3325\/revisions"}],"predecessor-version":[{"id":4873,"href":"https:\/\/users.ssc.wisc.edu\/~ewfrees\/wp-json\/wp\/v2\/pages\/3325\/revisions\/4873"}],"up":[{"embeddable":true,"href":"https:\/\/users.ssc.wisc.edu\/~ewfrees\/wp-json\/wp\/v2\/pages\/3322"}],"wp:attachment":[{"href":"https:\/\/users.ssc.wisc.edu\/~ewfrees\/wp-json\/wp\/v2\/media?parent=3325"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}