sysuse lexp, clear
* Using log scales in a graph, and using logged values of your
* data are two separate things.
generate lngnppc = log(gnppc)
scatter lexp gnppc, name(g1) title("Original Units")
scatter lexp lngnppc, name(g2) title("ln(gnppc)")
graph combine g1 g2, ycommon col(1)
graph drop g2
* If we compare a scatter plot of the original data with a
* scatter plot of the logged data, we see the data values
* are different, but we also see that the relative positions
* of the data points have changed: points closer to zero
* are farther apart from each other (more spaced out along
* the axis) while points farther from zero are closer to
* each other (less spaced out).
*
* Now consider the original data on a logged scale. While the data
* are untransformed, the visual display is has the same distortion
* as the logged data. You can imagine that the
* plotting canvas was rubberized, and has been stretched
* on the left and compressed on the right.
* (Note that it helps to pick the right graphing limits to see the
* equivalence between the two graphs.
scatter lexp gnppc, name(g1) title("Original Units")
scatter lexp lngnppc, xlabel(6(0.5)10.5) name(g2) title("ln(gnppc)")
scatter lexp gnppc, xscale(log) name(g3) title("xscale(log)")
graph combine g1 g2 g3, ycommon col(1)
graph drop g1 g2 g3
* Combining both would be a mistake.
scatter lexp lngnppc, xscale(log) name(g4) title("ln(gnppc) and xscale(log)")
* To see why, consider these two regressions.
quietly regress lexp gnppc
predict lexphat
quietly regress lexp lngnppc
predict lexphat2
twoway (scatter lexp gnppc)(lfit lexp gnppc), name(g5) title("Original Units")
twoway (scatter lexp lngnppc)(lfit lexp lngnppc), name(g6) title("ln(gnppc)")
graph combine g5 g6, ycommon col(1)
* Each predicts a straight line when viewed on the appropriate scale, however
* when viewed on the same scale we see quite different models. The second
* model predicts that increasing the GNP by an order of magnitude has a steady,
* linear effect on life expectancy. The first model predicts that increasing
* GNP by an order of magnitude has a steadily increasing effect on life
* expectancy!
* As long as we attach the appropriate meaning to each scale, the difference
* is clear in either units.
twoway (scatter lexp gnppc)(lfit lexp gnppc)(line lexphat2 gnppc, sort), name(g7) title("Original Units")
twoway (scatter lexp lngnppc)(lfit lexp lngnppc)(line lexphat lngnppc, sort), name(g8) title("ln(gnppc)")
graph combine g7 g8, ycommon
* However, combining the use of logged data with a stretched plot area has
* no real use for us - it makes both effects appear curved.
twoway (scatter lexp lngnppc)(lfit lexp lngnppc, n(200))(line lexphat lngnppc, sort), xscale(log) name(g9) title("ln(gnppc) and xscale(log)")
twoway (scatter lexp lngnppc)(lfit lexp lngnppc, n(200))(line lexphat lngnppc, sort), yscale(log) name(g10) title("ln(gnppc) and yscale(log)")
graph combine g9 g10, xcommon