###August 24, 2014 :: J. Bradford DeLong###
Import data from as data.frame Shiller with headings, and perform elementary data manipulations...
```{r, echo=FALSE}
DATE = Shiller$DATE
REAL_PRICE = Shiller$REAL_PRICE
REAL_DIVIDENDS = Shiller$REAL_DIVIDENDS
REAL_EARNINGS=Shiller$REAL_EARNINGS
MA10_EARNINGS = Shiller$MA.10._OF_EARNINGS
CUMULATIVE_RETURN = Shiller$CUMULATIVE_RETURN
shift<-function(x,shift_by){
stopifnot(is.numeric(shift_by))
stopifnot(is.numeric(x))
if (length(shift_by)>1)
return(sapply(shift_by,shift, x=x))
out<-NULL
abs_shift_by=abs(shift_by)
if (shift_by > 0 )
out<-c(tail(x,-abs_shift_by),rep(NA,abs_shift_by))
else if (shift_by < 0 )
out<-c(rep(NA,abs_shift_by), head(x,-abs_shift_by))
else
out<-x
out
} #First: a lead-lag functionâ€¦
LEAD10RETURN = (shift(CUMULATIVE_RETURN,120)/CUMULATIVE_RETURN)^(1/10)-1
LEAD20RETURN = (shift(CUMULATIVE_RETURN,240)/CUMULATIVE_RETURN)^(1/20)-1
CAPE = REAL_PRICE/MA10_EARNINGS
```
Make sure we have all the data...
```{r, echo=FALSE}
plot(DATE,REAL_PRICE, main="Real Stock Index Price", xlab="Date", ylab="Real Stock Index Price", pch=16, cex=0.5)
Error: unexpected symbol in "plot(DATE,REAL_PRICE, main="Real Stock Index Price" xlab"
plot(DATE,REAL_DIVIDENDS, main="Real Stock Index Dividends", xlab="Date", ylab="Real Stock Index Dividends", pch=16, cex=0.5)
plot(DATE,REAL_EARNINGS, main="Real Stock Index Earnings", xlab="Date", ylab="Real Stock Index Earnings", pch=16, cex=0.5)
Error: unexpected symbol in "plot(DATE,REAL_EARNINGS, main="Real Stock Index Earnings" xlab"
plot(DATE,MA10_EARNINGS, main="Cyclically-Adjusted Real Earnings", xlab="Date", ylab="10-Yr MA of Trailing Real Earnings ", pch=16, cex=0.5)
plot(DATE,LEAD10RETURN, main="Realized Ten-Year Returns", xlab="Date", ylab="10-Yr Forward Realized Annual Rate of Return", pch=16, cex=0.5)
plot(DATE,LEAD20RETURN, main="Realized Twenty-Year Returns", xlab="Date", ylab="20-Year Forward Realized Annual Rate of Return", pch=16, cex=0.5)
plot(DATE,CAPE, xlab="Date", ylab="Campbell-Shiller CAPE", pch=16, cex=0.5)
```
Everything as it should be so far?...
Now on to the analysis proper...
Let's start with the simplest possible forward-return regression: regressing the ten-year future realized return in the Campbell-Shiller stock index database on the Campbell-Shiller cyclically-adjusted earnings yield INVERSECAPE--the inverse of the CAPE:
```{r, echo=FALSE}
INVERSECAPE = 1/CAPE
return_regression_2.lm = lm(formula = LEAD10RETURN ~ INVERSECAPE)
summary(return_regression_2.lm)
```
The significance levels that R reports are wrong: its naive regression package assumes that sac of the 1482 observed 10-year returns is independent of each of the others. They are not. Each monthly return shows up as a component in 120 10-year returns. The right t-value for the cyclically-adjusted earnings yield INVERSECAPE is not 27.3 but rather something between 4 and 5--still highly, highly significant.
More important, a third of the variance in future 10-year returns is accounted for by knowing the value of INVERSECAPE. More important, the intercept is zero and the coefficient is 1. More important, the ability of the earnings yield to forecast future 10-year returns remains highly, highly significant. More important, you get these not just by knowing what INVERSECAPE is and then performing some linear transformation on it, but by just the INVERSECAPE itself. What this equation tells us is that, since 1881, 0 + 1 x INVERSECAPE is a remarkably good linear forecast of ten-year future returns.
```{r, echo=FALSE}
plot(INVERSECAPE,LEAD10RETURN, main="Realized Ten-Year Forward Returns vs. CAPE Earnings Yield", xlab="CAPE Earnings Yield", ylab="10-Year Realized Foreward Returns", pch=16, cex=0.5)
abline(lm(LEAD10RETURN ~ INVERSECAPE))
```
Note that this particular functional form for understanding how knowing CAPE should shape your forecast of future returns is not important. INVERSECAPE is convenient because it comes in the same units as returns. But regressing future long-run returns on CAPE itself does about as well. Submit:
```{r, echo=FALSE}
INVERSECAPE = 1/CAPE
return_regression_2.lm = lm(formula = LEAD10RETURN ~ CAPE)
summary(return_regression_2.lm)
```
The same 1/3 of variance accounted for. The same parameter at the median of the distribution. This formulation suggests that expected ten-year real returns turn negative at a CAPE value of above 30--that offsetting the 3.3% real earnings yield at that point is an anticipated equal decline in valuation metrics over the next ten years, but the plot reveals that this prediction relies heavily on the linearity. Submitting:
```{r, echo=FALSE}
plot(CAPE,LEAD10RETURN, main="Realized Ten-Year Forward Returns vs. CAPE ", xlab="CAPE", ylab="10-Year Realized Foreward Returns", pch=16, cex=0.5)
abline(lm(LEAD10RETURN ~ CAPE))
```
Basically what we know about expected returns is that on the one occasion when CAPE rose above 30, the dot-com crash of 2000 was in the near future and the housing crash of 2008 came into the ten-year return window. That is not much information on which to base a long-run "sell" decision.
Let's think about not what economists call risk--variation about expected returns--but what people call risk: the chance that your money won't be there in real terms. The lowest realized 10-year returns did indeed come when the CAPE was at its highest and thus the earnings yield INVERSECAPE was at its lowest. But the second-lowest returns happened when CAPE was not high but normal. And the other periods of negative realized returns happened when CAPE was high, but not that low. Plus there is a lot of mass of the distribution with both high CAPE and very healthy positive returns. What's going on?
```{r, echo=FALSE}
plot(DATE,LEAD10RETURN, main="Realized Ten-Year Forward Returns", xlab="DATE", ylab="10-Year Realized Foreward Returns", pch=16, cex=0.5)
plot(CAPE,LEAD10RETURN, main="Realized Ten-Year Forward Returns", xlab="DATE", ylab="10-Year Realized Foreward Returns", pch=16, cex=0.5)
```
There are only four historical periods during which a ten-year investment in the S&P has not at least held its real value: ten years before the post-World War I deflation and the post-WWI depression of the start of the 1920s; (barely) in the Great Depression and the WWII inflation; 10 years before the stagflation of the 1970s and the subsequent Volcker depression; and 10 years before the recent financial unpleasantness for those dates where the ten-year return window includes both the dot-com and the housing-bubble crashes.
There is little more to be squeezed out of this particular data set.
```{r, echo=FALSE}
return_regression_3.lm = lm(formula = LEAD10RETURN ~ INVERSECAPE + CAPE + CAPESQ)
summary(return_regression_3.lm)
```
All the data will say is that once the CAPE earnings yield is known there is absolutely no point in adding either CAPE or CAPE^{2} in the hopes of picking up some predictive ability via curvature, while the computer does have a (weak) preference for placing predictive weight on the yield if it is added to the regression