Average Stock Market Returns Aren’t Average

The average investor in the stock market will earn less than the average stock market return–this is true even without taking into account any behavioral biases. A reasonably diversified portfolio of stocks can expect to earn 7% per year on average. Thus, it’s easy to see that the expected payoff from investing $100 and holding for 30 years is $100*(1.07)^30=$761.23. The expected payoff, however, is subject to a lot of uncertainty–even on a diversified portfolio the standard deviation is about 20% annually. Many people think that uncertainty washes out when you buy and hold for a long period of time. Not so, that is the fallacy of time diversification. Although the average return becomes more certain with more periods you don’t get the average return you get the total payoff and that becomes more uncertain with more periods.

To illustrate I ran 100,000 simulations of a 30 year stock market investment with a 7% return and a 20% standard deviation. The mean payoff across all 100,000 runs was $759.58 (recall the theoretical mean is $761.23 so we are spot on). But now consider the following. What percentage of returns would you guess lost money, i.e. had a total payoff after 30 years of less than $100?

After 30 years, 8.9% of all returns lost money!!!  In terms of recent debates, (average) r>g does not mean that wealth accumulates automatically. Fortunes can be lost even when the averages are in your favor.

Perhaps even more surprisingly what percentage of investors would you guess earned less than the average payoff of $761.23? An amazing, 69.2% of investors earned less than the average. The median payoff in my simulation was only $446.85, so the median return was not 7% but 5.1%. The average investor earned less than the average return.

The point is subtle and widely misunderstood. Here’s a simple example. Suppose that the average return is 10%. If $100 is invested for two periods the average payoff is $100*(1.1)^2=$121. But on average that is not what happens. More typically, you get say 0% in the first period and 20% in the second period, i.e. $100*(1.0)*(1.2)=$120. Notice that the average return is exactly the same, 10%, but the total payoff is smaller in the second and more realistic case–an application of Jensen’s inequality–so the average investor earns less than the average payoff. The difference here is only $1 but over 30 years that seemingly small difference accumulates.

If most investors earn less than the average it follows immediately that a few must earn more than the average. Lady luck is a bitch, she takes from the many and gives to the few. Here is the histogram of payoffs. The right-hand tail is long. Indeed, I am only showing part of the tail as there were payoffs as high as $25,000. Most investors earn less than the mean payoff.


And here is a line plot showing the portfolio accumulation over time for a sample of 10,000 runs. Note two things. First, the variance of the total payoff is increasing over time and second, the total payoff is highly right (upper)-tailed.

Total Return 2

Addendum: There is some evidence that stock market returns are mean reverting, as makes sense if discount factors are mean reverting. Taking mean-reversion into account would moderate the numbers somewhat but would not change the qualitative results. Moreover, we don’t have many independent 30-year data points so in my view we shouldn’t put too much weight on mean-reversion.


So you're saying the median is not the average in an open-ended distribution...

Okay. I viewed this post as banal and wouldn't have bothered to have read if the title had been "Median Stock Market Returns Aren't Average", but based on others' responses, the post obviously has utility to them even if I wasn't the target audience.

Perhaps what bothered you was no the topic but the long post. Investment return distribution is not a normal one, it's highly skewed. Also, you can have negative values unlike IQ.

Returning to the Piketty issue, model results shows that having lots of capital today does not assure you're richer, or still rich, in 30 years. Actually, model results reconcile opposing views in the debate.

IQ as usually defined (normal distribution, with mean 100 and standard deviation 15) can go negative. You just have to be about 7 standard deviations below the mean. There are as many people with IQ -1 as there are with IQ 201.

Is this true? Is the brain even functional at a level of IQ that low?

Have you ever heard someone says that when you are young you should invest in stocks because even though they are risky you will have time to cancel out the ups and downs? Sure, you have, even someone as smart as Burton Malkiel has made this argument. Alex's post shows that this common argument is wrong.

It's still good advice, because young people have a lot of their lifetime earnings ahead of them, which are effectively allocated to "cash" today. You want to take more risk on whatever you can invest, in order to maximize return. This applies to savings for "retirement", not savings to provide liquidity in the event of unexpected hard times.
Older people have less tolerance for risk because all of their earnings are behind them, and so volatility in returns has a greater impact on the pool of money they have to consume out of in retirement.

Excellent point about risk tolerance being related to the PV of future earning potential. Young people need to understand this.

Retirement is another whole ball of wax for the reasons you state, plus company matches on 401ks.

Samuelson argued that one should hold a constant portion of ones capital in equities over the entire life cycle. If 60% equities is right for you, you should hold that % both when young and when old. However a persons total "capital" consists not only of financial capital but also human capital -- future earnings. Human capital is very "bond like". Therefore the traditional advice of being heavily in equities at younger ages still holds. Indeed, since people rarely have much financial capital at all at early stages of their career, they don't have the capacity to hold enough equities to truly follow samuelsons advice.

Ironically, young people need to hold far more equities than they are capable of in order to balance their high level of bond-like human capital. Old people, who are more likely to have the financial capacity to hold equities actually need to transition to bonds in order to balance their dwindling human capital. Defined benefit pensions, despite flaws, actually help overlapping generations of workers achieve such a balance.

Alex's post shows no such thing.

The chance of a 20-something who regularly invests in cash and bonds outperforming one who does the same into diversified equities is so close to zero you couldn't measure the difference.

Best MR post I've ever seen, thank you.

Gives you a warm fuzzy feeling about all those Government pensions who have assumed an X% return, doesn't it? X was always way to high, and now X doesn't even mean X.

I worry about all the people that plans for retirement based on a fixed investment return........

The question isnt whether an investment meets your (unreasonable, ignorant, misinformed, biased) expectations. The question is whether investing in the stock market is the best use of your funds. Yes, this tips the balance away from stocks, but the results here apply to all investments.

Because of this, you might allocate more to present consumption or more to risk free securities. But if everyone does that, prices rise and treasury yields fall.

He's equivocating on the meaning of the word "average." In the initial sense, he means expected average annual returns. Later, he means expected compound annual growth rate.

The correct way to view this is to look at expected compound annual growth rate at the beginning, and then make your optimal allocation.

Naive question: Isn't the 7% figure you used something like a CAGR? For running a simulation like you did (assuming you picked random yearly returns from the distribution) shouldn't you be using an average yearly return which would be quite a bit higher?

Youre not naive. Youre thinking about it correctly.

Can some address this? Seems like a good point.

Using online historical datasets I see a 8.3% "average" rate of return (not CAGR) & a SD of 19% based on inflation adjusted returns from 1883 to 2013. http://www.moneychimp.com/features/market_cagr.htm

If I simulate using these I get a mean payoff of $1096, a median payoff of $695 and only 3% lost money.

Now which of these simulations (mine or Alex's) is fundamentally the right way of simulating the real situation?

PS. I could replicate Alex's mean / median & dud percentage; so hopefully it isn't buggy code. Or at least we made the same blunders. :)

Wouldn't simulating historical data involve actual simulated buying and selling of stocks at the price at a given time? The historical data gives you the price-time data. A $100 investment means that you actually buy certain stocks, and as the stock prices rise and fall or even disappear, and you make adjustments in your stock portfolio by buying and selling to match whatever strategy you choose to mirror the market. You can include the fees and costs involved with the transactions, whatever the costs were at the time.

With $100 how could you structure a portfolio that represents the values of the market? I know there are funds that you can buy into, but that also implies management fees.

Tried randomly sampling from actual historic yearly returns to preclude having to assume a normal distribution on returns. That gets me a mean payoff of $1172 & a median payoff of $740 with only 2.5% losing money.

Another variation I tried was to randomly sample historical data for 30 year eras but maintaining internal sequential returns in the hope of retaining the year-to-year correlation & features of the time series, if any. That gave me a very non-intuitive & surprising result of a mean / median $760 & $633 but no body losing any money.

Not sure, likely that there is a coding bug. FWIW, I've posted my R codes at:


Why are you simulating the returns? 30 years of normal log-returns is still a normal distribution of log-returns. You just add the variances. So the standard deviation of log-returns over 30 years is 20sqrt(30).

Did you read the byline, George? Karsten even says "Alex,"

Art, I don't know you, but I do know Cowen, and I would bet $100 that anything you can do, he can do better.

I'd take that bet, based only on my personal experience of GMU faculty, and Prof. Cowen being a member of said faculty. Remember, based on 'that anything you can do' - no one GMU faculty member is really all that extraordinary in even three or four fields of human endeavor.

I took that into account. Tyler is literally the smartest person I've ever met, and then some.

(Okay, Tyler did once praise Metallica's black album, so I concede that he is not perfect. But apart from that...)

The bet would cover Tyler specifically, not GMU faculty generally (who are, on the whole, nevertheless much smarter than me).

7% is the average real return, I believe, unless you are ignoring reinvested dividends.

The slight adjustment to the model is the cost of reinvestment, but remember that the price of a stock goes down per its dividend payment. But for many funds that need to pay out cash, the dividends provide that opportunity without the need to sell (incurring cost of trading), so this might also be beneficial. Overall, it's a perturbation associated with the investor---easy to ignore for the sake of making the point.


Very interesting.

4 questions:
a) Does this bias on the median vs average increases with volatility and time horizon?
b) Does this mean the benchmark equity premia of 6% is wrong? That would imply that most DCF-based valuation is wrong.
c) If the equity premia is indeed lower, the discount rate should be lower, and that would imply that asset prices should be higher? I might be wrong on that, as it is counterintuitive: prospective (median) lower returns should imply lower asset prices, and not higher ( I can see how this relates to FX dynamics in the uncovered interest rate parity on the traditional Dornbusch model).
d) What would this line of reasoning imply for bond returns? (if the bias is increasing on volatility you should have more bonds...)

This is a little misleading. This measures the potential different outcomes of an investment made at one point in time after 30 years of compound returns. It does not measure inequality of returns, unless you can show that there is a 20% standard deviation to the expected return for all investors in a given year, as opposed to the same investor over different years.

This is a good point, but your reframing isn't quite correct either.

There is some variance in returns for individual investors in any given year. I suspect it's higher than the variance of the total market over different years, but there's probably a measurement out there in the literature. I'd be interested to know. A large fraction of people carry significant idiosyncratic risk in their portfolios (say, because they have company stock or pick actively managed funds or whatever), and sometime that helps and sometimes it hurts. We may all average out to the market, but it's not like everyone buys the Vanguard Total Stock Market Index and then forgets about it. So that probably introduces more variance.

But then, extending your original point, even people in different years share many years of market in common. Alex mentions that in his post. If you invested in year 1 and I invested in year 2, and we both have investment lifetimes of 30 years, we share 29 years of market exposure.

Then again, there are other things left out in the model, such as that people save over time, they don't just fire one saving bullet in year 0 and withdraw it in year 30. Also, the composition of portfolios changes with time as people grow more conservative (and seek to minimize regret) while approaching the withdrawal phase. Large inheritors are probably closer to the "save once into a market portfolio" model than the normal retirement saving model, insofar as they are more vulnerable to which year they draw to begin their saving life. But, and this is just a guess, they may have less idiosyncratic risk in their portfolios. Maybe that's wrong.

> they may have less idiosyncratic risk in their portfolios

The intuition behind this is that as you have more to invest, you are compelled to put more of it in the market. So rich people probably have a beta closer to 1.

Middle class folks are less directly tied to the capital markets, as a larger fraction of their wealth is in things like land, employment skills, future government benefits, and physical capital. Bill Gates doesn't have a $50b house. Bob Gates, who lives down the street, may have $250k in home equity, a $100k IRA, an expectation of Social Security, decent employment prospects as an accountant, and two cars.

Your point about overlapping time horizons in real world returns is a good one, as is your point about most investing happening in stages over a certain period of accumulation.
I'd also add that the existence of a second asset class with less than 100% covariance can help the overall distribution of returns. Invest in stocks and bonds, and rebalance the weights annually, and the volatility drag should be reduced. This is especially true to extent returns are mean-reverting.

The ratio of the mean to the median in you simulation was 1.7 giving a log ratio of 0.53. The theoretical log ratio is t*sigma^2/2 or 54%, so it matches well.

If instead one simulated an annuity model i.e. an investor investing say $100 per year for 30 years what's the theoretical shape of the effective returns rate you'd expect?

Reason I ask is for that scenario I get a very symmetrical return distribution in my simulation with mean & median almost coinciding. Is that to be expected?

No, it doesn't match. You are thinking in terms of lognormal prices: W = exp( sum r_i ) ) , while Alex is using simple products, W = product( 1 + R_i). These result in different distributions for W. They have the same mean, but a different median.

In Alex's formulation, = .07 and var(R_i) == sig^2 = .04. For Alex, = 1.07^30 = 761. He gets a median of 447 by simulation. I don't see off hand how to calculate his median analytically.

In your formulation, r_i is normal with variance == sig^2 = .04. Requiring that average wealth be the same as in Alex's formulation requires = exp( + var(r_i)/2 ) = 1.07. Therefore = log(1.07) - sig^2/2.
So for you, W is also equal to 1.07^30 = 761 by construction. But in your case the median can be easily calculated: the exponential is monotonic, to find the median of W we just need to find the median of sum r_i and exponentiate it. Since the r_i are gaussian, their sum is also gaussian; the median of a gaussian equals the mean which in this case equals 30* ( log(1.07) - sig^2/2). So the median of W for you is 1.07^30 * exp(-30*sig^2/2) = 761 * .549 = 418, which is smaller than Alex's number.

Here is the R code to simulate the two cases (i am sure the formatting will be unreadable):

w <- numeric(n)
wk <- w
for(i in 1:n)
tmp <- rnorm(30,0,.2)
Tret <- tmp + 1.07 # alex
retk <- tmp + log(1.07) - .02 # karsten
w[i] <- cumprod(Tret)[30] # alex
wk[i] <- exp(sum(retk) ) # karsten


So some of the format was lost - all of the brackets. Quick changes to the above:

In Alex's formulation, E(R_i) = .07

For Alex, E(W) = 1.07^30 = 761

...in Alex's formulation requires E(exp(r_i)) = exp( E(r_i) + var(r_i)/2) = 1.07

...Therefore E(r_i) = log(1.07) – sig^2/2

If you look at historical returns of the S&P , there has been no period of 25 years or longer with real negative returns http://visualizingeconomics.com/blog?tag=Stocks.
What we do see is that the market cap of the S%P over long periods of time increases with GDP per capita so is no more random that the GDP per capita growth rate averaged over business cycles.

That's true. But it's also the kind of thing that's true until it isn't.

See also: S&P 500 has never gone down two years in a row (until it did), and the US has never had housing prices collapse across the country.

It may be some day that there will be a 30 year period when the real return on stocks will be negative, but Alex is claiming that it will happen 8% of the time, and that has not been true in the US over the last 100 years so it is reasonable to conclude the the model is wrong, not the data. We know the returns are small or even negative during economic contractions and and even more important p/e ratios fall with inflation. If you believe that returns are random as Alex assumes they you must believe that the GDP growth rate and the inflation rate is also

You can find long periods of negative real U.S. equity returns in the 19th century. If anything should keep a buy-and-hold U.S. equity investor up at night, it's survivorship bias.

Losers are removed from the S&P500 before they are total losers, while winners are added in many cases right at the beginning of their win-win-win.

If you invest in the entire market, a fraction of all shares, and everyone else does as well, then the returns will be the same for everyone, but probably around GDP growth.

If you invest along with everyone else in the same index fund, then your returns will be the same as everyone else, but no new firms will be created and none will disappear.

Index funds like Vanguard is for the naive who invest in the mediocre and better, while millions of other invests invest in total losers, sometimes getting a shooting star, sometimes not and thus losing their shirt, plus other who buy stock and hold and sometimes need to sell at the top, but other times can never sell because its worthless.

Here is your point in a simpler way (all figures log). If returns are normal with mean r and standard deviation sigma then the expected return of a time t is rt but the median return is rt - sigma^2t/2. So in your case the annual median return is 0.2^2/2 = 2% less than the mean.

Some related data and tools:

Remapping S&P 500 Performance Since January 1871 - we calculated the best, worst and average rate of return for investments in the index for investment periods of anywhere from 1 month to 130 years in duration.

The S&P 500 at Your Fingertips - our tool that lets you find the rate of return in the S&P 500 between any two calendar months since January 1871 - both with and without reinvesting dividends and with and without the effect of inflation.

Investing Through Time - our tool that determines how your investment in the S&P 500 (or its predecessor indices and components) would have fared over its history. (You'll get the worst case scenario investment outcomes if you set June 1932 to be the end of your hypothetical investment period.)

What "Reverting to the Mean" Means - stock prices aren't as random as you think.

'Fortunes can be lost even when the averages are in your favor.'

Well, in the stock market. As if any multi-millionaire/billionaire invests exclusively in the stock market - if only out tax minimization strategies.

But hey, why talk about actual investing practices when we can create stories that elide such strategies completely, right?

I'm tempted to place a link to a certain free market's center planned giving page to illustrate how that works, but knowing the reasult of such linkin, I guess the enterprising donor will need to figure it out on their own.

I had a similar reaction.

The very rich diversify not just across equities but across asset classes: real estate, businesses, commodities, currencies, stocks, bonds, artwork, and more.

I should qualify that my reaction was based on someone else's mention of Piketty, which I (perhaps wrongfully) assumed was the larger point Alex was driving at.

I understand that this is a simulation but I must be missing something because I don't see any returns less than zero. Is the assumption that all returns in the long run are positive and only when factoring in inflation and loads do investors lose money in the market? That doesn't make a lot of sense.

That said, using standard measures of central tendency such as the mean or median for such a hugely skewed distribution of returns is misleading, bad statistics. A better assumption is that the distribution is lognormal and therefore needs a lognormal estimate of the mean.

It is well known however, that stock returns do not fit lognormal assumptions particularly wrt the tails. Cauchy would be a better assumption, particularly wrt tail behavior, but the Cauchy is undefined or infinite for measures of central tendency.

Without a representative distribution of returns reflecting losses as well as gains, not much is to be gained by going to other, extreme valued distributional assumptions.

There are returns less than zero - in the line plot the lines often go down. The total payoff is floored at zero. In Alex's simulation you can only end up broke, not in debt. It looks like that is the outcome for many as the bottom right corner of the line plot is filled in.

You have to squint pretty hard to see the large number of lines that go from the initial $100, downward. Thanks to the few outsized positives, the distance is a tiny fraction of the chart.

And you can't go to absolutely $0.00 by losing a percentage of your money each year; returns can't be much worse than -99.9%. And then the next year, your $0.10 can only go to $0.0001.

Prescreening? - a wise decision.

Or maybe just a glitch - previous lost post -

''Fortunes can be lost even when the averages are in your favor.'

Well, in the stock market. As if any multi-millionaire/billionaire invests exclusively in the stock market - if only out tax minimization strategies.

But hey, why talk about actual investing practices when we can create stories that elide such strategies completely, right?

I'm tempted to place a link to a certain free market's center planned giving page to illustrate how that works, but knowing the reasult of such linkin, I guess the enterprising donor will need to figure it out on their own.'

"Suppose that the average return is 10%. If $100 is invested for two periods the average payoff is $100*(1.1)^2=$121. But on average that is not what happens. More typically, you get say 0% in the first period and 20% in the second period, i.e. $100*(1.0)*(1.2)=$120. Notice that the average return is exactly the same, 10%,"

The arithmetic average return is exactly the same. The geometric average return is not the same. If there is any variance in returns across periods the geometric mean will be less than the arithmetic mean. The higher the variance in returns from one period to the next, the larger that difference will be. Any time you are dealing with data where you are interested in ending values over a long horizon (i.e GDP growth rates, growth in income, investment returns) the arithmetic mean is not a useful tool. For example, a 100% return in period 1 followed by a negative 50% return(loss) in period 2 ( 100*(2)(0.5)=$100) produces a arithmetic average return of 50% even though value of the investment at the end of two periods is that same as it was at the start of the period. In this case the true return and geometric arithmetic return is 0%.

Alex's point is important because any academics including economist draw misleading conclusions from using arithmetic means in these situations. Financial practitioners understand this well but often use it to suit their purposes. When a mutual fund reports its ten year average returns in marketing materials, do you think they report the higher arithmetic average or the lower geometric average return?

Crap, should have read your post first. I assumed people were aware that arithmetic means made no sense in that context, not to mention economists. Guess I was mistaken.

Trust your intuition more than Bill, would be my advice.

Well said. Many financial planners and other investment professionals with credentials understand well the relationship between arithmetic return and geometric return under different volatility conditions.

Future projections using arithmetic returns are harder and harder to find unless you look at small shops that are primarily marketers. Most use geometric returns projections based on historical geometric returns and then tweak a bit to suit their views.

Even most mutual fund reporting that I see uses geometric return, however, so maybe a bit more credit is due there. The big fund rating houses probably forced this to some degree.

Do you think using geometic returns instead of average completely erases the result? The benchmark equity risk premia of 6% was calculated with geometric averages, right?

The standard reporting for mutual funds is 'cumulative annual growth rate' (CAGR), which is a geometric return.

Interestingly, public sector pension plans, which discount liabilities based on expected asset returns (dubious itself, but...) are often unclear as to whether the assumed return is based on a geometric or arithmetic average. Given that they generally are assuming annual returns of 7-8% on mixed portfolios, I can take a guess. I even heard one actuary recently defending the use of arithmetic averages here (facepalm.)

Of course, a higher assumption is better for everyone...except tomorrow's taxpayer.

But the reason for privatizing Social Security is the returns for investing in Wall Street is higher than SS, so the government pensions replace SS by not paying the retirement FICA and pour that into the pension fund....unless tax cuts and a recession cut revenue so the pension fund contribution gets deferred. If public employees were on SS, the State would be required to pay FICA by cutting roads or hiking taxes or releasing people from prison.

If you argue for private retirement savings instead of SS, then the employer pays you the FICA until the economy is crap and cuts your pay so you cut your contribution to retirement savings.

If you force 13% of wages to be put in a retirement fund you get to invest in expert approved funds with no withdrawal until age 67 or 70, then you end up with individuals ending up with less money at the age of 70 than they put in because they likely react wrong in a year like 2008 at age 65.

Unless you simply decide to execute those who run out of money, you end up with a lot of people who don't have the means to live. Even with SS, 20% hit retirement with too little assets and can't live on the SS benefit and end up on welfare, and they may have worked 45-50 years for 50-60 hours a week, just doing things that pay little, like caring for the disabled, sick, elderly. But also people who were very successful who got taken in by Maddoffs.

Dude, I'm extremely pro-Social Security.

The implication is that the small number realizing outsized returns are playing by different rules than the majority. It's almost as if the game is rigged!

It's not rigged. It's a researcher confusing the mathematics of how average return and geometric return are different if there is non-zero volatility. There is nothing revealed in the original article.

Exactly. The same intuition can apply to any risky investment.

Tempest in a teapot.

Are you certain you haven't identified a feature of your model, rather than a feature of the stock market/investors? Your conclusion seems highly dependent on the distribution of the returns, as you allude to in your mean reversion addendum. If we know anything about the stock market it is that it's distribution does not fit the clean models that we have used to fit it, just ask options traders from 1987.

Isn't that exactly the point? If you could structure a portfolio that mirrored in proportion the market, and maintained it over thirty years theoretically it would match the returns. In reality the buying and selling required to maintain the mirror image could make a large difference, especially over a long time frame. As you sell some stocks and buy others the timing would have to match the market precisely to capture the gains and eliminate the losses that a delay in time may incur. Other considerations such as the necessity to draw profits or increase the size of the investment would be effected by timing.

There's a problem with the simulation. A 20% standard deviation in each year, completely unrelated to past years, and without any other constraints, doesn't really model the market.

For instance, if the market experiences a steep decline in year N I'd wager it's likely to see below-average gains in year N+1. Vice versa if the market sees extraordinary gains in year N (i.e. recovering from a recession, present recovery notwithstanding).

If we think the past is any guide (standard disclaimers apply) then a more interesting simulation would be to do 100,000 runs where an investor pursues a buy-and-hold-for-30-years strategy and each run chooses the start date randomly between 1954 and 1984. In other words, if you picked a random 30-year investment period using real, historical market data, what are the odds you end up in the red? What are the odds you under-perform the "expected" 7% annual return? Etc.

Obviously your return is highly influenced by the state of the market when you start investing, and the above simulation models a completely naive investor who starts his buy-and-hold without any consideration of market conditions. If he buys at the nadir of a recession then he's statistically more likely to come out ahead. Or the opposite if he buys at the height of a boom. It's impossible to exactly time the market, but if one's investment window is sufficiently flexible then I'd imagine it's possible to at least approximately gauge when the market is "down" or " up". Esp. since you can expect recessions to occur at least once every 10 years or so. An investor who only initiates his long-term investments during recessions, or at least who avoids situations like 1999/2000, might stand to do better than the totally naive guy.

If we think the past is any guide (standard disclaimers apply) then a more interesting simulation would be to do 100,000 runs where an investor pursues a buy-and-hold-for-30-years strategy and each run chooses the start date randomly between 1954 and 1984. In other words, if you picked a random 30-year investment period using real, historical market data, what are the odds you end up in the red? What are the odds you under-perform the “expected” 7% annual return? Etc.

Yep. But note that the typical buy-and-hold investor saving for retirement is not exposed to a single 30 year window, but rather a large set of overlapping windows. One way to think of it is that funds invested at age 30 are withdrawn at 60, the money invested at age 31 is withdrawn at 61, and so on for the next 20-30 years. So there's really no risk associated with picking a single bad 30 year span.


You'd win your wager.
I did a quick look at the correlation of S&P500 yearly returns with the preceding year's return:

-0.13 (if preceding year's return is negative)
+0.01 (in general, 1970-2013)
+0.12 (if preceding year's return is positive)

"Suppose that the average return is 10%. If $100 is invested for two periods the average payoff is $100*(1.1)^2=$121. But on average that is not what happens. More typically, you get say 0% in the first period and 20% in the second period, i.e. $100*(1.0)*(1.2)=$120. Notice that the average return is exactly the same, 10%, but the total payoff is smaller in the second and more realistic case"

Is it? The average return is not exactly the same, is it? The average return of 1.0*1.2 is sqrt(1.2) - 1, which is 9.5%. So of course the payoff is going to be different. What am I missing?

Sure, but take it a step further. What is the Gini coefficient of the terminal distribution of wealth? That is, under the null hypothesis that asset* returns are random iid normal, what should we expect inequality to be?

* by asset returns I am including returns to human capital.
** I get .487

The correct distinction is between ensemble average return and time average return. These are arithmetic and geometric respectively for a GBM-type process. It is not possible for an individual investor to "diversify" in ensemble (states of the world), only in time. The time asymptote of expected wealth is either zero or unbounded (except in the unstable zero point case.) For GBM, the time expectation of return is given by the familiar Ito's Lemma formula: (mu-0.5sigma^2)t. When the first term is greater than the second, we have the unbounded case and when it is smaller we have the zero case. (Note that the convergence of these asymptotes is only in probability, not almost sure.)

In real-life, we can never realize either ensemble or time asymptotes even if independent assets with identical processes are available in the market, because there are only finitely many assets and the investor has only finite time. The actual expectation of realized wealth is thus determined by a "partial ensemble average"; when there are many assets but little time this resembles the ensemble expectation but as time is increased for any finite number of assets, time always wins and the expectation becomes more time-like. It is easy to construct plausible portfolios where, say 99% of returns are less than the partial ensemble average return after, say 50 years or so and I have done so myself.

If a (near) risk-free asset is assumed, then the best return to use is the excess return rather than the real or nominal return. But when such an asset is available, it is possible to use leverage by borrowing the risk-free asset or investing some fraction of wealth in it. The ratio of wealth invested in risky assets to total wealth is the leverage ratio. J.L. Kelly famously showed how to calculate the time-optimum investment ratio for various iterated discrete betting scenarios. For the continuous GBM, it is found by solving the Ito's Lemma quadratic; it is mu/sigma^2.

Thus, Sharpe and Markowitz were wrong to thing that preferences between different leverage ratios of the market portfolio can only be distinguished by risk-preference, because there is an objectively maximal point value, and expected realized wealth is not unbounded in leverage.

Ole Peters has published a series of papers explaining all this very well; they make interesting reading. He also applies the same distinction to show that the St. Petersburg Paradox.

mu-0.5sigma^2='expected return' in most models, which is why they ignore the Kelly criterion, and why risk is all you need in those models.

It's not a realistic scenario though. Nobody sticks $x in the market for 30 years then sits back and waits. Most people invest $x per month, and they only want to be confident that their pension will be worth more this way than it would have been if they'd invested in treasuries.

By investing small amounts regularly, you eliminate the risk that your initial investment was made at the peak of the market. You'd have to be unlucky to find that you'd made decades of investments in an ever-falling market. (Japan's stockmarket 1990-2012 is a rare counter-example; but a diversified portfolio wouldn't invest in just one country,)

To say that with 7% return and a 20% standard deviation, after 30 years 8.9% of all simulations will come up with the investor having lost money is clearly mathematical nonsense! The likelihood that with those parameters you would lose money is infinitesimally small. Can somebody please explain to me what is going on here???

Ignoring compounding ER is 210% (30 x 7%) and vol a bit over 100%.

So it would take a -2 std shock.

But compounding extends the right tail and raises the frequency but not severity of the rest, so doesn't seem so unreasonable.

All done in one pass on vacation so if I'm way off sorry...

-A 20% standard deviation is very large. Even over a 30-year time horizon, that comes down to 3-4%. So the likelihood of a 0% arithmetic return is not "infinitesimally small" -- it's about 2 standard deviations from the mean, so the chance is about 2-3%.

-The main point of this post is that geometric returns are smaller than arithmetic returns. If you go up +20% and then down -20%, you aren't back at even, you are down -4%. This is what bumps the likelihood of losing money over a 30-year horizon up to around 9%.

±20% is high for a diversified portfolio, but is low for somebody who puts all their money into a single stock, e.g., somebody who uses an ESOP for his entire portfolio. Alas, those poor souls could well experience ±30%, with a much larger fraction holding less money at retirement than they put in.

It also makes more sensible the notion of “what fraction of investors” will have fewer assets than the average of all investors. If I were to restate this blog's case, I'd talk about somebody putting in $100, and then looking at the outcome in each of the future parallel universes; there is no sensible “average” of those since we're not blessed with foresight of which future we end up in and we only fall into one—without being able to determine which.

There is still a role for risk-preference in selecting a leverage ratio, though, because there is normally great uncertainty in the correct choice of process to model asset prices and in the process parameters themselves. In the GBM case, that is mu and sigma. The point is that the risk of mistaken values is not symmetric; a too-low leverage ratio will reduce expected wealth but reduce risk. A too-high leverage ratio will reduce expected wealth but *increase* risk. Thus a more risk-averse investor should be biased to smaller leverage ratio estimates.

"By investing small amounts regularly, you eliminate the risk"

Not so; a regular investment program changes the quantitative results but not the qualitative ones (so it does do some good.) This I have also demonstrated myself by Monte Carlo.

Nice post and important.

A big part of the real world distribution thru time, and the only way "time diversification" can be at all saved is that volatility of long horizon returns has been lower than short horizon returns from mean reversion. We have point estimates but not a lot of data on this as we don't have many 20-30 year periods to test so anyone's guess about the future is as good as mine...

Right now it's all probably worse as with valuations worse we are starting with a lower long term ER.

Good points—we have indeed been very fortunate in retrospect, but no more clear about how strong today's future will be.

And let's also not fail to mention that if inflation is a “modest” 2% p.a., it'll make the paltry $447 median payout closer to $247 (about 44% lower) in today's dollars. The “average”—an amount that's relevant as long as you're indifferent between a 1% chance of a 101% payout and a 99% chance at a 1% payout, i.e., not relevant—is similarly reduced by inflation, which I'm sure Mr. Asness would agree needs to be watched for.

Share your code!

+1 Yes! Please share your code!

Done in Mathematica. Email me if you would like the file.

Or a large Excel sheet, 30 columns for with returns generated by
(A5=100; paste the function into B5 and copy out to AE5 then copy the row down as many sims as you want.)

Excel went through gawdawful versions of the RAND functions over time but it's pretty solid in recent versions. The NORMSINV function appears to match results from other packages, out to much more precision than this exercise needs.

If you only run 2000 rows (as was apparently instantaneous on this 4-year-old MacBook) instead of 100,000, you'll be a bit farther away from the formulaic mean result but then again, 100K sims isn't perfect, either. The FREQUENCY function shows that the results are a bit lumpier in the right-hand tail (as you'd expect) but the chart matches very well in the outcomes that matter.

Parameterize the 7% and 20% values to see how unlikely you are to break even if you put all your cash into an undiversified portfolio, e.g., an ESOP with ±30% volatility, or 5% expected return after 2% inflation.

This reminds me of the Social Security debate back during the Bush years. How viable would '401Kification' of Social Security be if you told people nearly 10% of them would loose money *even* if they followed all the rules financial experts tell them (namely put your money in a well diversified index fund and leave it alone for a long time)?

But before we jump on that bandwagon, I'm seeing a problem with the 'simulation' is that this is not how most people invest. They don't put a lump sum in the stock market when they are, say 37 and then cash it out at 67 good or bad. The more common method is to put some in every year for 30 years or so, usually that amount will grow since raises and such will make your contributions at the end larger than at the beginning.

How would the simulation look if you had a simple $1000 per year contribution that grew, say, 3% every year for 30 years and then cashed out? Would there still be losers? Would they be nearly 10%?

Right. Do you get closer to average returns if you contribute on a level basis such that over the course of the portfolio you wind up making some contributions in favorable parts of the variance and some contributions in the unfavorable parts of the variance.

I don't think it's correct to describe this as 'nearly 10% of them would loose money'. There is a theoretical chance of losses but, over the horizons 30 year horizons, no one who has followed the rules has lost money, at least since WWII in the US.

A pattern of contributions, rather than a single contribution, makes some difference. It creates path dependence in outcomes, with later years figuring more prominently.

Here are some empirical worst case results from the S&P 500 since 1950.

$1 lump sum investment: worst 30-year nominal return: 9.2% (1955-1984).
$1 lump sum investment: worst 30-year real return: 4.3% (1965-1994).

$1 annual investment, increasing 3% per year: worst 30-year nominal return: 8.2% (1952-1981).
$1 annual investment, increasing 3% per year: worst 30-year real return: 2.3% (1952-1981).

> at least since WWII in the US.

You are specifying the exception to the rule. I would be surprised if a baby born in Germany today doesn't experience a major war in his lifetime.

Also, "since WWII" is only two thirty year periods, which isn't much of a basis for expectations.

Yeah, I distrust models and prefer to understand what actually has happened.

And I'm open to the idea that this is a somewhat exceptional period. The S&P 500 has produced average 30-year returns during this period of 11.2%, and average real returns of 6.3%. Most people expect these averages to be a few percent lower going forward.

Yes, the data are overlapping, but the 'cohort experience', even for neighboring cohorts, can be quite different. 1978-2007 produced a 12.1% return, while 1979-2008 (29 years of overlapping data) produced an 8.9% return. Huge difference there.

At least any mean reversion effects are internalized to these numbers, rather than being another potential source of model error.

This is not my area of expertise, but I think there are econometric methods for dealing with this sort of induced autocorrelation arising through overlapping observations in time series data. So a sophisticated analysis might be able to get a better estimate of peacetime returns in the most successful economy ever by looking at the last 70 years in the US. But I think straightforward analysis would tend to underestimate variance.

I think you're right, but it seems to me that positing some kind of autocorrelation assumption just shifts the problem to (a) estimating this correlation based on historical data, and (b) continuing to assume the future will be like the past. Depending on parameters, you could tell whatever story you desired.

I meant that when you have a sliding 30-year window, you get autocorrelation between the windows because successive windows share 29 years of returns. It would astound me if there aren't seven different methods of correcting for that.

I don't think this exercise was meant to be a forecast, and I think it's a mistake to treat it as such.

It's an exercise to show that *IF* thirty subsequent years' returns are randomly drawn from a distribution that's 7%±20%, the outcomes (a) range wildly, utterly out of control of the person who opted into the game of “investing” $100 in this lottery, and (b) the mathematical average is wildly unrepresentative of what most players will receive.

If you think the numbers are unrealistic, take my formula above and re-jigger it. Also, since few of us lock up our wealth on year 1 and cash out in year 30, go ahead and add another $100 or some other amount each year, or any other logic, such as increasing your investment any year when you're behind some target. If returns mean-revert, skew any year's results part-way to bring the current year's returns closer towards some mean expectation.

Et cetera. Pretty sure that unless you garbage up the formulas (which many of us are likely to do if we try to be too clever), points (a) and (b) above will shine through.

One thing you *can't* do: increase the returns any year to match what you'd like to get, or target getting. You don't get to vote about next year's returns in real life; don't build some elaborate spreadsheet to do it, either.

Under Alex's model, the probability that a 65-year run would produce no 30-year periods with negative returns is about 65%.

The probability that the worst 30-year period in that stretch would have a cumulative return of at least 40% is over 50%.

So the supposition that there's nearly a 10% chance of losing money over a 30-year period is not exactly contradicted by a 65-year-run where the stock market always wins.

How about no 30-year periods with returns lower no lower than 4.3%? How likely is that under Alex's assumptions?

Correction: how about no 30-year periods with a return of less than 9.2% in a 65-year run?

I think we can reject Alex's model.

Under Alex's model the probability that the lowest 30-year average annual return is 4.3% is over 10%.

Bump that up to 9.2%, and the probability drops to under 0.5%.

But again, this is all with a model where you have nearly a 9% chance of losing money over a 30-year period. You can tweak the model a bit -- bump up the standard deviation to 30%, say -- and now you have a 22% chance of losing money over a 30-year period, and a 7% chance that the worst 30-year return since 1950 would annualize out to 9.2%.

Obviously the bigger issue is that there's no guarantee that we're drawing from a stationary distribution (in fact it's almost certainly true that we're not) but that only makes my point stronger -- a 65-year stretch of great results does not strongly indicate that the risk is negligible over 30-year periods.

I'm kind of with Brian here. Just because it's convenient for theoretical finance people to model stock returns as iid normally-distributed random variables doesn't mean that they actually behave that way in reality. A world in which the total return for equities is negative over a 30-year period is a strange world indeed. It's a world in which for a whole generation businesses were unable make any profit by producing and selling goods and services. To get that result you need more than just a bunch of bad draws from a random number generator; you need a sea change in how our economy works. That's not to say that it can't happen, but the probability of it happening is an independent parameter, unrelated to the standard deviation historical stock market returns. The same thing is true on the high end, by the way. There is no way you get a 25,000x return in 30 years in an economy that looks anything like the one we know.

The practical upshot is that real systems are almost always subject to constraints that a random walk model doesn't capture. Therefore, you have to be a little skeptical of the model's results, particularly when you look at regimes very far outside of what has actually happened historically.

Ugh, sorry, please ignore my numbers concerning the 30% standard deviation assumption. You still have less than a 1% chance of seeing the post-WWII pattern.

Increasing the average arithmetic return to 10% (while leaving the standard deviation at 20%), on the other hand, does the following:

-Probability of losing money over a 30-year period: 5.4%
-Probability of averaging at least a 9.2% over every 30-year-period since 1950: 1.5%

Of course who knows. If the stock market was lower now than it was in 1985, would people be comfortable with arguments that there was a 25% chance that annual returns would have been 10% per year (making up numbers for the sake of argument)? You only run the tape once.

rpl, I agree that no one should mistake the map for the territory. However, a negative stock market return over 30 years is not necessarily a market in which companies were unable to make profits -- it's just a market in which the prices of these companies failed to appreciate over time.

If you want to tie stock prices to economic fundamentals, you should explain why stock market returns would outpace economic growth for decades -- if company profits increase by 5% per year, why should stock prices go up 10%? If the answer is "risk," then why should it be the case that stocks are very safe over multi-decade time periods?

I don't have answers to these questions, I just don't think they're very straightforward to answer.

Popeye, failure to appreciate isn't enough on its own, right? If we assume that profits stay constant, then it seems reasonable to assume that dividends do too. So, prices have to depreciate at a rate that offsets the dividend yield. That in turn means the yield will keep rising, which means the depreciation necessary to keep the return at zero gets even larger, and so on. Sooner or later that's going to become untenable. Or, to put it another way, one way or another, sooner or later free cash flow finds its way back to the investors.

I agree that stock market returns are higher than the overall growth rate of the economy because of risk, and the basic point that a 30-year time horizon doesn't entirely eliminate that risk is sound. However, the quantitative claims about the percentage of the scenarios that lost money (or underperformed other asset classes) are suspect, and if you're trying to decide how to allocate your own investments, the quantitative part matters.

Along those lines, whenever I see this argument, I always wonder how the person making it has allocated their investments. Do they give any serious thought to the possibility of a 30-year bear market?

rpl, my understanding is that the Nikkei 225 dropped around 80 percent between 1990 and 2003, and Japan had a functioning economy during that time period; it wasn't invaded or devastated by natural disasters.

One incentive people have for downplaying the risk of a long-term bear market is that such a market would hit everyone, including your peers (and fellow investment managers if you are an investment manager). That would take out some of the sting of a huge market drop, because everyone goes down together.

@Popeye, yeah Japan does give one pause. I'm skeptical about bubble theory, but Japan in the 1980s and NASDAQ in 1999 are hard to argue against.

I thought numbers would be available, but I couldn't find any, so I did my own calculations, reflecting the index, dividends, and exchange rates.

The guy who invested $1,000 in December 1989 had about $400 at the trough in February 2009, a -4.7% NOMINAL(!) CAGR over a 19.2-year period. Sobering.

Of course, the Nikkei index was rocking three-digit p/e ratios in 1989.

The guy who invested $1,000 at the beginning of 1983 (pre-bubble) saw a lousy but not horrific 4.8% CAGE over the 26.1 year period to the February 2009 trough, and has since bounced back a bit more to a 6% CAGR over 30 years.

The December 1989 investor has also bounced back but is still underwater at about $800 after 25 years. Just brutal. Even this guy, though, will get back to $1,000 with 3-4% returns over the next 5 years. Godawful to be sure, but if this is the worst outcome from any developed market over the past 70 years and it still doesn't produce a negative return, and it's a 'single point' (rather than a 'dollar-cost averaged) investment, I still think the left tail ain't what these models make it out to be.

@rpl, I agree. With a long enough holding period, the actual performance of companies dominates returns, and entry and exit prices become less important..

Brian, thanks for the data. My first guess when I read Popeye's post was that the index must have had some kind of truly absurd valuation at the start of the bear market.

Popeye, the Nikkei example is interesting, but is it relevant to asset allocation in the US market in 2014? The P/E for the S&P500 right now is just under 20. If we imagine an 80% decline with no corresponding collapse in profits, then the P/E would end up around 4. How does that happen? Think about midway through the process, when prices have collapsed by, say, 50%, so P/E is around 10 and the dividend yield is about 3.5%. Are there really no buyers at that price? It doesn't seem plausible to me.

Maybe a more succinct way of saying all of this is that stock prices have some purely random volatility, sure, but over the long term they are governed by fundamentals. The future behavior of fundamentals is uncertain too, but not anything like as uncertain as what is implied by price volatility (q.v. Shiller). The practical upshot is that it's a mistake to blindly throw the standard deviations of past prices into a model like Alex's, and it's a huge mistake to assume iid. All else being equal, a market trading at a P/E of 10 is going to have a very different probability distribution for its returns than one trading at a P/E of 100. Likewise for the distribution of returns in an economy that's just coming out of recession vs. one that has been having a multi-year boom.

A model that assumes away the influence of fundamentals might be useful for some purposes (e.g. valuing derivatives over relatively short periods of time), but one has to expect the results to be a little broken if you try to extrapolate them over decades.

Good point.

But let's not overlook the more important one: extrapolating from historical returns and variances implies that we are handed some outcome as if God reaches into an infinitely large jar labeled Contents: returns, 7%±20% Use with caution.

Probably nobody here thinks that the economy is anywhere close to repeating the experience of (a) still the worlds greatest depression followed by (b) a tsunami of government borrowing/spending for WWII, and then (c) a huge explosion of suppressed consumer spending fueled by personal credit. For starters, starting the clock after you've thrown out a decade of awful results is hardly an unbiased history, even if history were to repeat itself perfectly.

Uh, what if you inform the most productive people in the population that social security is a guaranteed negative real yield for them?

How about we tell the public that social security can, and likely will, go bust in the future, and that every proposed "fix" to save the system necessarily involves reducing the returns.

I think there are major problem's with Alex's conclusions, but it is interesting because if Alex's conclusions were correct it's a great argument for SS. I support SS in it's current form simply because there's no reason not to tie the minimal consumption of the elderly to the production of the working population of an economy as large as the US's, but that seems to be a tough concept for people to grasp.

many markets have experienced 30-year even 50-year negative real returns after inflation. for example german or austrian stocks in 1930. bye bye money. or japan. luckily the massive financial market of the USA will keep the market going. but it will be supplanted by the chinese stock market very soon. bye bye 7% per year then, you will find those only in China.

Yeah, people forget that the normal fate of modern developed nations is to get invaded or have a revolution or civil war and have much of their wealth destroyed. There are only a handful of examples of major nations for which this hasn't happened in living memory. I think they're all islands or nearly so.

This is assuming each investor invests independently. If all investors invest in the market portfolio, then they will all get identical returns, for better or for worse.

I think you underappreciate mean reversion. 2008 and 2009 are right next to each other. Not a coincidence.

I also don't think the science is settled on time diversification. A long horizon is an asset in investing. You can ignore liquidity and short-term volatility. I anticipate push back here, and I have read Bodie and Taleb and other 'tail of the distribution' types, but I think these tail events are associated with a societal meltdown that would make stock market losses the least of your concerns.

Anyway, here are S&P 500 CAGR's for various horizons since 1950 (mean, SD, 5th percentile, actual worst case). Maybe it's just been a really good 65 years:

1 year 3 year 5 year 10 year 20 year 25 year 30 year
Mean CAGR 12.1% 10.9% 10.7% 10.5% 10.8% 10.8% 11.0%
Standard deviation 16.3% 9.1% 7.4% 5.2% 3.0% 2.2% 1.3%
5th percentile CAGR (2 SD) (20.5%) (7.4%) (4.1%) 0.0% 4.7% 6.5% 8.5%
Worst experience since 1950 (36.6%) (14.5%) (2.3%) (1.4%) 6.5% 7.8% 9.2%

Ugh, formatting.

1 year 3 year 5 year 10 year 20 year 25 year 30 year
Mean CAGR 12.1% 10.9% 10.7% 10.5% 10.8% 10.8% 11.0%
Standard deviation 16.3% 9.1% 7.4% 5.2% 3.0% 2.2% 1.3%
5th percentile CAGR (2 SD) (20.5%) (7.4%) (4.1%) 0.0% 4.7% 6.5% 8.5%
Worst experience since 1950 (36.6%) (14.5%) (2.3%) (1.4%) 6.5% 7.8% 9.2%

Isn't the decreasing volatility due to overlapping intervals?

Yeah, and, as others have noted, the distribution is skewed too. That's why I included worst case outcomes as well. Pretty crude, but nothing in the record shows the left tail outcomes the models predict. If it's 370 AD and I'm in Rome, I'm skittish.

In an effort to address that issue I tried sampling random sequential datapoints from historic data. Does that yield a more realistic result?

Another distinct reason why "the average stock market returns aren't average": Companies issue new stock and also buy it back. Issuing new stock tends to be followed with decline stock prices, and buybacks with increases. (Possibly causation; In addition, companies might have better knowledge than the market of when to issue or buy back.)

However, since the number of shares outstanding changes, it's simply not true that the average investor gets the average return (if you don't weight by number of shares outstanding), even though any individual investor could get (ignoring fees) index result by buying and holding. If a stock goes from $100 to $200, has an additional offering, then declines to $50, then has a buyback to return the shares to the original number, and later goes back to $100, it's not true that "the average investor had a 0% return" even though anyone who bought and held did. More investors/shares saw the $200 to $50 decline than saw the increases from $100 to $200 and from $50 to $100.

See the paper by Ilia D. Dichev mentioned here and by Hal Varian.

This sort of scenario-based forecasting is what you get if you use the retirement planners at a large mutual fund firm, e.g. T Rowe Price. They generate a bunch of scenarios and tell you you have an x% chance of not running out of money by age y given your investment mix and initial bag of money.

(a) This is much more realistic than the old practice of just plugging in averages, and also provides a better cover for the investment advisor than the standard "past returns don't guarantee future returns".


(b) I never realized that because of the mean/median difference (of which I was well aware!) this also creates an argument for saving more, which is in the best interest of the investment firms (and probably my own best interest).

It really depends on how you feel your utility versus consumption curves look. If it's linear, it doesn't matter much what you do. If they aren't linear, you need to decide what the curves look like and how much you should spend to avoid certain low probability negative outcomes. This is what most people should be doing, but it's hard to say how much you are willing to give up in expected consumption to avoid a 5%,1% or 0.1% outcome.

Quick question for the peanut gallery. Don't regular fixed dollar purchases help increase returns beyond simple time diversification, as this strategy automatically increases quantity purchased when prices are low and reduces quantity purchased when prices are high? This strategy would buy less stock in 2000 and more stock in 2008 which would be helpful and the benefit would increase with volatility. I think this effect relies on mean-reversion but I am not positive.

If you get fired when the market crashes this kills your strategy though. Kind of a big deal for the percentage who experience this.

This is right- so perhaps we should put a larger focus on the changes in discount rates in the decline in stock prices '08-'09 as many people decreased their asset purchases due to risks of layoff. But if you are in a relatively secure job, this strategy of constant dollar purchases would help increase returns somewhat and would also mitigate volatility somewhat.

Fat tails! I believe the reaction of Brazilian soccer fans to the loss to Germany was milder than the reaction of some readers of this blog to fat tails.

AlexT is simply illustrating that the average is tricky when you have a Gaussian process that may be mean reverting but open-ended.

Simple related analogous point: flip a fair coin X times, until the number of heads (H) is balanced by the number of tails (T). What is the average number of flips for mean reversion? So HHTT would be four flips HHTHHHTTT would be nine flips. Turns out, the number (average) is infinity because of outliers. That's what happens when you have an open ended normal distribution.

Kinda obvious, but somehow I never learned this; also never heard of this.

Say I've a lump sum to invest: using the same 7%/20%, would love to get a feel for how much (if at all) splitting up my investment over 5 years would bring the median closer to the mean (while obviously bringing the mean down). . . .

Almost none - you are still 100% in the market at the end of the 5 years. Running the simulation for 25 years wouldn't change it too much from running it for 30 years.

Your argument is fleshed out in my 2006 Financial Analyst's Journal article.


The median outcome in your simulation is determined by the geometric mean, not arithmetic mean. Since the geometric mean (approximately) equals the arithmetic mean minus half of the variance, .07-.2^2 = .05.

In this paper, we should that as the horizon lengthens, the mean outcome goes to the 100% percentile. That is, average outcomes become literally unattainable.

The takeaway is that taking the arithmetic mean and compounding it out over many years provides a very misleading number, if one wants to get a sense of what might be typical. Fun stuff.

Alex models 100,000 outcomes but keeps the portfolio size the same at $100. But what this demonstrates, in part, is the importance of that starting portfolio size. Someone with a $10,000 to invest can further diversity, so that it's as if she's investing $100 a hundred times over, thereby covering a wider range of returns. Sure she'll underperform with most investments, but the investments that land on the left side of the distribution will compensate, and she'll end up with a return that's closer to the mean, rather than the medium.

Picketty describe just such an effect when he compares endowments of major university's over the last several decades. The larger the original endowment, the higher the average returns--leading to an ever-widening gap (i.e., greater inequality) between the total endowments.

Piketty´s data on endowments are less than convincing when you also include the 1.7% returns for Harvard from 2009 to 2013. see http://gregmankiw.blogspot.com/2014/06/the-saddest-chart-ive-seen-today.html

The period Mankiw uses (2009-13) skews those numbers. Harvard was hit particularly hard by the crash, and didn't start recovering until 2010. Overall, Harvard does exceptionally well, http://www.hmc.harvard.edu/investment-management/performance-history.html, and highly endowed schools on the whole perform much better than their impecunious peers, http://piketty.pse.ens.fr/files/capital21c/en/pdf/T12.2.pdf.

Warren Buffet explained.....

Cochrane covered this idea awhile back:


Your methodology for calculating stock returns essentially assumes that all ( 100%) of stock returns stems from capital gains.

But that does not match reality because if ignores dividends.

The actual studies of stocks long run returns find that returns on reinvesting the dividend is as important if not more important than capital gains. In the famous study of returns from 1929 to 1954 that showed stocks out performed bonds found that virtually all of the returns were from reinvesting the dividend, as the stock market did not surpass its' 1929 level until 1952 ( i think that is the year because I'm doing this from memory.

Moreover, because of reinvesting the dividend the actual volatility will be less than in your calculations and the odds of negative years will be much lower.

If you set the s&p 500 with dividend reinvested and the s&p 500 index equal to 100 in January 1970 -- the first year in my database--. at the end of June, 21014 the index of the s&p with dividend reinvested is at 9021 and the s&o 500 index is at 2156.

Over the 45 years since 1970 the annual average increase of the S&P with dividend reinvested was 200 points while it was 47.6 for the S&p 500.

In June, 2014 the dividend reinvested index was 4.18 times the S&p 500 index.

I think this generally gets taught in the intro finance course in most business schools. (I know it did where I taught.) So this isn't really "new" news.


Let me explain what you have done here as you have made some bizarre statements. You have demonstrated that for a random sequence of numbers (doesnt have to be gaussian) that the arithmetic mean is always greater than the geometric mean. This is a basic lesson in finance, but not one which we interpret as the average investor gets less than average returns. It does mean though that reward-less risk eats returns.
arithmetic mean = average annual return
geometric mean = [product(1+each return)]^(1/years)

When practitioners say that the long run market return is 7% they mean the geometric mean is 7%. In your simulation you have used 7% as an arithmetic mean then showed that the geometric mean was less which is expected as per the above stated rule.

Now it is still true, that regardless of the return assumed that more than half of the investors would receive less than the average pay-off. But you are not really comparing investors here are you? You are comparing universes. or epochs lets say. Investors returns on the other hand are highly correlated with each other. In a given year everyone experiences the same broad market return. Their difference with one other is their individual tracking risk, which is much lower. But here is an important point, related to what you are trying to say, but not the same: tracking risk will not be rewarded. To the extent that people have personal annual returns that are evenly distributed around the market return, the average personal return will be less than the long run market return. The average investor should not fear market volatility as you seem to imply, they should fear their own judgement.

As for time not being a diversifier. It is certainly true that the variability of returns as measured in dollars increases, but this is not the same as saying time is not a diversifier, the expected return increases too. You point out yourself that only 9% of investors lose money over the entire period. What percentage do you think lose money after the first year? Actually as an excersize you might want to calculate the average sharpe ratio experienced by each investor as time goes on. This increases rapidly but tops out pretty quick. This is related to the claim that portfolio diversifiers really dont need to choose more than 5 funds before the benefits of additional diversification dry up.

Excellent analysis. I wasn't aware of these details but fortuitously they were evident once I simulated the data myself.

Yes thanks Andrew, I was looking for something like this. It's my impression that investor's portfolios are highly correlated - I'm an index tracker myself, and thus the variation is as you say the "tracking error" around the market return. And as others have said, if the past is any guide (!), the market return over 30 years is going to be positive more than 92% of the time.

I may have found the bug in Alex's code.

For a single 30-year simulation, Alex somehow drew 30 "random" returns with a 7% average gain having a 20% standard deviation. But I suspect that each set of 30 returns were drawn from a population conforming to the (7%, 20%) parameters. If so, then not every simulation would conform to 7%, some could be significantly higher, some significantly less.

I performed a similar simulation, somewhat crudely. Rather than worry about variance and fat tails, I obtained the most recent 522 S&P500 monthly returns. A 30-year simulation becomes a 360-month simulation. I didn't bother with the (7%, 20%) parameters; I was just looking means, medians and distributional shape.

If 360 monthly returns were drawn with replacement from the 522, results were similar to (and lower in gain than) Alex's: about half the simulations lost money. But if the drawing was *without* replacement, then every simulation made money (although the worst of the 100,000 gained a total return of only 14.8% total, which is less than half a percent per year annualized.

In retrospect, my simulations are uninformative: consider, for example, drawing at random 30 annual returns from the past 30 years. With replacement, some simulations will collect more than one copy of 2008, and will likely suffer losses; other simulations will fortuitously avoid 2008 and likely make big gains.

I am curious: how does one draw 30 random numbers subject to the constraint that the drawn numbers must have a given mean and a given standard variation? Note that this is not the same as drawing 30 random numbers from a given Normal Distribution. It seems to me that you would have to start with some set of 30 random numbers, scale them to get the desired mean, then iteratively tweak them, maintaining the sum, to get the desired standard deviation. I am missing something, or is Alex? or both?

Feature, not bug. No fundamental law guarantees every thirty year sample to have an exact 7%-20% distribution.

Agreed, of course. Alex chose 7%-20% arbitrarily for his simulations. And I misunderstood what he was up to.

Have a look quantopian.com if you want to backtest ideas against actual historical market data.

It's actually interesting that Alex seems to think that his observation about geometric/arithmetic returns is somehow related to inequality. Investors clearly aren't drawing independent returns, if the market goes down one year most investors go down with it. Yet Alex says things like "Lady luck is a bitch, she takes from the many and gives to the few" and "Most investors earn less than the mean payoff."

It's almost as if he has some kind of deep belief that inequality is a fundamentally impersonal result arising out of some mathematical facts (about probability or what-not).

Where does one get the data for this simulation? I'm interested in playing with it myself.

Google for "random normal deviates"; or google for "random normal deviates xxx" where xxx is your choice of Excel, MatLab, R, Mathematica, etc.

Why are the results surprising? e.g. in the example of 10% avg return over 2 years-> 1.10 x 1.10 will ALWAYS be greater than 1.11 x 1.09 or 1.12 x 1.08 or 1.13 x 1.07 etc.

A deviation from the mean (avg) return will always give a lower cumulative value than if the mean (avg) was achieved each year. Did you even imagine getting a different result?

High-frequency traders routinely earn annualized returns exceeding tens of millions percent. Of course, they do it only for brief periods, so the total return is more modest. And they have real-resource costs that eat up much of that return. Moreover, because there are no barriers to entry, we would expect their (long run, equilibrium) risk-adjusted net returns to capital to be normal. They argue that their high apparent returns are justified because they are providing a service (liquidity), rather than capital, to financial markets; I argue elsewhere that HFTs are dissipating rents by expending real resources in unproductive racing.

Setting that argument aside, however, what are the implications for the average investor? Even one pursuing a buy-and-hold strategy, with almost no trading? I believe that the fat HFT tail lowers the return of the average investor -- who, by the way, cares not at all about liquidity. Too much of the HFT debate has focused on the fairness of trading; it would benefit from a closer look at the long-run effect on total market returns, and their distribution. Even buy-and-hold investors, I suspect, are paying the cost of HFT strategies.

It is a well known fact that for a log-normally distributed random variable with mean "mu" and standard deviation "sigma", the expected value is = mu - 0.5 sigma^2. So, plugging in our numbers, 7%-0.5*(0.2^2) = 5%, pretty close to the 5.1% measured. It is no big deal.

What mean and sigma did Alex use? I read his post as using mean=7 and sigma=1.4 (20% of 7); but my results (after careful checking) are much more positive than his.

Comments for this post are closed