What every trader needs to know about regression to the mean

This is a classic trading post from my investment blog GoodeValue.com.

Perhaps one of the most widely disseminated and most widely misunderstood statistical concepts is that of regression to the mean. It is also one of the most important concept for investors to understand.

The simple definition of regression to the mean is that with two related measurements, an extreme score on one will tend to be followed by a less extreme score on the other measurement. This definition will not suffice for us as it is incomplete. Regression to the mean only happens to the extent that there is a less than perfect correlation between two measures. Thus, as a technical definition, let us use that of Jacob Cohen: whenever two variables correlate less than perfectly, cases that are extreme on one of the variables will tend to be less extreme (closer to the average or mean) on the other variable.

For those of you who have been away from math for too long, a correlation is simply a measure of how well one thing can predict another. A correlation of 0 indicates that two things are unrelated, while a correlation of 1 or -1 indicates that they are perfectly related. See this website for a nice graphical presentation of what different correlation coefficients mean. For example, the price of a restaurant is correlated with its quality at about .60 (this is just my rough guess)—more expensive restaurants tend to be higher quality than less expensive restaurants, but there are plenty of exceptions.

On the other hand, I would estimate that price correlates more strongly with the quality of chocolate—probably around .80. Except for exceptions such as Candinas and Sees, most really good chocolates are horribly expensive, while cheap chocolates (such as Russell Stover) are invariably bad. An example of a near-perfect correlation would be the correlation between altitude and temperature at any given time in any given place—as the altitude increases, the temperature drops.

Some people refer to regression to the mean as a statistical artifact. It is not. It is a mathematical necessity. Let us start with a very simple example. Suppose that people who have more money tend to be happier than those with less. This is actually true, but the correlation is weak—money really matters to happiness only to the extent that people can afford the basic necessities. If we were to predict the happiness of both 100 billionaires and 100 people who live on welfare, we might expect that the billionaires would be significantly happier. In fact, billionaires are only slightly happier than those on welfare. Because the correlation is so weak, we would be better off ignoring the correlation of wealth and happiness and just guessing that everyone was of average happiness.

Let’s try another example. Suppose that you work as an admissions officer for Harvard. You have two main sources of information in order to decide whether or not to admit prospective students. You have the candidates’ SAT scores and you have the results of their admissions interview. Suppose that one student has an SAT score of 1550 (out of 1600 possible points) and a very bad interview—the interviewer considered the student to be uninteresting and not very bright. Another student had an SAT score of 1500 and an outstanding interview. Assuming there is only one spot left, which student should you admit and which should you reject?

Take a moment to think and make your decision. You most likely chose the student with the lower SAT score and better interview, because the SAT score was only slightly lower, while the interview was much better than that of the first student. However, this is the wrong decision. Repeated studies have shown that admissions interviews have no correlation whatsoever with college student performance (as measured by graduation rate or college grades). SAT scores, on the other hand, do correlate (albeit less strongly than most believe) with college grades. Thus, you should completely ignore the interview and make a decision purely based upon SAT scores.

I admit that this example is unfair—truth be told, SAT scores are only correlated moderately well with college grades: about .60. That means that there is little difference between a score of 1550 and a score of 1500. However, a small, meaningful difference is still more informative than a large, meaningless difference.

To make this a little more clear, we can do this without an interview, since the interview is useless. Rather, we throw a die (as it is equally useless). For the student with the 1500 SAT, we roll a 6. For the student with the 1550 SAT, we roll a 3. Would you decide to admit the student with the 6 because of his higher die roll? Obviously not, because the die roll is pure chance and does not predict anything. The same reasoning applies to the interview, since its relation to school performance is just chance.

Suppose we selected students based on a roll of the die—how would they fare? The students with the best scores would tend to do average, while those with the worst scores would also do average. This is perfect regression to the mean. Simply put, the die roll adds nothing.

Regression to the mean only happens to the extent that the correlation of two things is less than perfect (less than 1). If the correlation is 0, then there will be perfect regression to the mean (as with the die). If the correlation is between 0 and 1, then there will be partial regression to the mean. Let us now look at example of this.

There is a correlation between income and education level. I cannot find the actual data, so I will make it up—I will say that it is around .60. Therefore, level of education (as measured numerically by highest grade level or degree completed) is a fairly good predictor of a person’s income. More educated people tend to make more money. Let’s look at a sample of the top 10% of money-earners. If education perfectly predicted income, then those top money earners would be the top 10% most educated. Whereas education imperfectly predicts income, we will find regression to the mean. Those earning the highest incomes will tend to be well educated, but they will be closer to the average education level than they are to the average income level.

One of the beautiful things about regression to the mean is that if we know the correlation between two things, we can exactly predict how much regression to the mean will occur. This will come in handy later.

If all we had to worry about when two things are not perfectly correlated was regression to the mean, we would be fine. It is fairly simple to calculate a correlation coefficient and then figure out how much of some effect is caused by regression. Unfortunately, there is one more complicating factor: measurement error.

Imagine you have a bathroom scale that has 100% error. In other words, the weight it shows is completely random. One morning you weigh yourself at 12 pounds, while the next morning you weigh 382 pounds. Whereas height is normally correlated strongly with weight, your weight as measured by your scale will not correlate with your height, since your measured weight will be random. If we make the bathroom scale just a little more realistic and say that its measurement has 2% error (quite normal for bathroom scales), the same problem applies—the measurement error reduces the apparent correlation between height and weight and increases regression to the mean.

This is exactly the problem that we see in the stock market, although the errors are much larger than with your bathroom scale. The value of a company is a function of only one thing: the net present value of its future cash flows. That, in turn, is determined by two things: the company’s current price (as measured most typically by P/E or P/CF) and its future earnings growth. The measurement of P/E has very little error. The estimation of future growth has much error, though.

For the moment let’s assume that P/E and future growth each account for half of the current value of a company. (This is actually wildly inaccurate—as the growth of a company increases the growth will become much more important than the current P/E in determining the net present value of the company. Conversely, if growth is zero, then P/E will completely determine the net present value of a company.)

Since P/E accounts for half of present value, it is correlated at r=.71. (R2 is the proportion of variance explained, which is .50 in this case, so the square root of this is the correlation coefficient r). This is a fairly strong correlation. Nevertheless, it is far from perfect. Regression to the mean will ensure that companies with the most extreme P/E ratios will be less good values than is purely indicated by their P/E ratios. When you think about it, this makes perfect sense—some companies deserve low P/E ratios because their prospects are poor.

Now for the other half of the equation: growth. Growth is correlated at r=.71 with the net present value of the company. However, that is assuming that we can accurately predict future growth. This is simply not true. Analyst predictions of company earnings less than one year ahead are on average off by 17% of reported earnings (meaning that near-term estimates have a .83 correlation with actual earnings*). Their estimates of growth years in the future are of course much worse. So while the correlation between future growth and present value of a company is fairly strong, .71, the correlation between predicted growth and present value is very much less than that (about .28).

Due to this reduced correlation, there will be much greater regression to the mean for growth as a predictor of value than there is for P/E. The one problem is that investors do not take this into account. Investors and analysts put faith in projections of high growth for years in the future. However, the chances are only 1 in 1,250 that a company will go for 5 consecutive years without at least one quarter of earnings over 10% less than analysts’ estimates. This even understates the problem, because in the above calculation, the estimates can be updated until just before a company actually announces earnings. Estimating earnings five years in the future is impossible.

Remember how I earlier mentioned that as a company’s growth rate increases, its current P/E has less and less relation to its true value? The true value of these companies (such as Google GOOG is determined primarily by their growth rate. So in effect, when the growth investors say that P/E does not matter if the growth is fast enough, they are correct.

There is one problem with this: because of regression to the mean, those companies that grow the fastest are also most likely to under-perform analyst and investor expectations. So the predictions of growth will be least accurate for those companies whose value most depends on their growth rate!

Investors do not realize this and they thus bid up the prices of growth stocks in proportion to the anticipated future growth of a company. Because of regression to the mean caused primarily by the lack of reliability of analyst estimates of earnings, earnings for the best growth companies (as measured by anticipated future growth rates) will tend to disappoint more often than other stocks. The converse will actually happen with the most out of favor stocks: analysts and investors are too pessimistic and thus they will underestimate future earnings and cash flow growth. See “Investor Expectations and the Performance of Value Stocks vs. Growth Stocks” (pdf) by Bauman & Miller (1997) for the data.

Some converging evidence for my regression to the mean hypothesis would be useful. According to my hypothesis, earnings growth for the lowest P/E or P/BV (Price/Book Value) stocks should increase over time relative to the market, while earnings growth for the highest P/E or P/BV stocks should decrease relative to the market. The value stocks in the following data are those with the lowest 20% of P/BV ratios, while the growth stocks are those with the highest P/BV ratios. Ideally, I would look not at P/BV, but at projected earnings growth, but these data will do.

The value stocks have earnings growth of 6.4% at the point in time when they are selected for their low P/BV ratio. After 5 years, their earnings growth increases to 11.6%. Their increase in earnings growth rate was thus 5.2 percentage points. The growth stocks, on the other hand, see their earnings growth rate fall from 24.6% to 12.1% (decrease of 12.5 percentage points), while the market’s rate decreases from 14.2% to 10.6% (decrease of 3.6% percentage points). The figures for cash flow growth are similar: value stocks increase their growth rate by 2.3 percentage points, while the market decreases its growth rate by 3.3 percentage points and the growth stocks see a decrease in growth rate of 10.3%. Changes in sales growth rates are not as convincing, but do not contradict my hypothesis: value stocks do as well as the market (seeing a 3.6 percentage point decrease in sales growth), while growth stocks see a whopping 6.5 percentage point decrease in sales growth rate.

The icing on the cake is in return on equity (ROE) and profit margin. In both cases there is no such benefit for value stocks over growth stocks. Why? Both ROE and profit margin are primarily determined by the industry a company is in: commodity industries will see lower ROE and lower profit margins, while industries with a possibility of long-lasting competitive advantage will see higher ROE and profit margins. ROE and profit margins tend to remain relatively stable (but generally decreasing over time for every company), meaning that they are reliable measurements. More reliable measurements means less regression to the mean.

So what does this all mean? Investors do not overreact to good or bad news. Or at the very least, it is not some sort of emotional overreaction—rather, they predict that current (either negative or positive) trends will continue. They do not take the unreliability of their estimates into account. Thus, they do not anticipate nor do they understand regression to the mean.

While this article is geared towards investors, traders need to know how regression to the mean works. I will address specific regression issues in trading in a future article.

*This is not true. I am not sure how to calculate the correct number, though, so I will use this as an approximation.

How the NYSE and SEC abet stock fraud by limiting short selling of penny stocks

This is a classic trading post from my investing blog, GoodeValue.com.

I’ve been going over Regulation T (Reg T; you can see it in its full glory here), which is the SEC rule that governs margin loans, as well as the NYSE margin rules for margin accounts. And if I were designing regulations to increase stock fraud, I could think of no better way to do it.

Why is this? The margin requirements for short selling stocks are greater than for buying stocks, at least for cheap stocks (below $2.50 in value). Here is how it works for stocks above $5. You will note the nice symmetry between short and long margin requirements. While the margin requirement for buying stocks is 50%, the requirement for short-selling stocks is 150%. Here’s an example: if I buy a stock for $10 per share (let’s say 100 shares), I only need to put up $500, or half the total value of the stock. If I want to sell the same stock short, I need to put up $500 (plus the $1000 in proceeds from the sale of the borrowed stock). So there is symmetry between short and long margin requirements. (Investopedia has an in-depth explanation of this). If the price of a stock is below $5, there is no margin allowed on either long or short sales. So if I want to buy 100 shares of a stock at $3, I must have $300 in cash (or margin from a higher-priced stock). If I want to short sell the same stock I would likewise need the same amount of cash or margin available.

The symmetry between long and short breaks down, however, with stocks under $2.50 per share. The NYSE has a rule (rule 431 (c) 2) that requires $2.50 in cash or margin for every stock below $2.50 per share sold short. A comparable rule does not exist for long positions. So if I want to buy 1000 shares of a penny stock trading at $0.40, I need $400 in cash or margin ability from marginable stocks. But if I want to short 1000 shares of a $0.40 stock I need $2,500 in cash or margin. So any time someone shorts a stock under $2.50, they have negative leverage: the position value ($400) is but a fraction of the money needed to hold the position ($2,500). For this reason, very few short sellers sell short cheap stocks. Fraudulent companies or worthless shell companies trade at absurd valuations because their share prices are too low to attract short sellers.

Most of the financial fraud in public companies nowadays is with penny stocks. The reason is because short sellers cannot afford to sell short cheap stocks. If the NYSE $2.50 rule were eliminated, more short sellers would be willing to take short positions in overvalued penny stocks. Pump and dump scams would not be as effective because short sellers like myself would easily be able to short sell the pumped-up stocks earlier, at cheaper prices, reducing the harm to the poor rubes who fall for such scams.

Removing the $2.50 rule would increase the amount of information available about penny stocks as short sellers like myself would write critically about the overvalued stocks they sold short. This would give the poor rubes a chance to learn the truth about the worthless stock they were considering buying and this would further reduce the success of pump and dump scams.

Please, contact the NYSE and urge them to stop supporting scammers and fraudsters. Urge them to remove the $2.50 requirement.

Disclosure: I love short-selling penny stocks.

Use the Kelly Criterion to determine position size

This is a classic trading post from my non-trading blog, GoodeValue.com.

The Kelly Criterion is a formula for choosing how large a bet to make on each trade/investment/gamble. It works for the stock market, though it was originally developed for gambling. The formula is simple: bet the proportion of your investment as defined by the ratio of expected return divided by maximum return. Expected return is what you expect in the long run.

So, the formula is: P_invest = E(r) / M(r)
where,
Proportion of portfolio to invest = P_invest
Expected return= E(r)
Maximum return = M(r)

Now, a couple of examples:

1. If you flip fair coin and win $1 if heads and lose $1 if tails, the expected return is $0 (.5 x $1 + .5 x -1). The maximum return is $1 (if heads). Therefore, the Kelly criterion suggests you bet no money ($0/$1). This makes sense, because you should not invest money where you expect to only break even.

2. You want to short Apple (AAPL) because you think there is an 80% chance the stock will go down in the next month. You think if that happens, the stock will go down 10%. You figure that there is a 20% chance that the stock will go up 5%. The expected return is 7% (.8 x 10% + .2 x -5%). The maximum gain is 10%. The Kelly formula suggests that you invest at most 70% (7/10) of your portfolio.

3. Same thing, shorting AAPL. You like the odds, so you increase your leverage by buying put options. You buy just out of the money options. Now, there is a 70% chance that your options expire worthless (-100% return) and a 30% chance that you make 300%. The expected return is +20% (.7 x -100 + .3 x 300). The maximum gain is 300%. The Kelly formula says that you should bet less than 1/15 (about 6.5%) of your portfolio (20/300).

One thing to consider is that the Kelly formula seeks only to maximize gains. If you wish to minimize portfolio variability as well, you should invest significantly less than the maximum allowed by the Kelly formula. Also, keep in mind that the formula is only as good as your guesses of probability. In order to minimize portfolio volatility and because it is very difficult to accurately estimate the expected return on a trade a priori, many traders stick to using a very small fixed percentage of their portfolio on each trade.

I recommend a Legg Mason article on the Kelly Criterion, or this paper by Edward Thorp (who used it to great effect).

Visit Cisiova’s website for their advanced online Kelly Criterion calculator, which allows you to enter a large number of possible outcomes.

If you liked this post you may want to check out William Poundstone’s book Fortune’s Formula.

Disclosure: I own no Apple stock, long or short. Unfortunately, I did once lose money shorting AAPL. My disclosure policy never loses me money.

What to expect on this blog

What can you expect on my new blog? I will blog about trading strategy, trading psychology, and my own trades. For the time being I do not forsee offering any pay services even though some readers have encouraged me to start a Reaper Alerts service much like Tim Sykes’ TimAlerts (of which I am a lifetime member). I do not rule out offering such a service in the future, but for the time being protecting my best trading strategy is more important to me than earning a few thousand dollars from selling my knowledge.

In addition to this blog I have a new Twitter account. See my @ReaperTrades twitter account for all future tweets.

First up: reposts of my best trading articles from my other blog. Expect new posts here by the end of the week.

Disclosure: I am an affiliate of Tim Sykes and am a happy customer of both. See my disclosures & disclaimers page for more details on what products I have purchased from Sykes.