In Part 1 of this 2 part article, I showed that normality was key in getting optimization to do its work over the long haul. In this segment, I shall discuss how normality helps structure a simple way to rebalance a portfolio that may include non-normal stocks.
Berkshire's (BRK.B) holdings as of Dec 30, 2011:
American Express Co (AXP); Bank of New York Mellon Corp (BK); Comdisco Holdings (OTCQB:CDCO); Coca Cola Co (KO); Costco Wholesale (COST); ConocoPhillips (COP); Gannett Co (GCI); Dollar General Corporation (DG); General Electric Corp (GE); GlaxoSmithKline (GSK); Ingersoll-Rand (IR); M&T Bank Corp (MTB); Mastercard (MA); Moody's Corp (MCO); Procter & Gamble (PG); Sanofi-Aventis (SNY); Torchmark Corp (TMK); US Bancorp (USB); USG Corp (USG); United Parcel Service (UPS); Wal-Mart Stores (WMT); Washington Post (WPO); CVS Caremark Corp (CVS); DirecTV (DTV); General Dynamics Corp (GD); Intel Corp (INTC); Visa (V); Wells Fargo & Co (WFC); Da Vita (DVA); International Business Machines Corp (IBM); Liberty Media Corp (LMCA); Johnson & Johnson (JNJ); Kraft Foods (KFT); Verisk Analytics (VRSK).
As usual we will leave it to readers to include the risk free asset in their own analysis, should they wish to do so. Recall that the inclusion of a risk free asset in the mean variance calculations is mathematically equivalent to the Capital Asset Line in CAPM Theory. Also, at no time do we suggest that this is how Berkshire does its rebalancing.
In part 1 of this article, I showed the following chart of 5 stocks in Berkshire's portfolio that tested non-normal. Here volatility was illustrated by taking a moving average of absolute log returns.
Click to enlarge.
Optimization calculations are dependent on 3 inputs, namely, mean returns, volatility, and the correlations that exist between assets.
In mean-variance optimization, volatility is simply the standard deviation of the annualized returns. This assumes a normal distribution where by definition the variance exist, is finite and as such, is easily calculated.
Although complicated models that attempt to estimate non-normal behavior can outperform simpler models on the data from which they are fitted, their performance will suffer when subjected to future data.
Investors should therefore be skeptical of claims that a complicated model that attempts to estimate non-normal behavior will result in superior future performance.
To serve as an illustration, if I tried to fit the volatility for American Express Co shown in the top left graph of the figure above with just 1 parameter, the estimate (the purple line) would look like this:
The one-parameter estimate looks like, and is in fact, a simple average. Now, if I got ambitious and plugged in 10 parameters, I would get the following estimate:
If I aimed even higher and plugged in 25 parameters, I would get what seems like an even better estimate:
And indeed it is a better estimate --- but of the past and not of the future. When too many parameters are fitted to a sample, the model begins to fit the noise instead of the signal. This is an extreme example but one that I feel amply illustrates what I am trying to say.
Skewness & Kurtosis
Normality is defined by 68% of data being within one standard deviation from the mean, 95% of data within two standard deviations and 99% within three standard deviations.
When distributions are deemed non-normal, it would mean that they do not follow these specifications. For example, it may be that only 40% of data are within one standard deviation rather than 68%. Or it might be that data is not equally distributed on both sides of the mean i.e. there might be 70% on the left and only 30% on the right of the mean.
The graph below shows how American Express Co return-difference data (blue bars) seem to be in greater quantity on the left of the zero point on the axis. In this graph the zero point represents the average (mean) of all the annualized returns. Data on the right represent occurrences of above-average returns while data on the left are below-average.
In statistical parlance, this is called positive skewness. In such a distribution, the mean has shifted to the left from what would be a normal distribution (red bars). In order for the mean to be "restored", there must more extreme positive returns than "normal" to balance out the rest of the return data.
While it may seem that a high probability of an extreme above-average return over a high probability of an extreme below-average return is preferred, random draws from such a distribution will generally be of lesser value, and will be balanced only when an extreme above-average return happens.
Some researchers seek to estimate non-normality by estimating the skewness and kurtosis of the underlying probability distributions.
Investors should realize that to take such skewness into the calculations would mean they have opted for a strategy that underperforms their peers during times of market stability.
The same logic applies to those trying to estimate kurtosis which is a measure that determines the flatness of the probability distribution. Heavier tails at both extremes are balanced by a flatter distribution in the middle.
When kurtosis is taken into the calculations, it would mean that investors have once again opted for a strategy that underperforms their peers during times of market stability.
Rebalancing Suggestion #1: Let's adopt a strategy that will not underperform in times of market stability.
Correlation Between Assets
It might surprise you to know that all 4 relationships in the figure below produce the same correlation. If the two variables are normally distributed, however, then they will have a linear relationship and correlation correctly measures the strength of this linear relationship.
Let's discuss another example.
If you have a risk that is defined by a normally distributed random variable X, then it is obvious that X would be perfectly positively dependent on a risk defined by exp(100*X).
However, it can be easily shown that despite the "perfect dependence" that exists between both risks, the correlation is close to zero. The reason for this is that the relation between both risks is not linear but exponential. In other words, you can calculate a number (i.e. the correlation coefficient) but this number would be meaningless.
So correlation measures the linear relationship between 2 variables, assuming such a linear relationship exists. For such a linear relationship to exist, the variables should be normally distributed.
But what if distributions are non-normal and such a linear relationship does not exist? Copulas extend the notion of correlations, and can be used to determine non-linear dependencies.
Calculating a copula is, however, complex and finding the right copula (there are several different ones) adds to this complexity.
Rebalancing Suggestion #2: Copulas are technically sophisticated but the added precision may not be large enough to warrant their use. Why not transform non-normal data to simulate normality so simple correlations can be used?
Correlation in times of crises
Just as an aside, there is an oft-quoted comment that assets become highly correlated (or go to 1) during extreme events when all assets seem to move up and down together. My response to this comment against the backdrop of long-term investing is "so what if they do".
Returns can be thought of as the resultant effect of many component forces. Sometimes certain system-wide effects may dominate all the returns of a set of assets. These systemic shocks to the returns, because of their large magnitudes, will may produce a large measured short-term correlation close to 1 when the assets' returns are considered together.
The word to note here is short-term.
In the figure above, the correlation between Coca Cola and GlaxoSmithKline starts near 1 in the short-term BUT stabilizes in the long term as more data points go into the calculation. This is fairly typical.
A Rebalancing Technique
Every adviser or investor has his favorite approach to rebalancing. Some advisers rebalance regularly on a calendar basis, for example, once a year. Others rebalance when the portfolio moves away from its intended mix by a certain percent. And some don't even rebalance. The rebalancing approach below takes into consideration the 2 rebalancing suggestions mentioned in the preceding paragraphs.
As shown in Part 1 of this article, the performance of Berkshire's combined portfolio consisting of both normal and non-normal stocks when using a Buy & Hold strategy is shown in the following graph. The blue bars denote the performance of an optimized-mix using the Markowitz mean-variance algorithm while the red bars are an equal-weights mix. To simplify, taxes and other expenses are ignored.
While empirically historical returns generally do not conform to normal distribution assumptions, it has been argued that non-normality may be attributed to the likely existence of mixed normal return distributions. Readers who are interested to do further research can check out Rosenberg, B. (1974). "Extra-Market Components of Covariance in Security Returns." Journal of Financial and Quantitative Analysis 9(2), 263-273.
The notion is that the "mean and variance" do approximate investors' expectations at any point of time but that the "returns and variances" vary over time. In this context, a normal return distribution assumption is often appropriate for risk-return estimation at a given point of time.
Consider now the performance of a modified approach that adjusts the data to simulate normality, takes into account the fact that "returns and variances" vary over time, and capitalizes on the correlations that exist between stocks in a portfolio, whether they are normal or non-normal.
Rebalancing can be both and art and a science.
Rebalancing in the context of optimization is a science because it starts with a Nobel-prize winning theory. The Markowitz mean-variance algorithm stems from this theory.
Employing the mean and variance in the context of mean-variance optimization is a strategy that will not underperform in times of market stability.
Rebalancing is also an art form in the sense that it does not come packaged in a closed-end mathematical formula. Transforming non-normal data to simulate normality is one technique of many but is simple to use. Choosing the length of data that goes into the calculations is a tradeoff because using too much data gives stability but loses information on recent trends while using too little data can be unstable.
The art is in the balance.