Correlation Analysis: The First Step Towards Portfolio Diversification

by: Marius Bausys

Investors regularly hear clichéd suggestions to diversify their portfolio by investing in assets with low correlations. However, it is not always clear what exact steps need to be taken to reap the benefits of diversification. In this article, I would like to address the question of how market participants can incorporate simple correlation analysis into their investment decision making process.

Simply put, diversification is an investment approach that aims to combine different investments in order to reduce the overall risk in a portfolio. The underlying driver of the diversification effect is the correlation between returns of portfolio components. Most readers are familiar with the notion of correlation, but for those who need a refresher, a good starting place is here (pdf). The key point is that the portfolio volatility is always lower than the weighted average volatility of each asset as long as the assets are not perfectly correlated. The less correlated the assets are, the more significant the diversification benefit.

Tools to calculate correlations

First of all, correlation analysis requires estimation of the coefficients. This is where a lot of investors struggle as historical data gathering, model selection and calculation procedure all require effort and resources.

Fortunately, there are freely available online tools that can quickly calculate correlation coefficients even for large portfolios. Such tools are readily available, straightforward to use and an investor need not worry about data gathering or cleaning. On the downside, online resources are rarely flexible and users have to rely on models designed by developers.

If an investor feels comfortable running calculations on their own, they can utilize built-in functions of common spreadsheet applications, such as Excel. This "manual" way of data crunching gives a good understanding of the methodology applied. The drawback of Excel, as I found out the hard way when writing my thesis on minimum variance portfolios, is that historical data cleaning and management are time consuming when the analysis needs to be performed on a regular basis.

Finally, there are stand-alone statistical analysis packages (R ,MatLab, etc.) that offer by far the best functionality. However, they demand significantly higher level of knowledge of statistics and in some cases may be simply too costly for an average retail investor.

Ways to utilize correlation analysis

Correlation analysis in the portfolio management context can be performed at different levels of granularity. However, my experience shows that there are only two main methods, both of which are straightforward and intuitive:

1. Assessment of the full correlation matrix. In this case, an investor investigates inter-correlations between securities of interest and chooses the ones that have the lowest correlation with other components or the overall portfolio.

2. Assessment of correlation coefficients between securities of interest and a single market factor that an existing portfolio is largely exposed to, e.g. S&P 500. Using this method, an investor chooses securities that have the lowest correlation with the chosen factor.

Robustness checks

It is also of huge importance to perform robustness checks for persistence of coefficients. At the very minimum, correlations should be calculated over different time frames (e.g. compare 1, 2 and 5 year coefficients). Better though, plotting a chart of historical coefficient values will give an investor a good understanding of the co-movement during different market phases. For instance, the chart below shows a rolling
1 year correlation between S&P 500 (NYSEARCA:SPY) and gold (NYSEARCA:GLD) over the last 5 years:

We can see that the coefficient ranged from approximately - 0.25 to +0.44 during the selected period. Therefore, the current reading of 0.35 appears to be at the top of the range and should be interpreted with care as historically the correlation used to be lower. As markets keep changing all the time, asset correlations should be tested periodically and the portfolio may need to be rebalanced as a result.

A working example

To demonstrate how the correlation analysis could be applied in practice, let's assume an investor holds a simple portfolio that is 75% invested in US stocks (NYSEARCA:VTI) and 25% in emerging market stocks (NYSEARCA:EEM). Now suppose our investor wishes to add exposure to gold miners and is considering the two most popular alternatives: Market Vectors Gold Miners ETF (NYSEARCA:GDX) and Market Vectors Junior Gold Miners Fund (NYSEARCA:GDXJ). The investor wants to know which ETF would fit the portfolio better, i.e. provide a more substantial diversification benefit.

Method 1. For the purpose of the analysis, I have computed a correlation matrix for the last year on InvestSpy:

As both GDX and GDXJ are fairly similar products, their performance is also expected to be consistent. Nonetheless, we observe that GDX is less correlated than GDXJ to both VTI (0.34 vs 0.38) and EEM (0.46 vs. 0.51).

To check whether the same finding would hold true over a longer time frame, I have re-run the computations for the full period since GDXJ's inception on November 10, 2009:

The correlations during this period appear to be consistent with initial findings as GDX remains less correlated than GDXJ to both VTI and EEM.

In general, more emphasis should be placed on correlations with portfolio holdings that account for a larger proportion (i.e. VTI with 75% of total portfolio vs. EEM with 25%). However, this is less relevant in our case as GDX is a better diversifier for both VTI and EEM.

All in all, the conclusion is that GDX would be a better diversifier than its junior counterpart. Of course, an investor may want to take into consideration the composition, expense ratio, liquidity and other parameters of the ETFs in question, but from the diversification perspective GDX would get the nod.

Method 2. Referring to the second method described in the previous section of this article, the analysis can be simplified even further. A portfolio composed of VTI and EEM is a clear bet on long equities. Therefore, an investor could opt to evaluate correlations of GDX and GDXJ with a broad equity market index, e.g. SPY, instead of analyzing a full correlation matrix. Below are correlation coefficients for the portfolio components against the SPY ETF:

The correlation for GDX again appears to be lower than for GDXJ, thus the former would be preferred from a diversification perspective. Despite its slightly lower accuracy, Method 2 is generally simpler and more convenient to run for a larger set of securities.


Summing up, correlation analysis is a simple tool that investors can rely on when making portfolio allocation decisions. There are two main approaches, both of which are straightforward and easy to apply. Of course, correlation analysis can be extended in a number of ways, one of which I have briefly touched on in another article here. On a more sophisticated level, combining correlations with volatilities, one could perform full portfolio optimization to achieve the maximum risk/return ratio or to minimize risk, and I look forward to discussing this topic in future articles.

Disclosure: I have no positions in any stocks mentioned, and no plans to initiate any positions within the next 72 hours. I wrote this article myself, and it expresses my own opinions. I am not receiving compensation for it. I have no business relationship with any company whose stock is mentioned in this article.