Fidelity supplies ETF ratings from Ned Davis, S&P, Morningstar, Marco Polo and Sabrient Systems. In our view the Ned Davis ratings are of minimal value, because they provide too little differentiation between ETFs. S&P ratings provide more differentiation, but not much.
Morningstar, Marco Polo and Sabrient provide substantially more ratings differentiation, but their methods are proprietary and opaque.
In an article yesterday, we offered a simple, non-proprietary, and transparent rating method with highly granular differentiation and ranking of domestic equity ETFs.
Here is a table that shows the ratings on 21 key US domestic ETFs for each of the services supplied by Fidelity plus the one we published yesterday.
You can see that Ned Davis provides a 4 of 5 rating on all 21 funds. While they may all be good choices relative to the universe Ned Davis rates, within the domestic universe as represented by these 21 funds there is not much differentiation. That makes the service minimally useful.
You can also see that S&P, while it provides more differentiation than Ned Davis, doesn't provide nearly as much as the other three services.
Marco Polo appears to provide the most differentiation and potential for differentiation because it uses a 1 decimal place rating on a 10 point scale (100 levels versus 3 from S&P, 5 from Ned Davis, Sabrient and Morningstar).
Beyond differentiation are at least two important questions; one about reliability, and the other about method transparency.
We don't have data on reliability, so for us it would be a bit of a shot in the dark to rely on any of these rating systems available at Fidelity.
Our own method may not be reliable on an absolute basis, but is fairly likely to be reliable on a relative ranking basis, because it incorporates hundreds of analysts' views on thousands of stocks, taking into consideration whatever issues, data and indicators they feel are appropriate.
As for transparency, all but our method are essentially opaque.
Our system is simple, transparent and can be replicated by anyone. For the price return, it uses the year-ahead price appreciation measured by the difference between the market price and the average analyst 1-year target price for each constituent company in each fund weighted according to the weight of the stock within the fund's holdings. For the total return, it sums the calculated price return and the trailing dividend yield.
And important weakness of our method is absence of consideration of volatility.
The method is likely more useful in terms of relative ranking of future performance than in predicting absolute levels of future performance.
The descriptions for the other methods shown below are from the Fidelity website. We think those methods are just too complicated, and in the end produce too little ETF differentiation for all the effort they require.
Ned Davis Method:
NDR's ETF research was developed based on one of Ned Davis' main tenets of investment research, which is: "to rely on a tree of indicators rather than hanging on one branch." Because there are several factors that influence security pricing, we developed ETF-related indicators and models to provide an overall assessment of an ETF on both a relative and an absolute basis. Therefore, NDR's ETF research can be used to help identify those ETFs that best suit an investor's needs.
The ETF research is not designed to generate absolute buy and sell recommendations for a particular ETF. The ETF research, which includes relative strength rankings, trend models, and ETF-related indicators, can be used by investors to narrow their ETF selection choices based on criteria that supports their investment philosophy. Whether used in conjunction with other research or as a stand-alone selection tool, a disciplined indicator/model approach in addition to sound risk control and money management can help reduce investor anxiety and enhance selection choices.
Rating: NDR provides a variety of proprietary indicators developed to narrow ETF choices and selection. All indicators are converted simple average rating from one to five for easy interpretation, with one being the lowest score (bearish) and five being the highest score (bullish). Factors are designed to evaluate an ETF's relative strength versus other ETFs, its short- to intermediate-term price and breadth trend, its mean reversion potential, and its historic seasonality tendencies. In addition to factor analysis, capitalization and style attributes, as well as asset allocation within the ETF, are determined by the ETF holdings. (here)
Standard & Poor's Method:
The S&P ETF Ranking Methodology, launched in October 2008, provides analytical rigor and independent opinion on the relative ranking of each ETF. S&P utilizes a consistent and transparent methodology, which looks at the securities owned by an ETF, as well as certain features (e.g., expense ratio, volatility, etc.) of the ETF itself. The resulting relative ranking is an objective and independent snapshot comparison of an ETF against the entire equity ETF asset class.
S&P is focusing on the idea that understanding the fundamentals and risks of an ETF's individual holdings is as important as - if not more important than - the relative past performance of the ETF itself. Employing a holdings-based view provides valuable insight beyond solely risk-adjusted historical returns.
To assess the relative attractiveness of an ETF's holdings, S&P utilizes its well-established intellectual property, including S&P STARS, S&P Fair Value and S&P Quality Rankings for stocks, along with S&P Credit Ratings and Risk Assessments.
The S&P ETF Ranking Methodology is applied across three components, encompassing ten inputs (qualitative, quantitative and technical):
S&P STARS: Since January 1, 1987, Standard & Poor's Equity Research Services has ranked a universe of common stocks based on a given stock's potential for future performance. Under proprietary STARS (Stock Appreciation Ranking System), S&P equity analysts rank stocks according to their individual forecast of a stock's future total return potential versus the expected total return of a relevant benchmark, based on a 12-month time horizon.
S&P Fair Value Rank: Quantitative model that calculates a stock's weekly fair value, the price at which S&P believes an issue should trade at current market levels, based on fundamental data such as earnings growth potential, price-to-book value, return on equity and dividend yield relative to that of the S&P 500 index.
S&P Technical Evaluation: In researching the past market history of prices and trading volume for each company, S&P's computer models apply special technical methods and formulas to identify and project price trends for the stock.
S&P Quality Rank: Growth and stability of earnings and dividends are deemed key elements in establishing S&P's Quality Rankings for common stocks.
S&P Qualitative Risk Assessment: This reflects the S&P equity analyst's view of a given company's operational risk, or the risk of a firm's ability to continue as an ongoing concern.
S&P Issuer Credit Rating: An Issuer Credit Rating is a current opinion of an obligor's overall financial capacity (its creditworthiness) to pay its financial obligations. Credit Ratings are issued by S&P Ratings Services, a nationally recognized securities rating organization, that is separate from S&P Equity Research.
Standard Deviation: A historical measure of the variability of an ETF's returns. If an ETF has a high standard deviation, its returns have been relatively volatile; a low standard deviation indicates returns have been less volatile.
Expense Ratio (Gross): The ETF's operating expenses as a percentage of average assets, before management fees, disbursements, or other expenses.
Price to NAV: The relationship between the share price of the ETF and the net asset value per share of the underlying holdings.
Bid/Ask Spread: A measurement of the relative gap between the offer price to buy shares of an ETF, and the price at which another party is willing to sell.
Marco Polo Method:
The goal of the XTF ratings methodology is to present an objective and easy-to-understand framework for investors to evaluate Exchange Traded Funds. We have evaluated every ETF listed on US exchanges and rate all ETFs with a minimum of six-month trading history. The XTF Rating Service uses its own proprietary database as well as several market data vendors to rate ETFs. The proprietary database contains the composition history of each ETF, along with intraday trading and quote data for all US-traded ETFs and their components, and other related information.
Researching and rating ETFs is all we do and reflects our belief in the benefits of index based investing across markets, sectors and asset classes. We are driven by a desire to provide relevant, independent, timely and actionable research about ETFs using our proprietary methodology. We rate ETFs independently of asset class, geography and currency, following a disciplined, rules and fact based research and ratings process which we are seeking to establish as a global industry standard.
Our ETF rating service is comprised of a Structural Integrity rating and an Investment Metric rating, which together make up the overall XTF Rating.
1.Structural Integrity Analysis consists of the following data points (factors):
a) Tracking error is based on at least six months of daily performance data. Each ETF has a stated benchmark that it uses as the basis for its investment strategy. For example, the iShares Russell 2000 Exchange Traded Fund uses the Russell 2000 Index as its benchmark. The goal of an ETF is to seek investment results that correspond generally to the price and yield performance, before fees and expenses, to its stated benchmark. The TE measure is important because it quantifies how well the ETF manager is tracking the benchmark. Since asset classes are often represented by particular benchmarks, TE can play an important role in the asset allocation decision; TE helps you choose which ETF fits most appropriately to your asset allocation strategy. The lower the TE, the better the ETF manager is replicating the stated benchmark; an ETF provider that consistently tracks its benchmark will have a very low TE. TE is computed as the standard deviation of the daily total return difference between the ETF and the corresponding benchmark. TE refers to these daily relative performance measures as the consistency of the daily error. Our TE calculation is based on the last six-months of daily errors.
b) Efficiency daily alpha before expenses) measures how well an ETF outperforms its stated benchmark before expenses. Since many ETFs do not hold every security in their benchmark, the ETF manager may use some security selection criteria to determine what to include in the ETF portfolio and how to handle dividends. Efficiency measures the ETF manager's ability to generate alpha. The main sources of outperformance are: security selection, securities lending, and use of swaps or derivatives to track the index. In the case of swaps or derivatives we refer to this as basket optimization. The higher the efficiency, the better; a high efficiency ranking illustrates the ETF manager's aptitude for implementing the techniques used to generate alpha. Daily alpha is the average value of the daily error for each ETF. To measure performance before expenses the ETF expense ratio is added to daily alpha. Information Ratio (Efficiency/TE): Efficiency and Tracking Error work hand in hand to evaluate how well an ETF manager is able to consistently outperform the stated benchmark. The Information Ratio is similar in nature to the Sharpe Ratio: whereby the Sharpe Ratio measures risk-adjusted performance, the Information Ratio measures risk-adjusted outperformance. This ratio is relevant to investors because it indicates consistent outperformance of a stated benchmark and low tracking inefficiencies. This is an ideal combination for many long-term investors.
c) Market Impact quantifies the liquidity of each ETF. The MI measures the price impact of executing a hypothetical trade of 50,000 ETF shares. We estimate MI by multiplying daily ETF price volatility by the square root of the ratio of 50,000 shares to the average daily volume. The lower the MI, the better for the investor. Low MI means that price sensitivity to trade size is smaller for the ETF therefore its liquidity is higher. It serves as a proxy for trading efficiency: the ability to trade in and out of an ETF without negative performance impact.
d) Concentration Risk measures the level of diversification of the underlying portfolio that comprises an ETF. CR uses the weights of each constituent security within the ETF as the basis for the measure. To compute the CR we use the average constituent weight and add the square root of the variance of constituent weights adjusted for the total number of constituent securities. The lower the weight of each constituent security, and the more securities, the better. The lower the CR, the better for the investor since a low CR means that the ETF is not overly sensitive to the performance of any single security.
e) Tax Efficiency: Capital Gains are simply the cumulative capital gains distribution over the preceding twelve-month period divided by the average ETF price over the same period. Capital gains distributions should be minimized in order to enhance performance as much as possible. Our CG measure penalizes ETF providers that do not manage capital gains efficiently. Fortunately, most ETF providers are proficient at handling capital gains. The lower the CG, the better. A low CG measure means that the ETF provider has minimized the tax inefficiencies that accompany the distribution of capital gains.
f) Expense Ratio is a straightforward annualized measure of an ETF's expenses paid by shareholders. The more expensive an ETF, the less likely the ETF will be able to add value over its stated benchmark. ER allows the investor to judge how effectively an ETF manager handles the operational issues of the underlying securities. Our tracking of historical expenses reveals how well an ETF provider reduces costs over time. If the ER does not come down over time it suggests that the ETF provider is simply increasing margins and failing to pass on the savings to the investor. A high ER directly reduces investor returns; the lower the ER, the better.
g) Bid-Ask Ratio measures the hidden or implicit transaction cost of an ETF. At any given time, the investor will buy at the asking price and sell at the bid price, incurring a loss equal to the difference between the two prices. The Bid-Ask ratio is the asking price less the bid price divided by the mid-price of the ETF. Dividing by the mid-price puts the dollar amount in percentage terms so investors can easily relate the measure to returns. We compute the BA as a simple average of all intraday quotes over a one-month trailing period. Our tracking of historical BA spread reveals the marginal change in popularity. As an ETF attracts assets we expect the BA spread to reduce over time reflecting increased trading volume. The BA component complements our MI measure. The BA is directly related to transaction costs and inversely related to liquidity. The lower the BA, the better.
Overall Structural Integrity Measure Each ETF receives a raw measure for each of the metrics described above. We then produce a percentile rank for each ETF factor relative to other ETFs in the same asset class. Within each asset class, the investor can easily evaluate how each ETF compares with other ETFs for each metric. Finally we combine the ratings together using a proprietary weighting scheme to derive an aggregate Structural Integrity score. The highest scoring ETF in each asset class has the greatest Structural Integrity relative to all other ETFs in that asset class; the lowest rating has the least Structural Integrity. Our ratings methodology (for unleveraged ETFs) is based on the view of an average long-term investor. That is why our weighting scheme addresses the need to control the trade-off between turnover, performance, and tracking error. We assumed a five-year investment horizon and a 20 percent turnover rate. Then we determined how our structural components affect the risk-reward trade-off for the average investor. This comparative analysis was then used to compute our weighting scheme. Note also that an ETF does not have to produce the highest ranking in every measure to receive the highest overall score.
2. Investment Analysis complements the Structural Integrity analysis to offer investors the most thorough of evaluation for ETFs available. Where the Structural Integrity analysis concentrates on the operational capabilities of each ETF, the Investment Metrics analysis rating focuses on performance and the investment fundamentals of each ETF. The Structural Integrity analysis applies the same metrics to evaluate all ETFs independent of asset class. However, the Investment Metrics analysis uses measurements which are tailored specifically for the ETF asset class. For example, the performance and momentum measures apply to all asset classes and are used on a percentile basis to rate all ETFs with respect to other ETFs within the same asset class. However, dividend yield is used only to rate ETFs in the equities and real estate classes. What is most important is that we use a consistent approach in evaluating ETFs within the same asset class, allowing an investor to make accurate assessments of ETFs. The Investment Metrics analysis consists of the following data points:
a) Risk Adjusted Performance (Sharpe Ratio) is computed for four different periods. For each ETF that has been in existence for at least five years, the total annualized return is computed for 6-month, 1-year, 3-year, and 5-year periods. For those ETFs that don't have a five year history, we default to computing total returns over the longest historical period available. We then adjust each total return measure using risk computed from daily price volatilities over the corresponding period. The resulting risk-adjusted measures are then ranked by asset class for each time period using a 0-100 percentile. We present several time periods so investors can determine how well an ETF performed over different business cycle and investment environments. As the ETF market matures, we will be able to add more historical periods to increase our offering of information to investors.
b) Momentum is used as our technical indicator based measure. In this case the higher the momentum the better; rapid price growth momentum is more attractive then slower growth over the same period. To measure the rate of change of momentum we compute the ratio of the one month price momentum (moving average) to the six month price momentum. A momentum based investor will prefer a ratio greater than one.
c) Earnings Yields for each equity based ETF we compute the weighted earnings yield (EY). In order to include as many ETFs within our rating service as possible we use the underlying benchmark as a proxy whenever sufficient information is not available for the ETF. It is our view that higher the EY, the better, so our rankings are assigned accordingly. The ETF with the highest EY is given a rank of 100 and the lowest a rank of 0.
d) Dividend Yields (Equities and Real Estate only). As with all of our investment metrics, we use the underlying benchmark as a proxy whenever sufficient information is not available for the ETF. We prefer higher DY to low DY as a relative value indicator. DY is also used by some investors to determine possible income generation capability. The ETF in each asset class with the highest DY is given a rank of 100 and the lowest a rank of 0.
e) Diversification score (Commodity and Currencies only). We consider commodity and currency ETFs to be outside the asset classes traditionally used in a well diversified portfolio. As a result we try to measure the diversification benefits of including these asset classes within such a portfolio. To do this we compute daily correlation of the ETF to an asset allocation benchmark (60% equity S&P 500 and 40% fixed income Lehman US aggregate). The Diversification Score (NYSE:DS) benefits investors who consider adding commodities or currencies to their portfolio. The lower the DS the better - the ETF within an asset class that has the lowest DS is given a rank of 100 and the ETF with the highest DS score is assigned a rank of 0.
f) Interest Rate Risk (Fixed Income only) measures the sensitivity of the ETF to future changes in interest rates. To do this we compute Interest Rate Risk as the product of the securities effective duration and its yield volatility. When sufficient information for the ETF is unavailable we use the underlying benchmark. Also, it is sometimes necessary to use the price volatility as a substitute for the duration approach. The lower the IR, the better - the fixed income ETF with the lowest IR is given a rank of 100 and the fixed income ETF with the highest IR is assigned a rank of 0.
Investment Metric Rating The last measure computed within the Investment Metrics framework is an overall Investment Metrics rating. Employing a proprietary weighting scheme within each asset class we produce an aggregate rating for each ETF using the corresponding ranks for each metric. Each time horizon will have a unique investment metric rating due to the changing risk-adjusted performance. While we believe the ranking for each investment metric component to be useful and applicable to any investment decision, investors might find this overall rating most useful in performing preliminary analysis.
Overall Rating The final step in our rating service is to combine the structural and investment rating for each ETF within each asset class. Since we developed all of the rankings by asset class we now aggregate them together using a proprietary weighting scheme that accounts for the relative importance of each metric. Again, we believe that the constituent inputs to this overall score to be potentially more useful for making informed investment decisions. However, investors will find the overall rating informative when performing preliminary analyses. In order to properly combine the structural and investment ratings we use only the six month period. View Rating Service Methodology (PDF) to learn more. (here)
Morningstar's ETF Research Reports cover approximately 350 individual funds across all asset types. On an asset basis, the Morningstar ETF Research coverage universe represents over 97% of all assets invested in ETFs. Morningstar analysts review ETFs on an individual basis assessing funds along five major criteria.
Suitability - the appropriate use for the fund and any risks inherent to the structure or holdings.
Fundamental View - A qualitative assessment of the key-drivers of an ETF's future performance and what is currently priced into the fund. For example, a broad-index ETF is high level due to the diversification reflected in relation to a particular underlying index, a sector ETF will discuss some of the largest holdings, a fixed-income ETF will deal with relative yields and credit spreads.
Portfolio Construction - A plain language description of the funds structure including the index tracked, the weighting scheme, and the rebalancing mechanism.
Fees - The ETFs expense ratio and how it compares to other offerings.
Alternatives - Other ETFs that an investor might research that may provide lower costs, higher liquidity or differing exposures to match an investor's thesis.
Morningstar provides two ratings on ETFs.
Morningstar Star Rating which assesses the fund's risk-adjusted return performance against its category over a trailing 3-year, 5-year and 10-year period. ETFs returns are mapped against the mutual fund space to give investors a clear guide to how the fund has historically measured up against other long-term investment options. Adjustments are made to make sure that funds are properly graded based on the risk-return performance that they deliver. To not do so would underestimate the amount of risk that many high-returning funds take on. Taking on market risks (size, distress, liquidity) can look like alpha is being delivered when really, the fund was just risky and things worked out in its favor.
Morningstar also provides an analyst-driven Price to Fair Value rating for ETFs that hold primarily equities such as broad index funds, sector funds, niche-themed funds (i.e., Clean Energy). These ratings are derived from the aggregation of our equity-analysts' ratings for the underlying securities. Our individual equity ratings are driven from a fundamental review process that emphasizes the quality of a firm's "economic moat" or the business' sustainability of cash flows and the discount rate used in determining the fair value estimate. Our discounted cash flow projections and an assessment of the risk to the firm's earnings and balance sheet. For an ETF to be rated in this fashion Morningstar must cover at least 66% of the fund's cap-weighted assets.
For the sake of clarification, it should be noted that ETF Star Ratings are quantitatively derived and ETF Price to Fair Value ratings are an aggregation of work done by Morningstar's Equity Research team. As a result, there are instances where ETFs will have ratings without ETF Analyst Research reports and other instances where Morningstar's ETF analysts will provide coverage of a fund without ratings, (here)
Sabrient Systems Method:
Sabrient employs an algorithm for relative scoring of ETFs and to assign one of five ratings categories. The score used for this categorization is called the "Sabrient Outlook Score."
The Outlook Score is the final output of a forecasting model developed by Sabrient in 2007-2008. The model employs a mix of factors including SEC-filing data including earnings, revenues, and cash flows, along with accounting auditing information and other relevant sources. Sabrient maintains a database of the constituent stocks of all covered ETFs. Using a bottom-up approach based on scoring of the constituent stocks, a composite profile and score for each ETF is constructed. The model and resulting scores have been validated with rigorous back-testing and two-year forward test of the algorithm.
The model stratifies returns over time. So a security with a score of 60 will be more likely to outperform a security with a score of 50 over a 1-3 year period. A security with a score of 70 will tend to outperform a 60, and so on. The forecast ratings are roughly normally distributed with high "kurtosis," or infrequent extreme deviations, as opposed to frequent modestly sized deviations so there are fewer scores at the "tails" of the distribution, i.e., under 20 and over 80. This further makes the top-scoring or bottom-scoring choices stand out. The model employs factors related to current return ratios, projected valuation, growth prospects, analyst consensus sentiment, and earnings quality. It can be useful for sector rotation, enhanced ETF, or sector-specific long/short portfolio strategies.
Once the ETF's composite Outlook Score is computed, a rating is assigned. Scores of 0-19, 20-39, 40-59, 60-79, 80-100 are assigned ratings of Least Attractive, Less Attractive, Neutral, Attractive, and Most Attractive, respectively.
Lastly, the ratings algorithm employs a "hysteresis" or "stickiness" approach whereby an ETF's Outlook Score is allowed to fluctuate somewhat at the ratings thresholds without necessarily receiving a ratings downgrade. This helps to avoid overly-frequent (or "flip-flopping") ratings changes. Therefore, for example, an ETF with a score of 60 would qualify for a rating of Attractive, but if the score later dropped to 57, the algorithm will allow it to stay rated as Attractive rather than immediately downgrading it to Neutral. If the score stays above the hysteresis threshold but below the rating threshold for five consecutive weeks, it would be downgraded on the fifth week
Disclosure: QVM has positions in SPY, MDY, IWM, VYM, VIG, and XLU as of the creation date of this article (December 21, 2012). We certify that except as cited herein, this is our work product. We received no compensation or other inducement from any party to produce this article, but are compensated retroactively by Seeking Alpha based on readership of this specific article.
General Disclaimer: This article provides opinions and information, but does not contain recommendations or personal investment advice to any specific person for any particular purpose. Do your own research or obtain suitable personal advice. You are responsible for your own investment decisions. This article is presented subject to our full disclaimer found on the QVM site available here.