What Happens When You Hold Leveraged ETFs for More than One Day?

| About: ProShares UltraShort (SDS)
This article is now exclusive for PRO subscribers.

I had not previously considered a math textbook to be a dangerous instrument, but it's amazing how much confusion can result from a superficial reading of a chapter on compounding. We've seen scary stories about how investors bought leveraged short ETFs just before the market collapsed, expected to rejoice in their profits, but found instead, that the short ETFs actually went down in a bear market! Similarly, leveraged bull ETFs can fall when stock prices rise.

I’ve addressed this topic several times in the past. Today, I want to offer something new: specific data that presents a better rounded picture of what can happen if you hold a leveraged ETF for more than a day.

The multi-day compounding issues are not a flaw with the ETFs, which are actually quite effective in doing what they are supposed to do (see, for example, my prior article studying tracking error). Instead, the controversy is caused by investors and/or commentators who haven’t really taken the trouble to understand how these things work. The good news, however, is that the leveraged ETFs can deliver quite nicely for those who use them correctly.

The not-so-fine print

Each day, a ProShares Ultra or Rydex 2X ETF aims to deliver twice the performance of the benchmark it tracks. A Direxion 3X Bull ETF aims to deliver triple the daily performance of its benchmark. ProShares Ultra Short and Rydex Inverse 2X ETFs aim to deliver double the inverse of the daily benchmark performance. And the Direxion 3X Bear ETFs seek to achieve triple the inverse of the daily performance of their respective benchmarks.

Most observers have a good understanding of double, triple and, where applicable, inverse. The problem has been failure of many to grasp the ramifications of "daily."

Table 1 offers a quick example of the issue. This hypothetical scenario traces a 3X bull ETF that is perfect at meeting its daily targets (zero tracking error).

Table 1

The example is set up to have the ETF deliver exactly what it is supposed to deliver each day, three times the movement of the benchmark.

But look what happened over the full five-day period. The benchmark is up marginally, 0.07%. If we triple that, we get 0.21%. So, the triple-bull ETF should be up 0.21%. Right?


These leveraged ETFs are designed to deliver the doubling or tripling on a one-day basis. Each day is a separate situation completely distinct from any other day. Whether or not the target doubling or tripling will occur over the course of a multi-day holding period is "path dependent." As the benchmark moves along zigging and zagging day to day, the daily gains and losses are computed with reference to different starting points and the overall multi-day return will depend on the nature of the path from start to finish.

Implication for traders and investors

This foregoing doesn't mean that everybody who holds leveraged ETFs for more than a day is going to experience the sort of oddities that have been getting lots of press. Remember the journalist's adage: "If dog bites man, it's not news. If man bites dog, that's news!" A short ETF that goes down when stocks fall is like man biting dog. It's news. It attracts lots of readers. So don't be surprised that this sort of thing gets plenty of coverage.

In reality the extent of deviation from the desired (but not officially proper) multi-day leveraging impact is based on several factors:

  • The length of the holding period
    • The longer one holds, the greater the number of opportunities for counter-trend zigs and zags to cause deviations.
  • The persistence of the move
    • The greater the number of on-trend moves, the less opportunity there is for deviation.
  • The strength of the overall trend
    • The stronger the trend, the more probable it is that it will be able to overwhelm the cumulative impact of the counter-trend days.

It would seem that this sort of thing might lend itself to quantitative modeling. To my knowledge, nobody has done it yet. The challenge would lie in how to express the notion of persistency, which seems to have a random-like quality (a Markov process?). I'm not sure if we can cast the problem in a way that makes it suitable for Brownian motion concepts. Maybe we can use Monte Carlo simulation. Perhaps we'd have to do something else. But even without going that far, I was able to cull some observations that can start providing a sense of the possibilities that might result from different holding periods.

I examined the ProShares Ultra Short S&P 500 ETF (NYSEARCA:SDS) from its 7/13/06 launch through 2/23/09. During that interval, I identified 3,145 holding periods (overlapping) consisting of 5 days, 10 days, 20 days, 40 days and 60 days. For each holding period, I computed the absolute overall return of SDS, the absolute value of the desired target (twice the absolute value of the S&P 500 return), and, of course, the deviation (the difference between the "desired target" and the holding period return).

Table 2 shows the averages. Standard deviations are in parentheses.

Table 2

Holding period

Overall holding-period return

Deviation from desired 2X target

5 days

4.22% (4.96%)
0.68% (1.09%)

10 days

5.45% (6.30%)
0.99% (1.70%)

20 days

8.10% (8.24%)
1.75% (2.84%)

40 days

11.87% (12.38%)
3.20% (4.94%)

60 days

15.44% (15.00%)
4.35% (6.59%)

We see here that in all cases, the average error is way more than zero and much greater than anything we've seen when we looked at daily tracking error. But how many targets regularly used by equity investors are better? Consider the sorts of things routinely done by those who like to project equity returns: discounted cash flow valuations, dividend discount model, Wall Street target prices, capital asset pricing model, arbitrage pricing theory, and so forth. How many of these work better in the real world than the sort of targeting we see in Table 2? Do any of them work better? Do any of them even come close? Can we even get to the point of comparing? (If you're using capital asset pricing model, what will you plug in for expected return of the market?)

While leveraged-ETF targeting does tend to miss when holding periods stretch beyond a day, it's not clear that the deviations are any worse than what we encounter in the stock market all the time. Indeed, the deviations, even out one standard deviation from the mean, seem manageable for holding periods up to 20 days (i.e. four-week rebalancing periods), which tend to be among the longer used by myself and many others who create and backtest strategies on Portfolio123.com, which just recently added ETFs.

Ultimately, though, when it comes to evaluating reasonableness, the proof is in the backtesting results, and I'll address that below. But first, what about all those headline-grabbing man-bites-dog stories of leveraged ETFs that went in the wrong direction. In the case of this study involving an ultra short ETF, "wrong direction" means the ETF rose when the market rose, or fell when the market fell. Table 3 shows how many wrong-direction instances there were among the SDS holding periods I examined.

Table 3

Holding period

Number of observations

How many wrong direction moves?

5 days


25 (3.8% of total)

10 days


32 (5.0% of total)

20 days


36 (5.6% of total)

40 days


49 (7.9% of total)

60 days


40 (6.7% of total)


182 (5.8% of total)

So yes, wrong-direction moves really do happen, even with holding periods as brief as five days. It's hardly business as usual. But how often is too often? Again, that's a question that can best be answered with the Portfolio123 backtester.

Again, however, I must stress that these deviations do not mean the ETF is doing anything wrong. They are designed to double or triple the benchmark only over the course of a single day and they explain this so prominently and clearly, the only ones who can fail to see are those who choose to avoid looking. The deviations examined here relate to longer-hold-period targets imposed on ETFs by their critics.

A sample long-short leveraged ETF strategy

I used a simple market-timing model to go into non-sector-specific ProShares U.S. equity Ultra ETFs when conditions are bullish, and Ultra Short ETFs when conditions are bearish. For conditions to be bullish, the risk premium on the S&P 500 must be at least 1%, and the 5-day simple moving average of the current-year consensus S&P 500 EPS estimate must be above the 21-day average. Then, I select the top three ETFs based on 5-day exponential moving average price minus the 20-day average. I backtested the strategy assuming 0.5% price slippage and four-week rebalancing intervals. I ran the test from 7/15/06, just after we had a full suite of leveraged ETFs available for trading, through the present.

Figure 1 shows the result.

Figure 1

Obviously, we had some incredible choppiness late in 2008, along with just about everything else related to the financial markets. But on the whole, the strategy performed impressively. And the late-2008 drama had nothing at all to do with the leveraged ETF holding periods. It was caused entirely by market-timing calls that turned out to have been incorrect. While this backtest does not quantify the relationship between daily target leverage and leverage achieved over longer holding periods, it does demonstrate that we can come close enough to work with 4-week leveraged ETF holding periods.

Notwithstanding the concern I expressed above about holding periods that stretch beyond 20 days, and notwithstanding my fear of allowing any market-timing call to stand pat for too long, I decided to do a quick backtest to see how the screen would have performed with a three-month rebalancing. The result, shown in Figure 2, surprised me.

Figure 2

Apparently, that particular period was not so plagued by counter-trend days as to throw off the leveraged performance. For better or worse, the results depended mainly on the accuracy of the timing.

Figure 3 shows what happens when I advance the dates by three weeks.

Figure 3

Figure 4 shows what happens when we keep the start and rebalance dates as they were for Figure 3, but switch to a four-week holding period.

Figure 4

Based on Figures 3 and 4, it seems likely that the strong three-month-rebalancing backtest result depicted in figure 2 owned much to luck, rather than any inherent quality of the strategy.

It's way too early to make any final pronouncements on the topic of leveraged-ETF holding periods, but for now, it does seems quite reasonable to develop leveraged ETF protocols based on assumed 4-week holding periods. One could not expect to achieve the sort of on-target doubling or tripling that would more-or-less characterize daily moves to be present, but the moves we get from these ETFs are of the sort that would allow us to create thoughtful strategies (assuming, of course, one's risk-reward tolerances are compatible with the larger rewards and punishments that go with successful or unsuccessful timing decisions).


I’m really not attempting to address myself to the critics of leveraged ETFs who are already showing themselves to be impervious to the fact that the ETFs were never designed to do what they criticize them for failing to do, evidence that reasonable strategies can still be created with these ETFs even for periods longer than a single day, and the huge success these ETFs have garnered in the marketplace (based on trading volume), something that would not likely be occurring if the products weren’t accomplishing something for their users.

Nor am I attempting to preach to the choir. As noted, these products are already quite successful, meaning many investors don’t need me to tell them what they have already found out for themselves.

I am attempting to address the large number of investors who are still trying to make sense of these relatively new offerings. It is not my place to suggest to anyone how much volatility they should be willing to assume, but I do hope to present objective evidence as to how these ETFs actually work, in order to help them cut through the clutter and decide what’s right for them based on demonstrable facts.