This article requires a subscription to view the full text. If you have a subscription you may use the login form below to view the article. Access to this article can also be purchased.

## Abstract

In this article, the authors consider two forms of volatility weighting—own volatility and underlying volatility—applied to cross-sectional and time-series momentum strategies. They present some simple theoretical results for the Sharpe ratios of weighted strategies and show empirical results for momentum strategies applied to U.S. industry portfolios. The authors find that both the timing effect and the stabilizing effect of volatility weighting are relevant. They also introduce a dispersion weighting scheme that treats cross-sectional dispersion as (partially) forecastable volatility. Although dispersion weighting improves the Sharpe ratio, it seems to be less effective than volatility weighting.

Volatility weighting is a form of risk management for investment strategies that is commonly applied in practice: When volatility is high (low), the positions are scaled down (up). In this article, we consider two main forms of volatility weighting. The first is weighting an investment strategy by its own volatility. The second is weighting each of the strategy’s underlying assets with their volatilities, which is equivalent to using *normalized returns*. We provide some theory for the efficacy of both forms of volatility weighting, focusing on the effect on the Sharpe ratio of a strategy. In addition, we consider both versions empirically for time-series and cross-sectional momentum strategies based on U.S. industry portfolios. We extend the concept of signed or directional momentum strategies from the time-series domain to the cross-sectional domain, and we introduce a novel form of volatility weighting, *dispersion weighting*, which treats cross-sectional dispersion as a volatility.

Some theory and empirical work on volatility weighting was done by Hallerbach [2012, 2014] for weighting a strategy by its own volatility as well as for using normalized returns, and we build on this work. Barroso and Santa-Clara [2015] found that weighting cross-sectional equity momentum with its own volatility is very effective for improving its risk-adjusted performance: They found that implementing volatility weighting doubles the Sharpe ratio. Moskowitz, Ooi, and Pedersen [2012] studied time-series momentum in futures markets and used normalized returns, but they did not study the effect of doing so. Clare et al. [2016] also considered the use of normalized returns for equally weighted portfolios and momentum strategies in the context of asset allocation. They found volatility weighting to be useful across (but not within) asset classes for the equal-weighted portfolio and that using normalized returns is beneficial for trend following.

We distinguish between two effects that seem to contribute to the efficacy of volatility weighting: volatility stabilizing (smoothing) and volatility timing. The former exploits a convexity effect when variances are either time-varying or random: The less variation in variances, the lower the aggregate volatility will be. The latter is important when the relationship between returns and volatility is negative. These effects are hard to disentangle, but we find that both are important. Our empirical results confirm that weighting a strategy with its own volatility as well as using normalized returns adds value: The Sharpe ratio increases, and the return kurtosis and downside risk decrease. Weighting a strategy with its own volatility seems to work, at least when the relationship with volatility is negative, and using normalized returns is almost always effective. Dispersion weighting, however, seems to be less effective, although it still improves the Sharpe ratio.

This article is organized as follows. We first present a theoretical framework in the next section and discuss the data and strategies considered in the third section. The fourth section details the empirical results, and the final section concludes. Technical details are contained in the Appendix.

**THEORETICAL FRAMEWORK**

In this section, we first define two types of momentum strategies, and then we develop some theory for volatility weighting.

**Signed Strategies**

A *signed* momentum strategy is a directional strategy: Depending on the sign of past returns or deviations from the cross-sectional average return, a unit long or short position is taken. Consider a market of N assets with period-t returns r_{i,t}(i = 1 … N) and a strategy that invests
at t if r_{i,t-1} is positive and
if it is negative (and 0 otherwise). This is a simple time-series strategy, much like that defined by Moskowitz, Ooi, and Pedersen [2012]. It is not hard to define a cross-sectional version of this by considering instead deviations d_{i,t} from the assets’ cross-sectional average return
.^{1} We then go long the asset if it has a higher than average return and simultaneously short the (equal-weighted) market; we do the opposite if it has a lower than average return.^{2} The bet is that an asset with a higher than average return will continue to have a higher return in the next period. The only difference between the times-series and cross-sectional strategy lies in whether prediction is possible from an asset’s own past returns or from the deviation of the return from the cross-sectional average return.

For simplicity, we focused on the case of investment in a single asset. The directional momentum setup is a specific case of the market timing strategies as examined by Hallerbach [2014], so we can use his results. Specifically, it can be shown that for a time-series strategy, the returns and their expectation and variance can be written as (see the Appendix for a derivation):

1where S_{t} = sign(r_{t-1}r_{t}) is assumed to be independent of r_{t}, and where p and q are the success and failure ratios (viz. the probabilities of a correct and incorrect prediction, respectively). For the cross-sectional strategy, one can substitute the deviation from the cross-sectional average return, d_{t}, for r_{t} in the preceding display.

**Weighting with Own Volatility**

One form of volatility weighting is scaling an entire strategy by an estimate of its future volatility. This corresponds to the empirical work done for stock momentum by Barroso and Santa-Clara [2015] (and, in a more general context, by Zakamulin [2014]). Here we considered theoretically the effect of volatility weighting on the Sharpe ratio. We assumed throughout that processes are adapted to some filtration (*F*
_{t})_{t}. Consider a strategy with the following return decomposition^{3}:

where a and g are constants, s_{t} is a positive predictable process representing the conditional volatility of the strategy returns at time t, and where the variate e_{t} is assumed to have zero mean and unit variance conditioned on *F*
_{t–1}. The return process has a predictable portion of which ?s_{t} is related to the predictable volatility and where a is the constant part of the (mean) return, which does not depend on volatility. The unpredictable portion of the process is e_{t}s_{t}, of which the volatility is dictated by s_{t}.

The volatility-weighted strategy is then of the form:

3and the Sharpe ratios of the unweighted and weighted strategies are easily seen to be (with CV denoting the coefficient of variation and assuming all the expectations exist):

4See the Appendix for a derivation of the preceding ratios. The numerators have been scaled to be comparable, and so we can consider the effect of volatility weighting on the numerator and the denominator. By Jensen’s inequality, we have that
where the strict equality holds when s_{t} is deterministic, for example, s_{t} = s, for all t. Hence, the numerator increases for ? < 0 and decreases for ? > 0. For the denominator, we find a range of a over which the denominator falls and the expected return of the strategy is positive. This range is:

where

See the Appendix for a derivation of the preceding ranges.

We have then for ? < 0 a sufficient condition for the Sharpe ratio to improve (i.e., a must be in the relevant range), and for ? > 0 we only have a necessary condition. This makes intuitive sense: If volatility is negatively related to returns, then we can profit from a volatility timing effect by investing less when volatility is high and more when volatility is low. Note, however, that this intuition only goes so far: If the part of returns not related to volatility is too large (viz. when the average return is too large), then volatility weighting is not beneficial. The volatility of is amplified, introducing a new source of volatility in place of the old. It is also the case that volatility weighting may even be beneficial if returns are positively related to volatility. The upper part of these ranges is the most interesting (the bottom part guarantees a positive expected return). We see that the range increases when multiplying volatility by some c > 1, so a larger volatility in this sense is good for volatility weighting (i.e., volatility stabilizing is at work). The range is also increasing in |?|, so a larger dependence on volatility is good for volatility weighting.

The preceding results are in a context similar to that of Hallerbach [2012]. They are less general in that they do not attempt to show that volatility weighting is optimal among a class of strategies. However, the result found by Hallerbach [2012] holds under the condition that volatility is independent of the normalized returns, which will not be the case if a is non-zero.^{4} Some claims can still be made that the result should hold approximately. We have made more precise the intuition that the mean (specifically the part not depending on volatility) should have only a small effect by explicitly giving ranges of a over which volatility weighting may work; we have also explicitly included a dependence of returns on volatility. If a is zero, then the result of Hallerbach [2012] applies, and volatility weighting is in fact optimal for the Sharpe ratio.^{5} In this case, the effectiveness is related to the variability of volatility by minimizing CV(s_{t}). Thus, the smaller the effect of the part of returns not depending on volatility, the more effective volatility weighting is likely to be.

**Weighting with Underlying Volatility (Normalized Returns)**

Instead of weighting the entire strategy by its volatility, we can weight each of the underlying assets, creating a new (notional) set of normalized assets. We refer to this as using *normalized returns*. We assumed again that all processes are adapted to some filtration (*F*
_{t})_{t}. First, consider a signed time-series strategy on a single asset with returns of the form r_{t} = c_{t}s_{t}. Here we suppose c_{t} has a conditional variance of 1, s_{t} is a positive predictable process, and for every t, |c_{t}| and s_{t} are independent.^{6} Now consider weighting the asset by
to get a new asset
on which the time-series strategy can now be run. Picking v_{t} deterministic corresponds to volatility targeting. The multiplicative form is assumed, as this allows us to make statements about optimality (otherwise we get complications, as in the previous section).

If we suppose that the weighting does not influence the probability of making correct or incorrect predictions (e.g., restricting v_{t} to be positive), then the Sharpe ratio of the time-series strategy on the normalized asset can be derived as (see the Appendix for details):

This is maximized by choosing v_{t} deterministic (i.e., volatility weighting). Note that the improvement relies on reducing the coefficient of variation of volatility—the variability of volatility, much as noted by Hallerbach [2012]. It is not hard to construct some simple processes that display momentum and to which this result (or a minor variation of it) can be applied. Note that this result applies to a random volatility for a single defined period of time. It does not apply to a deterministic volatility, even if this deterministic volatility is time-varying. Hallerbach [2014] provided results for the latter case.

The preceding can be applied in a cross-sectional context by weighting with the volatility of the deviations from the cross-sectional average return. A specific case of this would be what we term *dispersion weighting*, which we will examine empirically. Following Solnik and Roulet [2000], we can model cross-sectional dispersion as a predictable process s_{t} such that r_{i,t} = f_{t} + s_{t}?_{i,t}, where we take for f_{t} the (equal-weighted) market. That is, we consider dispersion as a cross-sectional volatility. Now weighting by cross-sectional dispersion would improve a time-series strategy run on deviations (i.e., a cross-sectional strategy), which can be seen by replacing returns with deviations, now of the form st?_{i,t}, in the previous argument.

One thing worth noting about the preceding results is that the relationship of returns with volatility (or dispersion) is not negative, but positive. The conditional expectation of returns under the timing strategy is (p – q)s_{t}E|c_{t}|, which is increasing in volatility. Here, it is not a volatility timing effect that results in an improved Sharpe ratio, but rather a volatility stabilizing effect. In practice, the volatility timing may also be important. Of course, it is also necessary to be able to forecast volatility (we have assumed volatility is perfectly predictable), and thus the efficacy of volatility weighting in practice will also depend on how effectively this can be done.

**DATA AND IMPLEMENTED STRATEGIES**

We considered the set of 49 U.S. industry portfolios, as compiled by French,^{7} with two basic strategies: a signed time-series strategy and a quantile cross-sectional strategy. Each strategy starts with a notional capital of one. These strategies invest each month based on the past J months of returns (formation period) and hold their positions for one month (holding period). J was chosen to be either 12 months (long formation) or one month (short formation).^{8} The signed time-series strategy invests a proportion
of the (notional) capital available in each asset in each period, positively (negatively) if the formation period return was positive (negative). The quantile strategy invests positively in the top quarter (12 assets) of assets in the formation period and negatively in the bottom quarter (each leg equal to the capital available). We considered two time periods, July 1969 to June 1994 and July 1994 to December 2012. The latter period gives an indication of whether markets behave differently more recently, whereas the former provides a look at behavior over a relatively long time period.

**Empirical Results**

We examined empirically the effect of volatility weighting. We considered weighting signed time-series and quantile cross-sectional strategies by their own volatility as well as by using normalized industry returns. We also considered a strategy that simply invests equally in each asset available each month (the equal-weighted market) for comparison. For the cross-sectional strategies only, we considered the effect of dispersion weighting. Similar to Moskowitz, Ooi, and Pedersen [2012], we used an exponentially weighted moving average (EWMA) with a persistence parameter of 0.9836 based on daily data, scaled by
to estimate both strategy and individual asset (ex ante) volatilities.^{9}

In this section we used robust regressions with a bisquare weighting function. The reported P-values are based on normality and can only be seen as approximate (even should the underlying data be normally distributed). We sometimes report percentage R^{2} values as well—it should be noted that these do not have the same interpretation as for an ordinary least squares (OLS) regression and can sometimes be negative.

**Own Volatility**

First we consider whether strategy volatility is predictable. Following Barroso and Santa-Clara [2015], we ran AR(1) regressions of the square root of the 21-day realized variances for our strategies in order to analyze the predictability of the volatility.^{10}

Exhibit 1 reports the slope coefficients (along with *t*-statistics in parentheses and P-values in square brackets) and R^{2} values for robust AR(1) regressions. The intercepts are all positive and highly significant and thus not reported. We find that volatility is quite predictable, with significant AR(1) coefficients of over 0.4 (with one exception) and R^{2} of close to 20% or even higher. Volatility appears to be even more predictable in the more recent period, with higher *t*-values and R^{2}.

We now try to assess whether, in the theoretical framework discussed earlier, a strategy’s return is dependent on its own (ex ante) volatility and to predict whether volatility weighting may be beneficial. First, we regressed normalized strategy returns on the inverse of volatility as follows^{11}:

This regression is based on Equation (2) and allows us to estimate the dependence on volatility (?) and the portion of returns not depending on volatility (a). We also calculated ? from Equation (3) and the associated ranges for a over which volatility weighting may work. These figures are reported in Exhibit 2 for robust regressions.

The a estimates fall comfortably into the relevant ranges, and the upper value of the range, ?, is large compared to a, which provides some empirical justification for Hallerbach’s [2012] approximation. One would expect volatility weighting to work, at least where ? is negative (recall that here the condition was sufficient and not just necessary). The relationship found with volatility is, however, weak and not always negative, as one may have expected. In particular, the short formation time-series strategy has a positive relationship with volatility (as does the equal-weighted market). The other three momentum strategies do appear to have a negative relationship, with the caveat that the long formation cross-sectional strategy has a very weakly positive ? estimate in the more recent period (this is negative under an [unreported] OLS regression).

Exhibits 3 and 4 report descriptive statistics for both unweighted strategies and weighted strategies over the two sub-periods.^{12} The descriptive statistics reported are the mean, standard deviation, mean less median (to reflect skewness), kurtosis, Sharpe ratio, and the average of the largest five (normalized) drawdowns. The volatility-weighted strategies use strategy returns
where ? is a (conditional) volatility target of
corresponding to an annual volatility of 10%.^{13}
Exhibits 5 and 6 plot the log cumulative returns for the weighted and unweighted strategies, where the returns (for the sake of comparability) are normalized with their ex post volatility. In the earlier period, all one-month industry momentum strategies have a higher Sharpe ratio than the corresponding long formation strategies; this confirms the findings of Moskowitz and Grinblatt [1999]. However, for the second period, this is only true for the time-series strategies.

It appears that weighting with own volatility is effective (i.e., it increases the Sharpe ratio), at least when the relationship with volatility is negative. For the earlier period, this is so for the cross-sectional strategies and the long formation time-series strategy (for the latter strategy, the weighting even flips the Sharpe ratio from below to above the equal-weighted market’s). For the market and the short formation time-series strategy where the relationship was positive, we see a deterioration: This suggests that (negative) volatility timing is at work and subsumes volatility smoothing, which is reflected in the cumulative return graphs as well. For the more recent period, the weighting improves the Sharpe ratio for all five strategies.

We would expect the standard deviations of weighted strategies to be close to the 10% target (or perhaps a little higher due to the effect of a non-zero conditional mean). This is the case in the more recent period, but in the earlier period the standard deviation is much higher, indicating that the volatility estimate does not capture all of the future volatility. The weighting also lowers the kurtosis of the strategies, which suggests further beneficial aspects of volatility weighting. A lower kurtosis would naturally result from stabilizing volatility.^{14} The effect on skewness is less clear, but the normalized drawdowns tend to decrease, which points to a reduction of downside risk.

Exhibit 7 reports the intercepts for regressions of weighted versus unweighted strategies, which offers a statistical test for outperformance (i.e., alpha).^{15} Notable is that the intercepts are positive (indicating outperformance), albeit only weakly so, in all cases, except for the short formation time-series strategy and the equal-weighted market in the earlier period (in which the Sharpe ratios also deteriorated).

To further investigate the ability of weighting to reduce downside risk (negative skewness or drawdowns), we investigated the possible asymmetry (nonlinearity) of the relation between weighted strategy returns and the corresponding unweighted strategy returns (and between the weighted market and the unweighted market). For this purpose, we performed a multivariate regression in the spirit of the asymmetric response market timing model of Henriksson and Merton [1981]. However, because we are interested in the downside, we regressed the weighted strategy returns on (1) the corresponding unweighted strategy returns, and (2) the* negative* part of the unweighted strategy returns (defined as min[return, 0]). When this second slope coefficient is negative, it implies that the relation between the weighted and unweighted strategy returns is convex: The weighted strategy’s sensitivity to the unweighted strategy is smaller for negative returns than for positive returns. Consequently, the weighting reduces the left tail of the strategy’s return distribution (and hence reduces downside risk).^{16}

Panel A of Exhibit 8 reports the robust regression coefficients and the corresponding statistics. For all strategies, we observe a negative slope coefficient for the negative part of the unweighted strategy return. Notable exceptions are the short formation time-series strategy and the market, both of which also showed a deterioration of the Sharpe ratio from weighting in the first period (but not in the second). The negative slopes point at a convex relation between the weighted and unweighted strategies’ returns, thus confirming that weighting with own volatility tends to reduce the downside risk of the weighted strategies.^{17} Although this particular regression setup does reveal nonlinearities, it can obscure out- or underperformance. In particular, to judge the outperformance of the weighted versus the unweighted strategies, the sign of the intercept should be evaluated together with the sign of the slope coefficient of the negative part of the unweighted strategy. After all, a positive alpha can outweigh a less negative (or even positive) slope and vice versa. From this perspective, only the combinations of (1) negative slope and positive alpha (i.e., outperformance) and (2) negative alpha and positive slope (i.e., underperformance) are unambiguous.

A final question is whether the performance difference between weighted and unweighted strategies is due to different exposures to established priced risk factors.^{18} We considered the Carhart [1997] 4-factor model, comprising the Fama and French [1992] U.S. market factor (MKT), size factor (small minus big [SMB]), and value factor (high minus low [HML]), supplemented with the 12-1 momentum factor (MOM).^{19} We included the momentum factor to adjust industry momentum for individual stock momentum (cf. Moskowitz and Grinblatt [1999]).

Exhibit 9 shows the results for the unweighted and weighted strategies separately. Starting with the significant exposures of the unweighted strategies, we see that the equal-weighted market has a positive loading on the size factor^{20} in both periods, a negative stock momentum exposure in the first period, and a positive value exposure in the second. In both periods, the equal-weighted market has no significant alpha. The cross-sectional strategies have no average market exposures (except for the one-month strategy in the more recent period), have positive loadings on the momentum factor, and tend to have negative exposures to the size factor. The 12-month unweighted time-series strategy shows significant positive loadings on the momentum, value, and market factors in both periods; the one-month strategy has a negative market exposure in the first period and a positive market loading in the second. The only significant alphas are in the first period: negative for the 12-month time-series strategy and positive for both one-month strategies.

For the strategies weighted with their own volatility, Panel B of Exhibit 9 shows the same pattern of significant exposures as for the unweighted strategies, with again significant positive alphas for the one-month strategies and a negative alpha for the 12-month time-series strategy. We conclude that in this multifactor context, the short formation weighted momentum strategies maintain their risk-adjusted outperformance (notwithstanding any improvements in the downside risk profile of the 12-month strategies as shown in Exhibit 8, which we cannot capture in the linear Carhart model).

The question arises of whether weighting with own volatility improves the risk-adjusted performance. If the volatilities of the weighted and unweighted strategies are very different (this is especially true for the normalized returns strategies; see Exhibits 3 and 4), then their return difference will be dominated by the strategy with the largest volatility, and this confounds the results. Thus, we cannot simply regress the return difference on the four factors. Instead, we augment the 4-factor model with the unweighted strategy’s return as an additional regressor. Next, we regress the weighted strategy return on this augmented 4-factor model. The resulting alphas are excess alphas in the sense that they reflect the difference between the unweighted and weighted strategies’ alphas, while taking into account any differences in volatility (scaling) between the unweighted and weighted strategies.

Exhibit 10 shows the augmented regression alphas. For both periods, the signs of the strategy alphas agree with the signs of the univariate regression alphas as reported in Exhibit 7, although their statistical significance is nowhere near accepted levels. We observe the biggest change for the 12-month time-series strategy in the earlier period: Its largest, positive, and highly significant univariate alpha is now the smallest and least significant.^{21} In the earlier period, the one-month time-series strategy has a negative alpha (significant at the 10% level). For this strategy, Exhibit 2 already showed that weighting reduced the Sharpe ratio. We conclude that weighting with own volatility provides a positive tilt to linear multifactor alphas (except for the one-month time-series strategy), but the improvement appears not to be significant.

**Underlying Volatility**

We then ran momentum strategies on normalized asset returns. For each underlying asset *i*, a normalized return series
was used where the conditional volatility target ? was again chosen to be
.^{22} The strategies were then run on this new set of normalized assets.

Descriptive statistics are also reported in Exhibit 4 and the cumulative returns plotted in Exhibits 5 and 6. Here the Sharpe ratio increases in almost all cases (the only exception is the equal-weighted market in the earlier period). This is also reflected in Exhibits 5 and 6. The kurtosis also mostly decreases, but the effect on drawdowns and skewness is less clear.

Panel B of Exhibit 7 reports intercepts for the normalized return strategies regressed on the unweighted strategies. Here we see intercepts that are positive for the more recent period, albeit not significant. For the earlier period, however, only the cross-sectional strategies show positive intercepts; for these strategies we also observe the largest increase of Sharpe ratios (see Exhibit 3).

Turning to the asymmetric response regressions reported in Panel B of Exhibit 8, we first see that the slopes for the full unweighted strategy returns are well below unity; this reflects the diversification across normalized industries. As before, we observe negative slope coefficients for the negative part of the unweighted strategy return, except for the two time-series strategies in the earlier period (which also showed only modest increases in their Sharpe ratios) and the equal-weighted market again in both periods. This confirms that weighting with underlying volatility also tends to reduce the downside risk of the momentum strategies.

To better understand these results, we considered some additional tests. First, we regressed the weighted and unweighted strategies on market volatility and also calculated correlations. The slopes of these regressions and the correlations are in Exhibit 11. Three of the four momentum strategies betray a negative relationship with market volatility. With similar findings by Wang and Xu [2015], the negative relationship is not unexpected. It is thus interesting that the short formation time-series strategy shows a positive relationship. In the more recent period, the regression estimate is weakly negative, but the correlation is still positive. The equal-weighted market shows a positive relationship with volatility.^{23}

A negative relationship with market volatility suggests that some of the improvement from volatility weighting should be attributed to volatility timing. The total exposure of the strategy would fall when average market volatility is high and increase when it is low (though changes in correlations would not be adjusted for). The positive relationship with market volatility would at least explain the small improvement in the Sharpe ratio (from 0.49 to 0.52) and the negative intercept for the short formation time-series strategy.

The other element of volatility weighting that would contribute to its effectiveness is the stabilizing of volatility—both within and across assets. This means that volatility weighting may be effective despite a positive relationship with volatility. We can surmise that this stabilizing is the more important effect for the short formation time-series strategy, but it is not clear for the other strategies. A simple means of considering stabilization is to compute volatility estimates for both the unweighted market and the market of normalized returns. The coefficient of variation of the volatility estimate falls from 0.37 to 0.27 in the earlier period and from 0.52 to 0.24 in the more recent period. Similarly, the coefficient of variation of dispersion (which we noted is also a kind of volatility), calculated as a cross-sectional standard deviation, drops from 0.30 to 0.25 in the earlier period and from 0.39 to 0.28 in the more recent period. This at least suggests that stabilization is in fact an important element of volatility weighting.

We finally turn to the risk-adjusted performance statistics from the (augmented) Fama–French–Carhart regressions as reported in Panel C of Exhibit 9 and Panel B of Exhibit 10. Regarding significant factor loadings in the Carhart model in Panel C of Exhibit 9, the normalized returns strategies all show the same patterns as the strategies weighted with own volatility (and the unweighted strategies); compare with Panels A and B. The stock momentum exposures of the cross-sectional strategies and the 12-month time-series strategy are positive in both periods. The two cross-sectional strategies now both have a significant positive alpha in the first period and the one-month time-series strategy in both periods. The 12-month time-series strategy again has a negative alpha in the first period.

Regarding the improvement in risk-adjusted performance of using normalized returns versus the unweighted strategies, Panel B of Exhibit 10 shows that the normalized cross-sectional strategies have positive excess alphas in the first period (where the one-month strategy’s alpha becomes highly significant) but effectively zero alphas in the second period. The normalized time-series strategies show a reversed pattern: effectively zero in the first period but significantly (at the 10% level) positive in the second. Thus, there is no consistent improvement in risk-adjusted excess performance over both periods.

We note that this linear multifactor context may ignore any improvements in downside risk by weighting. Panel B of Exhibit 8 shows that the normalized time-series strategies in the first period exhibit significant concavity with respect to the unweighted time-series strategies and convexity in the second (positive and negative slopes for the negative part of the unweighted strategy, respectively). Hence, the undesirable concavity is paired with zero excess alphas and the desirable convexity is paired with positive excess alphas (Exhibit 10, Panel B). This suggests that the linear factor model does subsume nonlinearities to some extent via factor exposures. However, these are selected examples, and our results do not warrant any generalization.

**Dispersion as Volatility (Dispersion Weighting)**

We then tested the effect of dispersion weighting for cross-sectional strategies. For this to work, we need dispersion to be forecastable. Exhibit 12 gives the slope and R^{2} of robust AR(1) regressions of dispersion (measured as a cross-sectional standard deviation). We do see a significant AR(1) coefficient and a moderate R^{2} value, suggesting some ability to forecast dispersion, particularly in the more recent period.

For our dispersion forecast, we ran a moving window (OLS) AR(1) regression with a window of 36 months and then did a one step ahead forecast. The forecast and the actual dispersion in each period are plotted in Exhibit 13. The dispersion forecast is not nearly as wild as the actual dispersion and seems to lag behind it. There is a lot of unanticipated dispersion that the forecast cannot capture. This is, of course, a naïve forecast and could potentially be much improved.

We ran quantile cross-sectional momentum strategies scaled with the dispersion forecast and reported Sharpe ratios for these strategies, and regressed them on the unweighted strategies. To disentangle the effect of dispersion weighting from the (in)accuracy of the forecast, we also considered strategies scaled by the actual dispersion in the holding period. This utopian case assumes perfect forecast ability. Although these latter strategies cannot actually be implemented, they do provide a useful benchmark for comparison: They represent the upper bound on the potential value-added of dispersion weighting. The Sharpe ratios are in Exhibit 14 and the intercepts in Exhibit 15.

We do see an increase in the Sharpe ratio for the dispersion weighted strategies, but the increase is mild, particularly in the early period. For the long formation, we see that weighting with actual dispersion gives the greatest improvement, as expected, but this is not the case for the short formation strategy, for which weighting with actual dispersion actually did worse than the forecast in the more recent period. This latter result is unexpected. We do, however, see positive intercepts for the regressions (indicating outperformance) except in one case.

Any positive effect from dispersion weighting would also flow from a stabilization of dispersion and possibly a timing effect from a negative relationship with dispersion. Stivers and Sun [2010], for instance, found a negative relationship between dispersion and cross-sectional stock momentum. Because volatility and dispersion are positively related (this is seen for stock momentum, reported by Wang and Xu [2015]), there may also be an aspect of volatility timing involved.

**SUMMARY AND CONCLUSIONS**

We defined two types of signed momentum strategies, a time-series and a cross-sectional strategy, the former already well known in the literature. We considered two forms of volatility weighting: weighting a strategy by its own (predicted) volatility and weighting each of the underlying assets (normalized returns). In the former case, we see that the intuition that if returns are negatively related to ex ante volatility, then volatility weighting will be beneficial is only partially accurate. We considered using normalized returns in both a time-series and cross-sectional setting, deriving some simple results.

We distinguish between a timing effect and a stabilizing effect in volatility weighting. The latter is important when the relationship between returns and volatility is negative. These effects are hard to disentangle, but we find that both are important. Our empirical results confirm that weighting a strategy with its own volatility as well as using normalized returns adds value: the Sharpe ratio increases, the kurtosis and downside risk decrease, and risk-adjusted performance as measured against the 4-factor Fama–French–Carhart model tends to be positive. Regarding the latter, however, there is substantial variation over the two time periods studied, and the interplay between improved downside risk and positive alpha prevents conclusive statements. We can conclude, though, that weighting a strategy with its own volatility seems to work, at least when the relationship with volatility is negative, and using normalized returns is almost always effective. Dispersion weighting, however, seems to be less effective, though it still improves the Sharpe ratio.

The results on the industry portfolios have limitations in that it is hard to verify whether strategies could actually have been traded efficiently. Market frictions may erode many of the benefits of volatility weighting, and this certainly needs further investigation. Further disentangling the effects of volatility stabilizing and timing is also a fruitful avenue for future research.

**APPENDIX**

For interested readers, we derive some of the results in the theoretical framework discussion.

**SIGNED STRATEGIES**

First consider a single asset with returns r_{t} and a strategy that invests 1 at t if r_{t–1} is positive and -1 if it is negative (0 otherwise). The investment makes a positive return (i.e., the prediction is correct) if S_{t}: = sign(r_{t-1}r_{t}) = 1. Suppose this signed variable is independent of |r_{t}|. Denote P(S_{t} = 1) = p, P(S_{t} = -1) = q. For continuous variables these will add up to 1.

The return on the strategy can be written as

The expected return of the strategy is
noting that E[S_{t}] = 1 × p – 1 × q.

The variance of the strategy is

**WEIGHTING WITH OWN VOLATILITY**

Consider an asset (or strategy) with the following return decomposition:

where a and g are constants, s_{t} is a positive predictable process representing the conditional volatility of the strategy returns at time t, and where the variate e_{t} is assumed to have zero mean and unit variance conditioned on *F*
_{t–1}. The expected return is a + ?Es_{t} and the variance is:

Thus the Sharpe ratio is

Now consider the volatility-weighted strategy

The expected return is and the variance is

Thus the Sharpe ratio is

We now derive a range for a over which the preceding denominator falls when performing volatility weighting and the expected return of the strategy is positive. We need a + ?Es_{t} = 0 so a = –?Es_{t}. We also need

From this one easily obtains the ranges in Equation (5).

**WEIGHTING WITH UNDERLYING VOLATILITY (NORMALIZED RETURNS)**

Consider a signed time-series strategy on a single asset with returns of the form r_{t} = c_{t}s_{t} with c_{t} having a conditional variance of 1 and s_{t} a positive predictable process. Further assume that for every t,|c_{t}| and s_{t} are independent. Now consider weighting the asset by
to get a new asset
on which the time-series strategy can now be run. We suppose the success and failure rates p and q are the same for both strategies, and |c_{t}| and v_{t} are also independent. The expected return and variance of the timing strategy on
(with S_{t} defined as in the text) are:

Thus the Sharpe ratio is:

## ENDNOTES

We thank an anonymous referee for critical comments on an earlier draft and helpful suggestions. Of course, we are responsible for any remaining errors.

↵

^{1}Another possibility would be to consider the median return instead of the average.↵

^{2}The signed cross-sectional strategy will be long the top half of assets and short the bottom half of the assets if the mean and median are equal. Thus, we may expect this strategy to be closely related to the typical cross-sectional strategies studied in the literature (starting with Jegadeesh and Titman [1993]), which buy the top quantile of assets and sell the bottom quantile.↵

^{3}This is similar to the return generating process discussed by Ederington and Guan [2010] and Hallerbach [2012]—see footnote 4 of Hallerbach.↵

^{4}See also Hallerbach [2012], footnote 4.↵

^{5}We should assume ? > 0 for positive expected returns.↵

^{6}There are slightly weaker assumptions that can be made in the last instance, but this is the most convenient.↵

^{7}This can be downloaded from http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html.↵

^{8}Individual stocks are characterized by short-term reversal; for this reason, their momentum was measured over the past 12 months, excluding the most recent month (*12-1 momentum*). Industries, however, exhibit both short-term and long-term momentum. See Moskowitz and Grinblatt [1999] and our Exhibit 3.↵

^{9}The EWMA technically gives a daily volatility estimate, thus the scaling and also the use of a relatively persistent EWMA. The persistence coincides with N = 120 days, with a weighted average time lag of about 61 days and a*half-life*(the time for the weight for a specific return to halve) of about 42 days.↵

^{10}Barroso and Santa-Clara [2015] ran their regressions on the variance estimates, not the volatility estimates. We also ran such regressions. However, these results (unreported) are weaker and are, in any case, of less interest because it is volatility, not variance, that we wish to forecast.↵

^{11}This allows the use of standard regression techniques (otherwise we need to compensate for conditional heteroskedasticity).↵

^{12}Figures are annualized as follows: mean: (1 + r)^{12}- 1; standard deviation and Sharpe ratio: multiply by mean less median: multiplied by 12.↵

^{13}The target is arbitrary, but if it is too large the strategy may end up with negative capital when there are large negative returns. The unconditional volatility will, however, be higher (even under a perfect volatility weighting scheme) because of the effect of a non-zero conditional mean. where µ_{t}is the conditional (on time t – 1) mean.↵

^{14}Consider, for instance, a generalized autoregressive conditional heteroskedaticity model, in which it is the random nature of conditional volatility that induces excess kurtosis in the unconditional distribution.↵

^{15}The slope coefficient of the regression (not reported) would automatically indicate the (ex post) relative weighting needed of one strategy versus the other to ensure comparability in cases in which the strategies have very different volatility.↵

^{16}When the slope of the negative return part is significant, it implies that the slope of the negative part is significantly different from the slope of the positive return part (i.e., significant nonlinearity).↵

^{17}Note that the Sharpe ratio does not reflect any changes in the form of the distribution as long as the mean and standard deviation do not change.↵

^{18}We thank the referee for suggesting this extension.↵

^{19}These data were also taken from French’s website; see footnote 9.↵

^{20}This can be explained by the fact that the Fama–French market factor is value-weighted. The equal-weighted market puts more weight on smaller industries.↵

^{21}The augmented regression factor exposures (not shown here for the sake of brevity) reveal that this is also the only strategy that has (highly) significant positive exposures to the market, value, and stock momentum factors. These exposures have apparently reduced the strategy’s alpha.↵

^{22}Results from Hallerbach [2014] suggest it is best to use the same target for all the assets. These targets are for the underlying assets, not the strategy. Unlike with weighting strategies by their own volatility, it is not clear what to expect for the standard deviation of the strategy (except that it will be lower because of diversification effects).↵

^{23}Note, however, that this regression is ill-specified: It regresses on the market’s volatility without considering that this volatility changes. The results in Exhibit 2 for the market strategy are more appropriate in this respect.**Disclaimer**Much of the research presented was initially done while Johan du Plessis was an intern in the quantitative investment research department of Robeco. Views expressed in the article are the authors’ own and do not necessarily reflect those of Old Mutual or Robeco.

- © 2017 Pageant Media Ltd