This article requires a subscription to view the full text. If you have a subscription you may use the login form below to view the article. Access to this article can also be purchased.

## Abstract

Granger causality (GC) tests are widely used to empirically address the dynamic relationship between speculative activities and pricing on commodity markets. However, the sheer number of studies and their heterogeneity makes it extremely difficult—if not impossible—to compare their results and to derive meaningful conclusions. This is the main objective of this article, which analyzes a consistent dataset with a homogeneous estimation approach. The authors analyze futures returns and volatilities of 28 commodities for three maturities, from January 2006 to March 2015, in relation to three speculation proxies. Overall, they find a larger number of significant GC effects for volatilities than for returns. The volatility effect is mostly negative (i.e., more speculation is followed by lower volatilities). This is particularly true if the Working index is used as a speculation proxy. The majority of destabilizing effects (positive relations), if any, is found in livestock. However, no such effects seem to be present in typical agricultural commodities. Mixed evidence is found for the soft commodities. Apart from statistical significance, the explained variance of returns and volatilities is below 8% and therefore economically small or moderate at best.

Aconsiderable body of empirical tests has been performed over the past decade to explore the temporal relationship between measures of financial speculation and the prices, returns, and volatilities of a wide range of commodity futures. These tests were mainly motivated by public concern about the adverse impact of financial investors, notably the growth of index-related investing since the mid-2000s, on commodity prices. This financialization of commodity markets has been the subject of numerous theoretical and empirical studies. The (empirical) results are far from homogenous and not easy to summarize, although the number of studies that report economically or statistically significant effects represents a minority. Reviews of recent empirical research can be found in essentially all published articles and survey articles.^{1}

A major obstacle in comparing and interpreting the results is their heterogeneity across the selection of commodities, time period analyzed, test methodology, speculation proxies, and the nature of price data (price levels, returns, volatilities). In part, these issues are common to all empirical testing in the field of finance, but two issues are specific to commodities: First and most important, the computation of feasible returns on futures positions is not as clear-cut as for traditional financial assets. It requires clear specification (e.g., type of contracts, maturities, and roll strategies), which is often missing. Worse, in several (even published) studies, the nature of price data is not even described: spot or futures, price levels or returns, discrete observations or time averages?^{2} Unfortunately, the documentation of data is fairly poor in many studies, which precludes a comparison of results on a priori grounds. Second, speculation proxies rely on statistical information (e.g., position data), which is sometimes too ambiguous to relate to commonsense or consensus views about speculation. Therefore, the robustness of empirical results with respect to various proxies seems essential.^{3}

To control for the various effects that might affect the empirical results, this article aims to provide a set of tests applied to a large set of commodities by using

• A common sample period;

• A single test methodology (Granger causality [GC] tests and vector autoregression (VAR) variance decomposition);

• A unified sample of commodity futures returns (i.e., the same construction method of single commodity returns, not indexes, and three maturities); and

• A set of homogenous, direct speculation proxies from a single data source.

Therefore, our contribution is not in applying new tests or using new data, but rather in making the empirical results from a set of standard tests comparable.

GC tests have become a fairly standard methodology in the literature for addressing the dynamic relationship between speculation and price formation. Of course, without imposing structural restrictions from a theoretical model, such an analysis is never a test of economic causation, but rather a test of the predictive power of the variables of interest. Among the most frequently cited are the following: Sanders and Irwin [2011a] analyzed the relationships among the net long positions held by swap dealers as indicative for index-related investments and examined nearby futures returns for 14 commodities from June 2006 to December 2009. Their system of Granger-style causality tests was unable to reject the null hypothesis that positions do not lead returns. Moreover, they found a negative relationship running from position size to market volatility. The same authors (Sanders and Irwin [2011b]) used new index position data for 2004 to 2005 to test their impact on grain futures markets before the 2007–2008 price boom. GC tests were unable to uncover whether the increased index participation affects grain futures prices. Tse and Williams [2013] used intraday futures data from 2006 to 2010 to test the relationship between commodity index-linked and nonindex-linked futures prices. They were able to reject the null hypothesis (no GC) for data frequencies up to one hour, but not for daily data. Similarly, the economic significance of the results virtually vanishes within one hour. Gilbert and Pfuderer [2014b] analyzed the impact of index fund positions on nearby futures returns for the main U.S. grains and oilseed products from January 2006 to December 2011. They compared GC and instrumental variable tests and found that GC effects cannot be observed in the data, whereas the alternative tests—which account for contemporaneous effects—reveal some price impact. Aulerich, Irwin, and Garcia [2013] used nonpublic data from the U.S. Commodity Futures Trading Commission’s (CFTC’s) Large Trader Reporting System from 2004 to 2009 to assess the impact that aggregate daily net flows into index investments have on futures returns. GC tests rejected the null hypothesis (no impact) in 3 of the 12 markets and showed negative and very small coefficients. Büyükşahin and Harris [2011] addressed the question of whether noncommercial trading positions Granger cause crude oil futures price changes in the 2000–2008 period; the tests do not reveal such a relationship. It is apparent from this short overview that observation periods, frequencies, research objectives, and speculation measures are fairly different across the studies.

Lehecka [2015] detailed a study that is closely related to ours. The author analyzed a comparable set of commodities over a similar time horizon as that used in our study, but analyzed the GC of speculation with respect to futures price *levels*, not returns and their volatility.^{4} The author analyzed a battery of disaggregate hedging and speculation variables and covered range of commodities comparable to (although slightly smaller than) what we do here. He concluded that hedging and speculative positions may not be helpful in explaining prices and strongly emphasized the observation that prices have predictive power for position changes. Hence, the focus of the two articles is fairly different.

This article is structured as follows: In the next section, we describe the data used in this study, the specification of the speculation measures, and the test methodology. Descriptive statistics of the speculation measures can be found in the third section, and the empirical findings are discussed in the fourth section. The main findings are summarized in final section.

## DATA AND METHODOLOGY

The most common test in addressing the question of investors’ speculative effects on commodity futures returns is the bivariate GC test. We would like to emphasize here that these tests do not test causality in any epistemological sense, but rather assess the predictive power of one time series with respect to another. Thus, the causality test is a test of the temporal leadership of two series based on correlations at various lags. Standard GC tests require stationary data, which are not always supported by our speculation proxies. However, a simple extension of the standard test, which we apply where the speculation proxy contains a unit root, is available from Toda and Yamamoto [1995].

### Speculation Measures (proxies)

There is no consensus among academics, practitioners, or the public regarding what (excessive) speculation means, what types of investment strategies it includes, and how it should be measured. A detailed discussion is beyond the scope of this study, but the interested reader can find a useful survey on the various definitions of speculation, issues related to commercials versus noncommercials, and their imperfect translation into the categories of hedgers versus speculators in a paper by Szado [2011]. Given these ambiguities, empirical studies should not rely on a single measure and calibration of (excess) speculation. We use three measures (*proxies*). The first is the standard Working T (WT) index, originally suggested by Working [1960] and since used in numerous empirical studies; the second measure is simply the percentage of total, long and short, speculation in relation to total open interest (SOI); and the third measure is a measure called *speculation pressure* (SP).

The WT index relates unnecessary long or short speculation to the total amount of hedging. It can be therefore interpreted as a measure of *excess* speculation. The formula is given by

where *SS* (*SL*) denotes speculators short (long) and *HL* (*HS*) denotes hedgers long (short).

Intuitively, if there is short-hedging pressure in a commodity (short positions exceed long positions, as is mostly the case for agricultural futures), there is an economic need for long speculation to balance out the positions. Short speculation is therefore regarded as unnecessary or excessive and put in relation to total hedging in the WT index. In the case of long-hedging pressure, the WT index puts (unnecessary or excess) long speculation in relation to total hedging.

It should be noticed that the WT index must be interpreted as an upper bound on excessive speculation and as a purely technical measure without much economic content: The index could be erroneously interpreted in a static way—namely, that commercials (hedgers) trade their positions among themselves and, at the end, transfer their net position to speculators. However, this is not how markets actually work: Speculators are counterparties during the entire process of commercial hedging and form an essential component in the matching of counterparties, without which the process of risk intermediation (from the mismatch of positions sizes, maturities, and market timing) would not work and, hence, the market would not exist. It is worth noting that this interpretation is reflected in Working’s own wording: “Indeed, the speculative index itself is a direct measure of the amount of that ‘excess’ [speculation]. But at least a large part of what may be called technically an ‘excess’ of speculation is economically necessary” (Working [1960], p. 197).

The WT index is economically meaningful in the sense that it relates speculation to hedging. Sometimes, in the public discussion, it is the amount of speculation, per se, that is criticized. For this purpose, we use an alternative and much simpler measure: the percentage of total, long and short, speculation augmented by the noncommercial spread positions (SSP), in relation to total open interest (TOI), which for consistency must be doubled^{5}:

The measure is called *speculative open interest* (SOI). It does not address the imbalance between long and short positions. To account for this, we use a third proxy, called net SP, which represents the *net* long position of speculators divided by total speculation^{6}:

where each side of speculation is augmented by the noncommercial spread positions (they cancel out in the numerator).^{7} In contrast to the WT index, SP is not related to hedging; it is a pure measure of speculators’ net position in futures contracts.

Each of the three proxies measures a different aspect of speculation; hence, they should be considered complementary.

### Speculation: Commitment of Traders and Supplemental COT Open Interest Data

Both proxies rely on an adequate measurement of speculation and hedging. As has become standard in the empirical literature, both measures are calculated using the weekly Commitment of Traders (COT) reports and, since 2007, the Supplemental Index Traders reports, released by the CFTC.

**COT report.** This report contains each Tuesday’s open interest (number of outstanding contracts) for U.S. exchanges on which 20 or more traders hold positions equal to or above the reporting levels established by the CFTC. Since March 14, 1995, the Futures-and-Options-Combined Report has been released; this provides an aggregation of futures market open interest and delta-weighted option market open interest. The published open interest for each market is aggregated across all contract maturities in both reports. The weekly reports are released on Friday at 3:30 p.m. eastern time.

The combined COT report classifies the positions into commercials, noncommercials, and nonreporting. For each group, the respective number of long and short contracts is reported separately, and the aggregate of long and short positions adds up to the market’s total open interest. Following common practice in the empirical literature, *commercials* are considered hedgers, whereas *noncommercials* are classified as speculators. However, the group of *nonreporting* traders cannot be easily classified as hedgers or speculators without strong assumptions. Sanders, Irwin, and Merrin [2010] pointed out that the speculation index is not particularly sensitive to the assignment of the nonreporting traders. For that reason, this group is omitted in computing our speculation measures.

**Supplemental COT report.** Since January 5, 2007, the CFTC has published a supplemental COT (SCOT^{8}) report, which releases the positions of *index traders* separately from the noncommercial and commercial positions; the nonreportable positions are not affected. The data are calculated back to January 3, 2006.

The SCOT index traders category includes positions from COT’s commercial *and* noncommercial traders:

• COT Commercials include swap dealers, which are classified as either index or nonindex traders. The index-related swap dealers are reclassified into the new category in the SCOT report. It has been argued (CFTC [2006]) that their hedging activity in the futures markets mostly originates from index-related over-the-counter (OTC) index products (offered by banks and brokers to financial investors) and as such does not represent classical hedging from positions in the physical commodity market.

• COT noncommercials include money managers (MMs), which are classified as either index or nonindex managers. The index-related MMs are reclassified into the new category in the SCOT report. It has been argued that MMs with index positions in commodities have different investment objectives than traditional speculative MMs in commodity futures: They take long-only positions without directional bets and without leverage (Stoll and Whaley [2011] put forward strong arguments in this direction).

Thus, there are arguments in favor of and against including the two categories of index traders in measuring speculation. Because our primary focus is on the impact of noncommercial activity on futures prices, less so on the distinction between traditional speculation and index trading, we prefer a proxy that is rather broad and possibly biased toward “too much” speculation. Therefore, our speculation measure includes *both* index trader categories of the SCOT reports (index-related swap dealers and MMs) in addition to the noncommercial category. However, for two of our speculation measures (SOI and SP),^{9} we reverse the bias and perform robustness tests by excluding index traders.

Summing up, long and short speculation (SS, SL) in our WT and SP measures includes the sum of noncommercial and *all* index trader positions when using the SCOT data. Compared to the COT classification, it is the group of *index-related swap dealers* that makes the difference: They are eliminated from the hedgers and added to the speculators.^{10}

Thus, we use three proxies of speculation (WT, SOI, and SP) applied to two hedger/speculator classifications (based on COT and SCOT reports).

### Commodity Futures Contracts

We have selected all 28 commodities on which futures were traded in the time period from January 2006 to March 2015 and for which COT position data are reported. The SCOT data are available only for a subset of 12 commodities that are included in standard commodity indexes. Our sample length is determined by the availability of the SCOT data. Although the combined COT-position data are available since 1995, we use a *common* sample period for the COT- and SCOT-based speculation measures for the purpose of comparison.

We use three contract maturities: 3, 6, and 12 months (specifically, the maximum number of days to maturity are 52, 183, and 365).^{11} However, from our preselected 28 commodities, we exclude three energy commodities from our analysis because of liquidity constraints and data limitations (DL, EN, and XB). Hence, our final sample includes 25 commodities for the COT data and a subset of 12 commodities for the SCOT data, with a total of 71 and 35 contract maturities respectively.

### Commodity Futures Prices and Returns

Futures price series are constructed by applying the rollover procedure, which is familiar in the empirical commodity literature. The contracts are rolled into the next available maturity in the month in which the shortest contract expires; a fixed business day is selected for the rollover.^{12} Using wheat as an example, the contract is rolled on the 12th business day of February, when the expiration month switches from March to May. On the 12th business day of April, the expiration month switches from May to July. This expiration applies until the 12th business day in June, when the expiration month switches to September, and so on.

All prices are denoted in U.S. dollars and were downloaded from Thomson Reuters Datastream. To match returns with the weekly position data available from the CFTC COT reports used for our speculation proxies, we compute weekly Tuesday-to-Tuesday log returns.

### GC Tests

We apply standard GC tests to weekly return and position data and, respectively, weekly return variances and position data. Returns are measured as log changes of Tuesday closing prices, and weekly volatilities are proxied by quadratic log returns (not their square root).^{13}

The timing of the variables needs some explanation; it is surprising that this crucial topic is mostly not addressed in empirical studies.^{14} The weekly published COT (and SCOT) reports contain the position data for Tuesday, but they are not released until Friday. Depending on how well and quickly information is processed in commodity markets, the release may have an impact on prices. If there is an information (aggregation) effect, the Tuesday positions in *t* (released a few days later) would have predictive power for the weekly futures return from *t* to *t* + 1, but this effect is unrelated to the economic causation of speculative positions on subsequent returns and volatility. In this case, to stay conservative in finding causal effects running from positions to returns and volatilities, it would be preferable to consider the positions in *t* (released a few days later) and the *subsequent* returns from *t* + 1 to *t* + 2 as contemporaneous. However, if markets are efficient and information is processed in the market without publication lag, this procedure would wash out a possible causal effect. In this case, we should consider the positions in *t* and the returns from *t* to *t* + 1 as contemporaneous. Because there is no direct evidence of a publication effect in the literature, we chose the second “efficient market” view.^{15}

The optimal lag lengths from the VAR used in the GC tests are determined from the Schwarz criterion.

The VAR estimation results are used to perform a variance decomposition for those cases in which the null of no-causality from speculation to returns or volatilities can be rejected. A Cholesky decomposition is applied to the error matrix. With respect to the ordering of variables, the speculation proxy is selected as the first variable, the returns as second. In our exhibits, we only display the maximum-variance share in the returns explained by the speculation proxy across the periods.

## DESCRIPTIVE STATISTICS

Exhibit 1 (Panels A to C) provides descriptive statistics of the three speculation proxies. We omit the statistics of the futures returns because they are widely documented in the empirical literature and are not substantially different for our sample.

The autocorrelation coefficients are reported in the last three columns; they are significantly different from zero (indicated by bold figures) across all proxies and commodities. Most of the AC(1) coefficients are close to one and decrease slowly, which indicates a degree of persistence in most series. Because the application of standard GC tests requires stationary data, we have to test for a unit root. Of course, one could argue that the three measures represent *relative* shares of speculation and exhibit upper and lower bounds by definition, and thus the series are stationary by construction. However, in finite samples, the series may well fluctuate in a range of values such that tests are unable to reject the null of a unit root.

The results of augmented Dickey–Fuller unit root tests for nonstationarity of the speculation measures are not displayed in this article^{16}; they confirm that nonstationarity cannot be rejected in approximately one third of the cases (at a 99% confidence level, using the Schwarz criterion)—that is, they behave *as if* they are nonstationary. At a 99% (in parentheses: 90%) significance level, the null of a unit root cannot be rejected for 10 (4), 7 (3), and 11 (4) of the 28 COT series, and for 7 (3), 5 (1), and 4 (0) of the 16 SCOT series. Thus, there *is* evidence for nonstationarity, but in most cases only at a relatively low significance level. Because the construction of the speculation proxies is fairly different, it is not surprising that the time-series characteristics are different across proxies and commodities. Only KC and CL are nonstationary across all three proxies.

In the cases in which we are unable to reject a unit root with 99% confidence, an augmented test of GC must be applied, as suggested by Toda and Yamamoto [1995], which takes into account the maximum order of integration of the nonstationary variable (which is *I*(1) in our case) and must be added to the optimal lag length of the original VAR model. However, the GC null hypothesis is tested on only the original number of lags. Our empirical results rely on the Toda–Yamamoto (T-Y) test for cases in which it is appropriate.

## EMPIRICAL FINDINGS

### Speculative Effects on Returns

How does speculation affect realized returns (log price change) in subsequent periods? The results are displayed in Exhibit 2; Panel A contains the COT-based speculation measures, and Panel B the SCOT-based measures. The first two columns of each table display the name of the commodity and the three analyzed maturities, followed by three major columns containing the results of the GC tests for the three speculation measures. Each of these major columns is subdivided into three columns: The first shows the probability level (P-value) for the null hypothesis that speculation does not Granger cause returns; if the null hypothesis can be rejected at the 10% significance level, the second column displays the sign of the relationship (sum of VAR parameters), and the third column contains the percentage return variance explained by speculation.

The general observation from the tables is that the number of significant effects running from speculation to returns is small. Specifically,

• If speculation is measured by the WT index, 3 (2) commodities

^{17}and 6 (2) maturities exhibit a significant relationship that is negative in all cases. Thus, more unnecessary speculation Granger causes lower returns (i.e., futures price to decrease). The explained variance is below 2.2%. There is an overlap of significant effects for the COT and SCOT speculation proxies for a single commodity only: live cattle (LC; for the longest maturity);• In the case of SOI as a speculation measure, 6 (6) commodities and 14 (10) maturities exhibit a significant relationship, which is negative without exception—that is, more speculation is associated with lower returns. However, the explained variance is extremely small (below 2.1%), with the exception of wheat using the SCOT series, for which the explanatory power is in the range of 6% to 6.5%. There is an overlap of significant effects for the COT and SCOT speculation proxies for cotton (for all three maturities). Overall, the SOI proxy seems to have the most pronounced effects for agricultural futures (cotton, wheat, corn, rice) among all the measures;

• If SP is used as speculation measure, that is, if the sign of

*net*speculation is taken into account, 4 (4) commodities and 8 (8) maturities exhibit a significant relationship, which is positive except in a single case. That is, positive SP (an overhang of long positions) is associated with higher returns (futures prices increase), and negative pressure is associated with lower returns (futures price decrease). The maximum explained variance is 2.8%. There is an overlap of significant effects for the COT and SCOT speculation proxies for two commodities: LC (for the longest maturity) and coffee (for the second and third maturity).

Overall, the results can be interpreted as follows: Positive GC effects of speculation on subsequent returns (i.e., positive price effects) can only be found for the SP measure. Thus, it appears that the sign of net speculation has some explanatory power for the sign of subsequent returns. For the SCOT-based proxies, with a stronger bias toward speculation, the effect can be observed across all three maturities for BO (soybean oil) and KC (coffee) and for a single maturity for LC and lean hogs (LH). The effect for KC and LC can also be observed for the more conservative COT-based proxy. However, the explained variance is not more than 2.8%.

In all other tests, more speculation leads to lower subsequent returns. This is particularly true for SOI as a proxy variable, so the general claim that more speculation leads to higher prices does not seem to be justified; however, the proxy is only of limited economic relevance. The WT index does not reveal positive effects either: Significant negative effects are observed in the SCOT data for a single maturity in two commodities (LH and LC, again), and all the other significant effects are observed in the COT data in precious metals, with the exception of a single maturity for LC (again). The explained variance of WT for LH and LC is again approximately 1.6%.

The SCOT-based speculation measures, which are more biased toward speculation, do not appear to exhibit stronger effects; quite the contrary is true. This means that OTC index investing—measured indirectly through the activity of index swap dealers—does not seem to have an impact on our findings. It is interesting to observe that significant effects are observed across all three measures for LC and LH, although not for all maturities and with mixed signs. No other study reports this finding. Where significant GC effects are found, the explained return variance is small, with a typical value between 1% and 3%. Thus, speculation does not seem to be a major individual driver of commodity futures returns. The public perception that more speculation leads to higher prices cannot be confirmed in general; there is some evidence for a single proxy (net SP), but the other two proxies lead to opposite conclusions.

As a test for robustness, we exclude index traders from two of our SCOT-based speculation measures (which in the previous tests all include index trading to get a possibly upward-biased measure). As expected, the number of significant results decreases; the only effect remains for cotton (SOI measure, for all maturities) and LC (SP measure, now significant for all maturities).^{18}

### Speculative Effects on Variance

How does speculation affect the variance of realized returns in subsequent periods? Here, an inspection of the results displayed in Exhibit 3 (having the same structure as the previous exhibits) reveals that the number of statistically significant effects is much larger than in the results for returns reported previously. We find statistically significant effects for 40% (34%) of the analyzed contract maturities if COT (SCOT) data are used.^{19} Furthermore, the results are more mixed across the individual commodities. The findings can be summarized as follows:

• If speculation is measured by the WT index, 15 (7) commodities and 38 (18) maturities exhibit a significant relationship, which is negative for 11 (5) and positive for 4 (2) commodities out of 25 (12).

^{20}Thus, for the majority of commodities, more unnecessary speculation Granger causes lower volatility in the subsequent weeks. The explained variance does not exceed 8.1%. There is an overlap of significant effects for the COT and SCOT speculation proxies for seven commodities: LH and cocoa (positive), and Chicago and Kansas wheat, soybean oil and meal, and sugar (negative).• For SOI as a speculation measure, 12 (6) commodities and 28 (11) maturities exhibit a significant relationship, which is negative for 9 (2) and positive for 3 (4) commodities. Thus, the use of SCOT-based measures leads to a larger number of volatility-increasing effects. The explained variance does not exceed 6.5%. There is an overlap of significant negative effects for the COT and SCOT speculation proxies for two commodities: Kansas wheat and soybean oil (for all maturities).

• If SP is used as proxy (i.e., the sign of

*net*speculation is considered to be relevant), 11 (5) commodities and 24 (8) maturities exhibit a significant relationship, which is negative in 5 (3) and positive in 5 (1) cases.^{21}Of course, it is a priori not clear whether an overhang of long^{22}or short positions should be associated with a higher volatility—it could well be that a large overhang with either sign could be associated with a large volatility; such an effect would imply a nonlinear relationship and requires a different test procedure. The mixed results (signs) that are in apparent contrast to those for the other two proxies could be a consequence of such an effect. There is an overlap of significant effects for the COT and SCOT speculation proxies for two commodities, feeder cattle and cocoa (both negative); however, this holds for a single maturity only.

Overall, the results can be summarized as follows: If SOI is regarded as a valid proxy and SCOT data are analyzed, one would be tempted to conclude that more speculation leads to higher futures price volatility. However, the picture changes completely if the speculation is related to commercial positions: The WT index indicates a negative volatility effect for most commodities, except for LH and cocoa. Overall, the results are extremely robust across the maturities of the contracts (for positive and negative effects) and the COT and SCOT data. The only heterogeneous results are reported for the SP proxy, which would not be surprising if a nonlinear relationship between SP and volatility should exist.

In general, the explanatory power of the speculative proxies is considerably higher for the volatilities than for the returns. However, the fraction of variance explained is in a typical range of 2%–6% and does not exceed 8.1%, which indicates a rather limited role of speculative effects in explaining futures volatility.

Again, as a test for robustness, we exclude index traders from two of our SCOT-based speculation measures. For the SOI measure, only the negative volatility effects are preserved, and additional negative effects are observed for three additional commodities. The results are mixed for the SP measure, in which sugar exhibits positive effects across all maturities.^{23} Given the importance of index trading, it is not surprising that the empirical results are affected by excluding this category. Given the debate regarding whether index trading is speculation or not—which is not within the scope of the article—it seems to be more appropriate to use an upward-biased measure.

### Special Results for Index Commodities?

Are the empirical results stronger for commodities that are included in popular commodity indexes? If index investing has special pricing effects on commodity futures, then the number of significant effects or the explained variance should be larger for those commodities for which SCOT data are available. Moreover, the SCOT-based speculation proxies should also provide more precise estimates of speculation. Recall that SCOT statistics were introduced for those commodities that are subject to significant index trading.

The number of significant results is measured by the total number of contracts (a total of 71 maturities for the 25 COT commodities and 35 maturities for the 12 SCOT commodities) for which statistically significant GC is observed at the 10% significance level.

For the return analysis (Exhibit 2, Panels A and B), the number of significant results is indeed larger for the SCOT data than for the COT data (21% versus 13%).^{24} However, this is only true if SP and SOI are used as speculation proxies, not for the WT index. Therefore, speculation proxies unrelated to commercial positions might indeed indicate more significant return effects in index-related commodities. Notice that this conclusion is entirely consistent with the findings in our robustness tests, in which index traders are excluded from the SP and SOI measures (not reported, but briefly summarized at the end of the last section).

In the case of the volatility analysis (Exhibit 3, Panels A and B), the picture is different: The number of significant results is smaller for the SCOT data across all three speculation proxies (35% versus 42%).

This conclusion differs from our robustness tests with respect to the SP and SOI measures (not reported, but briefly summarized earlier); there, we find that the number of significant relationships remains largely the same after elimination of index traders, whereas the observed effects (commodities, maturities, sign) partially change.

With respect to the explained variance, we observe the following differences between the COT and SCOT results: The mean (and median) explained variance of *returns* is 2% for the SCOT data (1.7%) and is slightly larger than that for the COT data, at 1.6% (1.5%). The overall figure is apparently small for both datasets, however. The mean explained variance of *volatilities* is virtually identical for the SCOT and COT data (i.e., 3.1%), and the median is slightly larger for the SCOT data (3.4% versus 3.2%).^{25} Thus, in terms of the explained variance, the SCOT dataset reveals slightly more explanatory power for the speculation proxies, but the absolute size of the figures does not indicate substantial differences.

We therefore conclude that our empirical findings—in terms of the strength of the observed causal effects—do not differ substantially between the COT or SCOT datasets. Thus, we do not find stronger effects for index-related commodities in our results.

## SUMMARY AND CONCLUSIONS

GC tests are very popular in the current discussion of the role of financial speculation in commodity futures markets. Unfortunately, the heterogeneity of tests in terms of speculation proxies, price variables, futures contracts, and analyzed time period makes it extremely hard to draw meaningful conclusions from the published results. Furthermore, the focus of many articles is not on individual commodities. The main contribution of our article is therefore to apply T-Y–augmented GC tests to a consistent set of futures returns and volatilities, for three maturities, using three speculation proxies applied to two sources of position data.

Our findings can be summarized as follows: There is a substantially higher degree of spillover effects from speculation to *volatilities* than from speculation to *returns* on futures markets. In the case of volatilities, we found statistically significant effects in approximately 40% of the contracts (COT data), compared to 20% for returns (SCOT data). There is essentially no return effect if the WT measure is used, some positive effects for the SP measure, and negative effects for the SOI measure. The volatility effects are mostly negative—that is, more speculation is followed by lower return volatility. This is particularly true if the WT index, which is widely regarded as the most meaningful measure, is used as the speculation proxy. Even where statistically significant effects are found, the explained variance is economically small or at best moderate: The typical values are in the range of 1%–3% for returns and 2%–6% for volatilities.

With respect to the individual commodities, two observations are striking: First, there are essentially no destabilizing effects of speculation with respect to agricultural commodity futures prices. Where significant effects are reported, they point in rather the opposite direction: More speculation is followed by lower returns (SOI measure) and lower volatilities, with very few exceptions. Second, destabilizing effects, if any, are more frequently observed in livestock, LC and LH in particular: The signs in the return effects are mixed, but are positive for several volatility effects (the WT and OI proxies in the COT and SCOT data). This might be the first study to find effects with some persistence in this commodity group.

There is some—but less conclusive—evidence for some destabilizing effects for the soft commodities: Coffee futures returns react positively to SP, but no volatility effects are observed. For sugar, negative and positive volatility effects are observed (negative for WT, positive for SOI). Cocoa volatility reacts positively to speculation measured by WT but not to the other two proxies. Thus, the overall picture is mixed here.

Compared to the findings in the empirical literature, as briefly reviewed in the introductory section, our results are consistent with the essentially nonexistent price effects with respect to commodity futures returns, which may be due to the weekly price data used (the consequence of using COT-based speculation measures). The volatility effects reported in this study are only partially in line with those in the literature; although most of our effects point toward stabilization, notable exceptions are found in livestock. This result has not been documented in the literature. None of the published articles analyze the sensitivity of the results for different speculation proxies and across futures maturities. Although it is not surprising to find that the specification of the proxy variable has a strong impact, the stability of the effects across maturities highlights the robustness of our empirical findings. Finally, our results also reveal that using CFTC’s SCOT database for calculating speculation measures, which allows for a reclassification of index swap dealers from commercials to speculators, leads to moderately stronger volatility effects of speculation. However, whether index investing should be regarded as classical speculation is a widely debated topic.

## ENDNOTES

We gratefully acknowledge the helpful comments of an anonymous referee. The study was financially supported by the Federal Commission for Technology and Innovation (CTI) under Project 168641 PFES_ES. We are grateful to the members of the expert panel for valuable comments—in particular, Alfred Bühler, Marc Engelhard, Martin Hess, and Peter Sigg.

↵

^{1}A recent overview was done by Lehecka [2015] and in many of the papers published in the “Understanding International Commodity Price Fluctuations” special issue of the*Journal of International Money and Finance*(Volume 42, April 2014). With respect to food commodities, the survey by Gilbert and Pfuderer [2014a] is informative.↵

^{2}It is worth mentioning that it is not uncommon in agricultural economics to work with price averages in empirical studies.↵

^{3}Haase, Seiler, and Zimmermann [2016] summarized the main quantitative and qualitative insights of approximately 100 (largely published) research papers on speculation and commodity prices (levels, returns, volatilities, and spillovers) covering the past decade. We find that the reported empirical results are fairly different in relation to whether indirect proxy variables or direct measures (U.S. Commodity Futures Trading Commission (CFTC) futures position data) are used for constructing speculation measures.↵

^{4}We have analyzed speculation shocks to price-level data using cointegration analysis and Gonzalo–Ng shock decomposition in a separate paper; see Haase, Seiler, and Zimmermann [2015].↵

^{5}The arithmetic of position accounting can be found in, for example, the work of Sanders, Boris, and Manfredo [2004].↵

^{6}Similar measures have been used in studies by Sanders, Boris, and Manfredo [2004] and Lehecka [2015]. The term is borrowed from*net hedging pressure*(introduced by Cootner [1960]), which is used to explain the Keynes–Hicks normal backwardation model of commodity term structure.↵

^{7}Spreading includes the simultaneous long and short futures and options positions in the same underlying commodity taken by noncommercial traders (in our classification: speculators).↵

^{8}The report is sometimes also referred to as the Commodity Index Trader (CIT) report in the literature.↵

^{9}For the Working speculative index (WT) excluding a single category (index traders) without adding it to the remaining category (i.e., commercials) does not make sense, by construction of the index. The index could no longer be interpreted (at least in Working’s way) if a single category of traders is skipped. Although one could argue that index trading is not necessarily “classical” speculation, it should definitively not be added as a whole to the commercial positions.↵

^{10}The index-related MMs are speculators in the COT, as well as in our SCOT-based classification.↵

^{11}Due to data constraints, the 12 months maturity must be dropped for 4 commodities (FC, palladium, platinum, and lumber).↵

^{12}The roll schedule applied to each commodity is displayed in an Appendix that is available upon request.↵

^{13}This is justified if the measurement interval is small and the expected return is close to zero. Because in general the variability of commodity returns seems to dominate expectations (risk premiums), this procedure seems adequate to us even if a week is not strictly a “small” time interval.↵

^{14}A notable exception is the paper by Sanders, Boris, and Manfredo [2004]; the study reveals that the timing of the COT data and returns has some impact on empirical findings.↵

^{15}Of course, this problem prevails whenever a stock variable (positions) must be matched with flow variables (returns or volatilities)—even without publication lags. Taking first differences of the position data over the weekly return measurement interval does not solve the problem because our hypotheses to be tested are explicitly related to the*level*of speculation.↵

^{16}The results are available upon request.↵

^{17}The first figure refers to the COT results; the second (in parentheses) refers to the SCOT results here and in all subsequent interpretations.↵

^{18}Tables with the detailed results are available upon request.↵

^{19}Again, a significance level of 90% is selected.↵

^{20}The sign of the relationship is the same across all maturities for essentially all commodities, except for LC if the SP is used as speculation proxy.↵

^{21}As stated before, one commodity (LC) is indeterminate across maturities.↵

^{22}Net long (short) speculation increases (decreases) volatility (e.g., for soybean oil and meal, rice, sugar, and occasionally wheat), whereas it decreases (increases) volatility (e.g., for cocoa, LH, feeder cattle, WTI oil, natural gas, and copper).↵

^{23}Tables with the detailed results are available upon request.↵

^{24}The percentages are computed as simple averages across commodities and contracts (for each proxy) and then averaged across proxies.↵

^{25}A breakdown of the results for the individual speculation proxies, however, reveals that the explanatory power of the WT index (which we regard as the economically superior proxy) is consistently smaller for the SCOT data. The results are largely driven by the SOI proxy.**Disclaimer**The third author was a minority shareholder of CYD Research GmbH, a provider of quantitative investment solutions in commodity derivatives, until 2013.

- © 2017 Institutional Investor, Inc.