## Abstract

The authors discuss a network-based methodology that models hedge fund strategies across the superordinate–subordinate dimension to gain new insights into their interrelation. This methodology uncovers considerable misbehavior of various hedge fund strategies from the network perspective. Simply speaking, a misbehaving hedge fund strategy has undesired network proximity (similarity) with strategies from other classifications and/or undesired network-based risk properties. The authors conduct extensive static and dynamic (bootstrapping) analyses demonstrating misbehaviors for the full-sample data set. In addition, they demonstrate that numerous network-based behavioral properties of hedge fund strategies can explain future hedge fund returns. This aspect is of significant relevance, as it shows that network-based information has the potential to act as a value-adding warning indicator for funds of hedge funds. Summing up, the authors think that this article provides novel and valuable tools for hedge fund investors, managers, and analysts.

**TOPICS:** Real assets/alternative investments/private equity, performance measurement

Financial markets can be considered venues where numerous heterogeneous participants interact with each other. As such, financial markets easily qualify as complex systems that tend to organize in a hierarchical manner (Simon 1962). Loosely speaking, a hierarchically organized structure is made up of a few dominant (central) assets and many subordinate (peripheral) assets. Assuming that the object of interest is organized in such a manner, network-based analysis tools that aim to unveil patterns of hierarchical organization are the appropriate choice (Anderberg 1973). One of those tools is the concept of financial networks^{1} that performs a so-called hierarchical clustering analysis.

Given a set of assets or asset classes, the most common methodology for building financial networks starts with a dependence matrix, which is often a correlation matrix.^{2} In the next step, this dependence matrix is transformed into a distance matrix, measuring the proximity between assets in metric space. Finally, a hierarchical filtering method is applied to the distance matrix, which extracts the hierarchical “backbone” of the distance matrix. After this step, we are left with a financial network in a matrix format, a so-called adjacency matrix. This matrix entails all relevant information of the financial network.^{3} Of course, the adjacency matrix can be visualized, leading to a financial network graph in the narrow sense of the word.

The resulting financial network conveys three basic properties: First, the hierarchical “backbone” of the system of investigation is exposed, unveiling connections between assets across the superordinate–subordinate dimension (hierarchical clustering). The second set of information provided by financial are the centrality properties properties of underlying assets that closely relate to the hierarchical “backbone” of the system. Compared with the “classical” risk measures, such as correlation and CAPM beta, centrality scores assume that financial markets are governed by a specific hierarchical order, which is estimated by the financial network.^{4} Superordinate (dominant) assets are usually strongly central, whereas subordinate assets are less central (peripheral). Investors should care about this information because the centrality score of a considered asset quantifies its embeddedness intensity in the financial network, which is a proxy for interconnectedness or contagion risk carried by this asset. Central or strongly interconnected assets can be more strongly affected by shocks as shocks usually propagate faster to central assets and vice versa. Baitinger and Papenbrock (2017a, 2017b) demonstrate that interconnectedness is a novel risk concept, since it is only loosely related to conventional risk definitions, and is value-adding within quantitative investment strategies. Third, a financial network shows the clustering properties of underlying assets, that is, their proximity in the metric space.^{5} Intuitively, holding assets that constitute one cluster is riskier than investing in assets across multiple clusters. Thus, information on the clustering behavior of assets can be useful for analyzing and improving portfolio diversification.

Since any structure that is organized by interacting elements is always a network, network-based analysis has a broad scope of possible applications. The academic literature on banks uses network-based analysis tools for many years to model contagion risks and contagion dynamics in financial institutions (Diebold and Yılmaz 2014, Elliott et al. 2014, Acemoglu et al. 2015, and Affinito and Pozzolo 2017). These networks are based on balance sheet linkages and not on an estimated time series dependence structure. A further important research area is the analysis of social interactions by means of network analysis (e.g., social networks). Rowley (1997) applies social network analysis to study the characteristics of stakeholder structures and their effect on company behavior. In a similar manner, Sciarelli and Tani (2013) try to focus on the reciprocal influences linking other social actors to the enterprise activities and to understand the relative effects. Last, Jiang and Zhou (2010) apply network theory to investigate the (social) networks of traders for that purpose, they use the order flow data of the Shenzhen Development Bank stock.

In this article, we focus on financial networks of assets, whereby the dependence structure is retrieved from the time series dimension (e.g., correlation matrix). This kind of network analysis was first applied to financial markets by Mantegna (1999). He constructed correlation-based networks for US stocks and observed hierarchical structures and economically interpretable clusters. His work triggered a tremendous amount of research on financial markets through the lens of network-based analysis; examples include Bonanno et al. (2001), Vandewalle et al. (2001), Bonanno et al. (2003), Onnela et al. (2003b), Jung et al. (2006), Garas and Argyrakis (2007), Huang et al. (2009), Tumminello et al. (2010), and Majapa and Gossel (2016). The articles cited above focus exclusively on single stock data. Complementing this network-based research endeavor, Mizuno et al. (2006) and Keskin et al. (2011) study currency markets, Roy and Sarkar (2011) and Nobi et al. (2014) analyze global stock indexes, while Schwendner et al. (2015) investigate network structures in bond markets. Lastly, Baitinger and Papenbrock (2017a, 2017b) perform network-based studies on a multi-asset data set containing stocks, bonds, currencies, commodities, and real estate. To the best of the authors’ knowledge, a hierarchical structure analysis through financial networks has not yet been applied to hedge fund strategies. We aim to close this gap by providing extensive network-based studies for various hedge fund strategies.

Before going any further, it is important to clarify what we mean by misbehavior—or undesired behavior—from the network perspective. Network-based misbehavior can be measured mainly across two dimensions. First, since the financial networks discussed here are a product of a clustering algorithm, misbehavior in this regard implies unintended similarities to hedge fund strategies or clusters belonging to a different classification. Second, as mentioned above, the clustering algorithm indeed performs a hierarchical (superordinate–subordinate) clustering, yielding centrality properties for each hedge fund strategy. In that respect, misbehavior describes undesired dynamic centrality properties of hedge fund strategies. Specifically, as will be shown below, some hedge fund strategies can exhibit very erratic centrality characteristics; that is, such strategies are neither stable peripheral nor stable central. A possible consequence of such misbehavior is that an investor believing holding a peripheral strategy could end up being invested in a central—significantly riskier—strategy most of the time.

The detailed contribution of this article is as follows. After equipping the reader with all relevant network-based analysis tools, we start with static network-based studies (see Exhibit 1, Study Type 1). Using two data sets, we show how hedge fund strategies cluster empirically in financial networks. The first data set is made up solely of hedge fund strategies, while the second is a mixture of conventional asset classes and hedge fund time series. This analysis reveals that various hedge fund strategies exhibit significant misbehavior concerning their original strategy classification. The observed misbehavior could be detrimental for hedge fund investors since they de facto could end up being invested in an undesired hedge fund strategy. In the second study, we provide a more detailed analysis on the (mis)behavior of hedge fund strategies by using bootstrapping techniques (see Exhibit 1, Study Type 2). By bootstrapping the above static network, we first measure the reliability of network linkages. This additional information yields a better understanding of the clustering behavior of hedge fund strategies and puts the previously observed misbehavior in perspective. Further, we use the bootstrapping methodology to expose the detailed (mis)behavior of selected hedge fund strategies. The resulting bootstrap-based network interaction profile is of great merit for every hedge fund investor, as it neatly demonstrates how many strategies a certain hedge fund strategy interacts with and how strong the interaction is. Moreover, a further result of our bootstrapping studies is the centrality (mis)behavior of hedge fund strategies. As mentioned above, the centrality metric is a network-based risk measure. In that respect, we demonstrate that some hedge fund strategies exhibit undesirable, that is, very erratic and hence misleading network-based risk behavior. The danger of erratic centrality characteristics can be motivated if we think of a “classical” mean-variance investor. The only relevant risk measure for this investor is variance. She does not only prefer ceteris paribus low variance strategies to high variance strategies, she also cares about the variance of the variance. High variance of realized variance means that realized variance is a poor predictor for future variance. In this case, the mean-variance investor can end up being invested in a high volatility strategy even though she desired and optimized for a minimum volatility strategy. In the same vein, the network-based analysis investor cares about the variance (erraticness) of realized centrality.

The third main study deals with the question of whether information on the network-based behavior of hedge fund strategies is able to forecast hedge fund returns (see Exhibit 1, Study Type 3). Indeed, we demonstrate that various network-based metrics are statistically significant in-sample predictors of hedge fund returns. This aspect implies that information on the network-based behavior of hedge fund strategies has the potential to act as a leading or warning indicator for hedge fund investors.

The rest of this article is structured as follows. In the second section, we elaborate on the data set and explain the most important methodological aspects regarding financial networks and network-based analysis. The third section deals with the static network-based (mis)behavior of hedge fund strategies, while the respective dynamic, bootstrapping studies are performed in the fourth section. In the fifth section, we conduct time series analysis, whereby we aim to predict hedge fund returns by network-based metrics. Concluding remarks are provided in the final section.

## DATA SET AND METHODOLOGY

For our studies, we use monthly return time series on hedge fund strategies retrieved from Hedge Fund Research (HFR).^{6} Precisely, we focus on investable HFRX indexes described in Exhibit 2. This data set covers indexes tracking 34 different hedge fund strategies, which are classified into four main strategies: equity hedge, relative value, macro, and event-driven. We adopt the strategy classifications from HFR. Further, Exhibit 2 features information on seven conventional asset classes. These time series become relevant when we analyze how hedge funds relate to conventional asset classes from a network-based perspective. The data set spans from January 2005 to January 2017, comprising 145 monthly returns or 12 years and one month of data.

Given a data set, a network representation of it can be achieved by implementing the following steps:

1. Calculate a dependence or a statistical relationship matrix. Usually one chooses the correlation matrix.

2. Transfer this dependence matrix to a distance matrix, that is, a matrix containing metrics that qualify as adequate distance measures.

3. Apply a filtering method on the matrix from Step 2, rendering a connected network structure in matrix form, the so-called adjacency matrix.

4. For visual analysis purposes, this matrix can be graphically represented, leading to the visual financial network.

Following the common approach used in the financial network literature, we chose the linear correlation as our dependence measure.^{7} Since correlation coefficients are not qualified distance metrics, we use the following distance measure (Gower 1966 and Mantegna 1999) for the construction of correlation networks:^{8}

where is the (correlation–)distance index between the *i*th and *j*th asset at time *t*, and is the respective correlation coefficient at time *t*. The distance metric decreases when the respective correlation increases and vice versa, which is necessary for the correct application of the minimum spanning tree filtering.^{9} Furthermore, this feature has an intuitive appeal when financial networks are visually analyzed. A strong linkage between two assets (high correlation coefficient) leads to a small distance index, implying that these assets are close to each other in the (visual) metric space and vice versa. Having constructed the distance matrix, we apply the minimum spanning tree (MST) technique to retrieve the network structure.

An MST aims to connect all nodes or vertexes (i.e., assets) with the smallest possible (minimum) sum of distances.^{10} Therefore, the MST compiles the smallest or a subset of the smallest distances.^{11} Note that the MST is the most radical filtering method, as it leads to a network structure with exactly *N* − 1 edges, where *N* is the number of assets (nodes/vertexes in network terms). The advantage of this radical filtering technique is that it yields a clear network picture that is not blurred by numerous edges.^{12} After this filtering, we are left with a network in matrix form, the so-called adjacency matrix. Note that the adjacency matrix contains all relevant network information. All network-based information pertaining to this particular network are retrieved from the adjacency matrix and can be represented via a (visual) financial network.

Given a financial network, we are able to derive a substantial amount of network-based information from it. Generally speaking, network-based information can be grouped into two subclasses. The first class of network-based information measures network-based properties of the individual network components, that is, the assets making up the network. This asset-specific network-based information quantifies the embeddedness intensity of assets into the network. A more convenient term for “embeddedness intensity” is “(network) centrality.” Therefore, this type of network-based information is called centrality measure. A selected subset of different centrality measures is shown in Exhibit 3. For our analyses, we calculate a PCA-based composite centrality indicator from the five genuine centrality scores: adjusted *BET*, *CLS*, adjusted *ECC*, adjusted *DEG*, and *EVC*. In some cases, a marginal adjustment of the original centrality measure is necessary, to yield a centrality score with a distinctive ordering.^{13}

By measuring the centrality of assets, we objectively quantify their interconnectedness risk. Assets that are strongly embedded in the network are central and, thus, carry significant interconnectedness risk. On the other side, assets that are only loosely embedded in the network are called peripheral assets and consequently carry only moderate interconnectedness risk. Baitinger and Papenbrock (2017a, 2017b) demonstrate that this kind of network-based information can be exploited to construct profitable investment strategies.

A second class of network-based information consists of holistic network measures that are shown in Exhibit 4. In contrast to centrality scores, these measures do not quantify asset-specific properties. Rather, they aim to quantify certain aspects of the overall network. The similarity ratio at time *t* quantifies the percentage of common edges between two different networks and is analytically defined as follows (see Onnela et al. [2003a, 2003b]):

where **E**_{t} is the set of edges of the financial network at time *t*, while **E**_{t−h} is the counterpart at time *t* − *h*. ⋂ is the intersection operator and |…| counts the number of elements in the set. The similarity ratio counts how many edges of the previous (*t* − *h* period) network survived and are present in the current network. Therefore, the similarity ratio is often labeled as the survival ratio. In this article, we set *h* = 1, meaning that we compare two neighboring networks within the time dimension. The similarity ratio usually collapses at significant financial market disruptions, implying a major reconfiguration of the network (Onnela et al. 2003a and Baitinger 2017). The diameter, the radius, and their weighted versions aim to capture the density of the financial network. Thus they answer the question of whether the overall network structure is stretched or rather contracted. The remaining holistic network-based measures presented in Exhibit 4 quantify average network properties inferred from individual assets or edges.

## EMPIRICAL EVIDENCE: STATIC ANALYSIS

Using the methodology described above, we construct an MST-based financial network for hedge fund strategies in Exhibit 5, whereby we use the complete time period (i.e., 145 monthly returns). The financial network shows the correlation-based proximity of hedge fund strategies for this period. Furthermore, it conveys centrality information of the various strategies and, thus, demonstrates which hedge fund strategies bear significant interconnectedness risk. With regard to centrality, we observe that **EH**, **ED**, **SS**, and **RVA** are very deeply embedded in the network (Exhibit 5) and, hence, are strongly central. To put it differently, these strategies are very “promiscuous” or are directly connected to promiscuous strategies. The high centrality or the promiscuous nature of these strategies makes them vulnerable to shock events because such events quickly propagate to central assets. Moreover, we often observe empirically^{14} that high centrality or promiscuous strategies often exhibit an unclear behavior, that is, they are not loyal to the original classification and often show close similarities with strategies from other classifications. This implies that a sponsor deliberately investing in a certain hedge fund strategy could eventually become invested in a de facto different and likely undesired hedge fund strategy.

With regard to the clustering behavior or the correlation-based proximity of hedge fund strategies in Exhibit 5, we detect the following observations: First, by and large, the clusters adhere to the HFR strategy classifications described in Exhibit 2. In that respect, we can detect an equity hedge strategies cluster that is governed by the central component **EH**. Further, a relative value and a macro cluster are discernible. The former is led by **RVA**, while the latter is headed by **M**. Finally, we can also recognize a fractional part of the event-driven cluster made up of **CRED**, **ED**, and **SS**. The other event-driven components are spread throughout the network, which leads us to the second basic observation. While many strategies are loyal to their respective strategy classification, some hedge fund strategies exhibit gross misbehavior in that regard. For example, **SB**, which (ideally) is supposed to be in the equity hedge cluster, is located far away in the macro cluster. On the other side, the macro strategy **MMS** behaves empirically more like an equity hedge strategy. The same observation applies to **ACT**.

Summarizing, the empirical behavior of hedge fund strategies is often in line with the respective theoretical classification. However, there are some gross misbehaviors. In these cases, investors end up being invested in an unwanted strategy if they blindly believe in the theoretical classification. The network-based analysis is a simple way to expose such empirical misbehaviors and helps the investor to avoid possibly drastic investment errors.

In Exhibit 6, we add conventional asset classes to the above network. This exercise enables us to analyze how hedge fund strategies relate to conventional asset classes from a network-based perspective. Exhibit 6 shows that **UHY** is characterized by a strong centrality reading, but it is not part of a hedge fund cluster. Instead, **UHY** forms a fixed income cluster with **UIG** and **WGBI**. In contrast, **STCK** is strongly embedded in the equity hedge cluster. This cluster was previously (Exhibit 5) dominated by **EH**. Now, **EH** has to share its centrality power with **STCK**. This observation demonstrates that equity hedge strategies are still closely related to stocks, we suspect, mainly due to their average net long exposure. Further, we observe that **BBG** and **COMG**, which represent commodities, exhibit drop the article to macro strategies. Hence, a sponsor can infer from this fact that specific macro-based hedge fund strategies can exhibit display significant commodity exposure. The final finding is of special interest. It shows that **UMM** is tightly connected to the hedge fund strategy **COM**, meaning that an investor could replicate this strategy partially by investing in a plain-vanilla money market fund.

## EMPIRICAL EVIDENCE: DYNAMIC ANALYSIS

The above exercise is static by construction since only one correlation network is built using the complete time period. By doing this, one receives a “one-shot” of the average behavior of the components for the period under consideration. Any dynamic interactions between network components are completely neglected. This drawback can be addressed by performing dynamic network analyses that are not limited to a one-shot average behavior picture. The basic approach of dynamic network analysis is to construct many financial networks, each describing a different time period. In a second step, the information from the various networks is analyzed or “intelligently” summarized. The creation of multiple financial networks can be achieved by defining an initial estimation period, which is then moved forward (i.e., rolling window approach). Since we use monthly data consisting of just 145 observations for each strategy, this methodology would result in only a modest number of networks, thus, leading to a relatively crude analysis. To enable a more granular analysis, we choose the bootstrapping technique for the creation of different pseudo–time periods. Precisely, we create 10,000 different 145-month time periods by bootstrapping^{15} the time index.^{16} For each random data composition, we construct the respective financial network resulting in 10,000 networks.

A possible way to summarize the information of these 10,000 bootstrapped networks is presented in Exhibit 7. The network shown in this figure is the same as the full-sample network (Exhibit 5), except the indication of edges reliability by their thickness. This reliability information is retrieved from the bootstrapped networks by analyzing how often the connected nodes in the original network (Exhibit 5) interact with each other in the bootstrapped samples. The merits of such an analysis are twofold. First, it draws a more precise picture of the clustering behavior of hedge fund strategies. Indeed, Exhibit 7 demonstrates that there exists a clear break between **DT** and **FSV**. In addition, there is also a break between RVA and SS. This implies a relatively clear clustering behavior of macro and relative value strategies. Further, the size of the equity hedge cluster is actually different from the size initially suggested by the full-sample network (Exhibit 5). **VOL**, **REAL**, and **MA** are only loosely involved in this cluster, relativizing to some degree their misbehavior demonstrated by Exhibit 5. Interestingly, **TH**, **QD**, and **EMN**, which are all classified as equity hedge strategies, are only weakly embedded in the statistical equity hedge cluster. A second merit of this bootstrapped based analysis is that it enables the creation of completely novel network-based information. This can be achieved by substituting the distance measures in the adjacency matrix with distance measures derived from the bootstrapped interaction intensity values. Then, we can proceed with the calculation of weighted centrality and holistic network measures. Network-based metrics that rely on edge weights will be affected by this adjustment leading to bootstrap-based network-based information. Since that kind of analysis would exceed the scope of this article, it is left to future research.

A second way to describe the information of many bootstrapped networks is presented in Exhibit 8. Focusing on **EMN** and **CA**, this exhibit shows their interaction profile with regard to the 10,000 bootstrapped networks.^{17} To be more specific, for each bootstrapped network we record the hedge fund strategies that are directly linked to **EMN**/**CA**. In network terms, these strategies directly interact with **EMN**/**CA**. Finally, the recorded relative interaction frequencies of **EMN**/**CA** with other hedge fund strategies are then summarized by a triangular plot. This plot reveals the network-based behavioral profile or “footprint” of the strategies in question. This simple exercise conveys two important pieces of information on the hedge fund strategy in focus. The first piece of information is the specific network-based (mis)behavior profile; that is, the network-based clustering property, of a strategy. For example, Exhibit 8 demonstrates that the equity hedge strategy **EMN** directly interacts with many strategies from the same strategy classification, which is the desired outcome. But at the same time, **EMN** shows a high degree of misbehavior since it also interacts with many foreign classification strategies. Second, the network-based interaction profile shows the general interaction intensity of the strategy in focus. **EMN** exhibits a significant propensity to directly interact with many strategies, while **CA** is less interaction friendly and therefore more reliable or pure.

Summing up, the methodology of Exhibit 8 can help investors or asset managers to better understand the analyzed strategy by revealing its characteristics in a dynamic setting. In this regard, **EMN** is an unreliable strategy, since its behavior is erratic and often not classification compliant. On the other side, **CA** is more stable and interacts mostly with strategies from the same classification.

An additional network-based metric of interest can be inferred from bootstrapped networks as outlined in Exhibit 9. It shows the normalized centrality distribution for four selected hedge fund strategies.^{18} This centrality distribution is retrieved from the 10,000 random (bootstrapped) financial networks. Hedge fund strategies can exhibit misbehavior or undesired behavior not only regarding their interaction profile but also regarding their dynamic centrality characteristics. A hedge fund strategy lying in the center of a network has significantly different risk characteristics than a peripheral strategy. In that respect, strategies **EH** and **SB** in Exhibit 9 show very reliable risk behavior. In the great majority of bootstrapped networks, EH is strongly central while SB is strongly peripheral. In contrast, the hedge fund strategy **SS** has a very erratic centrality behavior. In the (absolute) majority of bootstrapped networks, **SS** is highly central, but the centrality density function also exhibits significant frequencies for many other centrality degrees. Hence, **SS** has a very dangerous dynamic centrality profile, since its centrality behavior cannot be reliably assessed from the static, full-sample network. Lastly, the **EMN** strategy is somewhere in between. It is not as reliable as **SB**/**EH**, but it has more centrality stability than the **SS**. Unfortunately, many hedge fund strategies have moderately unreliable (“**EMN**-like”) or strongly unreliable (“**SS**-like”) dynamic centrality properties, implying unstable network-based risk metrics. In those cases, investors or asset managers assuming that the dynamic centrality behavior will comply with the full-sample centrality will face in many instances completely different centrality characteristics.

## EMPIRICAL EVIDENCE: TIME SERIES ANALYSIS

In this section, we aim to investigate how network-based information explains hedge fund returns across the time dimension^{19} in predictive regressions. Speaking in our terms, we examine the relationship between the network-based hedge fund behavior (and in some cases misbehavior) and hedge fund returns. To keep the exercise tractable, we focus on holistic network-based measures, thereby drastically reducing the dimension of explanatory variables. The first part of the study is based on the following OLS regression:

*R _{t}* is the composite hedge fund return at time

*t*. This composite return is given by the simple mean of all hedge fund strategies returns (see Exhibit 2) at time

*t*.

*N*

_{i,t}is the

*i*-th holistic network-based information (see Exhibit 4) at time

*t*. This network-based information refers to the network as a whole and thus results in a one-dimensional variable for each holistic network-based measure.

*h*is the explanation horizon. Setting

*h*= 0 will result in an explanatory regression. Because of endogeneity issues, the

*h*= 0 setup is not considered here. Instead, we focus on

*h*= 1, yielding a so-called forecasting regression. This setup answers the question whether network-based information has the potential to successfully forecast hedge fund returns.

In the second part of the study, we adjust Equation 3 by controlling for an AR term (*R _{t}*) and lagged stock market returns (

*STCK*

_{t}):

Controlling for possible predictive components in lagged hedge fund returns and taking into account the stock market environment is a tough reality check for holistic network-based measures.

We start the analysis by calculating the required network-based input variables. First, we define a five-year (60 months) initial estimation period of monthly hedge fund returns spanning from January 2005 to December 2009. For this period, we construct a financial network by following the four steps discussed above. On the basis of this network, we calculate all holistic network-based metrics described in Exhibit 4. Thus, the network-based measure for time *t* = *December* 2009 is derived from the financial network constructed for the period January 2005 to December 2009 and so on. We use an expanding window approach meaning that the estimation window is getting larger over the course of time, reducing possible estimation error issues. Following this road map, we ultimately have 86 observations (December 2009 to January 2017) for each holistic network-based metric. The latest data point of the network-based information (at *t* = *January *2017) is derived from the financial network for the period January 2005 to January 2017 (145 months). All input variables are standardized to achieve comparability of estimated beta coefficients. Further, the composite hedge fund return for December 2009 is retrieved by calculating the simple mean of all hedge fund returns for December 2009 and so on. After these steps, we are left with all variables required by Equations 3 and 4.

The left panel (A) of Exhibit 10 investigates whether network-based information is capable of forecasting composite hedge fund returns without controlling for further predictors. We use first-order differences for all network-based metrics in the forecasting setup, as it leads to overall improved forecasting results.^{20} Since our data set is based on monthly returns, we assess the capability of current network-based information to forecast next month returns. Looking at the detailed results reveals that the first-order differenced diameter, radius, and respective weighted variants all significantly explain next month’s returns. The implication is that networks that become more spread out (increasing diameter/radius) are often followed by positive returns, and vice versa. These results seem plausible as contracting networks usually signal market stress, whereas expanding networks represent a stable market structure (see Peralta 2015). Furthermore, in our study, the mean distance measure (**MDist**) seems to be a powerful metric since it explains next month’s returns with the highest significance (*t*-statistic) level. Interestingly, the coefficient of **TDegr** is statistically significant, while its weighted counterpart is completely useless in forecasting hedge fund returns. The same can be said about the similarity measure (**Simil**), which lacks any explanatory power for future returns. Last, for reasons of academic curiosity, we search (via a brute force approach) for the best forecasting regression with regard to adjusted *r*^{2} containing a maximum of three independent variables. The best forecasting model is described at the bottom of Panel A. It is dominated by **MDist**, highlighting the importance of this network-based metric.

In the right panel (B) of Exhibit 10, we add further controlling variables to the plain forecasting equation, Equation 3. By and large, the results are only marginally affected when controlling for an AR-term and lagged stock market returns. The significance level of diameter, radius, and their weighted counterparts even increases after the inclusion of further controlling variables, emphasizing their predictive power. Similar to Panel A, in the right panel (B), we once again search (via a brute force approach) for the best forecasting regression. This time, the pool of permitted variables also contains the mentioned controlling variables. Since these variables exhibit only a limited degree of predictive power, they are completely neglected in the best model leading to the same forecasting regression as in Panel A.

Before this article is concluded, some words of caution regarding our time series study are appropriate. Regarding the forecasting study, we would like to stress that it is based on a simple in-sample perspective. This fact is underscored by the adjusted *r*^{2} measure, which is often too high for a true out-of-sample forecasting model. Rather, our forecasting studies aim to provide a first hint of the possible prediction power of network-based metrics without claiming to have found the holy grail of hedge fund returns forecasting. To assess the real-time potential and profitability of a network-based forecasting model, strict out-of-sample setups are necessary. These kinds of empirical studies are more complicated and data hungry than simple in-sample studies and are therefore left to future research.^{21} Furthermore, even though our studies prove that, in general, network-based information often has some predictive power with respect to hedge fund returns, in many forecasting regressions, the alpha coefficients are statistically significant. This tells us that one network-based metric is usually not enough to sufficiently predict hedge fund returns and, therefore, a multiple regression approach should be preferred (similar to the best model approach).

## CONCLUDING REMARKS

Since the seminal paper of Mantegna (1999), network-based analysis is increasingly being applied to financial markets with the focus on conventional asset classes. We complement this academic endeavor by investigating the (mis)behavior of hedge fund strategies, representing an alternative asset class, through the lens of network methodology. This article presents valuable network-based tools for studying hedge fund strategies and shows that several strategies can exhibit undesired or misleading characteristics; that is, misbehavior concerning to style adherence and centrality properties. Furthermore, we demonstrate that some network-based metrics even have the potential to act as a warning indicator for hedge funds.

The application of network-based methods to hedge fund analysis is a relatively novel “playground.” Hence, numerous interesting research questions arise from here. For us, as the authors of this article, two general research areas appear to be of particular interest, namely, network-based forecasting and network-based asset allocation. Future research papers should investigate more deeply the potential of network-based information to forecast hedge fund returns in a strict out-of-sample setup. At this time, there is very little research on the topic of network-based forecasting. Similarly, network-based asset allocation research is also a relatively novel research area. In this regard, future research can investigate the potential of network-based (risk) information in the context of the management of multiple hedge fund portfolios.

## ENDNOTES

↵

^{1}Financial academia often refers to the term “financial networks” when speaking about financial institution networks based on balance sheet linkages (see Spelta and Arajo 2012, Elliott et al. 2014, and Acemoglu et al. 2015). In contrast, when using the term “financial networks,” we think of networks that are based on a statistical dependency structure estimated from financial time series.↵

^{2}Therefore, financial networks are often labeled “correlation networks.”↵

^{3}Hence all quantitative properties of this financial network can be derived from the respective adjacency matrix.↵

^{4}Financial networks are usually built from hierarchically filtered correlation matrices. Therefore, when explaining network analysis to a “classical” mean-variance investor, one can describe network analysis as a kind of hierarchical correlation analysis.↵

^{5}Hence, network-based analysis neatly complements research on hedge fund clustering/classification (Fung and Hsieh 1997, Amenc et al. 2003, Brown and Goetzmann 2003, and Deetz 2013).↵

^{7}The correlation coefficient is not the only option for dependence measurement. Fiedor (2014), Kaya (2015), and Baitinger and Papenbrock (2017b) estimate the dependence structure of asset returns by information–theoretic concepts.↵

^{8}Financial networks that are based on correlation distances can be called correlation networks. Throughout this article we mainly use the broader term “financial networks.”↵

^{9}Note that by construction, the distance metric in Equation 1 is not defined for correlation coefficients of one. This concern is of a theoretical nature only, since nondiagonal elements of valid correlation matrices never equal to one.↵

^{10}A nonmathematical description of the MST approach is provided by Mantegna (1999). An integer programming representation of the MST problem is provided by Kaya (2015).↵

^{11}The opposite applies if one thinks in terms of a correlation coefficient: Considering Equation 1, the MST comprises the highest or a subset of the highest correlation coefficients.↵

^{12}A less drastic filtering method is the planar maximally filtered graph (PMFG); see Tumminello et al. (2005). However, Baitinger and Papenbrock (2017a) compare network-based information provided by MSTs and PMFGs. They conclude that both types of networks provide qualitatively similar information.↵

^{13}This property is relevant if we aim to construct interconnectedness risk optimized portfolios or portfolios of assets based on centrality orderings.↵

^{14}See the network-based interaction profiles in the online appendix.↵

^{15}Random sampling with replacement.↵

^{16}Bootstrapping techniques are first applied to financial networks by Tumminello et al. (2007).↵

^{17}For the sake of completeness, bootstrap-based network interaction profiles for all hedge fund strategies are presented in the appendices.↵

^{18}Centrality distributions for all hedge fund strategies are shown in the online appendix.↵

^{19}Since we consider only a limited number of hedge fund strategies, cross-sectional studies are not considered in this article and are left to future research.↵

^{20}Precisely, the delta of network-based information ([*N*−_{i,t}*N*_{i,t−1}]) is used to explain the respective next month returns.↵

^{21}Out-of-sample predictability studies for stock returns are conducted by Pesaran and Timmermann (1995) and Welch and Goyal (2008).**Disclaimer**The stated email address is used by the corresponding author for no purpose other than to indicate his professional affiliation as is customary in publications. Furthermore, the contents of this paper are not intended as an investment, legal, tax, or any other such advice and must not necessarily represent views of FERI Trust GmbH, the website www.feri.de, or any of its other affiliates.

- © 2019 Pageant Media Ltd