Factor investing offers a big promise. By identifying the persistent drivers of long-term returns in their portfolios, investors can understand which risks they are exposed to and make explicit choices about these exposures. An oft-cited analogy is to see factors as the nutrients of investing. Just like information on the nutrients in food products is relevant to consumers, information on the factor exposures of investment products is relevant to investors.
This analogy suggests factors cannot be arbitrary constructs. What would you think if Nestlé used its own definition of saturated fat for the information on its chocolate packets and McDonald’s also had its own, but different, definition for the content of its burgers? Further, would it not be curious if both definitions had nothing to do with the one nutritionists and medical researchers used?
This is exactly the situation that we find ourselves in when it comes to information about factors. Investment products that aim to capture factor premia have gained popularity. Investors rely heavily on analytic toolkits to identify the factor exposures of an investor’s portfolio. However, neither investment products nor analytic tools necessarily follow the standard factor definitions that peer-reviewed research in financial economics has established.
Investors will benefit from understanding and controlling their exposure to factors only if they are reliable drivers of long-term returns. Factor definitions that have survived the scrutiny of hundreds of empirical studies and have been independently replicated in a large number of data sets are, of course, more reliable than ad hoc constructs used for the purposes of a product provider. Unfortunately, such best practices for data are not observed in the investment industry.
In a recent study, we discuss factor definitions used in investment products and analytical tools offered to investors and contrast them with the standard academic factors. We also outline why the methodologies used in practice pose a high risk of ending up with irrelevant factors. References to the academic and practitioner research, and exhibits illustrating our arguments, can be found in the study.
Are factors grounded in academic research?
Factor models link returns of any investment strategy to a set of common factors. In addition to the market factor, commonly used factors include size, value, momentum, profitability and investment. They capture the variation among returns across firms with different characteristics. In financial economic research, a small number of models have become workhorses for analysing asset returns and fund manager performance, given the consensual understanding that they contain the factors that matter for asset returns. Providers of factor-based investment tools and strategies unequivocally claim that their factors are “grounded in academic research”. However, we show that the factors they use are inconsistent with the factors that are supported by a broad academic consensus.
In academia, different models use identical factor definitions, the number of factors is limited to a handful and factors are defined by a single variable. These three properties mean that the different factor models draw on few variables – ones that have been identified as persistent drivers of long-term returns.
In contrast, the factor tools from commercial providers typically include a proliferation of variables. MSCI’s “Factor Box” draws on 41 different variables to capture the factor exposures of a given portfolio. S&P markets a “Factor Library” which, despite including more than 500 variables “encompassing millions of backtests”, wants to help you “simplify your factor investing process”. BlackRock proudly announces “thousands of factors” for its Aladdin Risk tool.
Why do the standard models avoid such a proliferation of variables? First, the need for more factors is often rejected on empirical grounds. One study shows that using 71 factors does not add value over a model with two simple factors (book-to-market and momentum) and another study shows that a model with four simple factors does a good job at capturing the returns across a set of nearly 80 factors. Second, academic research limits the number and complexity of factors because a parsimonious description of the return patterns is likely to be more robust that way.
Non-rewarded versus rewarded factors
Several definitions of the term ‘factor’ exist. Some of them focus on the variability in returns (i.e., short-term fluctuations) and others on the expected returns of assets (i.e., long-term average returns). One type of factor can be used to describe common sources of risk across assets. In this setting, volatilities and correlations among the assets are driven by exposures to a certain set of factors. While this information can provide some understanding of the fluctuations in a portfolio, it does not explain what drives long-term returns. Such factors are referred to as non-rewarded factors. Naturally, there are a number of such non-rewarded factors that can help capture short-term fluctuations. For example, short-term fluctuations of an equity portfolio may be explained by its sector exposures or by its exposure to countries, currencies or commodity risks – among many other possibilities. However, since such factors are not rewarded, an investor does not gain additional returns from such exposures.
Rewarded factors are those that explain differences in the long-term expected return of the assets. Knowledge about these enables an investor to tilt a portfolio towards stocks with high exposure to a factor that is positively rewarded. Investors need to be cautious to avoid misinterpreting a factor offered in commercial factor tools as rewarded when it is actually not. Dividend yield, for example, is included in the factor model of MSCI because it is a source of “time-varying return and risk”. However, it does not explain cross-sectional differences in the long-term expected return.
A severe problem with commercially used factors is the process by which they are defined. It increases the risk of falsely identifying factors, due to weaknesses in the statistical analysis. In fact, providers will analyse a large set of candidate variables to define their factors. Given today’s computing power and the large number of variables representing different firm characteristics, such an exercise makes it easy to find so-called factors that work in the given dataset. However, these factors will probably have no actual relevance outside the original dataset. This problem is well known to financial economists.
Simply seeking out factors in the data without a concern for robustness will lead to the discovery of spurious factors because of the selection bias of choosing among a multitude of possible variables. The practice of identifying merely empirical factors is known as ‘factor fishing’. Therefore, a key requirement for investors to accept factors as relevant is that there be clear economic rationale as to why exposure to this factor constitutes a systematic risk that requires a reward, and as to why it is likely to continue producing a positive risk premium. In short, factors selected on the sole basis of past performance, without considering any theoretical evidence, are not robust and must not be expected to deliver similar premia in the future.
In addition, there are statistical tools to adjust results for the biases arising from testing a large number of variables. A recent study shows that it is easy to find great new factors in backtests but they add no real value to standard factors. Moreover, these factors do not survive more careful vetting. These results emphasise that it is easy to discover new factors in the data if enough fishing is done, but such factors are neither economically meaningful nor statistically robust. Of course, exposure to non-robust factors with an unreliable backtest performance will not prove useful to an investor going forward. The past will give an inflated picture of the factor-based performance.
We have emphasised that a stark problem arises when providers of factor tools select flexibly from among many variables. It turns out that the actual problem is even worse in practice. Providers of factor products and tools do not stop their data-mining practices at selecting single variables. Instead, they create complex composite factor definitions drawing on combinations of variables. Research shows that the use of composite variables yields an overfitting bias. This arises because combining variables that give good backtest results provides even more flexibility to seek out spurious patterns in data.
For a given combination of variables, changing the weight each variable receives in the factor definition may have a dramatic impact on factor returns.
What do providers do?
Given the well-documented risk of biases leading to useless factors, providers of factor products should use the academically validated definitions. Indeed, many providers claim that their factors are grounded in academic research. MSCI, for example, recently issued a report that clearly emphasises this. It states that the firm’s “factor research is firmly grounded in academic theory and empirical practice”. FTSE also mentions the broad academic consensus that exists for the factors used in its global factor index series.
It is important to highlight, however, what having a strong academic foundation should mean. To claim that a specific factor is “firmly grounded” in academic research means that it should fulfil two criteria. First, its existence should be replicated and documented across different independent studies. This gives investors the assurance that the methodologies are externally validated and that the factors exist outside of the original data set. Second, a risk-based explanation should support the existence of the factor. Without this, there is no reason to expect the persistence of the performance. Post-publication evidence is needed to confirm that the factor does not disappear after it is published. To support a claim for academically grounded factors, providers should be able to list the independent studies showing that these two requirements are fulfilled.
This does not mean that using new or proprietary factors will necessarily fail out of sample. However, it is not possible to obtain the same assurances for the effectiveness of the factor without academic grounding. A prudent approach is to select only factors that have been replicated independently. With this in mind, why would one rely on provider-specific research concerning a new factor when you have free due diligence from the academic community concerning a standard set?
It is clear that the use of proprietary factors exposes an investor to risks that can easily be avoided.
Whereas product providers use factor names that are usually based on those presented in the literature, the actual implementation is very different. Our study gives some examples of variable definitions that different index providers use as a proxy for factors. These can be compared with the definitions academics use. It is clear provider definitions are more complex than academic ones and differ substantially from the externally validated factors, despite using the same labels, such as “value” and “momentum”.
A relevant question for investors is whether the ‘upgraded’ definitions of standard factors, like “enhanced value” and “fresh momentum“ add value only in the backtest or if the benefits hold after publication (i.e. in a live setting). Moreover, in the absence of external replication of such factors, investors are fully reliant on provider-specific results.
Many providers use composite scores in their factor definitions. As discussed above, this opens the door for an overfitting bias, even if composites are equal-weighted across constituent variables.
Overall, product providers explicitly acknowledge that the guiding principle behind factor definitions is to analyse a large number of possible combinations in short data sets and then retain the factors that deliver the highest backtest performance. In fact, providers’ product descriptions often read like a classical description of a data-snooping exercise, which is expected to lead to spurious results. For example, one provider states that, when choosing among factor definitions, “adjustments could stem from examining factor volatilities, t-stats, information ratios”, with an “emphasis on factor returns and information ratios”. Another provider states that “factors are selected on the basis of the most significant t-stat values”, which corresponds to the technical definition of a procedure that maximises selection bias.
Factor definitions providers use may appear to be advantageous in practice. Notably, this is the case when index providers offer both analytical tools and indices. If an analytical tool and a set of indices are based on the same factor definitions, the indices will show an exposure to the factors by construction. However, if the factors are flawed to start with, such correspondence does not add any real value to investors.
Many factors used in investment practice are well known to fail to deliver a significant premium. For example, different analytics packages include the dividend yield, leverage, and sales growth as factors, while all of these have been shown not to deliver a significant premium.
Factors may also be redundant with respect to consensus factors from the academic literature. In other words, many proprietary factors may have return effects that can be explained away by the fact that they have exposures to standard factors.
Popular factor products and tools contain a large number of factors that do not deliver an independent long-term premium. This is bad news for investors who are using such tools to understand the long-term return drivers of their portfolios.
Factors used in investment practice show a stark mismatch with factors that financial economists have documented. Commercial factors are based on complex composite definitions that offer maximum flexibility. Providers use this flexibility to seek out the factors with the highest performance in a given dataset. Such practice allows spurious factors to be found. Spurious factors work well in a small dataset but will be useless in reality. Therefore, many factors that appear in popular investment products and analytic tools are likely false.
Even though many providers claim their factors are grounded in academic research, we have emphasised that two important conditions to support this claim are often not fulfilled. The factor definitions should have been used and validated across different independent studies and a risk-based explanation should support the existence of the factor. Without these assurances, there is no reason to assume the persistence of the factor.
We also show in our study that relying on proprietary factor definitions can lead to unintended exposures. For example, investors who tilt towards a composite quality factor will end up with a strategy where, depending on the index we consider, only about a third or half of the excess returns are driven by exposure to the two well-documented quality factors (profitability and investment). This means the part of the excess returns that is unrelated to quality factors can be as high as two-thirds, an obvious misalignment with the explicit choice to be exposed to quality factors. Even if the quality factors perform as the investor expects, this performance will not necessarily be reflected in portfolio returns, which are in large part driven by other factors and idiosyncratic risks.
Understanding the factor drivers of returns increases transparency and allows investors to formulate more explicit investment choices.However, they must also be wary of exposures to useless factors, which have no reliable link with long-term returns. A good idea can easily be distorted when implemented with poor tools. For a meaningful contribution to the ability of investors to make explicit investment choices, factor investing should focus on persistent and externally validated factors.
Felix Goltz is head of applied research, EDHEC-Risk Institute and research director at Scientific Beta. Ben Luyten is a quantitative research analyst at Scientific Beta.