Avoiding the pitfalls of ESG scores

Traditional scores or ratings of a company’s environmental, social and governance (ESG) credentials attempt to translate assessments across a host of criteria into one convenient number representative of overall corporate performance. They have been advertised to investors as relevant metrics that support comparisons across companies as well as computation of aggregate portfolio performance indicators for purposes of reporting and fund selection.

Whereas exclusionary screening, which remains the most widely applied ESG strategy globally, does not rely on ESG scores, other popular strategies often do and they are frequently marketed on promises of superior financial, and especially non-financial, performance.

Extensive academic research has underlined that these scores are not capable of guiding investors concerned with social welfare and environmental sustainability. International organisations as diverse as the OECD and the WWF have warned against viewing ESG scores as a meaningful indicator of an investment strategy’s contribution to the achievement of ESG goals, in particular the fight against climate change.

These dire warnings are buttressed by the amply documented lack of convergence of ESG scores across different providers. This divergence, which we analyse in the first section of this article, is due not only to differing objectives, definitions, methodologies and data, but also to the inherent subjectivity of assessment.

In the second part, we highlight the additional concerns linked to averaging ESG scores across a portfolio and using such an average as a goal or constraint in portfolio construction. Portfolio optimisations based on average ESG scores magnify the estimations errors of individual ESG scores. Moreover, average ESG scores can only be viewed as relevant if one makes questionable assumptions on investors’ utility functions with respect to ESG performances and/or unrealistic assumptions about the link between ESG scores and the ESG risks of companies.

The lack of convergence of ESG scores is nevertheless not an indictment of all ESG data. Indeed, the ESG screens incorporated in our off-the-shelf ESG and climate options do not seek to manage an average score at the level of the index but impose the same minimum ESG standards on all constituents. Thus, concerns about portfolio-level financial or ESG performances are not permitted to distract from the removal of securities of issuers whose activities or behaviours violate global ESG norms – violations that can be documented with reasonable objectivity. We prefer involvement indicators that shed light on inconvenient truths to the convenience of portfolio averages, which may obscure issues, and to rely on data grounded in physical realities, such as carbon emissions, when building portfolios that contribute to tackling climate change.

The divergence of ESG scores across providers questions their reliability

The overall informational potential of ESG scores is low, as ESG scores are derived from heterogeneous data and idiosyncratic methodologies, and may therefore diverge significantly from one data provider to another. This divergence of ESG scores originates from divergences of objectives (what), methodologies (how) and assessments. Scores could (and do) diverge because they relate to fundamentally different concepts, such as measurement of the ESG impact or performance of a company versus measurement of the financial materiality of ESG issues for a company. They also diverge on the choice and weighting of criteria and/or because of differences in data sources and treatment, including arising from subjectivity. Academic studies have shown that more than half the divergence observed is explained by the latter, ie differences in assessment.

Several academic studies have documented the lack of correlation between the ESG scores of different rating providers, with average correlations ranging from 0.40 to 0.61 in the different studies. In Chatterji et al. (2016), for example, this leads the authors to conclude that these metrics cannot guide issuers and that, in the worst-case scenarios, well-intended managerial attention to social metrics could reduce social welfare. Likewise, they note that investment based on these invalid metrics will fail to direct capital toward the most responsible firms. Finally, they observe that the lack of validity or the inconsistency of ESG scores should cast doubt on the validity of score-based academic research on the effects of ESG on performance. An additional problem when it comes to the validity of studies trying to link ESG scores to financial performance is that the scoring history is sometimes rewritten, creating a risk of hindsight bias.

In conclusion, observing the lack of convergence of ESG scores, the OECD warned that “if high ESG scores are simply a judgment that varies significantly across firms, the extent to which investors can be assured that this approach either provides enhanced returns or aligns with particular societal values merits further scrutiny by policy makers and the investment community.” (See OECD, 2020, chapter 1, Robert Patalano and Riccardo Boffo.)

The difference between reliable ESG data and ESG scores: The example of environmental performance

We must nevertheless underline that the above observations on the lack of convergence of ESG scores are not an indictment of all ESG data. They do not apply to issuer-reported data that are accepted as valid or objective data that may be directly measured or modelled with reasonable precision.

Taking the environmental dimension as an example, ESG scores may be viewed as contributing to greenwashing.

At the issuer level, ESG scores are typically averages of indicators of corporate strengths and weaknesses over multiple criteria. Averaging allows certain issuers to achieve strong scores despite association with material ESG concerns and provides rich opportunities for astute and well-endowed companies to take a “strategic” approach to ESG scores by orientating ESG investments and reporting towards “low-hanging fruit”.

This leads to some questioning the very relevance of ESG scores. As an illustration, the World Wide Fund for Nature European Policy Office (WWF, 2019) noted in its feedback on the update of the EU Benchmark Regulation that it was not convinced that ESG scores were very robust. Their consideration of secondary ESG issues (what the WWF called “nice to have”) could lead to overlooking critical ESG issues (what it called “strategic core business issues”) and a focus on process indicators (“box ticking”) could lead to overlooking impact indicators. The non-governmental organisation also objected to the relative nature of most ESG ratings, which leads to sustainable companies being distinguished within non-sustainable sectors.

Focusing on climate change, we observe that academic studies looking at the correlation across greenhouse gas emissions data distributed by different providers find it to be strong for direct emissions (scope 1) and indirect emissions linked to consumption of purchased electricity, heating or cooling (scope 2). Busch et al. (2018) conclude their comparison of emissions provided by Bloomberg, CDP, ISS ESG, MSCI, Sustainalytics, Thomson Reuters and Trucost with these words:

“When outliers are removed from the data samples, data concerning scope 1 and 2 emissions provides a rather homogeneous picture. Notably, a high level of consistency can be achieved when data gathering and reporting practices follow the GHG protocol. At the same time, the aggregated consideration of estimated data for scope 1 and 2 emissions provides a surprisingly homogeneous result. While the consistency of estimated data – as can be expected – is lower as compared to reported data, the different estimation methods being applied seem to close data gaps in an adequate manner.”

This suggests that climate change metrics based on scope 1+2 carbon emissions data may be much more relevant than environmental scores when analysing a portfolio’s exposure to climate transition risks or an investor’s contribution to climate change.

Indeed, the quantitative analyses performed by the OECD (OECD, 2020, from chapter 2, Robert Patalano and Catriona Marshall) highlighted the risks of relying on environmental scores when defining investment strategies aimed at addressing climate change: “the E score in its current form is not an effective tool to differentiate between companies’ activities related to outputs that affect the environment, climate risk mitigation to improve risk-adjusted returns, and medium-term strategies to align portfolios with lower-carbon activities.” For some of the scoring providers analysed, they even, worryingly, found that good environmental scores correlated positively with high emissions.

The problems with using portfolio-average ESG scores

Average ESG scores at the portfolio-level are only meaningful from a socially responsible investment standpoint if one assumes that the utility of ESG performance is linear, for example that holding a company facing a critical ESG controversy could be neutralised by investing in a company that has earned a corporate sustainability award.

From an ESG risk management angle, average ESG scores provide useful insights only if one assumes that scores very accurately proxy for ESG risks and furthermore that these risks are linear. None of these assumptions are substantiated by academic studies, or even by intuition or casual observation.

With respect to the question of the linearity of investors’ ESG preferences, let us note that high ESG performance at the corporate level rarely attracts as much attention or elicits as much passion as poor ESG performance – companies that are known to be failing basic standards of corporate responsibility receive disproportionately more coverage than those companies that greatly exceed standards.

Likewise, consumers have traditionally been found to consider the ESG performance of companies as a hygiene factor rather than a motivator (Meijer and Schuyt, 2005). Hence, for the average investor with progressive ESG motivations it is unlikely that the non-financial impact of holding a company facing a critical controversy could be neutralised by an investment of the same amount in a company that has earned a corporate sustainability award.

From an ESG risk management angle, the use of a portfolio’s average ESG score as a proxy for ESG risks with potential financial materiality assumes that there is an exact linear relationship between the ESG performance at the constituent level and the expected value of the financial impact of the risk realisation. While such assumptions may be convenient, we do not regard them as conservative, especially for downside risk management. By contrast, exclusions of companies that fail certain demanding standards or thresholds (e.g. controversial weapons producers or coal-related companies) focuses on companies that can be viewed as high risk. Supportive of this orientation are studies such as that by Oikonomou, Brooks and Pavelin (2012) that finds that ESG strengths are negatively but insignificantly associated with systematic firm risk – including downside risk measures – while ESG weaknesses are significantly positively related to these measures; the association is particularly strong for socially irresponsible actions.

Academic research has underlined that ESG scores are not able to guide issuers or investors who are concerned with social welfare and environmental sustainability. In addition, international organisations as diverse as the OECD and the WWF have warned against viewing ESG scores as a meaningful indicator of an investment strategy’s contribution to the achievement of ESG goals, in particular the fight against climate change.

We favour ESG screening, which does not seek to manage an average score at the level of the index or portfolio but imposes the same minimum ESG standards on all constituents. Thus, concerns about portfolio-level financial or ESG performances are not permitted to distract from the removal of securities of issuers whose activities or behaviours violate global ESG norms.

Moreover, attention is focused not on irrelevant constructs built on unreliable data but instead on objective and robust exposure data pertaining to key ESG issues. As such, in the area of climate risk mitigation, to the convenience of portfolio averages that may obscure issues, we prefer involvement indicators that shed light on inconvenient truths and carbon data grounded in physical realities. Indeed, filtering or weighting stocks based on reported emissions and modelled emissions with reasonable convergence is more sensible to tackle climate change than relying on divergent environmental scores, which an OECD study recently described as ineffective tools to assess the environmental impact of companies and counter-productive indicators from the point of view of climate change mitigation.

Erik Christiansen is an ESG investment specialist at Scientific Beta.

Sponsored Content

Join the discussion Cancel reply

More from this fund

Subscribe now to