Goldman Sachs Research
Global Portfolio Analysis
Sustainable ESG Investing: Turning Promises Into Performance
Table of Contents
6 July 2020 | 3:37PM EDT | Research| By Steven Strongin and others
More
Environmental, social and governance (“ESG”) investing is at a deep level all about sustainability, yet we rarely ask the question, “How sustainable is ESG investing itself?” Its goals seem ever variable. Its link to performance appears uncertain. Its actual impact on corporations hard to define and even harder to measure. It would seem to miss on many of its own metrics.
Yet, ESG investing clearly meets a deeper need within the system. There is a clear and almost unquestioned assessment that something is missing within the regular investment process that represents a critical failure of modern finance and perhaps of capitalism itself.
Here, we seek to provide a clearer sense of where ESG investing fits in the broader scope of active asset management, examine the gaps it fills and use that assessment to better structure the ESG investing processes, assess its place in asset allocation and rethink the metrics we apply to it.
In this endeavor, we focus on ESG as an investment style, not as a force for good. This is not to deny that many investors in ESG funds have a desire to use their investing to do good, but rather, we seek to treat ESG investing as a rational investment discipline capable of generating long term outperformance that will cover its fees and provide investors with appropriate levels of realized outperformance sufficient to sustain itself as an investment product long term. Valuing ESG as a force for good would require a common agreement on the definition of good, stable metrics for measuring it, and links between investing and shifts in that metric that would allow a reasonable valuation of the net cumulative social impact, none of which we have today. Further, as an investment process, we will argue that much of the value of ESG as an investment tool arises from anticipating shifts in political and social priorities that then create shifts in economic opportunities for investors. Such shifts, while clearly important from an ESG investing standpoint, nevertheless pose real difficulties for developing ex ante metrics that anticipate changes in social norms that have not yet occurred and the social impact of the related changes in corporate behavior.
We argue that the core of ESG investing as an investment discipline is all about time frame, specifically the long term. ESG can be thought of as an investment process which posits that the markets and many companies are too short term in their assessments and that if a company effectively invests in its own structure, its people and its community, that company’s long run performance will improve. The idea that companies that embrace better business ethics, respect for human dignity and environmental responsibility as core principles, will over time attract better employees, develop tighter relationships with customers and avoid government censure and thus be able to create more economic value long term, has clear and obvious business logic, even if the timing and magnitude of the value created is less clear. Equally, it is easy to understand how companies that fail to embrace these values are taking serious and perhaps ultimately fatal business risk. This however, does not imply a simple metric-driven test of long run success. High level principles of behavior and their applications are often context-dependent, with significant subtleties in execution.
It has been argued that all asset management should embed an ESG based long term view of corporate behavior. Such an embed, as attractive as it might appear, is not in practice likely to be the best way to achieve either the social or investing goals of an ESG investor. As we will argue in detail, ESG investing is likely to perform better, both as an investment strategy and as a force for good, if it has its own allocation and is evaluated in ways compatible with its underlying long term focus.
Investment strategies do not, in general, combine well. The reason that style boxes keep getting narrower over time is that diversifying across strategies tends to work far better than combining them into hybrid strategies. The general reason for this is that risk management, portfolio construction rules and other portfolio optimization methods should differ considerably by strategy. This is examined in detail in our publications, A Stockpicker’s Reality Part III: Sector Strategies for Maximizing Returns to Stockpicking and A Stockpicker’s Reality Part IV: Why Shorts Aren’t Longs. Further, ESG’s social intent requires improvement over time in reflecting changes in scientific understanding and social norms rather than maintaining a static definition of good. Equally, it requires a culture of cross-checking between claims of good behavior and the reality of corporate behavior.

Data driven processes: matching data use to purpose

In the context of ESG investing, the differences between best practices for evaluating strategies and performance and other equity investment strategies is particularly acute. Quite simply, as equity asset management has become more data driven, it has also become more short term. This is not a behavioral or psychological issue, but rather, the result of the intrinsic biases in any data driven processes. In management circles, it is often noted that what gets measured is what gets managed. But the bias toward short term investing actually runs much deeper and reflects the fundamental mathematics of statistics. In any statistical analysis, what is easily and often measured with the more stable relationship to outcomes will dominate the analysis.
Longer term factors, like those embodied in ESG, are often poorly measured as it is harder to define and refine the deeper aspects of a firm’s management. Furthermore, the impact of these structural factors are not mechanically related to performance in the way that cost-cutting or refinancing would be, but rather have their impact over time as opportunities allow. As a result, the explanatory power of such long term factors is often absorbed in such statistical analysis by more easily measured short term factors which are correlated with—and sometimes driven by—those longer term factors, but which are more closely tied to the mechanical processes which generate the more immediate returns.
Inevitably, this growing bias toward shorter term performance in most equity strategies has created longer term opportunities and thus helped foster the desire for and the development of long term investment processes—private equity and ESG being the two largest categories whose growth has mirrored the rise of quantitative methods in asset management. In both cases, there are explicit notions that patience in process can reward investors. In private equity, this long term aspect is structural in its mandates and well understood both by the investors and the managers. For ESG investing, the struggle to be and stay focused on the long term has been harder.
The idea underlying ESG based investing is simple yet profound—if companies do things that improve today’s performance at the expense of more fundamental things that matter over time, those companies will become poor investments, even if the “when” is uncertain. The problem in executing this idea is that ESG investing lends itself to the application of the same types of metrics that have driven equity active management to short term strategies (sometimes explicitly, such as in true quant funds, and sometimes implicitly due to performance optimization assessments of normal active equity mandates). In the kind of data driven world that we now live in, that uncertainty about the “when” is a statistical headwind of enormous size. As noted above, quantitative methods inherently reward frequency and consistency of impact and timing. ESG investing is based on things that have inherently weak timing links to performance.
The response of ESG investors to this conflict between “wanting metrics” and wanting to look long term has in some cases been to shift the emphasis away from performance and emphasize the good that is being done, or to demand new metrics that will somehow reconcile short term measurement with long term performance. Both of these strategies are likely to fail in the long term. Long term-short term metric driven investing is almost certainly just as wrong-headed as it sounds and persistently bad performance is not sustainable and will eventually cause money to leave ESG investing as the good done by ESG investing is even harder to measure than ESG factors themselves. What is needed, if ESG is to remain a sustainable framework for investing, is to fully embrace ESG’s long term focus and work to understand what that means in practice for investment performance—both good and bad—and then to learn how to manage those realities to acceptable levels of performance.
We would note that these arguments in no way diminish the value and need for ESG disclosures and metrics. Like all financial disclosures, ESG disclosures and all of the data work around them provide a baseline for deeper discussions of strategy and commitment, and information for consumers and employees to be better informed about the companies they deal with. The context and dialogue generated by ESG disclosures are just as essential for ESG assessments as financial data is for financial assessments.
As discussed above, as ESG focuses on longer term factors, ESG can be treated as a separate equity allocation, like private equity, so that performance reviews can be done in ways consistent with its intent and its long term nature. Just including ESG requirements in a normal equity mandate creates a serious conflict between short term performance goals and the long term nature of ESG that will over time lead active managers to treat ESG as a constraint, rather than as a goal which combined with company incentives, will likely lead to companies “checking the box” rather than embracing the underlying long term perspective. In contrast, a dedicated ESG manager would view being able to tell the difference between ESG appearance and reality as a key source of alpha. Bad behaviors and potential reputational vulnerabilities are often unique to a company and are often carefully concealed from any want-to-be metric. Finding such weakness should be a key focus of ESG investing, not rejected for lack of metrics. Similarly, those unique unquantifiable long term strengths of culture and positioning that some companies possess should also be embraced.
Thus, the ESG portfolio manager’s metrics and performance review structure needs to embrace that long term reality in ways similar to how other long term equity allocations, such as private equity, are reviewed, rather than treat ESG as just another public equity mandate. ESG performance reviews need to be heavily weighted to process and longer term cumulative returns assessments while seeking to minimize short term performance checks. Furthermore, we suggest for the process assessments to look carefully at both the coherence of the ESG assessments and the track records of understanding the implications of those assessments at the company level. As events tend to crystallize ESG performance, both at the portfolio and company level, it is perfectly appropriate in a process review to ask if the fund was prepared for those moments and assess its track record at being prepared for such events.
The key to correctly reviewing a long term fund is not to turn that review into a short term performance review that will move the fund out of long term assessments and into short term performance optimization. ESG is particularly vulnerable to this mistake as the data and methods to push it “shorter term” are so well-developed and useful in other contexts. Why and how asset allocators and ESG portfolio managers should and can avoid this short term trap is the focus of the rest of this paper. To do so, we review the major keystones of ESG investing and discuss how for each of those cornerstones, metrics and performance metrics are likely to relate to each other so that portfolio managers and asset allocators can better understand good versus bad ESG process and good versus bad ESG execution. In the end, the sustainability of ESG investing will depend on a common understanding between portfolio managers and asset allocators about methods, goals and long term performance that earns its fees.

ESG Flash-points: hard on metrics, but central to performance and social mission

Much of the apparent instability in ESG methods and the seeming unpredictability in its goals arises from Flash-points. Flash-points are the activities or corporate practices that investors believe while “acceptable” today, will create real problems for those companies at some point in the future. Those problems could take many forms—from legal and regulatory issues, to recruiting problems, to consumer boycotts. These Flash-points often give ESG a distinctly political flavor as they track ongoing partisan debates about what constitutes socially or politically acceptable behavior. For effective investing, the ESG portfolio manager needs to evaluate potential Flash-points both in terms of their likely consequences and timing. From a pure performance standpoint, the best investment outcomes arise from acting shortly before the general market takes the risk of such Flash-points seriously. Thus, such investing can and should run ahead of the current political consensus, but not too far ahead. We would note that the importance of a particular Flash-point is not necessarily about the political outcome, but about the economic impact, which typically has more to do with intensity and location of interest and less to do with the broader political consensus.
From a benchmarking standpoint, Flash-points pose significant challenges. The value of risk management, which is what Flash-points are about from an investing standpoint, is about avoiding left tail events. These events are by their nature rare but extreme and as a result, tend to provide very low power statistical assessments.
From an investment standpoint, avoiding Flash-points (usually structured as a refusal to buy from or own companies of a certain type) tends to mean giving up some current return to avoid later losses. This obviously causes pain in terms of short term performance, but also poses real issues even for long term assessments. Not all potential Flash-points create economic consequences. The return profiles around Flash-points are very much like buying insurance. You lose money most of the time, but occasionally see large payoffs (in this context the payoff arises from avoiding large losses). Such payoff paths must be evaluated on a cumulative basis over extended periods for a portfolio of Flash-points. Shorter time spans, period by period analysis, more narrow assessments across narrow areas of concern, or very broad portfolios of all possible Flash-points, all typically find little if any value to such strategies not because the value wasn’t created, but because the implied implementations are awful.
A good implementation of a Flash-point strategy would arise from a dynamic portfolio of possible Flash-points weighted by the current likelihood of those Flash-points becoming active areas of public concern focused on areas where that concern is likely to be an active business problem. Performance reviews should focus on a portfolio managers’ ability to distinguish between real and imagined Flash-points. Additionally, a higher penalty should be put in place for failures to identify emerging Flash-points than for identifying too many, reflecting the relative potential performance impact of these two types of errors.

ESG fundamentals

It might be assumed that while Flash-points are tricky to assess using standard quantitative methods within a static framework, the more fundamental persistent aspects of ESG would be more amenable to such methods. But the reality is quite different. There are a number of deep problems with static assessments of underlying ESG corporate behaviors that need to be understood and managed for in an investment context. First, the timing of ESG related actions are often largely independent of related company outperformance. Second, static metrics often fail to take into account that it is possible in some aspects of ESG to reach counter-productive levels of ESG expenditures and thus, these static metrics can value excess spending as though it was effective. Third, static metrics can invite false positives when companies see advantages in pretending to have principals and behaviors they do not have. And perhaps most importantly from an investment standpoint, static metrics fail to recognize that the potential for improvement may have more value for the investor than actual good behavior, if a company can swing from being penalized for bad behavior to being rewarded for good behavior.
To illustrate these problems, we take a tour through various ESG categories to show how each of these issues arise in practice and how portfolio managers might better approach them to improve investment results.

Diversity and talent

Valuing the economic value of diversity at the corporate level provides a simple and straightforward example of the timing problems that permeates the statistical analysis of ESG factors. Short term optimization (expediency) can lead companies to uniformity in people, products and methods. The justification (weak as it may be) is that it is simpler and quicker to recruit and manage similar individuals in the same way it is to produce similar products. Effective diversity programs take work and investment both in recruiting and management. However, as many studies have shown, diverse staffs tend to be higher quality, and even more importantly to have diversity of experience that makes it easier for a business to understand its failures and address them.
The research on diversity is quite clear that among other advantages diverse teams consistently produce more original solutions to new problems.[1] Thus, we would expect a diverse company to outperform more uniform ones in periods of company or industry stress. However, the timing of that value creation is dependent on the timing of the stress that allows those abilities to have meaningful impact rather than on the timing of the company’s investment in diversity. Thus, any statistical analysis looking to find a relationship between increases in diversity and performance is unlikely to produce convincing statistical evidence. Only analysis that examined the difference in performance between similar companies under similar stress would show the expected value from investments in diversity. Such analysis, if possible at all, would require substantial time to build up the needed data.
Further, diversity, like almost all ESG factors, requires real commitment, not just good statistics. Diverse teams will only produce more original solutions if the team members are all given real voices. Simply being there is not enough. An over reliance on metrics without cross-checks creates incentives for companies to create appearances without reality. The portfolio manager seeking investment results needs to be focused on the reality.
It is worth noting that the timing problems we have been discussing are among the simplest. The broader ability of companies to recruit good, diverse staff and engage effectively with the communities they serve is also often dependent on having diverse staff, but the resulting corporate performance will almost certainly be better correlated to the actual recruiting and the actual engagement strategies that were enabled by diversity, rather than the investment in diversity. This is a general problem with long term enabling factors. The performance is typically linked far more closely to the behaviors enabled rather than the investments which made those behaviors possible.

Climate and ESG

Climate and ESG investing illustrates another important conflict between good investing and simple metrics. Climate may be one of the defining investing and social issue of our time. But from an investing standpoint, the implications of climate change are often not that clear. The answer will depend critically on the technological and political paths by which we eventually address climate change. Many “green” answers may turn out to be economic dead ends. Even if we assume as many economists argue that the best way to address climate change is a price on carbon and focus just on carbon prices, the level and scope of that carbon tax will have significant implications for investors.
The simplest, but perhaps least likely path to address climate change would be with known technologies and strict reductions in emissions (though this is the often embedded in green metrics). Often analyzed by means of a “carbon budget,” such a path requires stranding large amounts of carbon based energy assets, huge investments in clean energy, rapid electrification of transport, and large scale changes in agricultural. Such a path has clear investment implications which are often cited in discussions of ESG green investing.
There are, however, a number of obvious problems with this path that suggest that other alternative paths maybe more likely and be more appropriate baselines for ESG investing. First, the political will to travel such a path is questionable. The implied welfare impact on voters is high and voter resistance in many cases appears equally high. This is particularly true in less urban areas where mass transit and electrification are significantly more expensive to implement.
Further, and perhaps more salient, is that it may not be the path that would arise from an actual carbon price, which is the way that most economists would argue would be both the most efficient and the fastest way to address climate change. A carbon price would naturally induce companies to find the cheapest way to reduce carbon emissions. This would foster investment in new technologies, some of which would seek to absorb carbon from the atmosphere. For any emissions that were more expensive to avoid than absorb post emission, the economically efficient answer would be emit and absorb. In such a world, the price of absorbing (or sequestering) carbon would be the central economic question about investing in climate. Conservation that was more expensive than sequestration would not be a good use of capital. This is highly relevant as many current conservation efforts, when viewed form this lens, look suspiciously expensive relative to the forecasts of where carbon prices might equilibrate (see Carbonomics: The Future of Energy in the Age of Climate Change).

Exhibit: Implied carbon prices for regulatory mandated carbon projects

Range of static carbon abatement cost of different past policies (US$/tnCO2eq)
Exhibit: Implied carbon prices for regulatory mandated carbon projects. Data available on request.
With current emissions on a continuing upwards trajectory, a wide range of energy efficiency and low-carbon policies have been put in place in different countries over the past decade aiming to tackle the challenge of climate change. Some of them have been very targeted (e.g. ethanol/wind/solar subsidies), while others were broader (fuel standards). In aggregate, they have been successful at incentivizing clean tech developments, yet they have not necessarily been a cost-efficient way for reducing carbon emissions, and they have only fostered technological innovation in narrow areas of the low-carbon economy. The costs associated with these policy measures encompass a very wide range, from zero to US$1,000/tCO2, with several of the policies implying a cost/ton CO2 that is higher than the implied cost of alternative technologies such as sequestration. The economic studies involved in shaping the estimates presented in the above chart are primarily concerned with policy measures that were in force during the period 2010-14.
Source: The Cost of Reducing Greenhouse Gas Emissions Kenneth Gillingham James H. Stock Journal of Economic Perspectives vol. 32, Copyright American Economic Association; reproduced with permission of the Journal of Economic Perspectives
Further, this is a very pure case. The political reality is that the implication for income inequality and social justice of a high carbon tax suggest that politically, the eventual carbon tax is not likely to be applied evenly across activities and geographies. This would in turn suggest that “green” investors would need to tailor investments to take account of the likely differentials in taxes across geographies and groups. It would, for example, be easy to imagine a political compromise where urban taxes on carbon were notably higher than rural taxes.
More broadly, it is clear from the preceding argument that taking the long view on climate and successfully investing in “green assets” will require a nuanced view of the full equilibrium and path by which we get there and not just checking a “green” box. On most issues, it is possible to spend too much to do good to get a reasonable return. Efficiency matters, both for policy and social impact. Even a cursory examination of some current “green” projects suggests some projects may already be good examples of money that could have been better spent elsewhere both in terms of return on investment and in terms of carbon reduction. Wasting money in the pursuit of carbon reduction is neither good climate policy nor good investing, but it is not easy to construct ex ante metrics which account for such problems of excess, particularly when the key technologies are still in development.

Governance

The issues around using governance metrics for investing mirrors much of what we have already discussed. But a few things are worth noting as being especially relevant with respect to governance metrics. The first is reporting bias. Governance is perhaps the area most subject to a gap between disclosure and reality. Thus, this is a place where the portfolio manager’s ability to assess reality is critical relative to just using the metrics. The second is that from an investment standpoint, current governance may not matter as much as forward governance, which poses a special problem for metrics.
If you believe that good governance creates better returns than bad governance, then changes in governance may be even more important than current quality. To take a very simple example of two regimes—one where managements are secure, but carefully monitored, and another with poor governance, but less security of position. The likelihood of poor practice is lower in the first regime due to monitoring, but the potential for improvement is higher in the second. Thus, in the long run, the secondary arrangements around governance (changes in control) could in some cases matter more than the current governance structure. While issues around changes in control are often subtler and less amenable to metrics, that does not make them less important.
This also relates to the general problem that investment returns are often most related to changes rather than levels. Thus, in a particular period a poorly run company that corrects its problems often generates far higher returns than one that is consistency well run. Thus, bad ESG companies may be good ESG investments, if the portfolio manager sees significant potential for ESG improvement. Scoring is inherently backward looking and while appropriate for many questions, it can miss important dynamics. Portfolio management needs to be forward-looking to succeed in the long term.

Good metrics, bad metrics — good reviews, bad reviews

We have spent much of this paper discussing the potential failures of metrics, both to capture the subtlety of ESG investing and of pushing investors to be too short term. This should in no way suggest that we are against measurement or review. It is more that the metrics and the reviews need to be tailored to the desired outcomes rather than allowed to drive the process.
As we have discussed, there are strong biases in quantitative methods toward methods which focus on short term repeated phenomena that allow repeated applications. This bias is not a mistake; it results from the simple truth that is easier to develop reliable methods of assessment using such methods. An asset manager who makes hundreds of investments a year on a disciplined basis is easier to assess than one who makes ten. The statistics on this are very clear. To put this conclusion slightly differently, if an investment committee wishes to have statistical confidence that a portfolio manager is good, a short term manager can provide that proof in a smaller number of years than a portfolio manager who focuses on longer term issues. If the investment committee wants current evidence that performance has not faltered, a short term manager can provide such evidence, while a longer term manager cannot. Thus, it is quite natural and appropriate for investment committees to also have a bias toward managers that can provide more compelling evidence.
But this same logic suggests opportunity in finding ways to assess and allocate to longer term managers. Those methods of assessment need to embrace the long term nature of these strategies, rather than seek to somehow alter or ignore that reality. This means that the review needs to have a firmer philosophic base that a particular method of managing investments makes sense and the reviews must be more about adherence to that philosophy and demonstrations of the skills necessary to do so.
It is the skills aspect that can lend itself to more rigor. Can the portfolio manager distinguish between diversity and its appearance? Do they have a good track record in anticipating Flash-points at the company or industry level? Can they assess governance? While we have discussed that metrics can be misleading on a forward basis, they can still be used ex post. In a review, it is appropriate to ask for companies that were identified relative to a specific longer term issue and then assess how often the manager was right about the company and the relative importance of the issue.
The data bias that drives portfolio managers to short term investing lies in the poor link between the timing of return and the timing of the investment decisions, as well as the information used to make those decisions. To avoid that bias, it is critical that the review of returns reflect the investment, but it is perfectly appropriate to look at the process identifying companies and issues on a more data driven and short term basis with rigorous statistical assessments. Such assessments will naturally have a higher qualitative aspect than the equivalent performance reviews, but that does not have to imply less rigor. It also requires a careful delineation between issues that have more or less resolved and those which have not. The manager should be fully accountable for those issues that have been resolved, but reviewers should not use their own judgment about the future to ex ante review a manager who has a different view.

A deeper dive into the short term statistical bias in quantitative asset management methods

It would be natural to ask if the short term bias is not just a methods problem that can be solved. Unfortunately, it isn’t that simple. Repeated constant events produce more data. More data creates more precise estimates. Even when there is a deep understanding of this bias, and the techniques are optimized to editing longer term processes, it is still impossible to create more data. Precision roughly increases with the square root of the number of observations (in simple cases, it rises exactly with the square root). Thus, in a 5-year period, the relative precision of an estimate for otherwise equivalent strategies for a daily strategy will be 4.5 times that of a monthly strategy and 15 times the precision of an annual strategy (assuming 260 businesses days in a year).
And this is in a perfectly identified example with no data contamination or identification issues of any type. In practice, the longer the horizon, the more unstable the timing and the more likely it is that other factors (correlated with the strategy) will create noticeable volatility, further eroding the precision of the estimates. Thus, any attempt to optimize the predictability of returns will generate a heavy bias toward the data heavy strategies.
Technically, you could equalize the playing field by throwing away data. This is in some sense what is being done when you examine only cumulative returns. But even then, the reality still favors the short term strategy as the greater number of trades and data allows for greater optimization of process and the greater number of embedded trades allows the short term manager to take advantage of larger portfolio effects. Thus, shorter term strategies can typically be structured to prove smoother returns even over a longer horizon. This is why the bias is not a mistake, but a simple reflection of the underlying statistics reality. But as in all investment questions, the next question is risk relative to return. The smoother returns profile and easier assessments of skill should attract more money and thus, lower returns in shorter term strategies and increase the returns for longer term strategies. The key is understanding the split, and designing asset allocation and review processes that exploit those longer run opportunities without turning them into another short term strategy with a different marketing envelope.

Integrating ESG into other processes

Previously, we argued that simply adding ESG into non-equity mandates is counterproductive. While we believe this statement is broadly true, it is not true in every case. Some investment strategies, such as private equity, have a natural link to ESG that allows easy combination as the correct performance review structure is already in place. Also, the ability of private equity investors to directly influence portfolio company behavior increases the ability to take a firm from poor practice to good, further raising the potential returns from an ESG perspective.
In more standard equity mandates, the ability to combine would hinge crucially on the true time horizon of investing. A firm that has managed a truly long term investing style can and may have already incorporated many ESG practices into their process. In such a case, there is little reason to make it more formal, but also not a high cost. One caution is that most portfolio managers’ self-perceived management style is far longer than the actual alpha profiles would suggest. Holding periods which exceed the period of average position outperformance are quite common as a way of improving turnover statistics, but do not change the actual investment horizon. Consequently, it probably makes more sense to examine philosophic compatibility with the current investment process than with the turnover statistics.
  1. 1 ^ Marquis, Jefferson P., Nelson Lim, Lynn M. Scott, Margaret C. Harrell, and Jennifer Kavanaugh, Managing Diversity in Corporate America: An Exploratory Analysis, Santa Monica, Calif.: RAND Corporation, OP-206-RC, 2007. As of April 18, 2008: http://www.rand.org/pubs/occasional_papers/OP206/

For important disclosures, see the Disclosure Appendix.