Thank you, Dr. Serageldin, for your opening remarks; and my thanks to you and your colleagues for organizing what promises to be a stimulating conference. I would like to start by emphasizing that the word, indicators, can mean more than a number or ratio. It can also mean a signal to attract attention or a device for showing the operating condition of some system. Today’s best practices focus on numbers or ratios so I use the terms sustainability metrics to be clear our goal should be a device that would use today’s indicators, and something more, to show the operating condition of the system meeting human needs.
I think sustainability metrics must be broader than today’s SDI to recognize all information that is relevant or material to a decision, with relevance determined by some verbal model of the decision that may or may not be quantified. We trust verbal models for many decisions in life in large part because that avoids the costs of quantitative proof. So while a model may be more compelling if backed by quantification, the benefit isn’t always worth the cost. One way sustainability metrics can improve on SDI is by weighing costs and benefits of quantification to a specific decision. Done properly, this means one can create an objective process even for what is today called case-by-case decision making. The other improvement I expect from sustainability metrics is to filter out what I call data noise, or quantitative data that may prove some verbal model—just not one stakeholders consider material to the decision under discussion. I mention this at the start because I think we will come back to these fundamentals over the next two days. Logically, all tabulated data depend on some verbal model but quantification is not always necessary and is rarely sufficient for decision making. Failure to consider verbal models unless quantified, and allowing data noise into decision making workspaces, are technical weaknesses in our decision support systems.
I am proud of the decades I spent designing and managing some of the most rigorously quantitative decision support systems: monetary indicators for IMF credit ceilings and then per capita income operational guidelines for access to World Bank funds. However, I see such systems today as frozen accidents, shaped by the convergence between the interests of centralizing authorities and early computerization. Not only has technology become diffuse; at least in principle there has been a global mood swing toward decentralized, market oriented decision making. I see future systems providing just-in-time information to flickering clusters of decision makers. Unfortunately, today we seem to using the last generation’s systems to imagine the next generation’s priorities. So I hope over the next two days we can look beyond conceptual issues and consider technical aspects of decision support systems.
Still, today’s best practices in monitoring development are vastly superior to those used in 1983, when the Bruntland Commission popularized the term sustainable development. Most of the improvement has been quantitative SDI but I think it is important to keep in mind that not everything that matters can be quantified; and people differ in their willingness to trust quantitative approaches to problem solving. I think the next generation will rely on sustainability metrics that go beyond today’s best practices not only conceptually but also, and perhaps more importantly, in technical terms. On the conceptual side I would like to discuss improvements that have been made, and those still necessary, in terms of what things are recognized and how much weight it gives to each. On the technical side I will suggest we focus on both our capacity to compile information, including capacity to collect; and to communicate that information, including capacity to build trust in what is communicated. Since it is easy to get lost in the details of today’s best practices, however, it might help if I give you a quick sense of where I think we are heading, and how the next generation will differ from today’s sustainable development indicators.
Most experts pledge allegiance to the definition, or verbal model, of sustainable development popularized by the Bruntland Commission in 1983. Most of the work since then has been elaborating a framework that connects this verbal model to decision-making processes and shows how our actions as individuals, businesses, and governments promote or slow transition to a more sustainable future.
There are two obvious connectors in Bruntland’s verbal model. First, it expects sustainability metrics to consider inter-generational equity, or how actions of the present generation affect future generations. Second, it requires an assessment of “needs,” which may or may not consider intra-generational equity, or trade-offs between different groups in each generation. I will return to the issue of intra-generational equity near the end of these remarks but for now let us simplify as current best practices do by assuming the issue is beyond the scope of sustainable development indicators.
As a practical matter, when I think about assessing human needs I start with a framework familiar to social scientists and useful for statistical purposes, called Maslow’s hierarchy. The base of the hierarchy comprises physiological needs, like food and water, or what has long been called basic needs in development economics. As we become better off or wealthier, we strive to meet higher needs—including those not considered in traditional development economics, like morality and creativity. Much of Maslow’s work as a sociologist was about how our actions depend not only on where we are in this hierarchy but also by our expectations about where we will be in the future. In this sense it seems a good framework for considering equity, within or between generations. This leaves the non-trivial task of explaining processes or systems that satisfy these needs, and how our actions affect those systems.
Almost all sustainability experts agree there are three, interacting, systems that satisfy needs suggested by Maslow’s pyramid: Economy, Environment, & Society. Ultimately, only the environment can satisfy our physiological needs, at the base of Maslow’s hierarchy. In practice most of us now depend on complex interactions among all three systems even for basics like food and water. Early work on sustainable development indicators, or SDI, focused on interactions among the systems, in large part because decision makers seemed to give too much weight to the economy, as if it did not depend on environmental and social systems.
Based on over 30 years working on decision support systems at the IMF and World Bank, I would say the bias in decision-making is real but arises at least in part from the basic difference in how information about each system is organized. Economic information has long had a coherent, quantitative framework called the UN System of National Accounts, or SNA. Parts of environmental and social systems have a long quantitative history but there was no equivalent systemic framework so anyone concerned about interactions among systems had to be prepared to delve into qualitative, or word-oriented, evidence. Not only does it take more time to read a report than scan a chart or table of numbers, but it is also harder to compare reports than charts. So other things being equal, decision makers will favor quantitative, number-oriented evidence, over qualitative, word-oriented evidence. As the World Bank’s first Vice President for Environmentally Sustainable Development, Dr. Serageldin began what I would call a two-pronged solution. He first began organizing the qualitative evidence around semi-structured frameworks called assessments, and he then began efforts to improve quantitative support for them. I joined his shop as part of that effort and pioneered what is now called the wealth or capital approach to sustainable development indicators, which John Dixon will describe more fully in the next session. Environmental indicators have taken two paths since then. The first produced agreement among central authorities on a “satellite” for SNA called the UN System of Environmental and Economic Accounts or SEEA and, at least among statistically advanced OECD countries, a unifying verbal model called DPSIR. The second path has been highly decentralized experiments in what are usually called composite indicators, with the Ecological Footprint and the Environmental Performance Index among the better known and more enduring examples. I will return to them shortly. Social systems continue to be considered mainly in word-oriented documents, or expert assessments. The point here is that today’s decision-makers are left to piece things together from evidence that differs not only by topic but also by form of presentation and accessibility.
DPSIR is named for the first letters from its five interacting phases: Driving forces, Pressures, State, Impacts, and Responses. For example, population growth is a D riving force affecting many aspects of sustainability; it puts P ressure on agriculture to farm marginal lands that may explain a deteriorating S tate of both land and water, which I mpacts the next generation’s ability to satisfy its basic needs. There are two limitations on this approach. First, physical sciences may tell us about cause-effect relationships among the first four DPSIR terms but R esponses involve social sciences and a problem modelers call self-referencing. Basically, humans both receive and provide services to be monitored--and we can consciously change each system and how they interact. This means nothing in the physical sciences would help up decide how to complete the loop in this diagram, with the uncertainties of social sciences deciding whether I mpacts lead to R esponses. This leads to the second limitation of DPSIR: it considers the environment in isolation, much as SNA considers the economy in isolation. More specifically, what it sees as a D riving force may be a R esponse in the social or economic system. If we are to recognize the super system we need to look beyond the implicit value judgments experts in each system use to describe interactions and focus on getting agreement about which interactions matter in which contexts, or what the arrows in this framework are supposed to link . Where either end of an arrow entails humans we need to accept that our R esponses depend on how aware we are of the specific interaction, and our sense of capacity and interest in affecting it. Any model that admits this can’t be closed, or computed. This is a logical problem and no amount of data work will solve the problem of self-referencing, noncomputable models. So we need a different kind of model; one that uses conventional quantitative modeling up to a point and then enables “human connectors” to complete the loop—and explain how they did so to other humans. There are some highly mathematical ways to represent human connectors in quantitative models, which go by the intimidating name of Impredicativity Analysis Loops. Since I am not a good mathematical economist, I favor a more direct solution. I think models should be as complex as necessary for a problem at hand
Up to the point where human connectors are necessary, today’s best practices actually do a decent job of detailing, and quantifying, interactions of the type suggested by the DPSIR model across all three systems that satisfy human needs. This slide, for example, gives a fairly generic view of how elaborate quantitative models have gotten. The next slide gives an idea of how quickly that complicates things, taking the supply side of water interactions in such a model as an example. I don’t expect anyone to be able to read let alone think about all this connections; I just want to make the point that we can detail these interactions well beyond the capacity of most of us to comprehend.
In effect, T21 asks model users to first agree which human responses are material, meaning both important and at least possibly subject to change, and then use the model to consider unintended consequences of changes. This, for example, is the set of human responses stakeholders can consider with respect to land in the model for Malawi. Each user can act as a human connector and see consequences of different responses on the mechanical part of the model. In principle, such human connectors close quantitative models and, if you get the right stakeholders involved, some consensus about aspects of human behavior that most need to change and policies to facilitate such change. In practice the models are so complex the human connectors have to spend a great deal of time delving into the machinery before they understand which interaction was material to each policy intervention—and few decision makers can spend the time.
However, the real limit of quantification is that it is not always necessary and is rarely sufficient for making a decision. I say this based on over 30 years designing and then managing some of the most number-oriented decision support systems in international development; notably the World Bank’s financial preference guidelines and before that the international financial statistics designed to rationalize IMF credit ceilings. To paraphrase Porter, the Bretton Woods institutions developed quantification because the elites or experts were weak, private negotiation was suspect, and trust was in short supply.
So after decades as a numbers-cruncher I conclude quantification is important in decision support systems but only if it respects the cultural contexts in which it is used. I use the term, cultural context, much as Porter does. But where he focuses on two technologies of trust, expert opinion and quantification, I would add a third, which I call Wisdom of Crowds based on the book by James Surowiecki that suggests groups may make better decisions that could be made by any single member of the group—including the expert alone. Professions differ in their inclination toward one or the other of these technologies and it is this professional preference that I mean when I speak of cultural context.
I realize this may sound very abstract so I would like to bring it down to a real issue like a factory owner trying to decide whether his business is a serious polluter. One very popular solution is to hire an expert to evaluate the factory relative to environmental management standards, notably those of the International Standards Organization. The expert, or auditor, provide a written report that the factory owner is under no obligation to publish; the most others usually see is a certificate issued by the auditor. Those who trust the auditor will trust the certificate; others won’t. Quantification requires comparing the factory to some external standard that is applied to others, as well. It is practically impossible to conceal the standards so the issue of trust becomes one of whether the standard was in fact applied. The issue is then trust in government and only indirectly about quantification and pollution—but it is nonetheless crucial to use of numbers. The third technology of trust is some form of stakeholder dialogue, which may include experts and those in governance but may include workers, neighbors, bankers, and so on. It may also use quantitative evidence, or not, depending on what works best for a particular group of stakeholders. At the moment, the factory owner is told all three are best practices. I wan to emphasize that I am not for or against any one of these since I think they all have a role to play and the best decisions use them all, in varying proportions. The trick is to set guidelines for switching among technologies of trust that the factory owner, among others, can understand and use to make decisions.
In order to monitor progress in accordance with the Bruntland definition of sustainability, we need some accounting framework. The one we pioneered at the World Bank is called the wealth or capital approach because it defines capital broadly, to include all services satisfying human needs, in all three systems. This has two important operational advantages. First, replacing the amorphous term, needs, with “all forms of capital” led SNA experts to look beyond the range of capital assets traditionally recognized in that system. Second, it recognized that the importance of the past, and historical data, was only to support a forward looking analysis, comparing what we inherited from the last generation with what we leave the next. This verbal model fits nicely with what economists call the Hicksian definition of income, as the amount one could consume in a time period and end it as well off as one was at the beginning of the period. Measurement challenges were and remain great but at least we seemed to have a verbal model that might be connected to quantitative data and hence to returns on investments across a very broadly defined range of assets.
This conceptual approach has now been endorsed by statistically advanced OECD countries for SDI; it is now used by World Bank as a Millennium Capital Assessment and also appears in the multi-agency Millennium Ecosystem Assessment. As shown in this diagram, based on one in the report of OECD experts, it considers flows or transactions as well as stocks or assets. What I mean by recognition is compiling evidence for columns labeled Q, for Quantities or physical indicators in general. Valuation has to do with compiling consistent evidence for columns labeled V, for Value or indicators of weight or importance in general. The green areas represent where OECD experts reached some general agreement about sources and methods; yellow areas indicate where they could not. The coloring basically says we aren’t sure how to value water and ecosystems or how to measuring most other forms of wealth in physical terms—and we aren’t really sure how to measure either aspect of social capital or infrastructures. Conceptually the missing link everywhere is a price, P, that meaningfully expresses the value of a single unit of each asset in each category. And the problem is not that we have no candidates for P but rather that there are so many.
The traditional solution is to “drill down” the cells of a table like the one in the last slide to details where we find meaningful Ps. This slide suggests the main branches for a concept hierarchy doing just that. Ever finer grain distinctions can be made under each of these terms until we have thousands of “buckets” that can hold information needed to compile indicators of the Stocks and Flows, and Quantities and Values. To reach meaningful Ps we may need separate buckets for different places and times, and to recognize that the services of an asset may depend on how it is used with others, like tires on or off a car. There are practical solutions but in my experience this reductionist approach is like the Sorcerer’s Apprentice: once we use it carry buckets of details to the well we will find it hard to stop it. The advantage of this framework is that it is fairly simple to recognize the objects, or assets, in each of the three systems that satisfy human needs. Such assets can not only be recognized but arranged in a concept hierarchy and quantified—or not. Any interaction that matters must have some effect on some asset in one of these systems. This is a important in two ways. First, it provides what financial analysts call a materiality filter on data: any data that can’t be shown to affect some specified asset is not material to sustainable development. Second, it suggests priorities for compiling and communicating information should be set by effects on assets, not ease or precision of compilation. In both respects we need to recall the advice of the great economist, Keynes, who said “It is better to be imprecisely right than precisely wrong.”
I see most cutting edge research on each system (economic, environmental, and social) as designed to broaden rather than deepen the wealth approach, and recognize how assets work together to satisfy human needs. For example, I think there are ways that cutting edge economics could be applied to environmental and social systems, ways that don’t yet seem to be recognized by experts on those systems. On the other hand, economics has only recently become willing to grapple with some challenges that have long troubled environmentalists. All this good work on harmonizing economic and environmental indicators may be overshadowed by the shifting focus of decision makers, to what are broadly called intangible assets. Indicators of multi-factor productivity, innovation, intellectual property, or research & development are today’s infant industry in sustainability metrics. They are really indicators of adaptation, which is obviously as important in environmental as economic and social systems but environmentalists have been slow to explore the potential. As the saying goes, the only constant is change. I think this is recognized intellectually by experts in all three systems but progress has been painfully slow on defining a common approach.
Most of us understand the money price of an asset, or P, needn’t reflect its total value in satisfying human needs, or TEV. Markets set money prices, or P, but societies define TEV—and social infrastructure decides whether, and if so how, P and TEV will converge. Economic theory expects Ps for a given asset to converge over time and place so economists use a complex averaging of Ps called Purchasing Power Parities to estimate where Ps would settle if market had time to equilibrate present imperfections. In different ways all three valuation schemes represent today’s values. PPPs and TEV are useful indicators of market and social forces, respectively, likely to cause Ps to move in future. Relating either to Ps implies an expectation that humans will adapt by changing production possibilities, shifting consumer preferences, or both. There is no a priori way to know how this will change relative Ps of assets in a portfolio and hence whether the stock of capital we leave for future generations will be worth more or less than what we inherited from our predecessors.
For those who aren’t familiar with it, this diagram gives the basic idea of Total Economic Value or TEV. Money prices or Ps usually consider only the left branch so quantification of nonuse value tends to be expressions of expert opinions or the wisdom of crowds (notably willingness to pay surveys).
I mentioned earlier that environmental indicators in particular took two paths. I then described the central path, where the wealth approach is trying to do for longer term or sustainability studies what GDP does for short term economic analyses: create a “headline” indicator within a framework where prices, quantities, and values must be consistent. The second path developed mainly to correct a perceived flaw in such frameworks; namely a trust in market prices as meaningful signals. The second path prefers to trust experts both to recognize assets or transactions and to value them. A number of these composite indexes have not only prospered for a number of years but also come to be recognized as innovators in environmental indicators. On the downside, there are now so many such indexes that even experts have a hard time comparing their signals.
This slide gives a case study of this problem, using a presentation I made last year in a conference at Environment Abu Dhabi or EAD. Unlike most organizations, which get involved in one or two composite index exercises, EAD has delved deep into all the major ones. I collated the several hundred details from all composite indexes to define the size and distribution of the data warehouse they need to monitor them all. I then sorted these according to 8 EAD themes, like air quality, and then related EAD’s themes to the four main categories of the wealth approach. This chart shows how each composite index assigns different weights to EAD themes; with the last row showing the collective weight if all details are given equally weight. Some details are used in more than one composite but each composite gives a zero value or weight to many details important to others. The challenge for a decision support system is to decide what weight or value each detail should have in monitoring EAD’s performance. And that requires fitting the details to a verbal model like DPSIR, a quantitative model like T21, or both. In this sense composite indicators may help identify issues but they need to be considered in a broader, more inclusive framework if they are to help decision makers.
As Erik Thomsen will explain more fully tomorrow, the technology exist for such a system but it requires recognition that governments and international organizations in particular are relying on information systems that were not designed to support decentralized, market oriented decisions.
I hope to have a chance tomorrow to elaborate but I want to end by saying I think there are concrete steps we can begin taking almost immediately to move toward what I call sustainability metrics. First, we can lobby for official recognition of sustainable development by the UN Statistical Commission. It is unrealistic to expect sustainability metrics to have much effect on official decisions, or receive sustainable funding from official sources, without such recognition. Second, we not only need to upgrade information systems to use today’s best technologies but use the complexity of sustainability metrics to push the frontiers, much as I saw happen in the IMF and World Bank in the 1960s, 1970s, and 1980s. Third, we need to make sure sustainability metrics continues to benefit from the interplay between top-down and bottom-up approaches to problem solving. This will require protocols to build trust among “stakeholders” and to do so systematically.
If we do these things, I believe we can in fact come up with decision support systems that help us think globally yet act locally.
Opening remarks by John O’Connor Bibliotheca Alexandrina December 7, 2009
Data availability High Low Materiality High Modeled data Verbal models Low Data noise
Human Resources (HR) Education Health Time use Social Infrastructure (SI) Frameworks (Laws, etc.) Networks (Clubs, Internet, etc.) Tangible Produced Assets Machines Structures Infrastructure Software [?] Financial Assets Bank accounts Stocks & bonds Marketable resources Subsoil assets (oil, etc.) Farmland, pastures Timber in forests Fish & marine products Water Ecosystems Habitats Species Tangible Intangible Blue water Green or virtual water Gray & black (waste) water Transevapoprecipitation (PA) (NC) (HR, SI)