SlideShare a Scribd company logo
Quantifying Governance:
An indicator-based approach
DFID Capstone Team
London School of Economics
March 2015
Kimathi Muriithi, Margarita Jimenez, Nicolas Jannin,
Noor Sajid, Sahibjeet Singh, Sudhanshu Sharma
1
(Page intentionally left blank)
2
Foreword
The following report has been written as a part of a Capstone Project commissioned by the
Department for International Development (DFID) to a group of students of the Master of
Public Administration (MPA) at the London School of Economics and Political Science (LSE).
The terms of reference (TORs) guiding the project are provided in the Appendix. All remaining
errors are our own.
The views and findings expressed in this report are the authors’ own and do not necessarily
reflect the views of the DFID, LSE or their staff.
3
Acknowledgements
This Capstone project, like many other endeavours, is a culmination of efforts of many indi-
viduals. We take this opportunity to express our gratitude to all those who have helped us in
various capacities.
Firstly, we would like to thank our client, Department for International Development (DFID)
and Dr. Alexander Hamilton, Statistics Advisor, DFID; for giving us an opportunity to work
on an exciting project. Alexander was instrumental in shaping the trajectory of our research,
and the current form of the report owes a great deal to his feedbacks and comments. We are
also thankful to Mr. Conor Doyle, Economic Advisor, DFID, for useful insights, feedback and
comments.
Secondly, we are grateful to Dr. Konstantinos Matakos-our Capstone Advisor at LSE- for his
guidance, constructive criticism and encouragement throughout the course of this project. We
are indebted to Dr. Jouni Kuha, Department of Methodology, LSE, for his advice on the
statistical methodology used in this report. We would also like to express our gratitude to Prof.
Patrick Dunleavy and the participants of MPA Capstone Workshops for their useful comments,
queries and insights.
Lastly, we would like to thank the MPA Office at the London School of Economics for logistical
support and assistance.
DFID Capstone Team,
London School of Economics & Political Science
March, 2015
4
Executive Summary
1. The forthcoming Sustainable Development Goals have put governance back on the agenda.
Within this context, DFID is interested in understanding and developing better measures
to quantify and assess governance. This report contributes, to that aim, by providing a
clear assessment of the validity and reliability of indicators in two particular dimensions
of governance: Public Financial Management (PFM) and Corruption.
2. The literature review highlighted the following:
• Governance is a multidimensional phenomenon and there is no convergence regarding
its conceptual understanding.
• The most widely used governance measurement approach has been composite indices.
Therefore, this report reviews indicator-based approaches to quantifying governance. It
discusses how to construct composite/aggregate indicators and assess their quality.
3. Analytical and practical considerations informed the selection of the two dimensions. PFM
and Corruption allowed us to conduct a holistic analysis by allowing for the comparison of
two types of indicators: objective and subjective. Additionally, both are salient governance
dimensions and have strong policy relevance. Practical considerations highlighted the
importance of using good data to carry the assessment. Thus, for PFM, the relevant
datasets were chosen because of their applicability within the development context. For
Corruption, the suitable datasets were chosen because they allowed us to explore different
levels of indicator aggregation.
4. Having identified the relevant datasets, we employed multivariate analysis to assess the
validity and reliability of our relevant indicators. Exploratory and Confirmatory Factor
Analysis allowed us to investigate whether indicators designed to measure particular con-
cepts were indeed consistent with the assumed structure.
5. Results:
• PFM: The relevant datasets for this section were PEFA & OBS. The OBS analysis
points towards the reliability and validity of our constructed aggregate indicators,
thereby suggesting the overall good quality of OBS data. On the other hand, PEFA
analysis provides less convincing results: the indicators seem to be measuring close
constructs with weak convergent and discriminant validity. However, the PEFA re-
sults should be interpreted with caution in view of the potential limitations concerning
the data methodology employed.
5
• Corruption: The analysis is split into two sections. The first section, looks at
aggregated measure of corruption using datasets from three sources - WB, ICRG and
GIB - to investigate whether they are measuring the same underlying concept. The
results show this to be true. Although we cannot firmly conclude from it that the
measured concept is the true measure of corruption, it is evidence for these indicators
being valid. The second section looks at whether the GCB survey indicators measure
the right underlying concepts proposed by the survey. Our analysis shows that this is
not the case and thus, it can be concluded that there is potential for future research
to construct better measurements.
6. Our analysis allows us to suggest a meaningful way to further explore the appropriateness
of governance indicators. For PFM measurement, we suggest the use of the OBS: valid,
reliable and particularly suited for the monitoring of underdeveloped PFM systems. For
corruption, we point out that the use of GCB at the early stages of project formulation will
be beneficial. Additionally, the aggregate measures of corruption, can be used to provide
contextual background when evaluating project impact. Finally, within each indicator set,
we further recommend dropping problematic or redundant indicators, merging indicators
that line up with a congruent pattern of dimensionality, and classifying indicators that
appear to be tautological.
Contents
1 Introduction 14
2 Governance: definition and dimensions 16
2.1 Understanding governance and its dimensions . . . . . . . . . . . . . . . . . . . . 16
2.2 Dimensions of focus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.1 Public Financial Management Policy Background . . . . . . . . . . . . . . 17
2.2.2 Corruption Policy Background . . . . . . . . . . . . . . . . . . . . . . . . 19
3 Measuring Governance: composite indicators 21
3.1 Composite governance indicators: an introduction . . . . . . . . . . . . . . . . . 22
3.1.1 Advantages of composite (aggregate) indicators . . . . . . . . . . . . . . . 23
3.1.2 Weaknesses of composite (aggregate) indicators . . . . . . . . . . . . . . . 24
3.2 Designing indicators: methodological considerations . . . . . . . . . . . . . . . . 25
3.2.1 Selection of indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.2 Factor retention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2.3 Selection of aggregation function . . . . . . . . . . . . . . . . . . . . . . . 26
3.2.4 Selection of weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2.5 Uncertainty and Sensitivity analysis . . . . . . . . . . . . . . . . . . . . . 28
6
CONTENTS 7
4 Assessing governance indicators 30
4.1 Validity and reliability: definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 Introducing Exploratory and Confirmatory Factor Analysis . . . . . . . . . . . . 33
4.2.1 Exploratory Factor Analysis model . . . . . . . . . . . . . . . . . . . . . . 33
4.2.2 Confirmatory Factor Analysis model . . . . . . . . . . . . . . . . . . . . . 35
4.3 Assessing validity and reliability of indicators with factor analysis: methodology 36
4.3.1 Quantifying validity and reliability . . . . . . . . . . . . . . . . . . . . . . 36
4.3.2 Exploratory Factor Analysis: getting a first idea of the indicators’ validity
and reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.3.3 Confirmatory Factor Analysis: confirming the factors structure . . . . . . 38
5 Data and analysis 40
5.1 Public financial management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.1.1 Open Budget Survey (OBS) . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.1.2 Public Expenditure and Financial Accountability (PEFA) . . . . . . . . . 53
5.2 Corruption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.2.1 Aggregate Measures of Corruption . . . . . . . . . . . . . . . . . . . . . . 65
5.2.2 Public Opinions: Global Corruption Barometer . . . . . . . . . . . . . . . 68
5.3 External consistency of the results . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6 Recommendations and conclusion 79
6.1 Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
References 82
Appendix 90
Terms of Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Technical annex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Acronyms and Abbreviations
BEES Business Enterprise Economic Survey
CD Coefficient of determination
CFA Confirmatory Factor Analysis
CFI Comparative fit index
CPI Corruption Perception Index
DeMPA Debt Management Performance Framework
DFID Department of International Development
EFA Exploratory Factor Analysis
GCB Global Corruption Barometer
GIB Global Insights Business
ICRG International Country Risk Guide
INTOSAI International Organization of Supreme Auditing Institutions
MDGs Millennium Development Goals
NGOs Non-governmental Organizations
NPFM New Public Financial Management
OBI Open Budget Index
PCA Principal Component Analysis
PEFA Public Expenditure and Financial Accountability
PFM Public Financial Management
PFM-PR Public Financial Management Performance Report
8
CONTENTS 9
PMF Public Expenditure and Financial Accountability Performance Measurement Frame-
work
RMSEA Root mean squared error of approximation
SRMR Standardized root mean squared residual
TI Transparency International
TLI Tucker-Lewis index
WBES World Bank Enterprise Surveys
List of Figures
3.1 Indicators’ classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2 Composite index structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 Constructing composite indicators . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.1 Validity and reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2 One-factor EFA model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.3 EFA vs CFA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.1 OBS’s 11 intermediate indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.2 EFA visualisation, Budget Process and Budget Proposal . . . . . . . . . . . . . . 44
5.3 CFA diagram, Budget Process and Budget Proposal . . . . . . . . . . . . . . . . 45
5.4 EFA visualisation, Budget Proposal and Citizens’ Budget . . . . . . . . . . . . . 47
5.5 CFA diagram, Budget Proposal and Citizens’ Budget . . . . . . . . . . . . . . . . 48
5.6 EFA visualisation, Budget Process and Citizens’ Budget . . . . . . . . . . . . . . 49
5.7 CFA diagram, Budget Process and Citizens’ Budget . . . . . . . . . . . . . . . . 50
5.8 EFA visualisation, all dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.9 CFA diagram, all dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.10 Structure and coverage of the PFM system . . . . . . . . . . . . . . . . . . . . . 54
5.11 PEFA indicators list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
10
LIST OF FIGURES 11
5.12 EFA visualisation, Credibility and Comprehensiveness & Transparency . . . . . . 57
5.13 CFA diagram, Credibility of the Budget and Comprehensiveness & Transparency 59
5.14 EFA visualisation, Credibility of Budget and Accounting, Recording & Reporting 60
5.15 CFA diagram, Credibility of Budget and Accounting, Recording & Reporting . . 61
5.16 EFA visualisation, Credibility of the Budget and Budget Cycle . . . . . . . . . . 62
5.17 Methods for measuring corruption. . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.18 Main “overall corruption” indicators . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.19 GCB’s 23 indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.20 EFA visualisation, Experience of Corruption & Effectiveness . . . . . . . . . . . . 71
5.21 EFA visualisation, Perception & Experience of Corruption . . . . . . . . . . . . . 72
5.22 EFA visualisation, Perception of Corruption & Effectiveness . . . . . . . . . . . . 73
5.23 EFA visualisation, all dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.24 CFA diagram, all dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
List of Tables
4.1 Validity and reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2 CFA fit indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.1 EFA, Budget Process and Budget Proposal . . . . . . . . . . . . . . . . . . . . . 44
5.2 CFA, Budget Process and Budget Proposal . . . . . . . . . . . . . . . . . . . . . 46
5.3 CFA fit indices, Budget Process and Budget Proposal . . . . . . . . . . . . . . . 46
5.4 CFA fit indices, Budget Proposal and Citizens’ Budget . . . . . . . . . . . . . . . 48
5.5 CFA fit indices, Budget Process and Citizens’ Budget . . . . . . . . . . . . . . . 50
5.6 CFA fit indices, all dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.7 CFA fit indices, Credibility and Comprehensiveness & Transparency 1/2 . . . . . 58
5.8 CFA fit indices, Credibility and Comprehensiveness & Transparency 2/2 . . . . . 59
5.9 CFA fit indices, Credibility of Budget and Accounting, Recording & Reporting . 61
5.10 CFA fit indices, Credibility of Budget and Budget Cycle . . . . . . . . . . . . . . 63
5.11 Measures of corruption. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.12 CFA fit indices, all dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.13 CFA fit indices, all dimensions modified . . . . . . . . . . . . . . . . . . . . . . . 77
6.1 EFA, Budget process and budget proposal . . . . . . . . . . . . . . . . . . . . . . 93
6.2 EFA, Budget process and budget proposal . . . . . . . . . . . . . . . . . . . . . . 93
12
LIST OF TABLES 13
6.3 CFA, Budget process and budget proposal . . . . . . . . . . . . . . . . . . . . . . 95
6.4 EFA, Budget proposal and citizens’ budget . . . . . . . . . . . . . . . . . . . . . 97
6.5 CFA, Budget proposal and citizens’ budget . . . . . . . . . . . . . . . . . . . . . 98
6.6 EFA, Budget process and citizens’s budget . . . . . . . . . . . . . . . . . . . . . . 99
6.7 CFA, Budget process and citizens’s budget . . . . . . . . . . . . . . . . . . . . . 99
6.8 CFA, Budget process, budget proposal and citizens’ budget . . . . . . . . . . . . 101
6.9 CFA, Budget process, budget proposal and citizens’ budget . . . . . . . . . . . . 102
6.10 EFA, Credibility [A.] and Comprehensiveness & Transparency [B.] . . . . . . . . 103
6.11 EFA, Credibility [A.] and Budget Cycle [C(iii)] . . . . . . . . . . . . . . . . . . . 103
6.12 EFA, Credibility [A.] and Budget Cycle [C.] . . . . . . . . . . . . . . . . . . . . . 104
6.13 CFA, Credibility [A.] and Comprehensiveness & Transparency [B.] . . . . . . . . 104
6.14 CFA, Credibility [A.] and Budget Cycle [C(iii)] . . . . . . . . . . . . . . . . . . . 105
6.15 CFA, Credibility [A.] and Budget Cycle [C.] . . . . . . . . . . . . . . . . . . . . . 106
6.16 EFA, Aggregate Measures of Corruption . . . . . . . . . . . . . . . . . . . . . . . 107
6.17 EFA, Experience of Corruption & Effectiveness . . . . . . . . . . . . . . . . . . . 107
6.18 EFA, Perception of Corruption & Effectiveness . . . . . . . . . . . . . . . . . . . 108
6.19 EFA, Perception & Experience of Corruption . . . . . . . . . . . . . . . . . . . . 108
6.20 EFA, Perception of Corruption, Experience of Corruption & Effectiveness . . . . 109
6.21 CFA, Perception of Corruption & Effectiveness . . . . . . . . . . . . . . . . . . . 110
6.22 CFA, Perception & Experience of Corruption . . . . . . . . . . . . . . . . . . . . 111
6.23 CFA, Perception of Corruption, Experience of Corruption & Effectiveness . . . . 112
Chapter 1
Introduction
The Department for International Development (DFID) is the British government’s department
responsible for the UK’s efforts towards ending extreme poverty (DFID, 2014) and achievement
of the Millennium Development Goals (MDGs). In line with this, DFID has decided to invest
efforts in the study of governance, due to its implied association with economic growth1 and
inclusion in the new Sustainable Development Goals (SDGs) (UN, 2014). Consequently, the UK
has included good governance as part of its development objectives and has committed to using
its expertise to achieve this goal2.
However, governance is not easily quantifiable. Some of the shortcomings confronted by aid
agencies and policymakers in tracking progress, in this area, relate to the difficulties of defining
governance, establishing accurate indicators, collecting dispersed information, and comparing
measurement outcomes across countries (UNDP, 2012). This report aims to explore the validity
and reliability of specific governance indicators using a multi-variate analysis approach. Particu-
larly, we employ exploratory and confirmatory factor analysis to assess both criteria based on the
study of two relevant dimensions of governance: Corruption and Public Financial Management.
The findings from our analysis related to PFM imply that OBS data is of good quality overall
while PEFA appears to measure very close constructs with weak convergent and discriminant
validity. For Corruption, we find that the aggregate indicators developed by the World Bank,
GIB and ICRG measure the same underlying notion of corruption. However GCB indicators fail
to measure the same conceptual understanding proposed by the survey.
1
The extensive literature on governance and economic performance highlights their close association and the
need to expand efforts to achieve good governance in order to enhance economic growth (see Khan, 2006; Acemoglu
& Robinson, 2006; Arndt & Oman, 2010; Morrisey et al., 2011; Gani, 2011; Cerquety, 2011).
2
“This, I believe, is a totally new addition to the Millennium Development Goals: the importance of good
governments, lack of corruption – what I call the golden thread of development.” David Cameron’s speech in the
UN (15 May 2013), available: https://www.gov.uk/government/speeches/david-camerons-speech-to-un
14
CHAPTER 1. INTRODUCTION 15
The structure of this report proceeds as follows: Chapter 2 defines governance and its dimensions,
and gives a policy background of the two dimensions that are the focus of our analysis. Chapter
3 develops a general approach to governance indicators and elaborates on the construction
and use of composite indicators. Chapter 4 defines validity and reliability and presents the
main econometric tools used in our assessment. Chapter 5 presents the data and results of the
technical analysis and discusses the practical significance of the results. From there, Chapter 6
concludes and presents our policy recommendations.
Chapter 2
Governance: definition and
dimensions
In this chapter, we define governance and identify some of its dimensions as proposed by various
organizations. We then identify two dimensions, Public Financial Management and Corruption,
which will be the focus of our analysis and give the policy context for our study of these two
dimensions.
2.1 Understanding governance and its dimensions
Academics, international organizations and development agencies have defined governance in
multiple ways and there appears to be no agreement on a single definition of the term. In
this section, we attempt to define governance and give example of dimensions of governance as
proposed by different organisations.
Definitions of governance range from those that are broad to narrow ones. The latter reflect
specific interpretations of the term by different organisations, in line with their particular man-
date (World Bank, 1994). Furthermore, a number of donor agencies and academics have given
a normative element to the term ‘governance’, often referring to it as ’good governance’. The
underlying reason for adopting a normative approach is the close association between aid effec-
tiveness and good quality institutions (WGI 2009).
The World Bank Research Institute proposes one of the most comprehensive definitions of gov-
ernance. They define it as the traditions and institutions by which authority in a country is
exercised including:
• The process by which governments are selected, monitored and replaced
16
CHAPTER 2. GOVERNANCE: DEFINITION AND DIMENSIONS 17
• The capacity of the government to effectively formulate and implement sound policies
• The respect of citizens and the state for the institutions that govern economic and social
interactions among them (Kauffman et al., 2004)
Defining governance is important, since its operationalisation helps identify the various elements
that constitute it, thereby facilitating its measurement. Although there is currently no consen-
sus on what ‘governance’ means, institutions have chosen to identify principles or dimensions of
governance to facilitate its measurement and the monitoring of progress (WGI, 2009). UNDP
for example identifies 9 characteristics of good governance as participation, rule of law, trans-
parency, responsiveness, consensus orientation, equity, effectiveness and efficiency, accountabil-
ity and strategic vision (UNDP, 1997). The Worldwide Governance Indicators Project proposes
six dimensions of governance: ‘Voice and accountability’, ‘Political Stability and Absence of
Violence’, ‘Government Effectiveness’, ‘Regulatory Quality’, ‘Rule of Law’ and ‘Control of Cor-
ruption’ (Kauffman et al., 2004). Other dimensions of governance include public sector delivery,
public financial management, state capacity, security, empowerment and elections.
2.2 Dimensions of focus
From the many possible dimensions of governance, this report focuses on two: Public Financial
Management (PFM) and Corruption. With these we aim to compare objective and subjective
governance indicators. While indicators measuring PFM are mostly based on objectively ver-
ifiable data, corruption indicators often rely on perception based information. Therefore, by
deciding to analyse these dimensions, we aim to encompass the two ends of the spectrum of
existing governance indicators. More importantly, from a policy perspective these dimensions
are central in the governance debate, as illustrated in the rest of this section.
2.2.1 Public Financial Management Policy Background
PFM refers to how a government manages the budget in its various phases of formulation,
approval and execution including oversight, control and intergovernmental fiscal relations (Can-
giano et al., 2012). It relates to the laws, systems and procedures used by governments to
employ resources efficiently and transparently and focuses on the management of government
expenditure (Allan et al., 2004).
PFM has three main policy objectives (Schick, 1998):
• To promote a sustainable fiscal position by establishing a balance between government
revenues and expenditure.
CHAPTER 2. GOVERNANCE: DEFINITION AND DIMENSIONS 18
• To facilitate effective allocation of public resources in line with government priorities.
• To enhance efficient delivery of public goods and services by promoting value for money.
Since the late 1980’s, donor countries have recognised the importance of improving public sector
management in order to improve economic performance (Wescott, 2008). This is because high
levels of corruption in developing countries may lead to aid being used for purposes other than
for those it was intended. This concern, as well as the new framework for aid1 where donor
countries committed to relying on partner countries’ own financial management institutions,
has lead to PFM becoming central in development policy. As a result, more than 50 donor
agencies including the World Bank, are now providing either General Budget or Sector Budget
Support to developing countries (Bietenhader, 2010).
The last two decades have seen a gradual evolution of PFM and an adoption of reforms that
have introduced new information requirements, process adjustment, and imposition of restrictive
rules (Cangiano et al., 2012). These reforms have taken different theoretical approaches. New
Public Financial Management (NPFM) introduced reforms that included a shift from cash to
accruals based systems, devolution of budgets, performance measurement and performance based
auditing (DFID, 2009). Public Expenditure Management approach aimed to understand the
broader context of good budgeting practices taking into account different actors and institutions
as well as linking expenditure with results (World Bank, 2001).
More recently, a study by the Public Expenditure and Financial Accountability Initiative (PEFA)
developed the Strengthened Approach. The study identified a lack of country ownership, in-
ability to objectively measure the progress of PFM reforms and uncoordinated PFM projects
as posing a hindrance to PFM reforms in developing countries. This approach emphasised in-
creased government ownership, coordinated program support by various donor countries and a
measurement framework to measure PFM results and performance over time.
In light of the measurement framework objective of the Strengthened Approach various measures
and diagnostic tools have been developed and applied to measure PFM results including the IMF
Code of Good Practices for Fiscal Transparency, and the Public Expenditure and Financial
Accountability (PEFA) Assessment, the Open Budget Survey (OBS), the Debt Management
Performance Framework (DeMPA), DFID’s Fiduciary Risk Assessment and the OECD Aid
Effectiveness Indicators. Two of the most commonly used measures, PEFA and OBS, are the
subject of our analysis in the following sections.
PFM is crucial for good governance since it ensures a more transparent budgeting process that
reduces miss-allocation of funds and public debt. It is recognised that budgeting, besides being a
technical process is also a political process that is affected by informal interests that may at times
override the efficiency and effectiveness imperative. PFM aims to counter this by introducing
1
Encapsulated in the 2005 Paris Declaration.
CHAPTER 2. GOVERNANCE: DEFINITION AND DIMENSIONS 19
measures that change the motivations and actions of politicians and public servants or restrict
their actions (Cangiano et al., 2012).
2.2.2 Corruption Policy Background
Corruption has been defined as the misuse of entrusted power for private gain and can be
classified into either grand or petty corruption (TI, 2015)2. Grand corruption is committed at
high levels of the government resulting in distortion of policies or the state’s function, while
petty corruption refers to daily misuse occurring between low-level officials and citizens (TI,
2015; Mashali, 2012).
Corruption is a topic of great concern for policy makers, academics and aid organisations due
to its apparent negative effects on economic performance. Morrissey et al. (2011) argue that
corruption reduces direct investment incentives, while Begovic (2005) states that it increases
transaction costs of economic operations. However, studies have shown that although corruption
has an adverse effect on growth, it is hard to establish its magnitude (DFID, 2015). Moreover,
Me´on & Weill (2010) state that for countries with low quality institutions, corruption can act
as the ‘grease in the wheels’ leading to increased productivity. Despite this, by providing in-
centives for illegal misappropriation of benefits, corruption perpetrates institutional weaknesses
and reduces the chances of economic improvement in the long run. Empirical evidence shows
that the positive effects of corruption on aggregate efficiency only hold in countries with weak
institutions and frail democracies (ibid, 2010).
There are various explanations of the causes of corruption, with some relying on the study of
the individual incentives for corruption. Begovic (2005) builds on the rational choice approach
to explain corruption and views people as utility maximising entities aiming to increase personal
wealth. Corruption here is seen as an innate behaviour that serves to achieve this end by
reducing transaction costs allowing for misappropriation of rents. Furthermore, such behaviour
is exacerbated by public officials’ greed and discretion. As explained by Rose-Ackerman (1978),
corruption is the consequence of excessive discretionary power, which allows public officials to
reward or punish citizens in order to achieve their own preferences with low risk of detection.
Another salient explanation of corruption is the principal-agent model (DFID, 2015). Corruption
is seen as being the result of asymmetric information between the principal (citizens) and agent
(public officials). The fact that agents hold more information than the principals, with the
latter having little oversight capacity, increases the likelihood of corruption. This approach
2
Since 1993 until now, Transparency International (TI) has played an important role in the disclosure of the
pervasive practice of corruption around the world. Through their yearly release of the Corruption Perception
Index this non-governmental organisation draws attention on the achievements and drawbacks of the fight against
corruption around the world. Similarly, the World Bank conducts a year measurement of the governance perfor-
mance. The so called World Wide Indicators (WWI) includes as one of its six governance dimension the control of
corruption. Interestingly, 31 out of 32 sources used by this renowned measurement includes indicators regarding
corruption.
CHAPTER 2. GOVERNANCE: DEFINITION AND DIMENSIONS 20
incorporates the individual rent-seeking rational previously considered while highlighting the
asymmetric relationship between actors. According to the literature, this is a valid explanation
that has been widely evidenced in developing countries.
Conversely, Khan (2006) points out structural causes of corruption in the developing world. Two
of them are: the use of corruption to guarantee political stability by giving selective incentives
to certain constituencies, and the weakness of property rights which increases the chances of
developing non-market transactions. In both cases, corruption is the answer to either the in-
stitutional or structural failure. Therefore, policy reforms must concentrate on strengthening
institutions.
Finally, it is necessary to highlight that corruption appears in different forms and has different
drivers depending on the sector. For example, Bertram (2005) studies corruption in nine sectors
and concludes that types and drivers of corruption change from one sector to another. Therefore,
anti-corruption policy design should take into account the key features of each sector in order
to be effective. Most importantly, any attempt to accurately measure and assess policy results
in this area requires a clear definition and understanding of the type and level of corruption to
be evaluated.
Chapter 3
Measuring Governance: composite
indicators1
As a multidimensional phenomena, governance has been recurrently measured by composite in-
dicators. This chapter presents this approach and discusses its advantages and disadvantages. It
focuses, in detail, on the process of designing composite indicators and addressing methodological
issues from a technical perspective.
Quantifying governance is an arduous endeavour- primarily due to the practical and method-
ological issues involved in the design of good metrics. Governments, development organisations
and the private sector are increasingly adopting a composite indicator based approach to mea-
sure the multidimensional phenomena of governance (Williams, 2011). The key advantage of
composite indicators is their capacity to compile and summarise large amounts of information
about many different governance dimensions in a single year score for each country. This allows
donor agencies like DFID, USAID etc. to track countries’ progress and better define resource
allocation. Therefore, an index approach has become widely used for decision-making purposes
(Arndt & Oman, 2010; Foa & Tanner, 2012; UNDP, 2007).
Governance indicators can be classified in various ways, and the following diagram is an illus-
tration of one possible classification.
1
Also referred to as ‘aggregate indicators’.
21
CHAPTER 3. MEASURING GOVERNANCE: COMPOSITE INDICATORS 22
Figure 3.1: Indicators’ classification
Source: Authors’ own representation.
The design of composite governance indicators is subject to several challenges. Firstly, the
multidimensional nature of governance makes it difficult to quantify.The composite index design
requires an arbitrary selection of which dimensions and proxies to use in-order to capture changes
on the chosen parameters and how to aggregate individual indicators. Secondly, more often than
not governance indices are based on subjective and perception based information like expert
surveys. The use of subjective information increases the scope for bias and measurement errors
(Arndt & Oman, 2006). Thirdly, the framework for measuring governance is often influenced by
international standards which might not reflect the realities of other contexts. For example, the
OBS compares both developing and developed countries for Public Financial Management using
the same framework, even though it is well known that there are significant differences amongst
the two groups of countries in terms of budget procedures and capacities. The following sections
discuss how to overcome some of these problems from a methodological perspective.
3.1 Composite governance indicators: an introduction
Composite indexes are a particular type of measures designed to quantify multidimensional
phenomena which cannot be adequately represented by an individual indicator e.g. Multidi-
mensional Poverty Index (MPI) (Anand & Sen 1994, Alkire and Housseini, 2014). Under this
approach, a set of relevant variables and dimensions are defined, measured and aggregated under
a single country score. Simply, composite indicators resemble a pyramid structure (Figure 3.2).
At the base, it compiles all the relevant variables to measure the specific phenomenon. Each of
those are aggregated under a set of components. Later, the resulting score is aggregated un-
der a single country-year score which represents the country indicator (Arndt & Oman, 2010).
CHAPTER 3. MEASURING GOVERNANCE: COMPOSITE INDICATORS 23
Thus, composite indexes produce synthetic measure of complex realities, which serve to monitor
changes in system status or to track trends in system performance (OECD, 2008; USAID, 2014).
Figure 3.2: Composite index structure
Source: Authors’ own representation.
3.1.1 Advantages of composite (aggregate) indicators
The key strength of aggregate indicators is their ability to convey information on many param-
eters succinctly (Booysen, 2002; Hahn, 2008; Zhou & Ang, 2009; Balica, 2012b). Therefore,
composite indices are powerful and communicative tools because they present clear and concise
results to non-technical audiences such as scores or rankings (Kenney et al., 2012). That helps
to promote a multi-stakeholder dialogue in establishing common understanding of supranational
concerns and overcoming socio-political barriers of decision making (Preston et al., 2011: 183).
The two main advantages of aggregate measures are:
• Variables that cannot be directly observed may be inferred by integrating multiple indica-
tors as part of a composite indicator.
• Composite indices usage helps to overcome the problems of precision, reliability and ac-
curacy by reducing the influence of measurement error as the number of observation from
multiple sources increase (Kaufmann & Kraay, 2007; Maggino & Zumbo, 2012).
CHAPTER 3. MEASURING GOVERNANCE: COMPOSITE INDICATORS 24
They can also inform policy action and guide further research. Booysen (2002), postulates that
composite indexes are flexible enough to be modified and updated to meet the decision makers’
requirement. Additionally, as new data becomes available the composite index methodologies
evolve and refine over time to incorporate new challenges (e.g. HDI modifications 2010, see Klug-
man et al., 2011). Therefore, composite indicators can help policymakers to identify priorities,
establish and refine standards, develop policy guidelines, determine appropriate adaptations, set
targets, and allocate resources (OECD, 2008), as well as, guide future research, data collection,
and data improvement efforts by revealing weaknesses and gaps in the data systems (USAID,
2014).
3.1.2 Weaknesses of composite (aggregate) indicators
The common critique of aggregate indices is embedded within its framework of implicit and ex-
plicit assumptions which do not always hold (Arndt & Oman, 2006; Ravellion, 2012), alongside,
the loss of individual indicators richness during the aggregation method (Molle and Mollinga,
2003; Abson et al., 2012; Kenney et al., 2012). Both shortcomings, can lead to mistaken conclu-
sions (Lindsey, Wittman et al., 1997). Moreover, composite indices may also fail to capture the
interconnectedness of indicators, ignore important dimensions that are difficult to measure, and
disguise weaknesses in some components (Molle & Mollinga, 2003; Zhou & Ang, 2009; Abson et
al., 2012).
Conversely, to what some authors argue (e.g Kaufman et al., 2007), indicator aggregation can
have a domino effect wherein it tends to amplify the effect of measurement errors. Thus, prob-
lems of precision, reliability, accuracy, and validity associated with individual indicators can
be propagated during the process of aggregation. However, the biggest limitation of aggregate
indicators is their mechanism of determining the constituent variables (Lohani & Todino, 1984).
Generally, the parameters chosen are reflective of the priorities or focus areas of agencies which
construct such indicators. There is no standard scientific method of selecting parameters, rather
they tend to be based on expert opinions. Thus, aggregate indicators cannot rule out the pos-
sibility of omitting important variables and also are exposed to experts bias (Arndt & Oman,
2006).
Therefore, aggregate indicators tread a tightrope; simplifying the intricate information without
being simplistic. However, the presence of the above concerns implies that composite indices
can also misguide policy and practice if used in an undiscriminating manner or if results are
misinterpreted, misrepresented, or overstated (Arndt & Oman, 2006; USAID, 2014).
CHAPTER 3. MEASURING GOVERNANCE: COMPOSITE INDICATORS 25
3.2 Designing indicators: methodological considerations2
3.2.1 Selection of indicators
While indicators used in aggregation are often obtained from existing data sources, they can
also be sourced by planning and implementing new data collection efforts. The selection of the
variables for inclusion in aggregation is a contentious issue and must be approached with caution
(Lohani & Todino 1984). Barnett et al. (2008) argue that indicators are sometimes “selected
not because the data reflect important elements of a model of vulnerability, but because of
the existence of data that are relatively easy to access and manipulate.” Pragmatic criteria for
deciding whether to include or exclude indicator are as follows (USAID, 2014):
• Data availability from public or private sources, including the cost, frequency, timeliness,
consistency, and accessibility of available data and the indicators’ temporal and spatial
coverage.
• If periodic updates are planned, then it is important to ascertain institutional commitments
to update and maintain constituent data sets, and to choose data accordingly.
• Where new data is to be collected, variables of time, effort, and budget constraints must
be taken into account.
• Data quality (e.g., data accuracy; whether or not the data are adequately referenced and
dated)
• The degree of salience: how relevant is the indicator to the intended users of the index?;
and
• The degree of audience resonance: how meaningful is the indicator to the intended audi-
ence?
Other than pragmatic considerations, the selection of variables is contingent on other factors.
Firstly, the selected sub-indices must provide a representation of principal factors of interest.
Secondly, collinearity amongst subindices is a problem. Highly correlated subindices can be
effectively considered as substitutes. Correlation coefficients are a standard way of testing for
collinearity. A practical way of addressing the problem of collinearity is to include only one
subindex from a highly correlated set.
2
A flowchart representation of the indicator design process is provided in Figure 3.3.
CHAPTER 3. MEASURING GOVERNANCE: COMPOSITE INDICATORS 26
3.2.2 Factor retention
If the number of indicators selected at the previous stage are large, it is desirable to reduce the
effective number by selecting only the most significant indicators, removing indicators of low rel-
evance, and thereby minimising the redundancy of highly correlated variables. Many statistical
techniques and stakeholder processes are available to narrow down the pool of indicators; for
example, exploratory factor analysis, principal component analysis (PCA), derivative method,
correlation method, expert survey, and stakeholder discussion (Adger & Vincent, 2005; Balica
& Wright, 2010; Balica et al., 2012; Babcicky, 2013).
3.2.3 Selection of aggregation function
The aggregation function is central to the creation of aggregate index- it defines how the sub-
indices combine to form the aggregate. Given its importance, it is not surprising that there is
considerable debate over the most appropriate aggregation function (Ravalion, 2012). Some of
the most commonly used aggregation functions are:
• Summation (additive aggregation): summation of normalised and weighted or un-
weighted indicators to compute the arithmetic mean (Booysen, 2002; Tate, 2012).
• Multiplication (geometric aggregation): the product of normalised weighted indica-
tors (Tate, 2013).
• Power means or adjusted means: stress the importance of specific areas of the indica-
tors, where countries suffer more deprivation or where they are at risk of under-performing
with respect to a set threshold.
• Max or Min: the maximum sub index or minimum sub index respectively is reported
Firstly, one needs to consider the strengths and weaknesses of the aggregation functions. Ott
(1978) has highlighted two potential problems with aggregation function:
• Underestimation problem: when the index does not exceed a critical value despite one
or more of its sub-indices exceeding the critical value.
• Overestimation problem: when the index exceeds the critical level without any sub
index exceeding the critical level.
The above problems become increasingly significant when the sub-indices are dichotomous or
categorical. Therefore, a good aggregation function is one which minimises one or both of the
over (under)- estimation problems.
CHAPTER 3. MEASURING GOVERNANCE: COMPOSITE INDICATORS 27
Secondly, it is necessary to take into account the functional form (either increasing or decreasing)
of the sub-indices. Those with increasing functional forms regard higher values as a ‘worse’ state
than lower values, and vice-versa
Lastly, simple mathematical aggregation functions are often preferred over complex functions.
Thus, when competing aggregation functions produce similar results with respect to overestima-
tion and underestimation, the most appropriate function will be ‘simplest’ one, mathematically
(Jollands et al., 2003).
3.2.4 Selection of weights
The weights on sub-indices indicate their relative importance and greatly influence the final
aggregate indicator. However, weighting is one of the most contentious topics in indicator
design, partially because there is no standard method. We present below some approaches.
Normative approaches: use expert consultation, stakeholder discussion, and public opinion
surveys to inform weighting schemes on the basis of the expertise, local knowledge
”
value judg-
ments and insights of particularly relevant individuals and groups (Booysen, 2002; Chowdhury
& Squire, 2006; Cherchye et al., 2007; Barnett et al., 2008; OECD, 2008; Kienberger, 2012;
Decancq & Lugo, 2013).
Numerical approaches:
• Differential weighting: is employed when there is sufficient knowledge understanding of
the relative importance of index components or of the trade-offs between index dimensions
(Belhadj, 2012; Decancq & Lugo, 2013; Tate, 2013)
• Equal weighting: it is applied when trade-offs between dimensions are not well under-
stood and therefore assignment of differential weights cannot be reliably justified (Tate,
2012, 2013; Decancq & Lugo, 2013; Tofallis, 2013).
Data driven approaches: they apply statistical methods to generate indicators’ weights.
Blancas et al. (2013) argue that using statistical methods to determine weights may counteract
the influence of subjective decisions made at other stages of indicator design process. Factor
analysis (FA) and Principal Component Analysis (PCA) can be used to test indicators for
correlation, thus allowing for adjustments to the weighting scheme by reducing the weights of
correlated indicators. PCA and FA generate weighting schemes that account for as much of
the variation in the data as possible with the smallest possible number of indicators (Deressa
et al., 2008; OECD, 2008; Nguefack-Tsague et al., 2011; Abson et al., 2012; Tofallis, 2013).
However, as Muro, Maziotta and Pareto (2011) argue, this weighting approach is rigid. Thus,
CHAPTER 3. MEASURING GOVERNANCE: COMPOSITE INDICATORS 28
the indicator has the risk of becoming inoperable when confronted with changes in the data
collection process. Additionally, sometimes FA results may diverge from reality. Consequently,
a good underlying theory is essential to avoid such errors (Muro, Maziotta & Pareto, 2011).
This analysis uses FA as the technique to assess validity and reliability of governance indicators.
A complete explanation for this methodology is developed in section 4.
3.2.5 Uncertainty and Sensitivity analysis
The final step of a composite indicator construction is to run uncertainty and sensitivity tests.
These help to determine if the adopted theoretical model is a good fit for the selected constituent
indicators, and the extent to which a different choice of inputs changes the output ranking. It
can also test if the weighting scheme is actually reflected in the output and if the index is
capable of reliably detecting change over time and space (USAID, 2014). These analyses inform
modifications and refinements of index composition and structure to improve the accuracy,
credibility, reliability, and interpretability of index results (OECD, 2008; Permanyer, 2011; Tate,
2012, 2013).
• Uncertainty analysis: focuses on how uncertainty in the input factors propagates
through the structure of the composite indicator and affects the composite indicator val-
ues (Nardo et al., 2005). It identifies, and evaluates, all possible sources of uncertainty in
the index design, and input factors. It includes theoretical assumptions, selection of con-
stituent indicators, choice of analysis scale, data quality, data editing, data transformation,
methods applied to overcome missing data, weighting scheme, aggregation method, and
composite indicator formula (USAID, 2014).
• Sensitivity analysis: it analyses the degree of influence of each input on the index output,
thereby revealing which methodological stages and choices are most or least influential
(Gall, 2007; Tate, 2012). Permanyer (2012) propose that modellers should compare index
results calculated using alternate weighting schemes to ascertain whether overall index
rankings change substantially
This section answered a critical question, “How to design a good composite indicator?”, which is
of great value to organisations and aid agencies engaged in endeavours to measure governance.
However, more often than not policymakers have to rely on the set of already available indicators
to make their decisions. The following chapter is aimed to aid the policymakers by answering
the question, “How to assess the quality of an indicator?”.
CHAPTER 3. MEASURING GOVERNANCE: COMPOSITE INDICATORS 29
Figure 3.3: Constructing composite indicators
Source: Authors’ own representation.
Chapter 4
Assessing governance indicators
This chapter provides a brief technical introduction to the criteria and methods used in ap-
praising the quality of governance indicators. First, it defines two of an indicator’s must-have
qualities: validity and reliability. Then, it gives a brief introduction of factor analysis. Finally,
it explains how the latter is helpful in discussing indicators’ validity and reliability.
4.1 Validity and reliability: definition
Any governance measure should fulfil two important criteria in order to be considered as an
accurate measure: validity and reliability. Validity refers to the extent to which a specific
indicator measures the concept it attempts to measure (Gisselquist, 2013). In statistical terms,
it is defined as the lack of systematic error. Similarly, reliability refers to the extent to which
an indicator can be extrapolated in time and place. Within the statistical framework it relates
to the degree to which a measure lacks random errors (Maruyama & Ryan, 2014). Systematic
and random errors are a recurrent problem in perception-based data and a potential source of
inaccuracy for composite indicators.
Indicators can be valid but not reliable or vice-versa. As shown in Figure 4.1, panel 2, the
country level indicator might measure the true value on average but the measure would be very
volatile: in such case it is a valid but unreliable indicator. For instance, think of a corruption
measurement being carried monthly in a same country. Corruption levels in a country change
slowly, and these measurements should be very close, if not the same: the opposite would hence
be evidence for the measurement’s lack of reliability. Conversely, panel 1 shows the opposite
case with an invalid and very reliable indicator which consistently measures the wrong concept.
30
CHAPTER 4. ASSESSING GOVERNANCE INDICATORS 31
Figure 4.1: Validity and reliability
Source: MindSonar, understanding statistics.
Formally, an indicator at the country level is:
Xi,t = Truthi,t + Biasi,t + i,t
where Xi,t is the indicator we measure for country i in year t (for simplicity of exposition let’s
assume that it is a simple average of respondents answers), Truthi,t is the true level of what you
try to measure in country i and year t, Biasi,t is the country average of respondent’s systematic
error, and i,t is the country average of respondent’s random measurement error.
Systematic errors: these are associated with sampling errors and perception biases. This is
pertinent to governance indicators as they mostly rely on expert opinion. First, experts and
general public views on governance achievements widely diverge. In terms of policy, problems
flagged by experts will be very different from the broad population views and needs, particularly
because they tend to over represent men, wealthy population, and business people. This, indeed,
would have a meaningful impact in how policy advocacy or aid allocation is carried or assigned
over some topics and others don’t (for example, in issues related to taxation and regulation).
Second, there is a risk of circular reasoning. Given the context shared by experts, business people
and organisations, it is likely that each of them would end-up informing their colleagues’ views.
Both shortcomings directly affect the validity of the indicators (Oman & Arnt, 2006). Another
source of systematic error arises from the likelihood that respondents change their responses in
order to influence the countries score to benefit their own interests (Oman Arndt & Oman, 2010;
Donchev & Ujhelyi, 2008).
Random errors: they relate to respondent’s unintentional biases like misunderstanding of
the questions, failure to remember exact fact or simply mood or fatigue (Maruyama & Ryan,
2014). In this respect, the use of large data sets and the aggregation of different data sources
contribute to increase the reliability, as far as they measure the same thing (UNDP, 2007).
Indeed, individuals’ uncorrelated random errors average out. However in practices there are
reasons to believe that individual’s random errors are correlated in a given year, and do not
fully average out when aggregating answers. This is attributable to many reasons. Firstly, data
CHAPTER 4. ASSESSING GOVERNANCE INDICATORS 32
produced by the indicator is also used to inform respondents’ opinions. Secondly, as presented
above, circular reasoning is also an issue. Third, respondents might be affected in a similar
way by the same type of political or economic factors and hence their views would differ across
different points of time. Fourth, respondents of the same country have similar views based on a
shared culture or background. When compiling different sources to calculate the final indicator,
any of these issues reduces the amount of additional information provided by each source. The
effect of such shortcoming is that it decreases the confidence intervals and consequently any
country variation in the score may be misleading and impossible to compare with other countries’
performance (Oman & Arnt, 2006). As presented by Kaufman et al. (2011) a rule of thumb
to overcome such an issue is to identify and compare countries which confidence interval do not
overlap between each other or across periods.
Table 4.1 summaries the different sources of potential biases.
Table 4.1: Validity and reliability
What is Truth?
Concept (unquantifiable)
Amount (objective)
Appropriateness of
amount (subjective)
Validity (biases)
Proxy No proxy Sampling biases
Information biases
Perception biases
Proxy bias (is the
proxy a good proxy?)
Sampling biases,
Information biases,
Perception biases
Sampling biases,
Information biases,
Perception biases
Reliability
(Random errors)
Average of respondents idiosyncratic shocks (mood,
fatigue, misrecording, setting. . . )
We aims to explore the validity and reliability of specific governance indicators with a multi-
variate analysis approach. Particularly, we employ exploratory and confirmatory factor analysis
to assess both criteria based on the study of two relevant dimension of governance: Corruption
and Public Financial Management.
CHAPTER 4. ASSESSING GOVERNANCE INDICATORS 33
4.2 Introducing Exploratory and Confirmatory Factor Analysis
The idea of latent variables and manifest variables (Bartholomew et al., 2008):
An integral aim of our analysis is to assess whether our shortlisted indicators from the selected
dimensions ( Corruption and PFM) measure what they intend to measure or whether that have
construct validity. This broader notion of validity is further evaluated by two sub-categories:
i) Discriminant validity – if concepts or measures that are supposed to be unrelated are in
fact unrelated and ii) Convergent validity – the degree to which a measure is correlated with
other measures that it is theoretically predicted to correlate with. Latent variable models such as
exploratory and confirmatory factor analysis (henceforth, EFA and CFA) allow us to statistically
evaluate both types of validity and the reason they are useful in the context of governance
indicators is the following. FA is better than Principal Component Analysis (PCA), a variable
reduction technique (Fabrigar et al., 1999), since FA allows us to make underlying assumptions
about the model that PCA cannot. Furthermore, Brown (2009) postulates that PCA does not
account for random error that is inherent in measurement, whereas FA does. This makes it a
better technique to employ for the purpose of our analysis.
For many concepts (e.g. sex, income, size), the correspondence between the concept and its
measurement is sufficiently close that the distinction between the two need not be emphasised.
Many other concepts however such as the idea of good governance (corruption, public service
delivery, public financial management etc.) are more abstract/complex, and capturing them
with empirical data is difficult. In these cases, it is often useful to operationalise them with
more than one observed indicator. Latent variable models, like EFA and CFA, explain the
values of a set of observed variables, and associations between them, in terms of their presumed
dependence on underlying latent variables. The distinction between the two models is the
following (Bartholomew et al., 2008):
• EFA: Aim is simply to ‘identify the latent variables underlying a set of items’ (or indica-
tors).
• CFA: Aim is ‘to test whether a set of items designed to measure particular concepts are
indeed consistent with the assumed structure’.
4.2.1 Exploratory Factor Analysis model
In modelling EFA in terms of a regression model, consider the following:
A set of indicators or manifest variables χ1, χ2, . . . , χp (where there are p indicators. In the
analysis that follows, Public Expenditure and Financial Accountability (PEFA) for example
has 28 indicators). A set of latent variables ξ1, ξ2, . . . , ξq (where there are q latent variables
CHAPTER 4. ASSESSING GOVERNANCE INDICATORS 34
or ‘factors’ in statistical parlance). Thus for PEFA, we might suspect the 28 indicators to be
measuring the same underlying concept(s). Whether this is the case and how confidently we can
claim our case to be, will be the substance of our analysis. Indeed, the 28 indicators have been
clubbed together by PEFA into three broad sub-dimensions and the analysis that we conduct
assesses validity and reliability based on these three categories. The manifest variables can
be thought of as the response variables and the latent variables as the explanatory variables
(Bartholomew et al., 2008). A one-factor model would then have the following model equations
for a hypothetical example of 4 items (indicators):
χ1 = τ1 + λ1ξ + δ1
χ2 = τ2 + λ2ξ + δ2
χ3 = τ3 + λ3ξ + δ3
χ4 = τ4 + λ4ξ + δ4
Where:
• χ1 . . . χ4 are the observed variables as mentioned above
• ξ is the common factor.
• λ1. . . λ4 are the factor loadings
• δ1. . . δ4 are the specific or unique factors
Although the common factors are unobserved, we can think of the above regressions as hypothet-
ical constructs that are able to explain each of the dependent variables (or items) by means of a
common set of regressors. Therefore even though these regressions are not explicitly estimated,
they aid in conceptualising the mechanism by which factor analysis derives the latent variables
from the correlations between many observed variables. A path diagram helps in visualising the
above regression models where the circle represents the latent variable and each square one of
the observed variables. The arrows (paths) illustrate the relationship between factor and each
item.
CHAPTER 4. ASSESSING GOVERNANCE INDICATORS 35
Figure 4.2: One-factor EFA model
The key assumptions of a general multi-factor model are:
• Indicators load on all factors: χ1 = τ1 + q
1 λkξk + δ1
• Both ξ’s and δ’s have mean 0 and variance 1
• δ1, δ2, . . . , δP are uncorrelated with each other
• The ξ’s are uncorrelated with the δ’s
Thus, what these assumptions imply is analogous to the conditional mean independence assump-
tion in regression models: the χ s are conditionally independent, given the ξ s. In other words,
the correlations among the observed items are entirely explained by the factors.
4.2.2 Confirmatory Factor Analysis model
There are two basic approaches to CFA: i) It allows us to test theories about relationships of
indicators to factors by setting certain relationships to zero or even by setting certain error
variances equal to each other. One key issue that it overcomes is that EFA relies on arbitrary
guidelines/personal judgement for deciding how many factors should be kept, i.e., for deciding
how many dimensions the data represents. Another issue with EFA is that it relies on arbitrary
guidelines for interpreting factors, or deciding which relationships are ‘large’ and which are
‘small’. ii) It allows us to test theories about relationships between factors (by estimating values
for covariances between factors, or constraining them according to theory). Graphically, the key
difference between EFA and CFA, is that in EFA, all indicators load onto all factors with the
paths/arrows connecting each latent variable or factor to all items as seen in Figure 4.3a. To aid
CHAPTER 4. ASSESSING GOVERNANCE INDICATORS 36
interpretation, one can exploit an inherent ambiguity (rotational invariance) in the definition
of an EFA model. This implies that we can use an oblique rotation to allow the factors to be
correlated. In contrast, a CFA as depicted in Figure 4.3b constrains certain loadings to zero
and the indicators group together to load onto their corresponding latent variables. For most
practical purposes, a two or higher factor path diagram is not useful given that it becomes
increasingly complicated to identify the respective paths for each latent variable. Thus, for the
subsequent analysis, we restrict our path diagrams to illustrate the CFA results.
Figure 4.3: EFA vs CFA
(a) EFA (b) CFA
4.3 Assessing validity and reliability of indicators with factor
analysis: methodology
4.3.1 Quantifying validity and reliability
A direct way of measuring an indicator’s validity would be to compare it with the true value
of what it is measuring, or with another measure of the same construct known to be valid.
Obviously, the possibility to do so would deny the very need of such an indicator in the first
place.
One indirect way to test for indicators’ validity is hence to assess whether indicators that should
(by definition or construction) be measuring similar concepts in fact are. Formally, we will assess
validity of indicators by testing for “convergent” and “discriminant” validity.
Convergent validity: indicators whose value are expected to be jointly dictated by that of a
common underlying concept -or measuring concepts that we expect to be highly correlated in
theory- should be highly correlated in practice. For instance, one would theoretically expect IQ
CHAPTER 4. ASSESSING GOVERNANCE INDICATORS 37
and GMAT scores to be highly correlated among individuals, since they are designed to measure
a certain understanding of individuals’ “ability”.
Discriminant validity: an indicator should not be influenced by underlying concepts other
than the one it is supposed to be measuring. If it is, then one can conclude that such indicator
is also picking up information about other underlying constructs, commanding other sets of
indicators. This is another way to say that such indicator is not measuring that same construct
its co-indicators are measuring.
There is an important caveat. Convergent and discriminant validity evidence never give definitive
answers. What they do is shed light on the validity question in an indirect way, by looking
at whether indicators supposedly picking up information about a same underlying construct
actually do. That is, these validity tests compare indicators between them. Convergent and
discriminant-valid indicators can still suffer from systematic bias: even when indicators measure
similar constructs, there is still the possibility that they are together wrong. Convergent and
discriminant valid indicators can hence be missing their target in a very specific way: they still
all hit the same point.
Hence, the intuition behind the relevance of discriminant and convergent validity tests is the
following: if a large number of indicators shows strong convergent and discriminant validity, it
is harder to think of a story explaining why they are all wrong in the same way, rather than
accepting it as evidence for their measuring the right concept (unless for instance they all suffer
from the same methodological bias with the same intensity).
Testing for reliability of indicators revolves around one idea: the variance of an indicator should
not be too much driven by that of its idiosyncratic error term i,t. Concretely, we will look at
the share of the indicator’s variance that is explained by that of its error term.
The factor analysis approaches presented above are very helpful tools in assessing both validity
and reliability of indicators. We outline below what will be the methods, tests and metrics that
we use.
4.3.2 Exploratory Factor Analysis: getting a first idea of the indicators’ va-
lidity and reliability
For each dataset, we first perform EFA on the whole set of indicators it contains and/or on sub-
sets of it (when the dataset has many indicators, we adopt a step-by-step approach to flag where
problem arise). EFA allows us to identify groups of indicators measuring the same underlying
construct, that is, whose variation can be well explained by the variation of a common (but
unobserved) “underlying factor” that is derived from the observed indicators. This first stage,
as its name suggests, is mostly there to help us get a first idea of the indicators’ validity.
CHAPTER 4. ASSESSING GOVERNANCE INDICATORS 38
Formally, there is evidence for convergent and discriminant validity if all indicators supposedly
measuring close or theoretically correlated concepts load highly and simultaneously on one same
derived factor without loading too much on other factors. These loadings are the coefficients of
a regression of the type of what is presented in section 4.2. Note that the caveat mentioned in
section 4.3.1 is a direct consequence of the factors being derived from the observed indicators:
any common systemic bias in the indicators will be transposed to the factors.
EFA can also give us some insight on the reliability of the indicators. The “uniqueness” measure
associated with each indicator indicates the share of its variation that is not due to that of the
derived underlying construct. A high uniqueness suggests a lack of reliability. If the variation
of the indicator is largely due to a volatile measurement error, this suggests that should one
replicate the measurement in the exact same setting (same country and year), one would get
different results although the underlying construct would stay the same. To formally assess
reliability we also use a measure known as “Cronbach’s alpha” which simply computes the share
of the indicators’ variance that is commanded by that of their underlying factor (be it valid or
not). A good rule of thumb is to consider any value greater than 0.7 as signalling satisfactory
reliability.
4.3.3 Confirmatory Factor Analysis: confirming the factors structure
When necessary and applicable, we then perform CFA, presented in section 4.2, in order to
confirm the intuition we derive from EFA. Imposing a model on the data makes a lot of sense:
indicators are supposedly measuring given underlying constructs. A model forcing theoretically
correlated indicators to be commanded by one underlying factor only should fit the data well,
with high correlation between indicators of a same group, and simultaneously strong shared
variance between indicators and their attached underlying factor.
We hence still look at factor loadings, but get additional metrics about the fit of the imposed
model:
CHAPTER 4. ASSESSING GOVERNANCE INDICATORS 39
Table 4.2: CFA fit indices
Index Optimal value Type
Comparative fit index () 1
Fit indices.
Tucker-Lewis index (TLI) 1
Coefficient of determination (CD) 1
Root mean squared error of approximation (RMSEA) 0
Standardised root mean squared residual (SRMR) 0 Residual index.
Good fit indices and simultaneously high factor loadings are further evidence for convergent va-
lidity (the relationships between the imposed factor and the indicators are strong). Discriminant
validity is not well examined under CFA since it does not allow to look at cross-relationships.
However, strong correlations between factors is admitted to be a signal of a lack of discriminant
validity, a good rule of thumb being again the 0.7 threshold. A small SRMR residual index
provides some evidence for indicators’ reliability (how much there is residual variation not com-
manded by the common factor). More details about the estimation strategies are found in the
appendix, notably on how we deal with error terms.
Chapter 5
Data and analysis
In this chapter, we present the data and highlight the key results of our analysis on public
financial management and corruption indicators. A more technical and extensive analysis, with
the relevant output tables, is provided in the appendix.
5.1 Public financial management
5.1.1 Open Budget Survey (OBS)
Overall, amongst the indicators that we have aggregated, convergent and discriminant validity
as well as reliability seem achieved, suggesting the good quality of OBS data as a whole. We also
find that dimensions 1) quality of Executive’s Budget Proposal and 2) quality of the Budget
Process are highly correlated, though that should not be regarded as a lack of discriminant
validity.
Open Budget Survey (OBS) is a country-level biennial survey which covers about 100 countries
in the world. It assesses the public availability of budget information and other budgeting
practices that contribute to an accountable and responsive public finance system in countries
around the world. It aims to measure how governments around the world are managing public
finances on the following three aspects:
• Budget transparency: measured in terms of amount, level of detail and timeliness
of budget information made publicly available for the citizens by the government. This
culminates in a score calculated for each country called “Open Budget Index” (OBI).
• Budget participation: this pertains to the opportunities provided by the governments
to the civil society to engage and participate in decisions about how public resources are
raised and spent.
40
CHAPTER 5. DATA AND ANALYSIS 41
• Budget oversight: measures the capacity and authority of formal institutions like leg-
islatures and supreme audit institutions to have an oversight of the government’s budget
process.
The OBS assesses the contents and timely release of eight key budget documents that all coun-
tries should issue at different points in the budget process, according to generally accepted best
practice criteria for PFM. These criteria are drawn from internationally accepted public financial
management practices such as IMF’s Code of Good Practices on Fiscal Transparency, OECD’s
Best Practices for Fiscal Transparency, and International Organisation of Supreme Auditing
Institutions’ (INTOSAI) Lima Declaration of Guidelines on Auditing Precepts. The universal
applicability of the above listed PFM guidelines to different budget systems around the world
ensures comparability amongst countries of different income levels. The majority of the ques-
tions in the survey seeks to measure what actually happens in practice (de facto), rather than
what’s required by law (de jure).
Data
The OBS questionnaire consists of 125 factual questions and is completed by a researcher or
a group of researchers within an organisation in the country. The researchers responding to
the questionnaire are a part of either academic institutions or civil society organisations with
significant focus on budget issues. The researchers can respond to questions by choosing from
five standard responses (a, b, c, d or not applicable), and are required to provide adequate
evidence for their responses by augmenting their answers with comments, clarifications, and
links to relevant documents. Once the responses from the researchers are received, they are
coded using a conversion scale (a=100, b=67, c=33, d=0). The 125 questions seek to measure:
• The quality of the executive’s budget proposal, that is 1) the quality of budget
estimates for the current year and beyond as well as 2) for preceding years, 3) the quality
of necessary complementary information (e.g. extra-budgetary funds, transfers, public
financial activity), and 4) the quality of the budget’s narrative relating expenditures to
policy goals and programmes, and of the monitoring of such expenditure programmes.
• The quality of the budget process, that is 1) the quality of its formulation phase
(consultations, timeliness, pre-budget reports), 2) the quality of its execution phase with
the publication of frequent implementation and progress reports and 3) the quality of the
post-fiscal year auditing phase.
• The strength of the legislature that is, its de jure and de facto involvement and power
in 1) the budget design and debate, 2) the monitoring of non-enacted contingent revenues,
expenditures and transfers, and 3) the monitoring of the audit phase.
CHAPTER 5. DATA AND ANALYSIS 42
• The intensity of the executive’s budget’s public accountability that is 1) the quality
of the “citizens’ budget”, i.e. if the “citizens’ budget” is an accessible, interactive and
transparent communication of the executive’s budget, 2) the level of de jure and de facto
engagement of the executive with the public in formulating and executing the budget, 3)
the level of engagement of the legislative body with the public when voting the budget,
and 4) the level of engagement of the audit institution with the public during the audit
phase.
Out of 125 questions, 95 are used to compute the OBI which is a simple average of the scores
on each of the 95 questions. It is important to note that though they are treated by OBS as
continuous, these variables are ordinal in nature. However, since we are analysing OBS’s usage
of these variables, it is a sensible approach to assess their quality on continuous grounds.
Assessing the validity and reliability of the OBI
We do not compare the OBS data with external data, as there is no other dataset that could
be used as a valid comparison. Rather, we perform within-dataset statistical analysis to know
whether indicators that are grouped in a same category, hence supposed to measure different
facets of a same underlying trait, actually do so. We use EFA and CFA to assess the validity and
reliability of the OBI. We consider the 95 variables (questions) that are used in the 2012 OBI.
They try to capture information on three of the four above-mentioned dimensions: 1) quality
of the executive’s budget proposal 2) quality of the budget process and 3) quality of
the citizens’ budget.
To make the interpretation of the statistical analysis easier, these 95 variables are collapsed
into 11 intermediate aggregate indicators along the aforementioned subcategories (4 for quality
of Executive’s Budget Proposal, 3 for quality of Budget Process, and 4 for intensity of Public
Engagement) as shown in Figure 5.1. We proceed methodically: we first perform three pairwise
analyses each with only two sets or categories of indicators (1-2, 1-3 and 2-3) to identify where
potential problems arise, and then perform an analysis with all indicators together.
CHAPTER 5. DATA AND ANALYSIS 43
Figure 5.1: OBS’s 11 intermediate indicators
Quality	
  of	
  the	
  
budget	
  
proposal	
  
Budget	
  
narra5ve	
  /	
  
link	
  to	
  
policy	
  
Addi5onal	
  
informa5on	
  	
  
Es5mates	
  
for	
  prior	
  
years	
  
Es5mates	
  
for	
  current	
  
year	
  and	
  
beyond	
  
Quality	
  of	
  the	
  
budget	
  process	
  
Budget	
  
execu5on	
  
End-­‐of-­‐year	
  
budget	
  audit	
  
Budget	
  
formula5on	
  
phase	
  
Quality	
  of	
  the	
  
ci5zens’	
  budget	
  
Dissemina5
on	
  to	
  the	
  
public	
  
Consulta5o
n	
  with	
  the	
  
public	
  
Frequency	
  
of	
  
publica5on	
  
Details	
  of	
  
the	
  ci5zens’	
  
budget	
  
Source: Authors’ own representation.
Analysis 1: Budget Proposal and Budget Process
Figure 5.2 shows a visualisation of the first EFA pooling Budget Proposal and Budget Process
indicators. A two-factor model is what OBS’s methodology would suggest, and is supported
by standard statistical factor selection criteria1. The transparent coloured polygons reflect the
intensity of the linear relationships (from 0 to 1) between indicators and underlying factors. For
instance, a doubling of the blue factor labeled “quality of the budget proposal” would inflate
the indicator “quality of the budget estimates for previous years” by a factor of 1.95, and the
indicator “quality of the budget estimates for the current year and beyond” by a factor of 1.68.
It is useful to remind the reader that the underlying factors do not have a life of their own, and
are directly extracted from the common variation that is found in the observed indicators. We
also report the associated EFA statistical Table 5.1. For all subsequent EFAs we only report the
visualisation, and invite the reader to refer to the appendix for the tables.
1
Keeping factors with eigenvalue > 1 or whose eigenvalue is situated before the first major drop (elbow
criterion).
CHAPTER 5. DATA AND ANALYSIS 44
Figure 5.2: EFA visualisation, Budget Process and Budget Proposal
-­‐0,2	
  
0	
  
0,2	
  
0,4	
  
0,6	
  
0,8	
  
1	
  
Proposal:	
  es3mates,	
  current	
  year	
  and	
  
beyond	
  
Proposal:	
  es3mates,	
  previous	
  years	
  
Proposal:	
  addi3onal	
  informa3on	
  
Proposal:	
  budget	
  narra3ve	
  Process:	
  budget	
  formula3on	
  
Process:	
  budget	
  execu3on	
  
Process:	
  budget	
  audit	
  
Budget	
  proposal	
   Budget	
  process	
  
Source: OBS and authors’ own representation.
Table 5.1: EFA, Budget Process and Budget Proposal
Factor1 Factor2 Uniqueness
Proposal: estimates, beyond .83847 .1297474
Proposal: estimates, previous .9774664 .1872742
Proposal: additional information .4942621 .4692496 .203666
Proposal: budget narrative .5644614 .3030345 .3447982
Process: budget formulation .6420719 .5842888
Process: budget execution .8865702 .2663301
Process: budget audit .6716625 .3802079
Observations 338
Blanks: loadings smaller than 0.3 in absolute terms
Source: OBS 2006-2008-2010-2012
CHAPTER 5. DATA AND ANALYSIS 45
The way to interpret Figure 5.2 is the following. There exist two underlying (derived) variables,
or underlying constructs, that command the values taken by the considered indicators in a very
specific way: indicators measuring concepts relating to the “quality of the budget proposal” are
simultaneously and mostly commanded by one underlying factor, and a similar conclusion can
be drawn from observing the “quality of the budget process” indicators. This points towards
rather strong evidence for both discriminant and convergent validity as defined in section 4.3: a
priori-clubbed indicators do load on one same factor, and cross-loadings are satisfactorily small.
The reader should bear in mind the caveat previously mentioned: this is merely evidence of some
indicators being influenced by a same underlying concept, which is not guaranteed to represent
the true concept of interest. However, the more discriminant and convergent validity evidence
one gathers, the more credible it becomes that indicators are measuring what they ought.
Although the identified underlying constructs seem to command their attached indicators, both
EFA and CFA show that the constructs “quality of the budget proposal” and “quality of the
budget process” are highly correlated. However it is a reasonable argument that the quality
of the budget proposal and of its execution and audit phases are in practice highly correlated.
Moreover, the EFA visualisation offers additional comfort with regards to discriminant validity:
small cross-loadings notably indicate that the two underlying factors are two different constructs.
Regarding reliability, the low uniquenesses we observe in Table 5.1 are first pieces of evidence
for satisfactory reliability of the indicators. More formally, we compute Cronbach’s alpha coef-
ficients. We obtain 0.9154 for quality of executive budget indicators and 0.8014 for quality of
the budget process indicators, suggesting high reliability.
We then impose a CFA model on the data. We notably let the random error components of
the “quality of the estimates for the current year and beyond” and “quality of the estimates for
previous years” indicators be correlated2. Figure 5.3 represent the CFA diagram with standard-
ised coefficients and Table 5.2 the associated statistical table. For all subsequent CFAs we only
report the diagram and invite the reader to refer to the appendix for the tables.
Figure 5.3: CFA diagram, Budget Process and Budget Proposal
Source: OBS and authors’ own representation.
2
We do so after looking at the modification indices of a first model and in line with OBS’ methodol-
ogy/questionnaire. More details in appendix.
CHAPTER 5. DATA AND ANALYSIS 46
Table 5.2: CFA, Budget Process and Budget Proposal
Proposal: estimates, beyond Process: budget formulation
PROPOSAL 27.17*** PROCESS 19.77***
(1.583) (1.944)
Proposal: estimates, previous Process: budget execution
PROPOSAL 27.89*** PROCESS 21.06***
(1.839) (1.950)
Proposal: additional information Process: budget audit
PROPOSAL 22.29*** PROCESS 25.27***
(1.809) (1.965)
Proposal: budget narrative
PROPOSAL 23.32***
(2.070)
Observations 338
Standard errors in parentheses
Source: OBS 2006-2008-2010-2012
* p<0.1, ** p<0.05, *** p<0.01
The model shows strong adequacy with the data, with simultaneously strong (standardised)
relationships between factors and indicators. This and the levels of the goodness-of-fit statistics
(desirable levels in parenthesis follows Hu and Bentler (1999) guidelines) constitute evidence for
both convergent and discriminant validity.
Table 5.3: CFA fit indices, Budget Process and Budget Proposal
Comparative fit index (CFI): 0.978 (0.95)
Tucker-Lewis index (TLI) 0.962 (0.95)
Coefficient of determination (CD) 0.967 (0.95)
Standardised root mean squared residual (SRMR) 0.025 (0.8)
Root mean squared error of approximation (RMSEA) 0.096 (0.5)
Finally, the low SRMR residual index of the CFA model is further evidence in favour of the
indicators’ reliability.
CHAPTER 5. DATA AND ANALYSIS 47
Analysis 2: Budget Proposal and Citizens’ Budget
Similarly, the second pairwise analysis shows strong signs of validity, with both EFA and CFA
models yielding a clear picture. Figure 5.4 shows very clear convergent and discriminant validity,
its interpretation being similar to that of Figure 5.2.
Figure 5.4: EFA visualisation, Budget Proposal and Citizens’ Budget
-­‐0,2	
  
0	
  
0,2	
  
0,4	
  
0,6	
  
0,8	
  
1	
  
Proposal:	
  es3mates,	
  current	
  year	
  and	
  
beyond	
  
Proposal:	
  es3mates,	
  previous	
  years	
  
Proposal:	
  addi3onal	
  informa3on	
  
Proposal:	
  budget	
  narra3ve	
  
Cit.	
  budget:	
  detail	
  
Cit.	
  budget:	
  dissemina3on	
  
Cit.	
  budget:	
  consulta3on	
  
Cit.	
  budget:	
  frequency	
  
Ci3zen's	
  budget	
   Budget	
  poposal	
  
Source: OBS and authors’ own representation.
The low uniquenesses we observe after running the EFA3 and the Cronbach’s alphas we compute
(0.8563 for the variables pertaining to the “citizens’ budget” category, and -as previously- 0.9154
for “quality of executive budget” indicators) suggest the high reliability of the indicators.
CFA (Figure 5.5) confirms the fit of a two-factor model to the data, given the strong relationships
between indicators and their underlying factors, as well as the good absolute fit indices.
3
See Table 6.4 in appendix.
CHAPTER 5. DATA AND ANALYSIS 48
Figure 5.5: CFA diagram, Budget Proposal and Citizens’ Budget
Source: OBS and authors’ own representation.
Table 5.4: CFA fit indices, Budget Proposal and Citizens’ Budget
Comparative fit index (CFI): 0.977
Tucker-Lewis index (TLI) 0.965
Coefficient of determination (CD) 0.995
Standardised root mean squared residual (SRMR) 0.031
Root mean squared error of approximation (RMSEA) 0.097
As before, the low SRMR index (0.031) suggests that indicators are highly reliable.
CHAPTER 5. DATA AND ANALYSIS 49
Analysis 3: Budget Process and Citizens’ Budget
As above, we can be confident about the discriminant and convergent validity of the two sets of
indicators. EFA (Figure 5.8) shows clear relationships between underlying constructs and their
attached indicators.
Figure 5.6: EFA visualisation, Budget Process and Citizens’ Budget
-­‐0,2	
  
0	
  
0,2	
  
0,4	
  
0,6	
  
0,8	
  
1	
  
Process:	
  budget	
  formula:on	
  
Process:	
  budget	
  execu:on	
  
Process:	
  budget	
  audit	
  
Cit.	
  budget:	
  detail	
  Cit.	
  budget:	
  dissemina:on	
  
Cit.	
  budget:	
  consulta:on	
  
Cit.	
  budget:	
  frequency	
  
Ci:zen's	
  budget	
   Budget	
  process	
  
Source: OBS and authors’ own representation.
We observe again satisfactorily small EFA uniquenesses4, and the Cronbach’s alphas do not
change for the two sets of indicators: this is again evidence for the considered indicators’ relia-
bility.
In turn, CFA again shows strong loadings of indicators on their attached factors, and good fit
indices, suggesting indicators’ validity.
4
See table 6.6 in appendix.
CHAPTER 5. DATA AND ANALYSIS 50
Figure 5.7: CFA diagram, Budget Process and Citizens’ Budget
Source: OBS and authors’ own representation.
Table 5.5: CFA fit indices, Budget Process and Citizens’ Budget
Comparative fit index (CFI): 0.990
Tucker-Lewis index (TLI) 0.984
Coefficient of determination (CD) 0.990
Standardised root mean squared residual (SRMR) 0.022
Root mean squared error of approximation (RMSEA) 0.061
Again, the small SRMR residual index (0.022) we observe points towards indicators’ reliability.
CHAPTER 5. DATA AND ANALYSIS 51
Analysis 4: All dimensions
We now turn to an all-factor analysis. Evidence so far seems to be pointing towards strong
validity and reliability of OBS indicators, and we want to give a final picture by pooling all
indicators. EFA (Figure 5.8) gives an unclear picture, so we turn to the CFA.
Figure 5.8: EFA visualisation, all dimensions
-­‐0,2	
  
0	
  
0,2	
  
0,4	
  
0,6	
  
0,8	
  
1	
  
Proposal:	
  es3mates,	
  current	
  year	
  and	
  
beyond	
  
Proposal:	
  es3mates,	
  previous	
  years	
  
Proposal:	
  addi3onal	
  informa3on	
  
Proposal:	
  budget	
  narra3ve	
  
Process:	
  budget	
  formula3on	
  
Process:	
  budget	
  execu3on	
  Process:	
  budget	
  audit	
  
Cit.	
  budget:	
  detail	
  
Cit.	
  budget:	
  dissemina3on	
  
Cit.	
  budget:	
  consulta3on	
  
Cit.	
  budget:	
  frequency	
  
Budget	
  process	
   Ci3zen's	
  budget	
   Budget	
  proposal	
  
Source: OBS and authors’ own representation.
As expected, the relationships between identified factors and their attached indicators are
strong, with good fit indices, small residuals and simultaneously high factor loadings (Figure
5.9) suggesting both validity and reliability5. Like before, we see a very high correlation
between the factors commanding the quality of the budget proposal and the quality of the
budget process. However this is not surprising as it is much likely that in practice these two
dimensions are highly correlated, and we don’t see it as evidence for a lack of discriminant
validity.
5
See also Table 6.9 in appendix.
CHAPTER 5. DATA AND ANALYSIS 52
Figure 5.9: CFA diagram, all dimensions
Source: OBS and authors’ own representation.
Table 5.6: CFA fit indices, all dimensions
Comparative fit index (CFI): 0.967
Tucker-Lewis index (TLI) 0.954
Coefficient of determination (CD) 0.997
Standardised root mean squared residual (SRMR) 0.036
Root mean squared error of approximation (RMSEA) 0.088
We hence conclude that OBS’s 11 sub indicators that we retain are measuring three distinct
concepts. OBS data in general show strong signs of quality: we gathered convincing evidence
for OBS’s indicators’ validity and reliability.
CHAPTER 5. DATA AND ANALYSIS 53
5.1.2 Public Expenditure and Financial Accountability (PEFA)
Overall, indicators of the PEFA dataset seem to be measuring the same underlying factors with
a complete lack of convergent and discriminant validity. These conclusions are therefore in sim-
ilar vein to Langbein & Knack (2010) who use EFA and CFA and find that the World Bank
governance indicators measure the same concept rather than distinct concepts of governance.
PEFA indicators seem to be measuring the same underlying concept of public financial manage-
ment rather than distinct concepts of credibility of budget, comprehensiveness & transparency,
predictability & control in budget execution, accounting/reporting and external scrutiny.
The PEFA programme was founded in 2001 as a multi-donor partnership between several donor
agencies and international financial institutions to assess the condition of country public expen-
diture, procurement and financial accountability systems and develop a practical sequence for
reform and capacity-building actions. Thus, one of the key activities of PEFA is the development
and maintenance of the PFM Performance Measurement Framework (PEFA Framework), which
is a contribution to the collective efforts of many stakeholders to assess whether a country has
the tools to deliver three main budgetary outcomes: i) aggregate fiscal discipline, ii) strategic
resource allocation, and iii) efficient use of resources for service delivery.
The Performance Measurement Framework includes a set of 28 high level indicators, which
measures and monitors performance of PFM systems, processes and institutions and a PFM
Performance Report (PFM-PR) that provides a framework to report on PFM performance as
measured by the indicators. Importantly, the focus of the PFM performance indicator set is
the public financial management at central government level, including the related institutions
of oversight, and in particular revenues and expenditures undertaken through the central gov-
ernment budget. The set of high-level indicators captures core components of PFM that are
widely acknowledged as being essential for all countries to achieve sound public financial man-
agement. By comparing them over time, the PMF’s and the scored indicators can act as basis
to monitor results of public financial management reform efforts (either through repeated as-
sessments and/or by building PEFA Indicators into a country’s own Monitoring & Evaluation
mechanism). One caveat however, is that the indicators only measure operational performance
rather than the inputs that enable the PFM system to reach a certain level of performance.
The Performance Measurement Framework identifies the critical dimensions of performance of
an open and orderly PFM system as follows (PEFA PMF, 2011):
1. Credibility of the budget - The budget is realistic and is implemented as intended.
2. Comprehensiveness and transparency - The budget and the fiscal risk oversight are com-
prehensive and fiscal and budget information is accessible to the public.
3. Policy-based budgeting - The budget is prepared with due regard to government policy.
CHAPTER 5. DATA AND ANALYSIS 54
4. Predictability and control in budget execution - The budget is implemented in an orderly
and predictable manner and there are arrangements for the exercise of control and stew-
ardship in the use of public funds.
5. Accounting, recording and reporting - Adequate records and information are produced,
maintained and disseminated to meet decision-making control, management and reporting
purposes.
6. External scrutiny and audit - Arrangements for scrutiny of public finances and follow up
by executive are operating.
These six dimensions and 28 indicators are further collapsed by PEFA into 3 broad categories
(the complete list of the 28 indicators is given in Figure 5.11):
1. PFM system out-turns: these capture the immediate results of the PFM system in
terms of actual expenditures and revenues by comparing them to the original approved
budget, as well as level of and changes in expenditure arrears
2. Crosscutting features of the PFM system: these capture the comprehensiveness and
transparency of the PFM system across the whole of the budget cycle.
3. Budget cycle: these capture the performance of the key systems, processes and institu-
tions within the budget cycle of the central government.
Figure 5.10 illustrates the structure and coverage of the PFM system measured by the set of
high level indicators and the links with the six core dimensions of a PFM system:
Figure 5.10: Structure and coverage of the PFM system
Source: PEFA PMF, 2011
CHAPTER 5. DATA AND ANALYSIS 55
Data
PEFA data contains the most recent status of a national or sub national PEFA assessment in
any given country as of December 23, 2014. The data is updated on a six-monthly basis in which
PEFA partners and other agencies that lead PEFA assessments are contacted about the status of
their assessments. The PEFA Secretariat collects and verifies this information before updating
the assessment portal. PFM assessment reports included in the assessment portal have scored
at least 2/3 of the PEFA indicator set. The data is available to the public and was downloaded
from https://www.pefa.org/en/dashboard.
Scoring methodology and aggregation
‘Each indicator seeks to measure performance of a key PFM element against a four point ordinal
scale from A to D. Guidance has been developed on what performance would meet each score,
for each of the indicators. The highest score is warranted for an individual indicator if the
core PFM element meets the relevant objective in a complete, orderly, accurate, timely and
coordinated way. The set of high-level indicators is therefore focusing on the basic qualities of
a PFM system, based on existing good international practices, rather than setting a standard
based on the latest innovation in PFM.’ (PEFA PMF, 2011)
Since many of the indicators have sub-dimensions, the aggregate indicators have been con-
structed by PEFA by taking the lowest score and adding a ‘+’ to them if one of the sub-
dimensions have a higher score. Thus, if an indicator has for each of say 3 sub-dimensions scores
C, C, and B respectively, the aggregate score would be C+. So as to run factor analysis on the
aggregated dataset (as would be apparent from the methodology section, factor analysis would
not be useful on sub-dimensions because they would clearly be measuring similar concepts), we
re-coded scores for the aggregated indicators from 1 – 7 for the following possible scores (A, B+,
B, C+, C, D+, D) with 1 corresponding to A, 2 corresponding to B+, and so on6. Although the
data is ordinal and not interval, this is a reasonable continuous structure to be imposed, and it
allows us to carry out an analysis in the spirit of that of OBS. Given the nature of the results,
we believe that they are robust to different kinds of analysis e.g. latent class modelling.
6
This is notably in line with a 2009 ODI study on cross-country PFM performance: Taking Stock: What do
PEFA Assessments tell us about PFM systems across countries?
CHAPTER 5. DATA AND ANALYSIS 56
Figure 5.11: PEFA indicators list
Source: PEFA Performance Measurement Framework, Revised 2011.
Analysis 1: A. Credibility of the Budget and B. Comprehensiveness & Transparency
We begin by analysing two sets of aggregate indicators7 that supposedly measure the“credibility
of the budget” and its “comprehensiveness & transparency”. A two-factor model is what PEFA’s
methodology would suggest, and is supported by standard statistical factor selection criteria.
7
Throughout the analysis, we refer to the indicators by their variable names rather than the full indicator
description, which is provided in Figure 5.11
CHAPTER 5. DATA AND ANALYSIS 57
Using a two-factor EFA8, we find that the uniqueness for most indicators of the set analysed
(PI 1 – PI 10) seems to be quite high. Alternatively, the communalities or the sum of squared
factor loadings (1 – unique variance) seems to be rather low, indicating that the model does not
fit the data well and is a first piece of evidence against indicator’s reliability.
We can tell from a simple EFA as indicated in Table 6.10 that a two-factor model is going to
be a bad fit. Given that indicators PI 7 – PI 10 load onto Factor 1, we would expect indicators
PI 1 – PI 4 to load onto Factor 2 from PEFA’s analytical categorisation. As is evident, there is a
clear lack of convergent validity. Indicators PI 2 and PI 4 from ‘Credibility of the Budget’ load
onto Factor 1 and indicators PI 5, PI 6 from ‘Comprehensiveness & Transparency’ load onto
Factor 2. Thus, cross-category indicators seem to be loading onto the same factor. Figure 5.12
illustrates the cross-loadings with PI 5 and PI 6 grouping together with PI 1 and PI 3 along
Factor 2.
Figure 5.12: EFA visualisation, Credibility and Comprehensiveness & Transparency
-­‐0,1	
  
0	
  
0,1	
  
0,2	
  
0,3	
  
0,4	
  
0,5	
  
0,6	
  
0,7	
  
0,8	
  
PI_1	
  
PI_2	
  
PI_3	
  
PI_4	
  
PI_5	
  
PI_6	
  
PI_7	
  
PI_8	
  
PI_9	
  
PI_10	
  
Factor1	
   Factor2	
  
Source: PEFA and authors’ own representation.
8
The Kaiser–Meyer–Olkin (KMO) measure of sampling adequacy is 0.7124, suggesting an EFA is appropriate
as indicators, within the dataset, have enough in common (Kaiser, 1974).
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4
61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4

More Related Content

Viewers also liked

Pefa presentation by dr malik khalid mehmood ph_d
Pefa presentation   by dr malik khalid mehmood ph_dPefa presentation   by dr malik khalid mehmood ph_d
Pefa presentation by dr malik khalid mehmood ph_dMalik Khalid Mehmood
 
2011 04-25-afghanistan-pfm-case-study
2011 04-25-afghanistan-pfm-case-study2011 04-25-afghanistan-pfm-case-study
2011 04-25-afghanistan-pfm-case-study
FreeBalance
 
The need for improvement in Public Expenditure and Financial Accountability (...
The need for improvement in Public Expenditure and Financial Accountability (...The need for improvement in Public Expenditure and Financial Accountability (...
The need for improvement in Public Expenditure and Financial Accountability (...
John Leonardo
 
Follow up actions by donors and countries, the case of pefa
Follow up actions by donors and countries, the case of pefaFollow up actions by donors and countries, the case of pefa
Follow up actions by donors and countries, the case of pefa
icgfmconference
 
Simply First Aid Brochure 2010 V01
Simply First Aid Brochure 2010 V01Simply First Aid Brochure 2010 V01
Simply First Aid Brochure 2010 V01Simply First Aid
 
Hadden public financial management in government of kosovo
Hadden public financial management in government of kosovoHadden public financial management in government of kosovo
Hadden public financial management in government of kosovo
icgfmconference
 
Day4 sp2 icgfm-annual_confpresentation_ibi_2014_davidcolvin_en
Day4 sp2 icgfm-annual_confpresentation_ibi_2014_davidcolvin_enDay4 sp2 icgfm-annual_confpresentation_ibi_2014_davidcolvin_en
Day4 sp2 icgfm-annual_confpresentation_ibi_2014_davidcolvin_en
icgfmconference
 
Public Financial Management Good Practice Government Resource Planning Resour...
Public Financial Management Good Practice Government Resource Planning Resour...Public Financial Management Good Practice Government Resource Planning Resour...
Public Financial Management Good Practice Government Resource Planning Resour...
FreeBalance
 
Recent public financial management publications and other resources
Recent public financial management publications and other resourcesRecent public financial management publications and other resources
Recent public financial management publications and other resources
icgfmconference
 
Budget reporting and performance standards (brps) projects
Budget reporting and performance standards (brps)   projectsBudget reporting and performance standards (brps)   projects
Budget reporting and performance standards (brps) projects
rickiti9
 
Tnm on control_of_expenditure_arrears_by_mario_pessoa_suzanneflynn_en
Tnm on control_of_expenditure_arrears_by_mario_pessoa_suzanneflynn_enTnm on control_of_expenditure_arrears_by_mario_pessoa_suzanneflynn_en
Tnm on control_of_expenditure_arrears_by_mario_pessoa_suzanneflynn_en
icgfmconference
 
OECD, 10th Meeting of CESEE Senior Budget Officials - Bojan Paunovic, Montenegro
OECD, 10th Meeting of CESEE Senior Budget Officials - Bojan Paunovic, MontenegroOECD, 10th Meeting of CESEE Senior Budget Officials - Bojan Paunovic, Montenegro
OECD, 10th Meeting of CESEE Senior Budget Officials - Bojan Paunovic, Montenegro
OECD Governance
 
SIF theme ii budget and resource tracking
SIF theme ii budget and resource trackingSIF theme ii budget and resource tracking
SIF theme ii budget and resource tracking
Mike McQuestion
 
Combating corruption and public financial management (PFM)
Combating corruption and public financial management (PFM)Combating corruption and public financial management (PFM)
Combating corruption and public financial management (PFM)
John Leonardo
 
Country Responses to the Financial Crisis Kosovo
Country Responses to the Financial Crisis KosovoCountry Responses to the Financial Crisis Kosovo
Country Responses to the Financial Crisis Kosovo
icgfmconference
 
3.30 4.45pm Improving Sector Performance (Jerome Dendura) English
3.30 4.45pm Improving Sector Performance (Jerome Dendura) English3.30 4.45pm Improving Sector Performance (Jerome Dendura) English
3.30 4.45pm Improving Sector Performance (Jerome Dendura) English
icgfmconference
 
Treasury reform in_nepal_a_case_for_government_credibility-baburam_subedi_en
Treasury reform in_nepal_a_case_for_government_credibility-baburam_subedi_enTreasury reform in_nepal_a_case_for_government_credibility-baburam_subedi_en
Treasury reform in_nepal_a_case_for_government_credibility-baburam_subedi_en
icgfmconference
 

Viewers also liked (20)

Pefa presentation by dr malik khalid mehmood ph_d
Pefa presentation   by dr malik khalid mehmood ph_dPefa presentation   by dr malik khalid mehmood ph_d
Pefa presentation by dr malik khalid mehmood ph_d
 
2011 04-25-afghanistan-pfm-case-study
2011 04-25-afghanistan-pfm-case-study2011 04-25-afghanistan-pfm-case-study
2011 04-25-afghanistan-pfm-case-study
 
The need for improvement in Public Expenditure and Financial Accountability (...
The need for improvement in Public Expenditure and Financial Accountability (...The need for improvement in Public Expenditure and Financial Accountability (...
The need for improvement in Public Expenditure and Financial Accountability (...
 
Follow up actions by donors and countries, the case of pefa
Follow up actions by donors and countries, the case of pefaFollow up actions by donors and countries, the case of pefa
Follow up actions by donors and countries, the case of pefa
 
Simply First Aid Brochure 2010 V01
Simply First Aid Brochure 2010 V01Simply First Aid Brochure 2010 V01
Simply First Aid Brochure 2010 V01
 
Hadden public financial management in government of kosovo
Hadden public financial management in government of kosovoHadden public financial management in government of kosovo
Hadden public financial management in government of kosovo
 
Day4 sp2 icgfm-annual_confpresentation_ibi_2014_davidcolvin_en
Day4 sp2 icgfm-annual_confpresentation_ibi_2014_davidcolvin_enDay4 sp2 icgfm-annual_confpresentation_ibi_2014_davidcolvin_en
Day4 sp2 icgfm-annual_confpresentation_ibi_2014_davidcolvin_en
 
Public Financial Management Good Practice Government Resource Planning Resour...
Public Financial Management Good Practice Government Resource Planning Resour...Public Financial Management Good Practice Government Resource Planning Resour...
Public Financial Management Good Practice Government Resource Planning Resour...
 
Recent public financial management publications and other resources
Recent public financial management publications and other resourcesRecent public financial management publications and other resources
Recent public financial management publications and other resources
 
Budget reporting and performance standards (brps) projects
Budget reporting and performance standards (brps)   projectsBudget reporting and performance standards (brps)   projects
Budget reporting and performance standards (brps) projects
 
Tnm on control_of_expenditure_arrears_by_mario_pessoa_suzanneflynn_en
Tnm on control_of_expenditure_arrears_by_mario_pessoa_suzanneflynn_enTnm on control_of_expenditure_arrears_by_mario_pessoa_suzanneflynn_en
Tnm on control_of_expenditure_arrears_by_mario_pessoa_suzanneflynn_en
 
3 Pradhan Framework Learning Dec122007
3 Pradhan Framework Learning Dec1220073 Pradhan Framework Learning Dec122007
3 Pradhan Framework Learning Dec122007
 
OECD, 10th Meeting of CESEE Senior Budget Officials - Bojan Paunovic, Montenegro
OECD, 10th Meeting of CESEE Senior Budget Officials - Bojan Paunovic, MontenegroOECD, 10th Meeting of CESEE Senior Budget Officials - Bojan Paunovic, Montenegro
OECD, 10th Meeting of CESEE Senior Budget Officials - Bojan Paunovic, Montenegro
 
SIF theme ii budget and resource tracking
SIF theme ii budget and resource trackingSIF theme ii budget and resource tracking
SIF theme ii budget and resource tracking
 
Combating corruption and public financial management (PFM)
Combating corruption and public financial management (PFM)Combating corruption and public financial management (PFM)
Combating corruption and public financial management (PFM)
 
Country Responses to the Financial Crisis Kosovo
Country Responses to the Financial Crisis KosovoCountry Responses to the Financial Crisis Kosovo
Country Responses to the Financial Crisis Kosovo
 
Towards Indicators of Strength of Public Management Systems
Towards Indicators of Strength of Public Management SystemsTowards Indicators of Strength of Public Management Systems
Towards Indicators of Strength of Public Management Systems
 
Facts on Public Finance Management
Facts on Public Finance ManagementFacts on Public Finance Management
Facts on Public Finance Management
 
3.30 4.45pm Improving Sector Performance (Jerome Dendura) English
3.30 4.45pm Improving Sector Performance (Jerome Dendura) English3.30 4.45pm Improving Sector Performance (Jerome Dendura) English
3.30 4.45pm Improving Sector Performance (Jerome Dendura) English
 
Treasury reform in_nepal_a_case_for_government_credibility-baburam_subedi_en
Treasury reform in_nepal_a_case_for_government_credibility-baburam_subedi_enTreasury reform in_nepal_a_case_for_government_credibility-baburam_subedi_en
Treasury reform in_nepal_a_case_for_government_credibility-baburam_subedi_en
 

Similar to 61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4

Optimization of Post-Scoring Classification and Impact on Regulatory Capital ...
Optimization of Post-Scoring Classification and Impact on Regulatory Capital ...Optimization of Post-Scoring Classification and Impact on Regulatory Capital ...
Optimization of Post-Scoring Classification and Impact on Regulatory Capital ...
GRATeam
 
Principles financialmarketinfrastructuresframeworkandmethodology
Principles financialmarketinfrastructuresframeworkandmethodologyPrinciples financialmarketinfrastructuresframeworkandmethodology
Principles financialmarketinfrastructuresframeworkandmethodologyDr Lendy Spires
 
Bridging the audit expectation gap
Bridging the audit expectation gapBridging the audit expectation gap
Bridging the audit expectation gap
Gabriel Ken
 
Dynamic Stress Test diffusion model and scoring performance
Dynamic Stress Test diffusion model and scoring performanceDynamic Stress Test diffusion model and scoring performance
Dynamic Stress Test diffusion model and scoring performance
Ziad Fares
 
Bma
BmaBma
EAD Parameter : A stochastic way to model the Credit Conversion Factor
EAD Parameter : A stochastic way to model the Credit Conversion FactorEAD Parameter : A stochastic way to model the Credit Conversion Factor
EAD Parameter : A stochastic way to model the Credit Conversion Factor
Genest Benoit
 
General principlesforcreditreporting
General principlesforcreditreportingGeneral principlesforcreditreporting
General principlesforcreditreportingDr Lendy Spires
 
General principlesforcreditreporting
General principlesforcreditreportingGeneral principlesforcreditreporting
General principlesforcreditreportingDr Lendy Spires
 
Thesis Final Report
Thesis Final ReportThesis Final Report
Thesis Final Report
Sadia Sharmin
 
Vendor Performance Management
Vendor Performance ManagementVendor Performance Management
Vendor Performance Management
Gerald Ford
 
Software testing services growth report oct 11
Software testing services growth report oct 11Software testing services growth report oct 11
Software testing services growth report oct 11
Transition Consulting Limited, India
 
Impact assessment-study-dit
Impact assessment-study-ditImpact assessment-study-dit
Impact assessment-study-dit
Girma Biresaw
 
Evaluation and Monitoring of Transboundary Aspects of Maritime Spatial Planni...
Evaluation and Monitoring of Transboundary Aspects of Maritime Spatial Planni...Evaluation and Monitoring of Transboundary Aspects of Maritime Spatial Planni...
Evaluation and Monitoring of Transboundary Aspects of Maritime Spatial Planni...
Pan Baltic Scope / Baltic SCOPE
 
Concordia_Index_Report_2016
Concordia_Index_Report_2016Concordia_Index_Report_2016
Concordia_Index_Report_2016Cheryl He
 
Dynamic Stress Test Diffusion Model Considering The Credit Score Performance
Dynamic Stress Test Diffusion Model Considering The Credit Score PerformanceDynamic Stress Test Diffusion Model Considering The Credit Score Performance
Dynamic Stress Test Diffusion Model Considering The Credit Score Performance
GRATeam
 
Booklet - GRA White Papers - Second Edition
Booklet - GRA White Papers - Second EditionBooklet - GRA White Papers - Second Edition
Booklet - GRA White Papers - Second Edition
Ziad Fares
 

Similar to 61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4 (20)

Optimization of Post-Scoring Classification and Impact on Regulatory Capital ...
Optimization of Post-Scoring Classification and Impact on Regulatory Capital ...Optimization of Post-Scoring Classification and Impact on Regulatory Capital ...
Optimization of Post-Scoring Classification and Impact on Regulatory Capital ...
 
Principles financialmarketinfrastructuresframeworkandmethodology
Principles financialmarketinfrastructuresframeworkandmethodologyPrinciples financialmarketinfrastructuresframeworkandmethodology
Principles financialmarketinfrastructuresframeworkandmethodology
 
PhD_Thesis_Dimos_Andronoudis
PhD_Thesis_Dimos_AndronoudisPhD_Thesis_Dimos_Andronoudis
PhD_Thesis_Dimos_Andronoudis
 
D03 guide-to-economic-appraisal-cba-16-july
D03 guide-to-economic-appraisal-cba-16-julyD03 guide-to-economic-appraisal-cba-16-july
D03 guide-to-economic-appraisal-cba-16-july
 
Bridging the audit expectation gap
Bridging the audit expectation gapBridging the audit expectation gap
Bridging the audit expectation gap
 
Dynamic Stress Test diffusion model and scoring performance
Dynamic Stress Test diffusion model and scoring performanceDynamic Stress Test diffusion model and scoring performance
Dynamic Stress Test diffusion model and scoring performance
 
Bma
BmaBma
Bma
 
EAD Parameter : A stochastic way to model the Credit Conversion Factor
EAD Parameter : A stochastic way to model the Credit Conversion FactorEAD Parameter : A stochastic way to model the Credit Conversion Factor
EAD Parameter : A stochastic way to model the Credit Conversion Factor
 
General principlesforcreditreporting
General principlesforcreditreportingGeneral principlesforcreditreporting
General principlesforcreditreporting
 
General principlesforcreditreporting
General principlesforcreditreportingGeneral principlesforcreditreporting
General principlesforcreditreporting
 
Thesis Final Report
Thesis Final ReportThesis Final Report
Thesis Final Report
 
Vendor Performance Management
Vendor Performance ManagementVendor Performance Management
Vendor Performance Management
 
Software testing services growth report oct 11
Software testing services growth report oct 11Software testing services growth report oct 11
Software testing services growth report oct 11
 
Impact assessment-study-dit
Impact assessment-study-ditImpact assessment-study-dit
Impact assessment-study-dit
 
Evaluation and Monitoring of Transboundary Aspects of Maritime Spatial Planni...
Evaluation and Monitoring of Transboundary Aspects of Maritime Spatial Planni...Evaluation and Monitoring of Transboundary Aspects of Maritime Spatial Planni...
Evaluation and Monitoring of Transboundary Aspects of Maritime Spatial Planni...
 
Concordia_Index_Report_2016
Concordia_Index_Report_2016Concordia_Index_Report_2016
Concordia_Index_Report_2016
 
Dynamic Stress Test Diffusion Model Considering The Credit Score Performance
Dynamic Stress Test Diffusion Model Considering The Credit Score PerformanceDynamic Stress Test Diffusion Model Considering The Credit Score Performance
Dynamic Stress Test Diffusion Model Considering The Credit Score Performance
 
EvalInvStrats_web
EvalInvStrats_webEvalInvStrats_web
EvalInvStrats_web
 
Evaluation
EvaluationEvaluation
Evaluation
 
Booklet - GRA White Papers - Second Edition
Booklet - GRA White Papers - Second EditionBooklet - GRA White Papers - Second Edition
Booklet - GRA White Papers - Second Edition
 

More from Alexander Hamilton, PhD

The perceived impact of agricultural advice in Ethiopia
The perceived impact of agricultural advice in EthiopiaThe perceived impact of agricultural advice in Ethiopia
The perceived impact of agricultural advice in EthiopiaAlexander Hamilton, PhD
 

More from Alexander Hamilton, PhD (6)

Evidence Base Programme
Evidence Base ProgrammeEvidence Base Programme
Evidence Base Programme
 
The perceived impact of agricultural advice in Ethiopia
The perceived impact of agricultural advice in EthiopiaThe perceived impact of agricultural advice in Ethiopia
The perceived impact of agricultural advice in Ethiopia
 
improving-tax-systems
improving-tax-systemsimproving-tax-systems
improving-tax-systems
 
1-s2.0-S1877584515000519-main
1-s2.0-S1877584515000519-main1-s2.0-S1877584515000519-main
1-s2.0-S1877584515000519-main
 
comparing-corruption-ethiopia-sudan-4
comparing-corruption-ethiopia-sudan-4comparing-corruption-ethiopia-sudan-4
comparing-corruption-ethiopia-sudan-4
 
31-14
31-1431-14
31-14
 

61506_Capstone_Report_DFID_FINAL_Quantifying_Governance__Indicators-4

  • 1. Quantifying Governance: An indicator-based approach DFID Capstone Team London School of Economics March 2015 Kimathi Muriithi, Margarita Jimenez, Nicolas Jannin, Noor Sajid, Sahibjeet Singh, Sudhanshu Sharma
  • 3. 2 Foreword The following report has been written as a part of a Capstone Project commissioned by the Department for International Development (DFID) to a group of students of the Master of Public Administration (MPA) at the London School of Economics and Political Science (LSE). The terms of reference (TORs) guiding the project are provided in the Appendix. All remaining errors are our own. The views and findings expressed in this report are the authors’ own and do not necessarily reflect the views of the DFID, LSE or their staff.
  • 4. 3 Acknowledgements This Capstone project, like many other endeavours, is a culmination of efforts of many indi- viduals. We take this opportunity to express our gratitude to all those who have helped us in various capacities. Firstly, we would like to thank our client, Department for International Development (DFID) and Dr. Alexander Hamilton, Statistics Advisor, DFID; for giving us an opportunity to work on an exciting project. Alexander was instrumental in shaping the trajectory of our research, and the current form of the report owes a great deal to his feedbacks and comments. We are also thankful to Mr. Conor Doyle, Economic Advisor, DFID, for useful insights, feedback and comments. Secondly, we are grateful to Dr. Konstantinos Matakos-our Capstone Advisor at LSE- for his guidance, constructive criticism and encouragement throughout the course of this project. We are indebted to Dr. Jouni Kuha, Department of Methodology, LSE, for his advice on the statistical methodology used in this report. We would also like to express our gratitude to Prof. Patrick Dunleavy and the participants of MPA Capstone Workshops for their useful comments, queries and insights. Lastly, we would like to thank the MPA Office at the London School of Economics for logistical support and assistance. DFID Capstone Team, London School of Economics & Political Science March, 2015
  • 5. 4 Executive Summary 1. The forthcoming Sustainable Development Goals have put governance back on the agenda. Within this context, DFID is interested in understanding and developing better measures to quantify and assess governance. This report contributes, to that aim, by providing a clear assessment of the validity and reliability of indicators in two particular dimensions of governance: Public Financial Management (PFM) and Corruption. 2. The literature review highlighted the following: • Governance is a multidimensional phenomenon and there is no convergence regarding its conceptual understanding. • The most widely used governance measurement approach has been composite indices. Therefore, this report reviews indicator-based approaches to quantifying governance. It discusses how to construct composite/aggregate indicators and assess their quality. 3. Analytical and practical considerations informed the selection of the two dimensions. PFM and Corruption allowed us to conduct a holistic analysis by allowing for the comparison of two types of indicators: objective and subjective. Additionally, both are salient governance dimensions and have strong policy relevance. Practical considerations highlighted the importance of using good data to carry the assessment. Thus, for PFM, the relevant datasets were chosen because of their applicability within the development context. For Corruption, the suitable datasets were chosen because they allowed us to explore different levels of indicator aggregation. 4. Having identified the relevant datasets, we employed multivariate analysis to assess the validity and reliability of our relevant indicators. Exploratory and Confirmatory Factor Analysis allowed us to investigate whether indicators designed to measure particular con- cepts were indeed consistent with the assumed structure. 5. Results: • PFM: The relevant datasets for this section were PEFA & OBS. The OBS analysis points towards the reliability and validity of our constructed aggregate indicators, thereby suggesting the overall good quality of OBS data. On the other hand, PEFA analysis provides less convincing results: the indicators seem to be measuring close constructs with weak convergent and discriminant validity. However, the PEFA re- sults should be interpreted with caution in view of the potential limitations concerning the data methodology employed.
  • 6. 5 • Corruption: The analysis is split into two sections. The first section, looks at aggregated measure of corruption using datasets from three sources - WB, ICRG and GIB - to investigate whether they are measuring the same underlying concept. The results show this to be true. Although we cannot firmly conclude from it that the measured concept is the true measure of corruption, it is evidence for these indicators being valid. The second section looks at whether the GCB survey indicators measure the right underlying concepts proposed by the survey. Our analysis shows that this is not the case and thus, it can be concluded that there is potential for future research to construct better measurements. 6. Our analysis allows us to suggest a meaningful way to further explore the appropriateness of governance indicators. For PFM measurement, we suggest the use of the OBS: valid, reliable and particularly suited for the monitoring of underdeveloped PFM systems. For corruption, we point out that the use of GCB at the early stages of project formulation will be beneficial. Additionally, the aggregate measures of corruption, can be used to provide contextual background when evaluating project impact. Finally, within each indicator set, we further recommend dropping problematic or redundant indicators, merging indicators that line up with a congruent pattern of dimensionality, and classifying indicators that appear to be tautological.
  • 7. Contents 1 Introduction 14 2 Governance: definition and dimensions 16 2.1 Understanding governance and its dimensions . . . . . . . . . . . . . . . . . . . . 16 2.2 Dimensions of focus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2.1 Public Financial Management Policy Background . . . . . . . . . . . . . . 17 2.2.2 Corruption Policy Background . . . . . . . . . . . . . . . . . . . . . . . . 19 3 Measuring Governance: composite indicators 21 3.1 Composite governance indicators: an introduction . . . . . . . . . . . . . . . . . 22 3.1.1 Advantages of composite (aggregate) indicators . . . . . . . . . . . . . . . 23 3.1.2 Weaknesses of composite (aggregate) indicators . . . . . . . . . . . . . . . 24 3.2 Designing indicators: methodological considerations . . . . . . . . . . . . . . . . 25 3.2.1 Selection of indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2.2 Factor retention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.2.3 Selection of aggregation function . . . . . . . . . . . . . . . . . . . . . . . 26 3.2.4 Selection of weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.5 Uncertainty and Sensitivity analysis . . . . . . . . . . . . . . . . . . . . . 28 6
  • 8. CONTENTS 7 4 Assessing governance indicators 30 4.1 Validity and reliability: definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.2 Introducing Exploratory and Confirmatory Factor Analysis . . . . . . . . . . . . 33 4.2.1 Exploratory Factor Analysis model . . . . . . . . . . . . . . . . . . . . . . 33 4.2.2 Confirmatory Factor Analysis model . . . . . . . . . . . . . . . . . . . . . 35 4.3 Assessing validity and reliability of indicators with factor analysis: methodology 36 4.3.1 Quantifying validity and reliability . . . . . . . . . . . . . . . . . . . . . . 36 4.3.2 Exploratory Factor Analysis: getting a first idea of the indicators’ validity and reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.3.3 Confirmatory Factor Analysis: confirming the factors structure . . . . . . 38 5 Data and analysis 40 5.1 Public financial management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 5.1.1 Open Budget Survey (OBS) . . . . . . . . . . . . . . . . . . . . . . . . . . 40 5.1.2 Public Expenditure and Financial Accountability (PEFA) . . . . . . . . . 53 5.2 Corruption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.2.1 Aggregate Measures of Corruption . . . . . . . . . . . . . . . . . . . . . . 65 5.2.2 Public Opinions: Global Corruption Barometer . . . . . . . . . . . . . . . 68 5.3 External consistency of the results . . . . . . . . . . . . . . . . . . . . . . . . . . 77 6 Recommendations and conclusion 79 6.1 Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 6.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 References 82 Appendix 90 Terms of Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Technical annex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
  • 9. Acronyms and Abbreviations BEES Business Enterprise Economic Survey CD Coefficient of determination CFA Confirmatory Factor Analysis CFI Comparative fit index CPI Corruption Perception Index DeMPA Debt Management Performance Framework DFID Department of International Development EFA Exploratory Factor Analysis GCB Global Corruption Barometer GIB Global Insights Business ICRG International Country Risk Guide INTOSAI International Organization of Supreme Auditing Institutions MDGs Millennium Development Goals NGOs Non-governmental Organizations NPFM New Public Financial Management OBI Open Budget Index PCA Principal Component Analysis PEFA Public Expenditure and Financial Accountability PFM Public Financial Management PFM-PR Public Financial Management Performance Report 8
  • 10. CONTENTS 9 PMF Public Expenditure and Financial Accountability Performance Measurement Frame- work RMSEA Root mean squared error of approximation SRMR Standardized root mean squared residual TI Transparency International TLI Tucker-Lewis index WBES World Bank Enterprise Surveys
  • 11. List of Figures 3.1 Indicators’ classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2 Composite index structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.3 Constructing composite indicators . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.1 Validity and reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.2 One-factor EFA model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.3 EFA vs CFA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 5.1 OBS’s 11 intermediate indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 5.2 EFA visualisation, Budget Process and Budget Proposal . . . . . . . . . . . . . . 44 5.3 CFA diagram, Budget Process and Budget Proposal . . . . . . . . . . . . . . . . 45 5.4 EFA visualisation, Budget Proposal and Citizens’ Budget . . . . . . . . . . . . . 47 5.5 CFA diagram, Budget Proposal and Citizens’ Budget . . . . . . . . . . . . . . . . 48 5.6 EFA visualisation, Budget Process and Citizens’ Budget . . . . . . . . . . . . . . 49 5.7 CFA diagram, Budget Process and Citizens’ Budget . . . . . . . . . . . . . . . . 50 5.8 EFA visualisation, all dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 5.9 CFA diagram, all dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 5.10 Structure and coverage of the PFM system . . . . . . . . . . . . . . . . . . . . . 54 5.11 PEFA indicators list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 10
  • 12. LIST OF FIGURES 11 5.12 EFA visualisation, Credibility and Comprehensiveness & Transparency . . . . . . 57 5.13 CFA diagram, Credibility of the Budget and Comprehensiveness & Transparency 59 5.14 EFA visualisation, Credibility of Budget and Accounting, Recording & Reporting 60 5.15 CFA diagram, Credibility of Budget and Accounting, Recording & Reporting . . 61 5.16 EFA visualisation, Credibility of the Budget and Budget Cycle . . . . . . . . . . 62 5.17 Methods for measuring corruption. . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.18 Main “overall corruption” indicators . . . . . . . . . . . . . . . . . . . . . . . . . 67 5.19 GCB’s 23 indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5.20 EFA visualisation, Experience of Corruption & Effectiveness . . . . . . . . . . . . 71 5.21 EFA visualisation, Perception & Experience of Corruption . . . . . . . . . . . . . 72 5.22 EFA visualisation, Perception of Corruption & Effectiveness . . . . . . . . . . . . 73 5.23 EFA visualisation, all dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 5.24 CFA diagram, all dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
  • 13. List of Tables 4.1 Validity and reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.2 CFA fit indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.1 EFA, Budget Process and Budget Proposal . . . . . . . . . . . . . . . . . . . . . 44 5.2 CFA, Budget Process and Budget Proposal . . . . . . . . . . . . . . . . . . . . . 46 5.3 CFA fit indices, Budget Process and Budget Proposal . . . . . . . . . . . . . . . 46 5.4 CFA fit indices, Budget Proposal and Citizens’ Budget . . . . . . . . . . . . . . . 48 5.5 CFA fit indices, Budget Process and Citizens’ Budget . . . . . . . . . . . . . . . 50 5.6 CFA fit indices, all dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 5.7 CFA fit indices, Credibility and Comprehensiveness & Transparency 1/2 . . . . . 58 5.8 CFA fit indices, Credibility and Comprehensiveness & Transparency 2/2 . . . . . 59 5.9 CFA fit indices, Credibility of Budget and Accounting, Recording & Reporting . 61 5.10 CFA fit indices, Credibility of Budget and Budget Cycle . . . . . . . . . . . . . . 63 5.11 Measures of corruption. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 5.12 CFA fit indices, all dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 5.13 CFA fit indices, all dimensions modified . . . . . . . . . . . . . . . . . . . . . . . 77 6.1 EFA, Budget process and budget proposal . . . . . . . . . . . . . . . . . . . . . . 93 6.2 EFA, Budget process and budget proposal . . . . . . . . . . . . . . . . . . . . . . 93 12
  • 14. LIST OF TABLES 13 6.3 CFA, Budget process and budget proposal . . . . . . . . . . . . . . . . . . . . . . 95 6.4 EFA, Budget proposal and citizens’ budget . . . . . . . . . . . . . . . . . . . . . 97 6.5 CFA, Budget proposal and citizens’ budget . . . . . . . . . . . . . . . . . . . . . 98 6.6 EFA, Budget process and citizens’s budget . . . . . . . . . . . . . . . . . . . . . . 99 6.7 CFA, Budget process and citizens’s budget . . . . . . . . . . . . . . . . . . . . . 99 6.8 CFA, Budget process, budget proposal and citizens’ budget . . . . . . . . . . . . 101 6.9 CFA, Budget process, budget proposal and citizens’ budget . . . . . . . . . . . . 102 6.10 EFA, Credibility [A.] and Comprehensiveness & Transparency [B.] . . . . . . . . 103 6.11 EFA, Credibility [A.] and Budget Cycle [C(iii)] . . . . . . . . . . . . . . . . . . . 103 6.12 EFA, Credibility [A.] and Budget Cycle [C.] . . . . . . . . . . . . . . . . . . . . . 104 6.13 CFA, Credibility [A.] and Comprehensiveness & Transparency [B.] . . . . . . . . 104 6.14 CFA, Credibility [A.] and Budget Cycle [C(iii)] . . . . . . . . . . . . . . . . . . . 105 6.15 CFA, Credibility [A.] and Budget Cycle [C.] . . . . . . . . . . . . . . . . . . . . . 106 6.16 EFA, Aggregate Measures of Corruption . . . . . . . . . . . . . . . . . . . . . . . 107 6.17 EFA, Experience of Corruption & Effectiveness . . . . . . . . . . . . . . . . . . . 107 6.18 EFA, Perception of Corruption & Effectiveness . . . . . . . . . . . . . . . . . . . 108 6.19 EFA, Perception & Experience of Corruption . . . . . . . . . . . . . . . . . . . . 108 6.20 EFA, Perception of Corruption, Experience of Corruption & Effectiveness . . . . 109 6.21 CFA, Perception of Corruption & Effectiveness . . . . . . . . . . . . . . . . . . . 110 6.22 CFA, Perception & Experience of Corruption . . . . . . . . . . . . . . . . . . . . 111 6.23 CFA, Perception of Corruption, Experience of Corruption & Effectiveness . . . . 112
  • 15. Chapter 1 Introduction The Department for International Development (DFID) is the British government’s department responsible for the UK’s efforts towards ending extreme poverty (DFID, 2014) and achievement of the Millennium Development Goals (MDGs). In line with this, DFID has decided to invest efforts in the study of governance, due to its implied association with economic growth1 and inclusion in the new Sustainable Development Goals (SDGs) (UN, 2014). Consequently, the UK has included good governance as part of its development objectives and has committed to using its expertise to achieve this goal2. However, governance is not easily quantifiable. Some of the shortcomings confronted by aid agencies and policymakers in tracking progress, in this area, relate to the difficulties of defining governance, establishing accurate indicators, collecting dispersed information, and comparing measurement outcomes across countries (UNDP, 2012). This report aims to explore the validity and reliability of specific governance indicators using a multi-variate analysis approach. Particu- larly, we employ exploratory and confirmatory factor analysis to assess both criteria based on the study of two relevant dimensions of governance: Corruption and Public Financial Management. The findings from our analysis related to PFM imply that OBS data is of good quality overall while PEFA appears to measure very close constructs with weak convergent and discriminant validity. For Corruption, we find that the aggregate indicators developed by the World Bank, GIB and ICRG measure the same underlying notion of corruption. However GCB indicators fail to measure the same conceptual understanding proposed by the survey. 1 The extensive literature on governance and economic performance highlights their close association and the need to expand efforts to achieve good governance in order to enhance economic growth (see Khan, 2006; Acemoglu & Robinson, 2006; Arndt & Oman, 2010; Morrisey et al., 2011; Gani, 2011; Cerquety, 2011). 2 “This, I believe, is a totally new addition to the Millennium Development Goals: the importance of good governments, lack of corruption – what I call the golden thread of development.” David Cameron’s speech in the UN (15 May 2013), available: https://www.gov.uk/government/speeches/david-camerons-speech-to-un 14
  • 16. CHAPTER 1. INTRODUCTION 15 The structure of this report proceeds as follows: Chapter 2 defines governance and its dimensions, and gives a policy background of the two dimensions that are the focus of our analysis. Chapter 3 develops a general approach to governance indicators and elaborates on the construction and use of composite indicators. Chapter 4 defines validity and reliability and presents the main econometric tools used in our assessment. Chapter 5 presents the data and results of the technical analysis and discusses the practical significance of the results. From there, Chapter 6 concludes and presents our policy recommendations.
  • 17. Chapter 2 Governance: definition and dimensions In this chapter, we define governance and identify some of its dimensions as proposed by various organizations. We then identify two dimensions, Public Financial Management and Corruption, which will be the focus of our analysis and give the policy context for our study of these two dimensions. 2.1 Understanding governance and its dimensions Academics, international organizations and development agencies have defined governance in multiple ways and there appears to be no agreement on a single definition of the term. In this section, we attempt to define governance and give example of dimensions of governance as proposed by different organisations. Definitions of governance range from those that are broad to narrow ones. The latter reflect specific interpretations of the term by different organisations, in line with their particular man- date (World Bank, 1994). Furthermore, a number of donor agencies and academics have given a normative element to the term ‘governance’, often referring to it as ’good governance’. The underlying reason for adopting a normative approach is the close association between aid effec- tiveness and good quality institutions (WGI 2009). The World Bank Research Institute proposes one of the most comprehensive definitions of gov- ernance. They define it as the traditions and institutions by which authority in a country is exercised including: • The process by which governments are selected, monitored and replaced 16
  • 18. CHAPTER 2. GOVERNANCE: DEFINITION AND DIMENSIONS 17 • The capacity of the government to effectively formulate and implement sound policies • The respect of citizens and the state for the institutions that govern economic and social interactions among them (Kauffman et al., 2004) Defining governance is important, since its operationalisation helps identify the various elements that constitute it, thereby facilitating its measurement. Although there is currently no consen- sus on what ‘governance’ means, institutions have chosen to identify principles or dimensions of governance to facilitate its measurement and the monitoring of progress (WGI, 2009). UNDP for example identifies 9 characteristics of good governance as participation, rule of law, trans- parency, responsiveness, consensus orientation, equity, effectiveness and efficiency, accountabil- ity and strategic vision (UNDP, 1997). The Worldwide Governance Indicators Project proposes six dimensions of governance: ‘Voice and accountability’, ‘Political Stability and Absence of Violence’, ‘Government Effectiveness’, ‘Regulatory Quality’, ‘Rule of Law’ and ‘Control of Cor- ruption’ (Kauffman et al., 2004). Other dimensions of governance include public sector delivery, public financial management, state capacity, security, empowerment and elections. 2.2 Dimensions of focus From the many possible dimensions of governance, this report focuses on two: Public Financial Management (PFM) and Corruption. With these we aim to compare objective and subjective governance indicators. While indicators measuring PFM are mostly based on objectively ver- ifiable data, corruption indicators often rely on perception based information. Therefore, by deciding to analyse these dimensions, we aim to encompass the two ends of the spectrum of existing governance indicators. More importantly, from a policy perspective these dimensions are central in the governance debate, as illustrated in the rest of this section. 2.2.1 Public Financial Management Policy Background PFM refers to how a government manages the budget in its various phases of formulation, approval and execution including oversight, control and intergovernmental fiscal relations (Can- giano et al., 2012). It relates to the laws, systems and procedures used by governments to employ resources efficiently and transparently and focuses on the management of government expenditure (Allan et al., 2004). PFM has three main policy objectives (Schick, 1998): • To promote a sustainable fiscal position by establishing a balance between government revenues and expenditure.
  • 19. CHAPTER 2. GOVERNANCE: DEFINITION AND DIMENSIONS 18 • To facilitate effective allocation of public resources in line with government priorities. • To enhance efficient delivery of public goods and services by promoting value for money. Since the late 1980’s, donor countries have recognised the importance of improving public sector management in order to improve economic performance (Wescott, 2008). This is because high levels of corruption in developing countries may lead to aid being used for purposes other than for those it was intended. This concern, as well as the new framework for aid1 where donor countries committed to relying on partner countries’ own financial management institutions, has lead to PFM becoming central in development policy. As a result, more than 50 donor agencies including the World Bank, are now providing either General Budget or Sector Budget Support to developing countries (Bietenhader, 2010). The last two decades have seen a gradual evolution of PFM and an adoption of reforms that have introduced new information requirements, process adjustment, and imposition of restrictive rules (Cangiano et al., 2012). These reforms have taken different theoretical approaches. New Public Financial Management (NPFM) introduced reforms that included a shift from cash to accruals based systems, devolution of budgets, performance measurement and performance based auditing (DFID, 2009). Public Expenditure Management approach aimed to understand the broader context of good budgeting practices taking into account different actors and institutions as well as linking expenditure with results (World Bank, 2001). More recently, a study by the Public Expenditure and Financial Accountability Initiative (PEFA) developed the Strengthened Approach. The study identified a lack of country ownership, in- ability to objectively measure the progress of PFM reforms and uncoordinated PFM projects as posing a hindrance to PFM reforms in developing countries. This approach emphasised in- creased government ownership, coordinated program support by various donor countries and a measurement framework to measure PFM results and performance over time. In light of the measurement framework objective of the Strengthened Approach various measures and diagnostic tools have been developed and applied to measure PFM results including the IMF Code of Good Practices for Fiscal Transparency, and the Public Expenditure and Financial Accountability (PEFA) Assessment, the Open Budget Survey (OBS), the Debt Management Performance Framework (DeMPA), DFID’s Fiduciary Risk Assessment and the OECD Aid Effectiveness Indicators. Two of the most commonly used measures, PEFA and OBS, are the subject of our analysis in the following sections. PFM is crucial for good governance since it ensures a more transparent budgeting process that reduces miss-allocation of funds and public debt. It is recognised that budgeting, besides being a technical process is also a political process that is affected by informal interests that may at times override the efficiency and effectiveness imperative. PFM aims to counter this by introducing 1 Encapsulated in the 2005 Paris Declaration.
  • 20. CHAPTER 2. GOVERNANCE: DEFINITION AND DIMENSIONS 19 measures that change the motivations and actions of politicians and public servants or restrict their actions (Cangiano et al., 2012). 2.2.2 Corruption Policy Background Corruption has been defined as the misuse of entrusted power for private gain and can be classified into either grand or petty corruption (TI, 2015)2. Grand corruption is committed at high levels of the government resulting in distortion of policies or the state’s function, while petty corruption refers to daily misuse occurring between low-level officials and citizens (TI, 2015; Mashali, 2012). Corruption is a topic of great concern for policy makers, academics and aid organisations due to its apparent negative effects on economic performance. Morrissey et al. (2011) argue that corruption reduces direct investment incentives, while Begovic (2005) states that it increases transaction costs of economic operations. However, studies have shown that although corruption has an adverse effect on growth, it is hard to establish its magnitude (DFID, 2015). Moreover, Me´on & Weill (2010) state that for countries with low quality institutions, corruption can act as the ‘grease in the wheels’ leading to increased productivity. Despite this, by providing in- centives for illegal misappropriation of benefits, corruption perpetrates institutional weaknesses and reduces the chances of economic improvement in the long run. Empirical evidence shows that the positive effects of corruption on aggregate efficiency only hold in countries with weak institutions and frail democracies (ibid, 2010). There are various explanations of the causes of corruption, with some relying on the study of the individual incentives for corruption. Begovic (2005) builds on the rational choice approach to explain corruption and views people as utility maximising entities aiming to increase personal wealth. Corruption here is seen as an innate behaviour that serves to achieve this end by reducing transaction costs allowing for misappropriation of rents. Furthermore, such behaviour is exacerbated by public officials’ greed and discretion. As explained by Rose-Ackerman (1978), corruption is the consequence of excessive discretionary power, which allows public officials to reward or punish citizens in order to achieve their own preferences with low risk of detection. Another salient explanation of corruption is the principal-agent model (DFID, 2015). Corruption is seen as being the result of asymmetric information between the principal (citizens) and agent (public officials). The fact that agents hold more information than the principals, with the latter having little oversight capacity, increases the likelihood of corruption. This approach 2 Since 1993 until now, Transparency International (TI) has played an important role in the disclosure of the pervasive practice of corruption around the world. Through their yearly release of the Corruption Perception Index this non-governmental organisation draws attention on the achievements and drawbacks of the fight against corruption around the world. Similarly, the World Bank conducts a year measurement of the governance perfor- mance. The so called World Wide Indicators (WWI) includes as one of its six governance dimension the control of corruption. Interestingly, 31 out of 32 sources used by this renowned measurement includes indicators regarding corruption.
  • 21. CHAPTER 2. GOVERNANCE: DEFINITION AND DIMENSIONS 20 incorporates the individual rent-seeking rational previously considered while highlighting the asymmetric relationship between actors. According to the literature, this is a valid explanation that has been widely evidenced in developing countries. Conversely, Khan (2006) points out structural causes of corruption in the developing world. Two of them are: the use of corruption to guarantee political stability by giving selective incentives to certain constituencies, and the weakness of property rights which increases the chances of developing non-market transactions. In both cases, corruption is the answer to either the in- stitutional or structural failure. Therefore, policy reforms must concentrate on strengthening institutions. Finally, it is necessary to highlight that corruption appears in different forms and has different drivers depending on the sector. For example, Bertram (2005) studies corruption in nine sectors and concludes that types and drivers of corruption change from one sector to another. Therefore, anti-corruption policy design should take into account the key features of each sector in order to be effective. Most importantly, any attempt to accurately measure and assess policy results in this area requires a clear definition and understanding of the type and level of corruption to be evaluated.
  • 22. Chapter 3 Measuring Governance: composite indicators1 As a multidimensional phenomena, governance has been recurrently measured by composite in- dicators. This chapter presents this approach and discusses its advantages and disadvantages. It focuses, in detail, on the process of designing composite indicators and addressing methodological issues from a technical perspective. Quantifying governance is an arduous endeavour- primarily due to the practical and method- ological issues involved in the design of good metrics. Governments, development organisations and the private sector are increasingly adopting a composite indicator based approach to mea- sure the multidimensional phenomena of governance (Williams, 2011). The key advantage of composite indicators is their capacity to compile and summarise large amounts of information about many different governance dimensions in a single year score for each country. This allows donor agencies like DFID, USAID etc. to track countries’ progress and better define resource allocation. Therefore, an index approach has become widely used for decision-making purposes (Arndt & Oman, 2010; Foa & Tanner, 2012; UNDP, 2007). Governance indicators can be classified in various ways, and the following diagram is an illus- tration of one possible classification. 1 Also referred to as ‘aggregate indicators’. 21
  • 23. CHAPTER 3. MEASURING GOVERNANCE: COMPOSITE INDICATORS 22 Figure 3.1: Indicators’ classification Source: Authors’ own representation. The design of composite governance indicators is subject to several challenges. Firstly, the multidimensional nature of governance makes it difficult to quantify.The composite index design requires an arbitrary selection of which dimensions and proxies to use in-order to capture changes on the chosen parameters and how to aggregate individual indicators. Secondly, more often than not governance indices are based on subjective and perception based information like expert surveys. The use of subjective information increases the scope for bias and measurement errors (Arndt & Oman, 2006). Thirdly, the framework for measuring governance is often influenced by international standards which might not reflect the realities of other contexts. For example, the OBS compares both developing and developed countries for Public Financial Management using the same framework, even though it is well known that there are significant differences amongst the two groups of countries in terms of budget procedures and capacities. The following sections discuss how to overcome some of these problems from a methodological perspective. 3.1 Composite governance indicators: an introduction Composite indexes are a particular type of measures designed to quantify multidimensional phenomena which cannot be adequately represented by an individual indicator e.g. Multidi- mensional Poverty Index (MPI) (Anand & Sen 1994, Alkire and Housseini, 2014). Under this approach, a set of relevant variables and dimensions are defined, measured and aggregated under a single country score. Simply, composite indicators resemble a pyramid structure (Figure 3.2). At the base, it compiles all the relevant variables to measure the specific phenomenon. Each of those are aggregated under a set of components. Later, the resulting score is aggregated un- der a single country-year score which represents the country indicator (Arndt & Oman, 2010).
  • 24. CHAPTER 3. MEASURING GOVERNANCE: COMPOSITE INDICATORS 23 Thus, composite indexes produce synthetic measure of complex realities, which serve to monitor changes in system status or to track trends in system performance (OECD, 2008; USAID, 2014). Figure 3.2: Composite index structure Source: Authors’ own representation. 3.1.1 Advantages of composite (aggregate) indicators The key strength of aggregate indicators is their ability to convey information on many param- eters succinctly (Booysen, 2002; Hahn, 2008; Zhou & Ang, 2009; Balica, 2012b). Therefore, composite indices are powerful and communicative tools because they present clear and concise results to non-technical audiences such as scores or rankings (Kenney et al., 2012). That helps to promote a multi-stakeholder dialogue in establishing common understanding of supranational concerns and overcoming socio-political barriers of decision making (Preston et al., 2011: 183). The two main advantages of aggregate measures are: • Variables that cannot be directly observed may be inferred by integrating multiple indica- tors as part of a composite indicator. • Composite indices usage helps to overcome the problems of precision, reliability and ac- curacy by reducing the influence of measurement error as the number of observation from multiple sources increase (Kaufmann & Kraay, 2007; Maggino & Zumbo, 2012).
  • 25. CHAPTER 3. MEASURING GOVERNANCE: COMPOSITE INDICATORS 24 They can also inform policy action and guide further research. Booysen (2002), postulates that composite indexes are flexible enough to be modified and updated to meet the decision makers’ requirement. Additionally, as new data becomes available the composite index methodologies evolve and refine over time to incorporate new challenges (e.g. HDI modifications 2010, see Klug- man et al., 2011). Therefore, composite indicators can help policymakers to identify priorities, establish and refine standards, develop policy guidelines, determine appropriate adaptations, set targets, and allocate resources (OECD, 2008), as well as, guide future research, data collection, and data improvement efforts by revealing weaknesses and gaps in the data systems (USAID, 2014). 3.1.2 Weaknesses of composite (aggregate) indicators The common critique of aggregate indices is embedded within its framework of implicit and ex- plicit assumptions which do not always hold (Arndt & Oman, 2006; Ravellion, 2012), alongside, the loss of individual indicators richness during the aggregation method (Molle and Mollinga, 2003; Abson et al., 2012; Kenney et al., 2012). Both shortcomings, can lead to mistaken conclu- sions (Lindsey, Wittman et al., 1997). Moreover, composite indices may also fail to capture the interconnectedness of indicators, ignore important dimensions that are difficult to measure, and disguise weaknesses in some components (Molle & Mollinga, 2003; Zhou & Ang, 2009; Abson et al., 2012). Conversely, to what some authors argue (e.g Kaufman et al., 2007), indicator aggregation can have a domino effect wherein it tends to amplify the effect of measurement errors. Thus, prob- lems of precision, reliability, accuracy, and validity associated with individual indicators can be propagated during the process of aggregation. However, the biggest limitation of aggregate indicators is their mechanism of determining the constituent variables (Lohani & Todino, 1984). Generally, the parameters chosen are reflective of the priorities or focus areas of agencies which construct such indicators. There is no standard scientific method of selecting parameters, rather they tend to be based on expert opinions. Thus, aggregate indicators cannot rule out the pos- sibility of omitting important variables and also are exposed to experts bias (Arndt & Oman, 2006). Therefore, aggregate indicators tread a tightrope; simplifying the intricate information without being simplistic. However, the presence of the above concerns implies that composite indices can also misguide policy and practice if used in an undiscriminating manner or if results are misinterpreted, misrepresented, or overstated (Arndt & Oman, 2006; USAID, 2014).
  • 26. CHAPTER 3. MEASURING GOVERNANCE: COMPOSITE INDICATORS 25 3.2 Designing indicators: methodological considerations2 3.2.1 Selection of indicators While indicators used in aggregation are often obtained from existing data sources, they can also be sourced by planning and implementing new data collection efforts. The selection of the variables for inclusion in aggregation is a contentious issue and must be approached with caution (Lohani & Todino 1984). Barnett et al. (2008) argue that indicators are sometimes “selected not because the data reflect important elements of a model of vulnerability, but because of the existence of data that are relatively easy to access and manipulate.” Pragmatic criteria for deciding whether to include or exclude indicator are as follows (USAID, 2014): • Data availability from public or private sources, including the cost, frequency, timeliness, consistency, and accessibility of available data and the indicators’ temporal and spatial coverage. • If periodic updates are planned, then it is important to ascertain institutional commitments to update and maintain constituent data sets, and to choose data accordingly. • Where new data is to be collected, variables of time, effort, and budget constraints must be taken into account. • Data quality (e.g., data accuracy; whether or not the data are adequately referenced and dated) • The degree of salience: how relevant is the indicator to the intended users of the index?; and • The degree of audience resonance: how meaningful is the indicator to the intended audi- ence? Other than pragmatic considerations, the selection of variables is contingent on other factors. Firstly, the selected sub-indices must provide a representation of principal factors of interest. Secondly, collinearity amongst subindices is a problem. Highly correlated subindices can be effectively considered as substitutes. Correlation coefficients are a standard way of testing for collinearity. A practical way of addressing the problem of collinearity is to include only one subindex from a highly correlated set. 2 A flowchart representation of the indicator design process is provided in Figure 3.3.
  • 27. CHAPTER 3. MEASURING GOVERNANCE: COMPOSITE INDICATORS 26 3.2.2 Factor retention If the number of indicators selected at the previous stage are large, it is desirable to reduce the effective number by selecting only the most significant indicators, removing indicators of low rel- evance, and thereby minimising the redundancy of highly correlated variables. Many statistical techniques and stakeholder processes are available to narrow down the pool of indicators; for example, exploratory factor analysis, principal component analysis (PCA), derivative method, correlation method, expert survey, and stakeholder discussion (Adger & Vincent, 2005; Balica & Wright, 2010; Balica et al., 2012; Babcicky, 2013). 3.2.3 Selection of aggregation function The aggregation function is central to the creation of aggregate index- it defines how the sub- indices combine to form the aggregate. Given its importance, it is not surprising that there is considerable debate over the most appropriate aggregation function (Ravalion, 2012). Some of the most commonly used aggregation functions are: • Summation (additive aggregation): summation of normalised and weighted or un- weighted indicators to compute the arithmetic mean (Booysen, 2002; Tate, 2012). • Multiplication (geometric aggregation): the product of normalised weighted indica- tors (Tate, 2013). • Power means or adjusted means: stress the importance of specific areas of the indica- tors, where countries suffer more deprivation or where they are at risk of under-performing with respect to a set threshold. • Max or Min: the maximum sub index or minimum sub index respectively is reported Firstly, one needs to consider the strengths and weaknesses of the aggregation functions. Ott (1978) has highlighted two potential problems with aggregation function: • Underestimation problem: when the index does not exceed a critical value despite one or more of its sub-indices exceeding the critical value. • Overestimation problem: when the index exceeds the critical level without any sub index exceeding the critical level. The above problems become increasingly significant when the sub-indices are dichotomous or categorical. Therefore, a good aggregation function is one which minimises one or both of the over (under)- estimation problems.
  • 28. CHAPTER 3. MEASURING GOVERNANCE: COMPOSITE INDICATORS 27 Secondly, it is necessary to take into account the functional form (either increasing or decreasing) of the sub-indices. Those with increasing functional forms regard higher values as a ‘worse’ state than lower values, and vice-versa Lastly, simple mathematical aggregation functions are often preferred over complex functions. Thus, when competing aggregation functions produce similar results with respect to overestima- tion and underestimation, the most appropriate function will be ‘simplest’ one, mathematically (Jollands et al., 2003). 3.2.4 Selection of weights The weights on sub-indices indicate their relative importance and greatly influence the final aggregate indicator. However, weighting is one of the most contentious topics in indicator design, partially because there is no standard method. We present below some approaches. Normative approaches: use expert consultation, stakeholder discussion, and public opinion surveys to inform weighting schemes on the basis of the expertise, local knowledge ” value judg- ments and insights of particularly relevant individuals and groups (Booysen, 2002; Chowdhury & Squire, 2006; Cherchye et al., 2007; Barnett et al., 2008; OECD, 2008; Kienberger, 2012; Decancq & Lugo, 2013). Numerical approaches: • Differential weighting: is employed when there is sufficient knowledge understanding of the relative importance of index components or of the trade-offs between index dimensions (Belhadj, 2012; Decancq & Lugo, 2013; Tate, 2013) • Equal weighting: it is applied when trade-offs between dimensions are not well under- stood and therefore assignment of differential weights cannot be reliably justified (Tate, 2012, 2013; Decancq & Lugo, 2013; Tofallis, 2013). Data driven approaches: they apply statistical methods to generate indicators’ weights. Blancas et al. (2013) argue that using statistical methods to determine weights may counteract the influence of subjective decisions made at other stages of indicator design process. Factor analysis (FA) and Principal Component Analysis (PCA) can be used to test indicators for correlation, thus allowing for adjustments to the weighting scheme by reducing the weights of correlated indicators. PCA and FA generate weighting schemes that account for as much of the variation in the data as possible with the smallest possible number of indicators (Deressa et al., 2008; OECD, 2008; Nguefack-Tsague et al., 2011; Abson et al., 2012; Tofallis, 2013). However, as Muro, Maziotta and Pareto (2011) argue, this weighting approach is rigid. Thus,
  • 29. CHAPTER 3. MEASURING GOVERNANCE: COMPOSITE INDICATORS 28 the indicator has the risk of becoming inoperable when confronted with changes in the data collection process. Additionally, sometimes FA results may diverge from reality. Consequently, a good underlying theory is essential to avoid such errors (Muro, Maziotta & Pareto, 2011). This analysis uses FA as the technique to assess validity and reliability of governance indicators. A complete explanation for this methodology is developed in section 4. 3.2.5 Uncertainty and Sensitivity analysis The final step of a composite indicator construction is to run uncertainty and sensitivity tests. These help to determine if the adopted theoretical model is a good fit for the selected constituent indicators, and the extent to which a different choice of inputs changes the output ranking. It can also test if the weighting scheme is actually reflected in the output and if the index is capable of reliably detecting change over time and space (USAID, 2014). These analyses inform modifications and refinements of index composition and structure to improve the accuracy, credibility, reliability, and interpretability of index results (OECD, 2008; Permanyer, 2011; Tate, 2012, 2013). • Uncertainty analysis: focuses on how uncertainty in the input factors propagates through the structure of the composite indicator and affects the composite indicator val- ues (Nardo et al., 2005). It identifies, and evaluates, all possible sources of uncertainty in the index design, and input factors. It includes theoretical assumptions, selection of con- stituent indicators, choice of analysis scale, data quality, data editing, data transformation, methods applied to overcome missing data, weighting scheme, aggregation method, and composite indicator formula (USAID, 2014). • Sensitivity analysis: it analyses the degree of influence of each input on the index output, thereby revealing which methodological stages and choices are most or least influential (Gall, 2007; Tate, 2012). Permanyer (2012) propose that modellers should compare index results calculated using alternate weighting schemes to ascertain whether overall index rankings change substantially This section answered a critical question, “How to design a good composite indicator?”, which is of great value to organisations and aid agencies engaged in endeavours to measure governance. However, more often than not policymakers have to rely on the set of already available indicators to make their decisions. The following chapter is aimed to aid the policymakers by answering the question, “How to assess the quality of an indicator?”.
  • 30. CHAPTER 3. MEASURING GOVERNANCE: COMPOSITE INDICATORS 29 Figure 3.3: Constructing composite indicators Source: Authors’ own representation.
  • 31. Chapter 4 Assessing governance indicators This chapter provides a brief technical introduction to the criteria and methods used in ap- praising the quality of governance indicators. First, it defines two of an indicator’s must-have qualities: validity and reliability. Then, it gives a brief introduction of factor analysis. Finally, it explains how the latter is helpful in discussing indicators’ validity and reliability. 4.1 Validity and reliability: definition Any governance measure should fulfil two important criteria in order to be considered as an accurate measure: validity and reliability. Validity refers to the extent to which a specific indicator measures the concept it attempts to measure (Gisselquist, 2013). In statistical terms, it is defined as the lack of systematic error. Similarly, reliability refers to the extent to which an indicator can be extrapolated in time and place. Within the statistical framework it relates to the degree to which a measure lacks random errors (Maruyama & Ryan, 2014). Systematic and random errors are a recurrent problem in perception-based data and a potential source of inaccuracy for composite indicators. Indicators can be valid but not reliable or vice-versa. As shown in Figure 4.1, panel 2, the country level indicator might measure the true value on average but the measure would be very volatile: in such case it is a valid but unreliable indicator. For instance, think of a corruption measurement being carried monthly in a same country. Corruption levels in a country change slowly, and these measurements should be very close, if not the same: the opposite would hence be evidence for the measurement’s lack of reliability. Conversely, panel 1 shows the opposite case with an invalid and very reliable indicator which consistently measures the wrong concept. 30
  • 32. CHAPTER 4. ASSESSING GOVERNANCE INDICATORS 31 Figure 4.1: Validity and reliability Source: MindSonar, understanding statistics. Formally, an indicator at the country level is: Xi,t = Truthi,t + Biasi,t + i,t where Xi,t is the indicator we measure for country i in year t (for simplicity of exposition let’s assume that it is a simple average of respondents answers), Truthi,t is the true level of what you try to measure in country i and year t, Biasi,t is the country average of respondent’s systematic error, and i,t is the country average of respondent’s random measurement error. Systematic errors: these are associated with sampling errors and perception biases. This is pertinent to governance indicators as they mostly rely on expert opinion. First, experts and general public views on governance achievements widely diverge. In terms of policy, problems flagged by experts will be very different from the broad population views and needs, particularly because they tend to over represent men, wealthy population, and business people. This, indeed, would have a meaningful impact in how policy advocacy or aid allocation is carried or assigned over some topics and others don’t (for example, in issues related to taxation and regulation). Second, there is a risk of circular reasoning. Given the context shared by experts, business people and organisations, it is likely that each of them would end-up informing their colleagues’ views. Both shortcomings directly affect the validity of the indicators (Oman & Arnt, 2006). Another source of systematic error arises from the likelihood that respondents change their responses in order to influence the countries score to benefit their own interests (Oman Arndt & Oman, 2010; Donchev & Ujhelyi, 2008). Random errors: they relate to respondent’s unintentional biases like misunderstanding of the questions, failure to remember exact fact or simply mood or fatigue (Maruyama & Ryan, 2014). In this respect, the use of large data sets and the aggregation of different data sources contribute to increase the reliability, as far as they measure the same thing (UNDP, 2007). Indeed, individuals’ uncorrelated random errors average out. However in practices there are reasons to believe that individual’s random errors are correlated in a given year, and do not fully average out when aggregating answers. This is attributable to many reasons. Firstly, data
  • 33. CHAPTER 4. ASSESSING GOVERNANCE INDICATORS 32 produced by the indicator is also used to inform respondents’ opinions. Secondly, as presented above, circular reasoning is also an issue. Third, respondents might be affected in a similar way by the same type of political or economic factors and hence their views would differ across different points of time. Fourth, respondents of the same country have similar views based on a shared culture or background. When compiling different sources to calculate the final indicator, any of these issues reduces the amount of additional information provided by each source. The effect of such shortcoming is that it decreases the confidence intervals and consequently any country variation in the score may be misleading and impossible to compare with other countries’ performance (Oman & Arnt, 2006). As presented by Kaufman et al. (2011) a rule of thumb to overcome such an issue is to identify and compare countries which confidence interval do not overlap between each other or across periods. Table 4.1 summaries the different sources of potential biases. Table 4.1: Validity and reliability What is Truth? Concept (unquantifiable) Amount (objective) Appropriateness of amount (subjective) Validity (biases) Proxy No proxy Sampling biases Information biases Perception biases Proxy bias (is the proxy a good proxy?) Sampling biases, Information biases, Perception biases Sampling biases, Information biases, Perception biases Reliability (Random errors) Average of respondents idiosyncratic shocks (mood, fatigue, misrecording, setting. . . ) We aims to explore the validity and reliability of specific governance indicators with a multi- variate analysis approach. Particularly, we employ exploratory and confirmatory factor analysis to assess both criteria based on the study of two relevant dimension of governance: Corruption and Public Financial Management.
  • 34. CHAPTER 4. ASSESSING GOVERNANCE INDICATORS 33 4.2 Introducing Exploratory and Confirmatory Factor Analysis The idea of latent variables and manifest variables (Bartholomew et al., 2008): An integral aim of our analysis is to assess whether our shortlisted indicators from the selected dimensions ( Corruption and PFM) measure what they intend to measure or whether that have construct validity. This broader notion of validity is further evaluated by two sub-categories: i) Discriminant validity – if concepts or measures that are supposed to be unrelated are in fact unrelated and ii) Convergent validity – the degree to which a measure is correlated with other measures that it is theoretically predicted to correlate with. Latent variable models such as exploratory and confirmatory factor analysis (henceforth, EFA and CFA) allow us to statistically evaluate both types of validity and the reason they are useful in the context of governance indicators is the following. FA is better than Principal Component Analysis (PCA), a variable reduction technique (Fabrigar et al., 1999), since FA allows us to make underlying assumptions about the model that PCA cannot. Furthermore, Brown (2009) postulates that PCA does not account for random error that is inherent in measurement, whereas FA does. This makes it a better technique to employ for the purpose of our analysis. For many concepts (e.g. sex, income, size), the correspondence between the concept and its measurement is sufficiently close that the distinction between the two need not be emphasised. Many other concepts however such as the idea of good governance (corruption, public service delivery, public financial management etc.) are more abstract/complex, and capturing them with empirical data is difficult. In these cases, it is often useful to operationalise them with more than one observed indicator. Latent variable models, like EFA and CFA, explain the values of a set of observed variables, and associations between them, in terms of their presumed dependence on underlying latent variables. The distinction between the two models is the following (Bartholomew et al., 2008): • EFA: Aim is simply to ‘identify the latent variables underlying a set of items’ (or indica- tors). • CFA: Aim is ‘to test whether a set of items designed to measure particular concepts are indeed consistent with the assumed structure’. 4.2.1 Exploratory Factor Analysis model In modelling EFA in terms of a regression model, consider the following: A set of indicators or manifest variables χ1, χ2, . . . , χp (where there are p indicators. In the analysis that follows, Public Expenditure and Financial Accountability (PEFA) for example has 28 indicators). A set of latent variables ξ1, ξ2, . . . , ξq (where there are q latent variables
  • 35. CHAPTER 4. ASSESSING GOVERNANCE INDICATORS 34 or ‘factors’ in statistical parlance). Thus for PEFA, we might suspect the 28 indicators to be measuring the same underlying concept(s). Whether this is the case and how confidently we can claim our case to be, will be the substance of our analysis. Indeed, the 28 indicators have been clubbed together by PEFA into three broad sub-dimensions and the analysis that we conduct assesses validity and reliability based on these three categories. The manifest variables can be thought of as the response variables and the latent variables as the explanatory variables (Bartholomew et al., 2008). A one-factor model would then have the following model equations for a hypothetical example of 4 items (indicators): χ1 = τ1 + λ1ξ + δ1 χ2 = τ2 + λ2ξ + δ2 χ3 = τ3 + λ3ξ + δ3 χ4 = τ4 + λ4ξ + δ4 Where: • χ1 . . . χ4 are the observed variables as mentioned above • ξ is the common factor. • λ1. . . λ4 are the factor loadings • δ1. . . δ4 are the specific or unique factors Although the common factors are unobserved, we can think of the above regressions as hypothet- ical constructs that are able to explain each of the dependent variables (or items) by means of a common set of regressors. Therefore even though these regressions are not explicitly estimated, they aid in conceptualising the mechanism by which factor analysis derives the latent variables from the correlations between many observed variables. A path diagram helps in visualising the above regression models where the circle represents the latent variable and each square one of the observed variables. The arrows (paths) illustrate the relationship between factor and each item.
  • 36. CHAPTER 4. ASSESSING GOVERNANCE INDICATORS 35 Figure 4.2: One-factor EFA model The key assumptions of a general multi-factor model are: • Indicators load on all factors: χ1 = τ1 + q 1 λkξk + δ1 • Both ξ’s and δ’s have mean 0 and variance 1 • δ1, δ2, . . . , δP are uncorrelated with each other • The ξ’s are uncorrelated with the δ’s Thus, what these assumptions imply is analogous to the conditional mean independence assump- tion in regression models: the χ s are conditionally independent, given the ξ s. In other words, the correlations among the observed items are entirely explained by the factors. 4.2.2 Confirmatory Factor Analysis model There are two basic approaches to CFA: i) It allows us to test theories about relationships of indicators to factors by setting certain relationships to zero or even by setting certain error variances equal to each other. One key issue that it overcomes is that EFA relies on arbitrary guidelines/personal judgement for deciding how many factors should be kept, i.e., for deciding how many dimensions the data represents. Another issue with EFA is that it relies on arbitrary guidelines for interpreting factors, or deciding which relationships are ‘large’ and which are ‘small’. ii) It allows us to test theories about relationships between factors (by estimating values for covariances between factors, or constraining them according to theory). Graphically, the key difference between EFA and CFA, is that in EFA, all indicators load onto all factors with the paths/arrows connecting each latent variable or factor to all items as seen in Figure 4.3a. To aid
  • 37. CHAPTER 4. ASSESSING GOVERNANCE INDICATORS 36 interpretation, one can exploit an inherent ambiguity (rotational invariance) in the definition of an EFA model. This implies that we can use an oblique rotation to allow the factors to be correlated. In contrast, a CFA as depicted in Figure 4.3b constrains certain loadings to zero and the indicators group together to load onto their corresponding latent variables. For most practical purposes, a two or higher factor path diagram is not useful given that it becomes increasingly complicated to identify the respective paths for each latent variable. Thus, for the subsequent analysis, we restrict our path diagrams to illustrate the CFA results. Figure 4.3: EFA vs CFA (a) EFA (b) CFA 4.3 Assessing validity and reliability of indicators with factor analysis: methodology 4.3.1 Quantifying validity and reliability A direct way of measuring an indicator’s validity would be to compare it with the true value of what it is measuring, or with another measure of the same construct known to be valid. Obviously, the possibility to do so would deny the very need of such an indicator in the first place. One indirect way to test for indicators’ validity is hence to assess whether indicators that should (by definition or construction) be measuring similar concepts in fact are. Formally, we will assess validity of indicators by testing for “convergent” and “discriminant” validity. Convergent validity: indicators whose value are expected to be jointly dictated by that of a common underlying concept -or measuring concepts that we expect to be highly correlated in theory- should be highly correlated in practice. For instance, one would theoretically expect IQ
  • 38. CHAPTER 4. ASSESSING GOVERNANCE INDICATORS 37 and GMAT scores to be highly correlated among individuals, since they are designed to measure a certain understanding of individuals’ “ability”. Discriminant validity: an indicator should not be influenced by underlying concepts other than the one it is supposed to be measuring. If it is, then one can conclude that such indicator is also picking up information about other underlying constructs, commanding other sets of indicators. This is another way to say that such indicator is not measuring that same construct its co-indicators are measuring. There is an important caveat. Convergent and discriminant validity evidence never give definitive answers. What they do is shed light on the validity question in an indirect way, by looking at whether indicators supposedly picking up information about a same underlying construct actually do. That is, these validity tests compare indicators between them. Convergent and discriminant-valid indicators can still suffer from systematic bias: even when indicators measure similar constructs, there is still the possibility that they are together wrong. Convergent and discriminant valid indicators can hence be missing their target in a very specific way: they still all hit the same point. Hence, the intuition behind the relevance of discriminant and convergent validity tests is the following: if a large number of indicators shows strong convergent and discriminant validity, it is harder to think of a story explaining why they are all wrong in the same way, rather than accepting it as evidence for their measuring the right concept (unless for instance they all suffer from the same methodological bias with the same intensity). Testing for reliability of indicators revolves around one idea: the variance of an indicator should not be too much driven by that of its idiosyncratic error term i,t. Concretely, we will look at the share of the indicator’s variance that is explained by that of its error term. The factor analysis approaches presented above are very helpful tools in assessing both validity and reliability of indicators. We outline below what will be the methods, tests and metrics that we use. 4.3.2 Exploratory Factor Analysis: getting a first idea of the indicators’ va- lidity and reliability For each dataset, we first perform EFA on the whole set of indicators it contains and/or on sub- sets of it (when the dataset has many indicators, we adopt a step-by-step approach to flag where problem arise). EFA allows us to identify groups of indicators measuring the same underlying construct, that is, whose variation can be well explained by the variation of a common (but unobserved) “underlying factor” that is derived from the observed indicators. This first stage, as its name suggests, is mostly there to help us get a first idea of the indicators’ validity.
  • 39. CHAPTER 4. ASSESSING GOVERNANCE INDICATORS 38 Formally, there is evidence for convergent and discriminant validity if all indicators supposedly measuring close or theoretically correlated concepts load highly and simultaneously on one same derived factor without loading too much on other factors. These loadings are the coefficients of a regression of the type of what is presented in section 4.2. Note that the caveat mentioned in section 4.3.1 is a direct consequence of the factors being derived from the observed indicators: any common systemic bias in the indicators will be transposed to the factors. EFA can also give us some insight on the reliability of the indicators. The “uniqueness” measure associated with each indicator indicates the share of its variation that is not due to that of the derived underlying construct. A high uniqueness suggests a lack of reliability. If the variation of the indicator is largely due to a volatile measurement error, this suggests that should one replicate the measurement in the exact same setting (same country and year), one would get different results although the underlying construct would stay the same. To formally assess reliability we also use a measure known as “Cronbach’s alpha” which simply computes the share of the indicators’ variance that is commanded by that of their underlying factor (be it valid or not). A good rule of thumb is to consider any value greater than 0.7 as signalling satisfactory reliability. 4.3.3 Confirmatory Factor Analysis: confirming the factors structure When necessary and applicable, we then perform CFA, presented in section 4.2, in order to confirm the intuition we derive from EFA. Imposing a model on the data makes a lot of sense: indicators are supposedly measuring given underlying constructs. A model forcing theoretically correlated indicators to be commanded by one underlying factor only should fit the data well, with high correlation between indicators of a same group, and simultaneously strong shared variance between indicators and their attached underlying factor. We hence still look at factor loadings, but get additional metrics about the fit of the imposed model:
  • 40. CHAPTER 4. ASSESSING GOVERNANCE INDICATORS 39 Table 4.2: CFA fit indices Index Optimal value Type Comparative fit index () 1 Fit indices. Tucker-Lewis index (TLI) 1 Coefficient of determination (CD) 1 Root mean squared error of approximation (RMSEA) 0 Standardised root mean squared residual (SRMR) 0 Residual index. Good fit indices and simultaneously high factor loadings are further evidence for convergent va- lidity (the relationships between the imposed factor and the indicators are strong). Discriminant validity is not well examined under CFA since it does not allow to look at cross-relationships. However, strong correlations between factors is admitted to be a signal of a lack of discriminant validity, a good rule of thumb being again the 0.7 threshold. A small SRMR residual index provides some evidence for indicators’ reliability (how much there is residual variation not com- manded by the common factor). More details about the estimation strategies are found in the appendix, notably on how we deal with error terms.
  • 41. Chapter 5 Data and analysis In this chapter, we present the data and highlight the key results of our analysis on public financial management and corruption indicators. A more technical and extensive analysis, with the relevant output tables, is provided in the appendix. 5.1 Public financial management 5.1.1 Open Budget Survey (OBS) Overall, amongst the indicators that we have aggregated, convergent and discriminant validity as well as reliability seem achieved, suggesting the good quality of OBS data as a whole. We also find that dimensions 1) quality of Executive’s Budget Proposal and 2) quality of the Budget Process are highly correlated, though that should not be regarded as a lack of discriminant validity. Open Budget Survey (OBS) is a country-level biennial survey which covers about 100 countries in the world. It assesses the public availability of budget information and other budgeting practices that contribute to an accountable and responsive public finance system in countries around the world. It aims to measure how governments around the world are managing public finances on the following three aspects: • Budget transparency: measured in terms of amount, level of detail and timeliness of budget information made publicly available for the citizens by the government. This culminates in a score calculated for each country called “Open Budget Index” (OBI). • Budget participation: this pertains to the opportunities provided by the governments to the civil society to engage and participate in decisions about how public resources are raised and spent. 40
  • 42. CHAPTER 5. DATA AND ANALYSIS 41 • Budget oversight: measures the capacity and authority of formal institutions like leg- islatures and supreme audit institutions to have an oversight of the government’s budget process. The OBS assesses the contents and timely release of eight key budget documents that all coun- tries should issue at different points in the budget process, according to generally accepted best practice criteria for PFM. These criteria are drawn from internationally accepted public financial management practices such as IMF’s Code of Good Practices on Fiscal Transparency, OECD’s Best Practices for Fiscal Transparency, and International Organisation of Supreme Auditing Institutions’ (INTOSAI) Lima Declaration of Guidelines on Auditing Precepts. The universal applicability of the above listed PFM guidelines to different budget systems around the world ensures comparability amongst countries of different income levels. The majority of the ques- tions in the survey seeks to measure what actually happens in practice (de facto), rather than what’s required by law (de jure). Data The OBS questionnaire consists of 125 factual questions and is completed by a researcher or a group of researchers within an organisation in the country. The researchers responding to the questionnaire are a part of either academic institutions or civil society organisations with significant focus on budget issues. The researchers can respond to questions by choosing from five standard responses (a, b, c, d or not applicable), and are required to provide adequate evidence for their responses by augmenting their answers with comments, clarifications, and links to relevant documents. Once the responses from the researchers are received, they are coded using a conversion scale (a=100, b=67, c=33, d=0). The 125 questions seek to measure: • The quality of the executive’s budget proposal, that is 1) the quality of budget estimates for the current year and beyond as well as 2) for preceding years, 3) the quality of necessary complementary information (e.g. extra-budgetary funds, transfers, public financial activity), and 4) the quality of the budget’s narrative relating expenditures to policy goals and programmes, and of the monitoring of such expenditure programmes. • The quality of the budget process, that is 1) the quality of its formulation phase (consultations, timeliness, pre-budget reports), 2) the quality of its execution phase with the publication of frequent implementation and progress reports and 3) the quality of the post-fiscal year auditing phase. • The strength of the legislature that is, its de jure and de facto involvement and power in 1) the budget design and debate, 2) the monitoring of non-enacted contingent revenues, expenditures and transfers, and 3) the monitoring of the audit phase.
  • 43. CHAPTER 5. DATA AND ANALYSIS 42 • The intensity of the executive’s budget’s public accountability that is 1) the quality of the “citizens’ budget”, i.e. if the “citizens’ budget” is an accessible, interactive and transparent communication of the executive’s budget, 2) the level of de jure and de facto engagement of the executive with the public in formulating and executing the budget, 3) the level of engagement of the legislative body with the public when voting the budget, and 4) the level of engagement of the audit institution with the public during the audit phase. Out of 125 questions, 95 are used to compute the OBI which is a simple average of the scores on each of the 95 questions. It is important to note that though they are treated by OBS as continuous, these variables are ordinal in nature. However, since we are analysing OBS’s usage of these variables, it is a sensible approach to assess their quality on continuous grounds. Assessing the validity and reliability of the OBI We do not compare the OBS data with external data, as there is no other dataset that could be used as a valid comparison. Rather, we perform within-dataset statistical analysis to know whether indicators that are grouped in a same category, hence supposed to measure different facets of a same underlying trait, actually do so. We use EFA and CFA to assess the validity and reliability of the OBI. We consider the 95 variables (questions) that are used in the 2012 OBI. They try to capture information on three of the four above-mentioned dimensions: 1) quality of the executive’s budget proposal 2) quality of the budget process and 3) quality of the citizens’ budget. To make the interpretation of the statistical analysis easier, these 95 variables are collapsed into 11 intermediate aggregate indicators along the aforementioned subcategories (4 for quality of Executive’s Budget Proposal, 3 for quality of Budget Process, and 4 for intensity of Public Engagement) as shown in Figure 5.1. We proceed methodically: we first perform three pairwise analyses each with only two sets or categories of indicators (1-2, 1-3 and 2-3) to identify where potential problems arise, and then perform an analysis with all indicators together.
  • 44. CHAPTER 5. DATA AND ANALYSIS 43 Figure 5.1: OBS’s 11 intermediate indicators Quality  of  the   budget   proposal   Budget   narra5ve  /   link  to   policy   Addi5onal   informa5on     Es5mates   for  prior   years   Es5mates   for  current   year  and   beyond   Quality  of  the   budget  process   Budget   execu5on   End-­‐of-­‐year   budget  audit   Budget   formula5on   phase   Quality  of  the   ci5zens’  budget   Dissemina5 on  to  the   public   Consulta5o n  with  the   public   Frequency   of   publica5on   Details  of   the  ci5zens’   budget   Source: Authors’ own representation. Analysis 1: Budget Proposal and Budget Process Figure 5.2 shows a visualisation of the first EFA pooling Budget Proposal and Budget Process indicators. A two-factor model is what OBS’s methodology would suggest, and is supported by standard statistical factor selection criteria1. The transparent coloured polygons reflect the intensity of the linear relationships (from 0 to 1) between indicators and underlying factors. For instance, a doubling of the blue factor labeled “quality of the budget proposal” would inflate the indicator “quality of the budget estimates for previous years” by a factor of 1.95, and the indicator “quality of the budget estimates for the current year and beyond” by a factor of 1.68. It is useful to remind the reader that the underlying factors do not have a life of their own, and are directly extracted from the common variation that is found in the observed indicators. We also report the associated EFA statistical Table 5.1. For all subsequent EFAs we only report the visualisation, and invite the reader to refer to the appendix for the tables. 1 Keeping factors with eigenvalue > 1 or whose eigenvalue is situated before the first major drop (elbow criterion).
  • 45. CHAPTER 5. DATA AND ANALYSIS 44 Figure 5.2: EFA visualisation, Budget Process and Budget Proposal -­‐0,2   0   0,2   0,4   0,6   0,8   1   Proposal:  es3mates,  current  year  and   beyond   Proposal:  es3mates,  previous  years   Proposal:  addi3onal  informa3on   Proposal:  budget  narra3ve  Process:  budget  formula3on   Process:  budget  execu3on   Process:  budget  audit   Budget  proposal   Budget  process   Source: OBS and authors’ own representation. Table 5.1: EFA, Budget Process and Budget Proposal Factor1 Factor2 Uniqueness Proposal: estimates, beyond .83847 .1297474 Proposal: estimates, previous .9774664 .1872742 Proposal: additional information .4942621 .4692496 .203666 Proposal: budget narrative .5644614 .3030345 .3447982 Process: budget formulation .6420719 .5842888 Process: budget execution .8865702 .2663301 Process: budget audit .6716625 .3802079 Observations 338 Blanks: loadings smaller than 0.3 in absolute terms Source: OBS 2006-2008-2010-2012
  • 46. CHAPTER 5. DATA AND ANALYSIS 45 The way to interpret Figure 5.2 is the following. There exist two underlying (derived) variables, or underlying constructs, that command the values taken by the considered indicators in a very specific way: indicators measuring concepts relating to the “quality of the budget proposal” are simultaneously and mostly commanded by one underlying factor, and a similar conclusion can be drawn from observing the “quality of the budget process” indicators. This points towards rather strong evidence for both discriminant and convergent validity as defined in section 4.3: a priori-clubbed indicators do load on one same factor, and cross-loadings are satisfactorily small. The reader should bear in mind the caveat previously mentioned: this is merely evidence of some indicators being influenced by a same underlying concept, which is not guaranteed to represent the true concept of interest. However, the more discriminant and convergent validity evidence one gathers, the more credible it becomes that indicators are measuring what they ought. Although the identified underlying constructs seem to command their attached indicators, both EFA and CFA show that the constructs “quality of the budget proposal” and “quality of the budget process” are highly correlated. However it is a reasonable argument that the quality of the budget proposal and of its execution and audit phases are in practice highly correlated. Moreover, the EFA visualisation offers additional comfort with regards to discriminant validity: small cross-loadings notably indicate that the two underlying factors are two different constructs. Regarding reliability, the low uniquenesses we observe in Table 5.1 are first pieces of evidence for satisfactory reliability of the indicators. More formally, we compute Cronbach’s alpha coef- ficients. We obtain 0.9154 for quality of executive budget indicators and 0.8014 for quality of the budget process indicators, suggesting high reliability. We then impose a CFA model on the data. We notably let the random error components of the “quality of the estimates for the current year and beyond” and “quality of the estimates for previous years” indicators be correlated2. Figure 5.3 represent the CFA diagram with standard- ised coefficients and Table 5.2 the associated statistical table. For all subsequent CFAs we only report the diagram and invite the reader to refer to the appendix for the tables. Figure 5.3: CFA diagram, Budget Process and Budget Proposal Source: OBS and authors’ own representation. 2 We do so after looking at the modification indices of a first model and in line with OBS’ methodol- ogy/questionnaire. More details in appendix.
  • 47. CHAPTER 5. DATA AND ANALYSIS 46 Table 5.2: CFA, Budget Process and Budget Proposal Proposal: estimates, beyond Process: budget formulation PROPOSAL 27.17*** PROCESS 19.77*** (1.583) (1.944) Proposal: estimates, previous Process: budget execution PROPOSAL 27.89*** PROCESS 21.06*** (1.839) (1.950) Proposal: additional information Process: budget audit PROPOSAL 22.29*** PROCESS 25.27*** (1.809) (1.965) Proposal: budget narrative PROPOSAL 23.32*** (2.070) Observations 338 Standard errors in parentheses Source: OBS 2006-2008-2010-2012 * p<0.1, ** p<0.05, *** p<0.01 The model shows strong adequacy with the data, with simultaneously strong (standardised) relationships between factors and indicators. This and the levels of the goodness-of-fit statistics (desirable levels in parenthesis follows Hu and Bentler (1999) guidelines) constitute evidence for both convergent and discriminant validity. Table 5.3: CFA fit indices, Budget Process and Budget Proposal Comparative fit index (CFI): 0.978 (0.95) Tucker-Lewis index (TLI) 0.962 (0.95) Coefficient of determination (CD) 0.967 (0.95) Standardised root mean squared residual (SRMR) 0.025 (0.8) Root mean squared error of approximation (RMSEA) 0.096 (0.5) Finally, the low SRMR residual index of the CFA model is further evidence in favour of the indicators’ reliability.
  • 48. CHAPTER 5. DATA AND ANALYSIS 47 Analysis 2: Budget Proposal and Citizens’ Budget Similarly, the second pairwise analysis shows strong signs of validity, with both EFA and CFA models yielding a clear picture. Figure 5.4 shows very clear convergent and discriminant validity, its interpretation being similar to that of Figure 5.2. Figure 5.4: EFA visualisation, Budget Proposal and Citizens’ Budget -­‐0,2   0   0,2   0,4   0,6   0,8   1   Proposal:  es3mates,  current  year  and   beyond   Proposal:  es3mates,  previous  years   Proposal:  addi3onal  informa3on   Proposal:  budget  narra3ve   Cit.  budget:  detail   Cit.  budget:  dissemina3on   Cit.  budget:  consulta3on   Cit.  budget:  frequency   Ci3zen's  budget   Budget  poposal   Source: OBS and authors’ own representation. The low uniquenesses we observe after running the EFA3 and the Cronbach’s alphas we compute (0.8563 for the variables pertaining to the “citizens’ budget” category, and -as previously- 0.9154 for “quality of executive budget” indicators) suggest the high reliability of the indicators. CFA (Figure 5.5) confirms the fit of a two-factor model to the data, given the strong relationships between indicators and their underlying factors, as well as the good absolute fit indices. 3 See Table 6.4 in appendix.
  • 49. CHAPTER 5. DATA AND ANALYSIS 48 Figure 5.5: CFA diagram, Budget Proposal and Citizens’ Budget Source: OBS and authors’ own representation. Table 5.4: CFA fit indices, Budget Proposal and Citizens’ Budget Comparative fit index (CFI): 0.977 Tucker-Lewis index (TLI) 0.965 Coefficient of determination (CD) 0.995 Standardised root mean squared residual (SRMR) 0.031 Root mean squared error of approximation (RMSEA) 0.097 As before, the low SRMR index (0.031) suggests that indicators are highly reliable.
  • 50. CHAPTER 5. DATA AND ANALYSIS 49 Analysis 3: Budget Process and Citizens’ Budget As above, we can be confident about the discriminant and convergent validity of the two sets of indicators. EFA (Figure 5.8) shows clear relationships between underlying constructs and their attached indicators. Figure 5.6: EFA visualisation, Budget Process and Citizens’ Budget -­‐0,2   0   0,2   0,4   0,6   0,8   1   Process:  budget  formula:on   Process:  budget  execu:on   Process:  budget  audit   Cit.  budget:  detail  Cit.  budget:  dissemina:on   Cit.  budget:  consulta:on   Cit.  budget:  frequency   Ci:zen's  budget   Budget  process   Source: OBS and authors’ own representation. We observe again satisfactorily small EFA uniquenesses4, and the Cronbach’s alphas do not change for the two sets of indicators: this is again evidence for the considered indicators’ relia- bility. In turn, CFA again shows strong loadings of indicators on their attached factors, and good fit indices, suggesting indicators’ validity. 4 See table 6.6 in appendix.
  • 51. CHAPTER 5. DATA AND ANALYSIS 50 Figure 5.7: CFA diagram, Budget Process and Citizens’ Budget Source: OBS and authors’ own representation. Table 5.5: CFA fit indices, Budget Process and Citizens’ Budget Comparative fit index (CFI): 0.990 Tucker-Lewis index (TLI) 0.984 Coefficient of determination (CD) 0.990 Standardised root mean squared residual (SRMR) 0.022 Root mean squared error of approximation (RMSEA) 0.061 Again, the small SRMR residual index (0.022) we observe points towards indicators’ reliability.
  • 52. CHAPTER 5. DATA AND ANALYSIS 51 Analysis 4: All dimensions We now turn to an all-factor analysis. Evidence so far seems to be pointing towards strong validity and reliability of OBS indicators, and we want to give a final picture by pooling all indicators. EFA (Figure 5.8) gives an unclear picture, so we turn to the CFA. Figure 5.8: EFA visualisation, all dimensions -­‐0,2   0   0,2   0,4   0,6   0,8   1   Proposal:  es3mates,  current  year  and   beyond   Proposal:  es3mates,  previous  years   Proposal:  addi3onal  informa3on   Proposal:  budget  narra3ve   Process:  budget  formula3on   Process:  budget  execu3on  Process:  budget  audit   Cit.  budget:  detail   Cit.  budget:  dissemina3on   Cit.  budget:  consulta3on   Cit.  budget:  frequency   Budget  process   Ci3zen's  budget   Budget  proposal   Source: OBS and authors’ own representation. As expected, the relationships between identified factors and their attached indicators are strong, with good fit indices, small residuals and simultaneously high factor loadings (Figure 5.9) suggesting both validity and reliability5. Like before, we see a very high correlation between the factors commanding the quality of the budget proposal and the quality of the budget process. However this is not surprising as it is much likely that in practice these two dimensions are highly correlated, and we don’t see it as evidence for a lack of discriminant validity. 5 See also Table 6.9 in appendix.
  • 53. CHAPTER 5. DATA AND ANALYSIS 52 Figure 5.9: CFA diagram, all dimensions Source: OBS and authors’ own representation. Table 5.6: CFA fit indices, all dimensions Comparative fit index (CFI): 0.967 Tucker-Lewis index (TLI) 0.954 Coefficient of determination (CD) 0.997 Standardised root mean squared residual (SRMR) 0.036 Root mean squared error of approximation (RMSEA) 0.088 We hence conclude that OBS’s 11 sub indicators that we retain are measuring three distinct concepts. OBS data in general show strong signs of quality: we gathered convincing evidence for OBS’s indicators’ validity and reliability.
  • 54. CHAPTER 5. DATA AND ANALYSIS 53 5.1.2 Public Expenditure and Financial Accountability (PEFA) Overall, indicators of the PEFA dataset seem to be measuring the same underlying factors with a complete lack of convergent and discriminant validity. These conclusions are therefore in sim- ilar vein to Langbein & Knack (2010) who use EFA and CFA and find that the World Bank governance indicators measure the same concept rather than distinct concepts of governance. PEFA indicators seem to be measuring the same underlying concept of public financial manage- ment rather than distinct concepts of credibility of budget, comprehensiveness & transparency, predictability & control in budget execution, accounting/reporting and external scrutiny. The PEFA programme was founded in 2001 as a multi-donor partnership between several donor agencies and international financial institutions to assess the condition of country public expen- diture, procurement and financial accountability systems and develop a practical sequence for reform and capacity-building actions. Thus, one of the key activities of PEFA is the development and maintenance of the PFM Performance Measurement Framework (PEFA Framework), which is a contribution to the collective efforts of many stakeholders to assess whether a country has the tools to deliver three main budgetary outcomes: i) aggregate fiscal discipline, ii) strategic resource allocation, and iii) efficient use of resources for service delivery. The Performance Measurement Framework includes a set of 28 high level indicators, which measures and monitors performance of PFM systems, processes and institutions and a PFM Performance Report (PFM-PR) that provides a framework to report on PFM performance as measured by the indicators. Importantly, the focus of the PFM performance indicator set is the public financial management at central government level, including the related institutions of oversight, and in particular revenues and expenditures undertaken through the central gov- ernment budget. The set of high-level indicators captures core components of PFM that are widely acknowledged as being essential for all countries to achieve sound public financial man- agement. By comparing them over time, the PMF’s and the scored indicators can act as basis to monitor results of public financial management reform efforts (either through repeated as- sessments and/or by building PEFA Indicators into a country’s own Monitoring & Evaluation mechanism). One caveat however, is that the indicators only measure operational performance rather than the inputs that enable the PFM system to reach a certain level of performance. The Performance Measurement Framework identifies the critical dimensions of performance of an open and orderly PFM system as follows (PEFA PMF, 2011): 1. Credibility of the budget - The budget is realistic and is implemented as intended. 2. Comprehensiveness and transparency - The budget and the fiscal risk oversight are com- prehensive and fiscal and budget information is accessible to the public. 3. Policy-based budgeting - The budget is prepared with due regard to government policy.
  • 55. CHAPTER 5. DATA AND ANALYSIS 54 4. Predictability and control in budget execution - The budget is implemented in an orderly and predictable manner and there are arrangements for the exercise of control and stew- ardship in the use of public funds. 5. Accounting, recording and reporting - Adequate records and information are produced, maintained and disseminated to meet decision-making control, management and reporting purposes. 6. External scrutiny and audit - Arrangements for scrutiny of public finances and follow up by executive are operating. These six dimensions and 28 indicators are further collapsed by PEFA into 3 broad categories (the complete list of the 28 indicators is given in Figure 5.11): 1. PFM system out-turns: these capture the immediate results of the PFM system in terms of actual expenditures and revenues by comparing them to the original approved budget, as well as level of and changes in expenditure arrears 2. Crosscutting features of the PFM system: these capture the comprehensiveness and transparency of the PFM system across the whole of the budget cycle. 3. Budget cycle: these capture the performance of the key systems, processes and institu- tions within the budget cycle of the central government. Figure 5.10 illustrates the structure and coverage of the PFM system measured by the set of high level indicators and the links with the six core dimensions of a PFM system: Figure 5.10: Structure and coverage of the PFM system Source: PEFA PMF, 2011
  • 56. CHAPTER 5. DATA AND ANALYSIS 55 Data PEFA data contains the most recent status of a national or sub national PEFA assessment in any given country as of December 23, 2014. The data is updated on a six-monthly basis in which PEFA partners and other agencies that lead PEFA assessments are contacted about the status of their assessments. The PEFA Secretariat collects and verifies this information before updating the assessment portal. PFM assessment reports included in the assessment portal have scored at least 2/3 of the PEFA indicator set. The data is available to the public and was downloaded from https://www.pefa.org/en/dashboard. Scoring methodology and aggregation ‘Each indicator seeks to measure performance of a key PFM element against a four point ordinal scale from A to D. Guidance has been developed on what performance would meet each score, for each of the indicators. The highest score is warranted for an individual indicator if the core PFM element meets the relevant objective in a complete, orderly, accurate, timely and coordinated way. The set of high-level indicators is therefore focusing on the basic qualities of a PFM system, based on existing good international practices, rather than setting a standard based on the latest innovation in PFM.’ (PEFA PMF, 2011) Since many of the indicators have sub-dimensions, the aggregate indicators have been con- structed by PEFA by taking the lowest score and adding a ‘+’ to them if one of the sub- dimensions have a higher score. Thus, if an indicator has for each of say 3 sub-dimensions scores C, C, and B respectively, the aggregate score would be C+. So as to run factor analysis on the aggregated dataset (as would be apparent from the methodology section, factor analysis would not be useful on sub-dimensions because they would clearly be measuring similar concepts), we re-coded scores for the aggregated indicators from 1 – 7 for the following possible scores (A, B+, B, C+, C, D+, D) with 1 corresponding to A, 2 corresponding to B+, and so on6. Although the data is ordinal and not interval, this is a reasonable continuous structure to be imposed, and it allows us to carry out an analysis in the spirit of that of OBS. Given the nature of the results, we believe that they are robust to different kinds of analysis e.g. latent class modelling. 6 This is notably in line with a 2009 ODI study on cross-country PFM performance: Taking Stock: What do PEFA Assessments tell us about PFM systems across countries?
  • 57. CHAPTER 5. DATA AND ANALYSIS 56 Figure 5.11: PEFA indicators list Source: PEFA Performance Measurement Framework, Revised 2011. Analysis 1: A. Credibility of the Budget and B. Comprehensiveness & Transparency We begin by analysing two sets of aggregate indicators7 that supposedly measure the“credibility of the budget” and its “comprehensiveness & transparency”. A two-factor model is what PEFA’s methodology would suggest, and is supported by standard statistical factor selection criteria. 7 Throughout the analysis, we refer to the indicators by their variable names rather than the full indicator description, which is provided in Figure 5.11
  • 58. CHAPTER 5. DATA AND ANALYSIS 57 Using a two-factor EFA8, we find that the uniqueness for most indicators of the set analysed (PI 1 – PI 10) seems to be quite high. Alternatively, the communalities or the sum of squared factor loadings (1 – unique variance) seems to be rather low, indicating that the model does not fit the data well and is a first piece of evidence against indicator’s reliability. We can tell from a simple EFA as indicated in Table 6.10 that a two-factor model is going to be a bad fit. Given that indicators PI 7 – PI 10 load onto Factor 1, we would expect indicators PI 1 – PI 4 to load onto Factor 2 from PEFA’s analytical categorisation. As is evident, there is a clear lack of convergent validity. Indicators PI 2 and PI 4 from ‘Credibility of the Budget’ load onto Factor 1 and indicators PI 5, PI 6 from ‘Comprehensiveness & Transparency’ load onto Factor 2. Thus, cross-category indicators seem to be loading onto the same factor. Figure 5.12 illustrates the cross-loadings with PI 5 and PI 6 grouping together with PI 1 and PI 3 along Factor 2. Figure 5.12: EFA visualisation, Credibility and Comprehensiveness & Transparency -­‐0,1   0   0,1   0,2   0,3   0,4   0,5   0,6   0,7   0,8   PI_1   PI_2   PI_3   PI_4   PI_5   PI_6   PI_7   PI_8   PI_9   PI_10   Factor1   Factor2   Source: PEFA and authors’ own representation. 8 The Kaiser–Meyer–Olkin (KMO) measure of sampling adequacy is 0.7124, suggesting an EFA is appropriate as indicators, within the dataset, have enough in common (Kaiser, 1974).