Rick On the Road

Saturday, September 01, 2007

Checklists as mini theories-of-change

During a recent evaluation of a UNICEF assisted health program in Indonesia I was given a copy of a checklist that had been designed for use in assessing the functioning of sub-district health centres in South Sulawesi. You can get a general idea of its structure from the image below. There is a list of attributes of “high performing” health centres down the left, grouped into categories, sub-categories and sub-sub-categories. Down the right side are columns, one for each health centre. Ticks are placed in each row of a column to indicate if the attribute in that row was found in that health centre. I think it is intended that if all the attributes are ticked then the health centre will be deemed to have “graduated” and no longer need to be given development assistance.

While this format has the important virtue of simplicity it does make two assumptions that may be useful to question. It appears that all the listed attributes are essential. This assumes that there is a consensus on what constitutes a “good” health centre. However, in practice, developing that consensus may be an important part of the process of developing a “good” health centre. Not only within the health centre, amongst the staff of the health centre, but also externally, amongst other organisations that the health centre has to work with (e.g. the district hospital, village health posts, and the district health office).

The second assumption is that all the attributes are of equal importance. This seems unlikely. For example, it would be widely agreed that having a supply of oxytocin (attribute 2.4.a) is much more important than “Mother's day celebration implemented each year at sub-district level” (attribute 4.2.b). Attempts to develop the capacity of the health centre will need to be guided by a clear sense of priorities, about what attributes are more important than others. The choice between organising a mothers’ day event and ensuring a supply of oxytocin could be a matter of life or death.

These two “problems” could be seen as opportunities. Attributes on the list could be weighted by asking selected stakeholders to rank the attributes in terms of their relative importance (by allocating points adding to 100 points). If there are a large number of attributes the ranking could start with the major categories, then sub-categories, then attributes within them. Importance could be defined as how much they are likely to continue to improved usage of quality services that will effect people’s health outcomes. The first set of stakeholders could be internal to the health service, and later on external stakeholders could be consulted. Attributes that were given widely different rankings would then be the focus of discussion as to why views varied so much. The assumption here is that this may lead to some convergence of views on priorities. It could also be relevant to staff training agendas. During the evaluation referred to above, we found that comparing different stakeholders ranking of the effectiveness of a number of (other) project activities generated a constructive discussion that increased both stakeholder groups’ understanding of each other, and of the issues involved.

Even when agreement is reached about appropriate weightings a question might be raised about whether this will necessarily lead to expected outcomes. Such as how women are using the health service or their behaviour after visiting the health centre. It would therefore be useful to compare the scores of different health centres and how they related to outcomes observed by those different health centres. How well do these scores predict these outcomes? If they do not, the scores could be re-calculated on the basis of a different set of weightings, to see if emphasising other attributes produced a better fit between health centre scores and observed outcomes of concern. If so, that would suggest the need for a re-orientation of priorities within the health centre. A given set of weightings is in effect the theory-of-change, and the score it generates can be treated as a prediction of an expected outcome. A series of predictions (scores from different health centres) would be needed to see how well the theory fits reality (outcomes observed by those health centres).

Incidentally: A target score on a checklist could be inserted as a single indicator in a Logical Framework, allowing a simple reference to be made to the measurement of a complex outcome. The wider use of checklist scores might help limit the use of overly simplistic indicators of progress, as seen in many Logical Frameworks.

PS: This discussion is not a criticism of the checklist as currently in use. It is an outline of what I think is some of its untapped potential.

Sunday, May 27, 2007

Evolving storylines: A participatory design process?

Some years ago...

More than a decade ago, while beginning my PhD, I experimented with the design of a process for evolving stories, through a structured participatory process. The thought was that this could lead to the development of better project designs. A project design should include a theory-of-change, and a theory-of-change when spelled out in detail can be seen as a story. But there could be many different versions of that story, some better than others. If so, then how to discover them?

One possibility was to make use of a Darwinian evolutionary process to search for solutions that have the best fit with their environment. The core of the evolutionary process is the evolutionary algorithm: the re-iteration of processes of variation, selection and retention. The intention was to design a social process that embodied these features. A similar process was later built in as a core feature of the Most Significant Changes (MSC) technique.

I tested the idea out, in a simple and light hearted way, by involving a classroom of secondary students taught by a friend of mine. The environment in which stories would have to develop and survive was that classroom, with its own culture and history. More serious applications could involve the staff of an organisation, and the environment within and around that organisation.

The process:

I gave ten of the students some small filing cards, and asked them each to write the beginning of a story on their card, about a student who left school at the end of the year. When completed, these ten cards were then posted, as a column of cards, on the left side of the blackboard, in front of the class. This provided some initial variation
I then asked the same students to read all ten cards on the board, and for each of them to identify the story beginning they most liked. This involved selection
The students were then asked to each use a second card to write a continuation of the one story beginning they most liked. These story segments were then posted next to the one story beginning they most liked. As a result, some stories beginnings gained multiple new segments, others none. This step involved retention of the selected story beginnings, and introduction of further variation.
The students were then asked to look at all the stories again, now they had been extended. I then asked them to write a third generation story segment, which they were to add to the emerging storyline they most liked so far. This process was re-iterated for four generations, until we ran out of class time. A graphic view of the results is shown below (the other being the text of the stories).

(left click to magnify image)

Each story segment is represented by one node (circle). Lines connecting the nodes, show which story segment was added to which, forming storylines. In the diagram above the story lines start from the centre and grow outwards. The color of each node represents the identity of the student who wrote that story segment. The size of each node varies according to how many "descendants" it had: how many other story segments were added to it later on. The four concentric circles in the background represent the four generations of the process. PS: Each story segment was only one to three sentences long.

The results:

In evolutionary theory success is defined in minimalist terms, as survival and proliferation. In this exercise three of the initial stories did not survive beyond the first generation (1.7, 1.8, and 1.9). Five others did survive until the fourth generation. Of these two were most prolific (1.6, 1.10), each of which had three descendants by the fourth generation.

Amongst the surviving storylines some were more collective constructions than others. Storylines 1.3 to 1.34, 1.10 to 1.39 and 1.10 to 1.38 had four different contributors (the maximum possible), whereas storylines 1.6 to 1.37 and 1.10 to 1.40 only had two.

As well as analysing the success of different storylines, we can also analyse success at the level of individual participants, using the responses of others as a measure. Individuals varied in the extent to which their story segments were selected by others, and continued by them. One participant's story segments had five continuations by others (see pale brown nodes). At the other extreme, none of the story segments of another participant (see dark green node) were continued by others. Before the exercise I had expected students to favor their own storylines. But as can be seen from the colored nodes in the diagram, this did not happen on a large scale. Some favored their own stories, but most changed storylines at one stage or another.

PS: The results of the process are also amenable to social network analysis. Participants can be seen as linked to each other through their choices of whose stories to select and add on to. It may be useful to test whether there are any coalitions at work. Either those expected prior to the exercise, or ones which were unexpected but important to know about. Within the school students exercise a social network analysis highlighted the presence of one clique of three student, where each added to each other's stories. But two of the students in this clique also added to others stories, and others added to theirs. See network diagram here.

Variations on the process

There are a number of ways in which this process could be varied:

Vary the extent to which the process facilitator tries to influence the process of evolution . The facilitator could ask all participants to start from one common story beginning in the centre. During the process the facilitators could also introduce events that all storylines must make reference to in one way or another. The facilitator could also choose to specify the some desired characteristics of the end of the story. PS: We could see the facilitator as a representative of the wider / external environment.
Run the process for a longer period. If there were ten generations, or more, it might be possible to find storylines that were built by the contributions of all ten participants. In the wider context it might be of value to find stories that have more collective ownership.
Allow participants to add two new story segments each, rather than only one. This would increase the amount of variation within the process. But it would also make the process more time consuming. It could be a useful temporary measure to create more variation amongst the stories.
Limit participation in the process to those whose (initial)storylines had survived so far. This would increase the selection pressure within the process. It could bring the process of evolution to an end (i.e one story remaining).
Magnify parts of the process. Take two consecutive segments in a story, and re-run the process to start from the first segment, with the aim of reaching the other segment by the n’th generation.
Introduce a final summary process. At the desired end time ask each participant to priority rank all the surviving storylines. These judgments could then be aggregated to provide a final score for each storyline. (Normally evolutionary processes go on and on, with different “species” emerging and dying out along the way).

How could this process be used for project development purposes?

It could be used at different stages of a project, during planning, implementation or evaluation. At the planning stage it would help think through different scenarios that the project might have to deal with. At the evaluation stage it might provide different versions of the project history, for external evaluators to look at. During implementation it could provide a mix of both scenario analysis and interpretation of history.

The mix of stakeholders involved in the process could be varied, in different ways:

The participants could be relatively homogenous (e.g. all from same organisation) or more heterogeneous (e.g. from a range of organisations), according to the amount of diversity of storylines that was desired.
The results of the process generated by one set of stakeholders (e.g. an NGO) could be subject to selection by another (e.g. the NGO's stakeholders). Using the example above, the class teacher could have indicated their preferred storyline from amongst the 10 surviving stories generated by his students.
It would also be possible to have separate roles for different stakeholders: with one group making the retention decisions (which storlines will be continued) and another making variation decisions (what new story segments to be added on to what storylines (already selected for continuation). The former could be a wide group of stakeholders, and the latter a much smaller group of project planners.

Participants could take on different roles. They could act as themselves or as representatives of specific stakeholders. Responding as individuals may allow participants to think in wider terms than when they are representing their specific stakeholder group. Stakeholder groups could participate via representatives, or as teams (each team making one collective choice about what storylines to continue, and how to do so). A team approach might promote more thought about each step in the evolving storyline, and how the stakeholder group's collective longer term interests could be best served.

At the other extreme, participants’ contributions could be anonymous (but labeled with a pseudonym). This would allow more divergent and risky contributions that might not otherwise appear.

How is this different from scenario planning?

(from Wikipedia) "Scenario development is used in policy planning, organisational development and, generally, when organisations wish to test strategies against uncertain future developments."There are many different ways in which scenario planning is done, but it appears that there are two stages, at least: (a) identification of different scenarios that are of possible concern, (b) identification of means of responding to those scenarios.

Evolving storylines is different in that both processes are interwoven and continuous. Each new story segment is a response to the previous segment, and in turn elaborates the existing scenario (story) in a particular way. It is more adaptive.

In scenario analysis it appears that scenarios are different combinations of circumstances, each of which is seen as potentially important. Such as high inflation and high unemployment. These factors are identified first, prioritised, then used to generate varous combinations. Some of these may not be able to occur together, but others that are become the scenarios. With evolving storylines there no limit on the number or kinds of elements that can be introduced into a story, but there are limits on the number of storylines that can survive.

Scenario analysis seems to be limited to a smaller number of possible outcomes than the storyline process. This may be necessary because the response process is separated from the scenario generation process.

There is also a connection to war games, as applied to the development of corporate strategy development (See Economist, May 31st 2007). These involve competing teams and the taking of turns, "allowing competitors not just to draw up their own strategies but to respond to the choices of others". Evolving storylines could take this process a step further, allowing teams to experiment with multiple parallel strategies. Sometimes a portfolio of approaches may be more useful than a single strategy, not only as a way of managing risk, but also as way of matching the diversity of contexts where an organisation is working. This is especially so for organisations working in multiple countries around the world.

Requests:

If you have any plans for testing out this process please let me know. I would be happy to provide comments and suggestions: before , during or afterwards.
I would like to develop ways of making this process work with large numbers of participants via the internet, rather than only in face to face meetings. Especially using "open source" processes that could be made freely available via Creative Commons or GNU licenses. If you have any ideas and/or capacity to help with these type of developments please left me know.

regards, rick

Sunday, December 31, 2006

Prediction markets as a source of independent and continuous evaluation for development projects?

Imagine that at the beginning of a development project, even during the planning stage, the designers identified a number of observable events, which were expected to be achieved at different points in time throughout the lifetime of the project. Some might be very immediate, such as spending 90% of the annual budget by the end of each year, while others might be much longer term in nature, such as primary school completion rates in district x exceeding 90% by 2010. Well something like this happens already in most development projects, you might say. These types of events are described in the planning documents, and associated work plans. And later on, in progress reports.

But these statements of intentions are often not very publicly accessible, and they rarely have much consequence. Cutting off funding if specific performance targets are not reached rarely happens (in my experience) and probably for good reason, it is a very crude “sledgehammer” type of response. And often the donors themselves are not independent judges, they need their projects to be seen to be succeeding, and cutting off funding is effectively a criticism of their own previous judgements, as well as the recent performance of their grantee.

There is an alternative which might be able to provide more independent and continuous assessments of project progress. These are called "prediction markets" (also sometimes called information markets, decision markets, idea futures, and virtual markets).

Prediction markets allow a group of people to express an opinion over a period of time about the probability of an event occurring. A question is posed and people buy and sell shares in stocks representing possible answers to that question. The highest priced stock at the end of a period of time is the group's prediction. (A definition provided by inklingmarkets.com)

Prediction markets have been championed in James Surowiecki’s 2004 book, “The Wisdom of Crowds”, and widely discussed and used since then (see list of links below). They have been used to successfully predict political events (election outcomes), sports events (winners), market success of commercial products, and many other types of events. Recent well known users have included Google, Yahoo, Microsoft, and HP. The most important claim that has been made is that prediction markets can generate more accurate predictions of events than individual experts or highly structured planning / design processes involving multiple specialists.

It is possible that prediction markets could also be usefully applied to development projects. Two types of benefits might arise.

Firstly, the existence of the markets might generate incentives for a wider variety and larger number of people to become engaged in a discussion about aid and development. Even prediction markets that do not involve real money bets do manage to attract large numbers of participants, who get rewarded by social recognition and self-esteem, if they win.
Secondly, those responsible for aid budgets might get more accurate information about the expected performance of their portfolio of projects than they do from those who are directly responsible for the implementation of these projects.

The challenge would be how to create incentives for project managers to publicly disclose information about their project and its performance. For example, via project websites. Rewards could be given to project managers where:

the number of participants in the prediction market was large, relative to previous comparble markets
the most favored bet was successful (predicted the observed outcome)

But this then raises the question of who would provide the rewards. Donors to the project might have the same reservations as project manegsr about disclosure of information, and reluctance to see negative predictions proved correct. The alternative would be to find independent third parties, possibly specialist NGOs, who might have an interest in promoting greater transparency by aid agencies.

Project prediction markets could have different uses at different stages. During the implementation of the project the prediction market would be providing real time feedback on expectations about whether a project was likely to succeed, that might encourage corrective behaviours by project managers. At the end of the project, when success/failure has been defined and winning bets paid off, it would be useful to compare the project manager’s own bets against the market as whole. And to analyse any discrepancies between them, and any lessons to be learnt from these.

Prediction markets can be open to the public, or internal (as used in Google, Yahoo, Microsoft, and HP). The proposal outlined above is for the use of public prediction markets in development project outcomes. But to allow and encourage project "insiders” to participate, on condition that their bets are disclosed. In the same way that directors of companies can buy and sell shares in their own company, but they are normally required to disclose these dealings.

Incidentally, the operation of prediction markets might also generate a modest income for development purposes, by using open source, proprietary or web-hosted software to host the market (see the Wikipedia listing).

Happy New Year!
Rick Davies

An experiment

Please take part in this very experimental prediction market, where the prediction concerns the achievement of one of the Millenium Development Goals (MDG). Go to
http://home.inklingmarkets.com/market/show/3166
You can leave your comments and questions in the Discussion section. Note that you have $5,000 token dollars available to spend. These are called "inkles". Bear in mind that this particular MDG prediction market is very much in the beta stage, where I expect there will be quite a few problems that will need to be sorted out.

Links:

Description & Analysis of Information Markets © 2005 by Bernd H. Ankenbrand and Caroline Rudzinski. http://www.pmcluster.com/Papers/Description%20of%20Information%20Markets%202005-12-17.pdf
Information Markets: A New Way of Making Decisions (Paperback) by Robert Hahn http://www.aei-brookings.org/publications/abstract.php?pid=1058
Putting crowd wisdom to work Posted by Bo Cowgill, Project Manager http://googleblog.blogspot.com/2005/09/putting-crowd-wisdom-to-work.html
openDemocracy markets http://www.opendemocracy.net/globalization-vision_reflections/inkling_markets_4202.jsp
Prediction markets http://en.wikipedia.org/wiki/Prediction_market
Does wisdom require markets?

Sunday, December 17, 2006

Assumptions, evidence and multiple stakeholders

Over the last few months I have been on the sidelines of a review of an NGO funding mechanism. The review report has been drafted, then re-written. But as yet, as far as I can see, three major issues have not been addressed. These are likely to be relevant to many other multi-donor NGO funding mechanisms.

Issue No. 1: The treatment of key assumptions

The first issue is core funding of NGOs (national and local). At the centre of the original project design was the belief that that provision of core funding will make an important difference to how NGOs work. The review team recognised this idea. But they did not then question or explore in any detail how the provision of core funding will lead to better development outcomes. Yet this was undoubtedly the potential killer assumption in the centre of the project design. In fact there are two linked assumptions here, that both needed examination, even if only at a desk level.

The first assumption is that core funding will increase the freedom and autonomy of NGOs. This assumption could have been explored by looking at the different NGOs that had been funded by the project, and then making some comparisons.

Firstly, by comparing NGOs where project provided core funding was a big versus small proportion of the NGO's overall budget. In the review there was no table showing such figures, though they were readily available, and though there were significant differences between NGOs in this respect. Is there any evidence of autonomy being greater where core funding was a bigger proportion of an NGO's income? Or are other factors more important in determining autonomy? An even bet, I suspect.

Secondly, by comparing the extent of the constraints imposed on NGOs by core funding mechanism, versus the constraints the NGOs experienced when using other sources of funding. Did the review team ask NGOs to compare the project (as core funder) to their other donors in terms of the constraints they imposed, and what did they find out about the differences? Complaints about funding procedures need a comparator.

The second assumption is that increased freedom and autonomy of NGOs will lead to better development outcomes.

Here it would be useful to compare the core funded NGOs’ performance against that of other NGOs who are more constrained by their donors (e.g. as a result of their project specific funding), but working on the same type of development outcomes. For example, where both NGOs work on education sector issues. At the very least it would have been possible to identify some of these cases through interviews with NGOs, and maybe even interview some of them, to at least get to the stage of developing some indicative hypotheses.

The reason for making such a comparison is that there are some good counter arguments in favour of constraint as necessary component of creativity, versus privileging freedom and autonomy. Biological evolution is the most creative process we know of, and that process works through the imposition of a very severe constraint: the need to be able to adapt to the current environment, or die. Architecture is another field where it is recognised that the presence of constraints can drive creative solutions. There have also been extensive research on the role of constraints in the fields of art and literature.

Issue No. 2: The use of evidence

The second issue was about the use of evidence. Although there had been an annual review earlier in the year, important lessons have not yet been learned from the experience. In that review there was extensive and selective use of unattributed comments by NGOs, with no information presented on how representative each of these views were. Understandably that caused major problems for the acceptance of that review. The first and second draft of the mid-term review seemed to continue that questionable tradition, albeit with a little more balance.

The alternative approach, which had been proposed before the most recent review, was that by default, all comments made by NGOs should be from identifiable sources. Exceptions could then be made where there were explainable reasons why identities had to be withheld. The assumptions behind this proposal were that:

* NGOs are mature organisations led by mature people who have a working relationship with the funder, which can withstand open expression of criticisms. Not the reverse.

* NGOs need to be confident and assertive, if they are to be effective advocates. If they cannot openly express their critical views to their own donor, how can they ever be effective advocates of critical views to less sympathetic audiences?

In contrast, the review made a brief and sympathetic reference to the earlier reviews use of evidence from NGOs, and then focused on the issue of whether results of the review interviews should be quantified or not. This was not the primary issue, and does not even need to be seen as an either/or choice. The primary issue with the both review methodologies was how transparent and trustworthy the process of data collection and analysis is. The continued selective use of unattributed comments weakened the value of both reviews.

Given there are only a small number of NGOs that had been funded it would have been quite easy to tabulate, using text not numbers, all the answers given to a number of key questions, using one table per question, and to use these in the respective relevant sections of the report.

Issue No. 3: Multi-stakeholder involvement

The third issue was multi-stakeholder involvement. The problematic nature of multi-stakeholder involvement in strategy design and evaluation was barely recognised. The project design required a common goal and convergent activities working towards that goal. Yet it involved working with NGOs who are autonomous, diverse and sometimes conflicting in their views of the world. The project's ability to find a common goal, or to mobilise people around a common goal, was in practice very very limited. Similar challenges face the whole issue of appropriate NGO representation on the project’s governing body.

Nevertheless, in this context the review proposed “Widening the dialogue on problem definition and strategy development, bringing together NGOs, government, donors and others, and using competitive funding to NGO consortia to channel demand and support the identified priorities”. And at the same time “Limit the role of the [management team] to administering grants and allied activities, avoiding other activities of a more interventionist type that might undermine the central aim”.

The review proposed that project’s strategy be defined via the proposed steering committee, and supported via “strategic issues” meetings involving a wider group of stakeholders. Although proposals were made re the inclusion of different categories of people, based largely on expertise categories, how they are chosen, and subsequently re-appointed and replaced was the more challenging question that was answered.

There is a counter argument that an independent review team could have given some attention to. That is, project’s strategy should be developed by a limited group of identifiable stakeholders with visible interests. And that NGOs using project funding should also be encouraged to seek funding from alternative sources. Overall, the project (or its donors) should be encouraging the development of plurality of funding sources, representing a diversity of strategies. Rather than trying to merge many conflicting interests within the strategy of one funding mechanism, in a non-transparent fudge that satisfies no one.

Wednesday, April 26, 2006

Evidence that the (development) world is getting better

...a new approach to monitoring and evaluation ;-)

I have recently been reviewing the language we use, in the world of development aid, and come to conclusion that there is an accumulating body of evidence that the world is getting better.

Here are some examples of changes I have noticed. If you have noticed other similar changes, please post them as Comments below.

In the past we only had projects, now we have initiatives (updated 20 May 2010)

In the past we only had projects, but we have interventions (June 2011, see DFID Business Plan How to Do guidance)

In the past we had plans, but now we have strategies
In the past we did research, but now we do analytic work
In the past we interpreted data, but now we are involved in sensemaking (updated 29 May 2010)
In the past we just had stories, but now we have narratives (updated 29 May 2010)
In the past we just did monitoring and evaluation, but now we do management for development results (MDR)
In the past we were concerned about coordination, but now we are concerned about harmonisation
In the past we only wanted things to work but now we expect them to be fit-for-purpose
In the past we only had interests, but now we have passions
In the past we had problems, but now we only have issues
In the past we only had news, but now we have breaking news
In the past we were just donors, but now we are development partners
In the past we were just NGOs, but now we are Civil Society Organisations
In the past we had some concepts that were not very practically useful but now we have "sensitising concepts"
In the past we took a particular perspective... now we have an analytical lens (13 April 2011)
In the past we used to stimulate discussion, but now we open a space for a dialogue (or versions thereof) (14 April 2011)
In the past we used to ask a question or make a point in a conference, but now we make an intervention (or is this now also passe?) (14 April 2011)
In the past we used to search the internet, but now we do horizon scanning work (July 17, 2012)
In the past we just used things, but now we leverage
In the past we just had trial and error, but now we do problem driven iterative adaptation.
In the past our activities only had an effect but now they impact things (2018), and now we even have "impactful development"
In the past we just tried to change things, but now we are aiming for transformational change (2018)
In the past we only had evaluators, but now we have impact researchers! (2021, thanks @EvaluationMaven)
In the past we just wanted more detail, but now are asking for more granularity (2021)
In the past we just tried to do participatory development , but now we are into in co-production (2021)

Integrating funding applications and baseline surveys

This idea falls into the category of "things I should have learned years ago!"

Over the last year or so I have been working on a number of research funding mechanisms in Ghana and Vietnam. Both involve something like the traditional two stage process of inviting simple / short Concept Notes for research, then from amongts the best of these, inviting fully developed Proposals for research. Quite a lot of information is provided by the grantee-to-be by this process, as well as by those who dont end up qualifying as grantees.

But up to now it has never occured to me that we should design this process to simultaneously gather information about the baseline status of these organisations and their activities, for subsequent monitoring and evaluation purposes. Especially information about their relationships with other actors at this early stage, which is of increasing interest to me. Instead, in one instance, we have organised a separate baseline survey some months later, involving the approved grantees only. Needless to say, this did not impress the new grantees, who had thought they had finished with form filling for the time being!

Another advantage of this approach is that by including the non-successfull applicants, we gather some wider contextual data, that will put the characteristics of the approved grantees in a broader perspective. Some of this information may reflect on the capabilty of the non-successful applicant, but other data may be more independent.

I have also been pushing a number of grant making bodies to use the application process to generate predictions of subsequent success, on a numerical scale. These predictions can later be compared to actual / perceived success, some years down the road. Not only is the correlation between the prediction and outcome of interest, so will be the positive and negative outliers (the unexpected successes and the unexpected failures). This is where case study investigations could help us learn a lot about what makes the difference between success and faulure.

Monday, March 20, 2006

The risks of big increases in aid flows to poor countries

The latest IDS Policy Briefing (Issue 25) is titled "Increased Aid: Minimising Problems, Maximising Gains", and contains a summary of papers in a recent IDS Bulletin on the same theme. Many of the concerns raised in this Briefing relate to my own experience and echo my pre-existing concerns. In particular:

- Donors will be pre-occupied with issues of quantity, and their attention will be diverted away from the more crucial question of the quality and effectiveness of aid.

- Absorptive capacity - the ability to put aid to effective use - is already in doubt as a result of poor governance, rigid and unresponsive administrative systems, and above all, the shortage of human resources.

- Governance reforms, that might help with absorption problems, take time. Speeding them up to enable bigger aid flows, is unlikely to work

- Increased aid flows will weaken incentives for recipient government to reprioritise their existing expenditures and to improve governance

- Aid coordination efforts will not keep up with increased aid flows from multiple directions, and the burden on the host government of managing its aid will increase.

- If much of the increased aid comes in the form of loans rather than grants (from WB and others) then this will impede recipient governments' efforts to break out of indebtedness and dependence on aid.

- In smaller countries the increased volume of aid might drive up the value of the national currency making exports more expensive and undermining efforts to base growth on rising exports

In one of the (summarised) IDS papers, Ros Eyben argues that a surge in aid could magnify certain donors practices that disempower recipient governments and civil society. She is especially concerned about the potentially damaging impact of "results-based management" endorsed by donors at Monterey in 2002 as the optimal approach. Apparently because it enables donors to define what is happening, and undermines the ability of recipient governments to do so.

I am a little skeptical about the ability of any M&E method or approach to have much of an impact, but I do share her concern about the impact of large increases of foreign aid on national sovereignty and the accountability of governments to their people. Some time ago I sat in on a meeting held to discuss the findings of a report on the cost of achieving the MDGs in country x. The focus of the discussion was almost wholly on the accuracy of the calculations. What was ignored was the massive scale of the required increases and how (if delivered) they would effectively dwarf the country's own revenue sources. And even more importantly, the political implications of such a situation. What sensible government would pay much attention to its own population's expressed needs, when the bulk of its revenue was coming from external aid? No amount of "governance reform" would seem to be able to redress the effects of these perverse incentives.

I was reminded of the slogan of the American revolution "No taxation without representation", and wondered whether the truth of the reverse also needs some publicity: "No (effective) representation without taxation"

I suspect that the use of results-based management may be more a symptom, rather than a cause; a too-simple response to the perverse accountability effects of large aid flows: governments becoming less accountable to their people. RBM might be more useful if it more directly addressed the causes of this quandary: the imbalance of aid revenue versus tax revenue. It could do this by including as a key performance measure the ability of the recipient government to increase its tax revenues. Achievement could be associated with increased aid revenues, but not otherwise. The target ratio between tax and aid revenues would of course have to be identified country by country, because tax raising capacity will vary widely. Ironically this is one performance measure that the people in the recipient countries would probably not be very happy with, but which could in the longer term empower them more in relation to their governments, than any large increase in aid.

I am sure there are already many cases where increased taxation is an objective of concern to donors and host governments. But how often, if ever, have donors been willing to limit their aid flows to any percentage or multiple of recipient countries tax revenues? Here there is another set of perverse incentives at work, to do with the nature of the "aid business". That to do so would be to undermine the resource base on which aid bureacrats earn their living and where they cand find future promotion opportunities. Some will be able to create career opportunities out of a well argued need to cut aid or to limit aid growth, but not all - by definition.

Further dis-incentives would probably lay ahead, should maximum levels of aid per country ever be agreed. Donor countries would have to agree who would contribute how much to a limited pot, when in fact many were seeking to maximise their influence. This would be the ultimate "hamonisation" challenge!

Late Note (23/03/06) From "Business in Africa"

"...Taking a look at the country’s national budget, virtually all spend is draft estimates and not accounted for cent by cent. An example of this lack of budget control is highlighted in the SAIIA report, where the director of monitoring and evaluation in the ministry of economic development and planning is quoted as saying, “What we are saying is that out of the 60 billion kwacha in the 2002/2003 budget, we only know of 37 billion kwacha that was used. We don’t know what happened to the rest because of lack of data.”

Foreign support makes up close to half of Malawi’s budget, which means that it is difficult to implement policies when the budget is dependent largely on sources from outside the country. In 2000, International Monetary Fund (IMF) aid and other donor funding was suspended following high-level corruption, archaic fiscal discipline and poor governance under the Bakili Muluzi administration. The IMF and World Bank resumed aid in 2004 when the economic minister in Muluzi’s cabinet, Bingu wa Mutharika, was named president."

Okay, so running a close second is the need for a performance indicator on the availability of adequate government accounts. This would be in the interests of citizens and supporting donors. No conlfict of interest here, unless donors have privileged access to government accounts that citizens do not. Unfortunately this is not an unknown occurence.

Another part of this latest story is the stop-go nature of aid flows. In highly aid dependent countries that cannot be a good way to proceed. What sort of performance monitoring system would generate such clumsy responses? Either one that was only barely working, or being ignored by its own proponents (or both).