Saturday, September 01, 2007

Checklists as mini theories-of-change

During a recent evaluation of a UNICEF assisted health program in Indonesia I was given a copy of a checklist that had been designed for use in assessing the functioning of sub-district health centres in South Sulawesi. You can get a general idea of its structure from the image below. There is a list of attributes of “high performing” health centres down the left, grouped into categories, sub-categories and sub-sub-categories. Down the right side are columns, one for each health centre. Ticks are placed in each row of a column to indicate if the attribute in that row was found in that health centre. I think it is intended that if all the attributes are ticked then the health centre will be deemed to have “graduated” and no longer need to be given development assistance.



While this format has the important virtue of simplicity it does make two assumptions that may be useful to question. It appears that all the listed attributes are essential. This assumes that there is a consensus on what constitutes a “good” health centre. However, in practice, developing that consensus may be an important part of the process of developing a “good” health centre. Not only within the health centre, amongst the staff of the health centre, but also externally, amongst other organisations that the health centre has to work with (e.g. the district hospital, village health posts, and the district health office).

The second assumption is that all the attributes are of equal importance. This seems unlikely. For example, it would be widely agreed that having a supply of oxytocin (attribute 2.4.a) is much more important than “Mother's day celebration implemented each year at sub-district level” (attribute 4.2.b). Attempts to develop the capacity of the health centre will need to be guided by a clear sense of priorities, about what attributes are more important than others. The choice between organising a mothers’ day event and ensuring a supply of oxytocin could be a matter of life or death.

These two “problems” could be seen as opportunities. Attributes on the list could be weighted by asking selected stakeholders to rank the attributes in terms of their relative importance (by allocating points adding to 100 points). If there are a large number of attributes the ranking could start with the major categories, then sub-categories, then attributes within them. Importance could be defined as how much they are likely to continue to improved usage of quality services that will effect people’s health outcomes. The first set of stakeholders could be internal to the health service, and later on external stakeholders could be consulted. Attributes that were given widely different rankings would then be the focus of discussion as to why views varied so much. The assumption here is that this may lead to some convergence of views on priorities. It could also be relevant to staff training agendas. During the evaluation referred to above, we found that comparing different stakeholders ranking of the effectiveness of a number of (other) project activities generated a constructive discussion that increased both stakeholder groups’ understanding of each other, and of the issues involved.

Even when agreement is reached about appropriate weightings a question might be raised about whether this will necessarily lead to expected outcomes. Such as how women are using the health service or their behaviour after visiting the health centre. It would therefore be useful to compare the scores of different health centres and how they related to outcomes observed by those different health centres. How well do these scores predict these outcomes? If they do not, the scores could be re-calculated on the basis of a different set of weightings, to see if emphasising other attributes produced a better fit between health centre scores and observed outcomes of concern. If so, that would suggest the need for a re-orientation of priorities within the health centre. A given set of weightings is in effect the theory-of-change, and the score it generates can be treated as a prediction of an expected outcome. A series of predictions (scores from different health centres) would be needed to see how well the theory fits reality (outcomes observed by those health centres).

Incidentally: A target score on a checklist could be inserted as a single indicator in a Logical Framework, allowing a simple reference to be made to the measurement of a complex outcome. The wider use of checklist scores might help limit the use of overly simplistic indicators of progress, as seen in many Logical Frameworks.

PS: This discussion is not a criticism of the checklist as currently in use. It is an outline of what I think is some of its untapped potential.

Sunday, May 27, 2007

Evolving storylines: A participatory design process?

Some years ago...

More than a decade ago, while beginning my PhD, I experimented with the design of a process for evolving stories, through a structured participatory process. The thought was that this could lead to the development of better project designs. A project design should include a theory-of-change, and a theory-of-change when spelled out in detail can be seen as a story. But there could be many different versions of that story, some better than others. If so, then how to discover them?

One possibility was to make use of a Darwinian evolutionary process to search for solutions that have the best fit with their environment. The core of the evolutionary process is the evolutionary algorithm: the re-iteration of processes of variation, selection and retention. The intention was to design a social process that embodied these features. A similar process was later built in as a core feature of the Most Significant Changes (MSC) technique.

I tested the idea out, in a simple and light hearted way, by involving a classroom of secondary students taught by a friend of mine. The environment in which stories would have to develop and survive was that classroom, with its own culture and history. More serious applications could involve the staff of an organisation, and the environment within and around that organisation.

The process:

  1. I gave ten of the students some small filing cards, and asked them each to write the beginning of a story on their card, about a student who left school at the end of the year. When completed, these ten cards were then posted, as a column of cards, on the left side of the blackboard, in front of the class. This provided some initial variation
  2. I then asked the same students to read all ten cards on the board, and for each of them to identify the story beginning they most liked. This involved selection
  3. The students were then asked to each use a second card to write a continuation of the one story beginning they most liked. These story segments were then posted next to the one story beginning they most liked. As a result, some stories beginnings gained multiple new segments, others none. This step involved retention of the selected story beginnings, and introduction of further variation.
  4. The students were then asked to look at all the stories again, now they had been extended. I then asked them to write a third generation story segment, which they were to add to the emerging storyline they most liked so far. This process was re-iterated for four generations, until we ran out of class time. A graphic view of the results is shown below (the other being the text of the stories).

(left click to magnify image)

Each story segment is represented by one node (circle). Lines connecting the nodes, show which story segment was added to which, forming storylines. In the diagram above the story lines start from the centre and grow outwards. The color of each node represents the identity of the student who wrote that story segment. The size of each node varies according to how many "descendants" it had: how many other story segments were added to it later on. The four concentric circles in the background represent the four generations of the process. PS: Each story segment was only one to three sentences long.

The results:

In evolutionary theory success is defined in minimalist terms, as survival and proliferation. In this exercise three of the initial stories did not survive beyond the first generation (1.7, 1.8, and 1.9). Five others did survive until the fourth generation. Of these two were most prolific (1.6, 1.10), each of which had three descendants by the fourth generation.

Amongst the surviving storylines some were more collective constructions than others. Storylines 1.3 to 1.34, 1.10 to 1.39 and 1.10 to 1.38 had four different contributors (the maximum possible), whereas storylines 1.6 to 1.37 and 1.10 to 1.40 only had two.

As well as analysing the success of different storylines, we can also analyse success at the level of individual participants, using the responses of others as a measure. Individuals varied in the extent to which their story segments were selected by others, and continued by them. One participant's story segments had five continuations by others (see pale brown nodes). At the other extreme, none of the story segments of another participant (see dark green node) were continued by others. Before the exercise I had expected students to favor their own storylines. But as can be seen from the colored nodes in the diagram, this did not happen on a large scale. Some favored their own stories, but most changed storylines at one stage or another.

PS: The results of the process are also amenable to social network analysis. Participants can be seen as linked to each other through their choices of whose stories to select and add on to. It may be useful to test whether there are any coalitions at work. Either those expected prior to the exercise, or ones which were unexpected but important to know about. Within the school students exercise a social network analysis highlighted the presence of one clique of three student, where each added to each other's stories. But two of the students in this clique also added to others stories, and others added to theirs. See network diagram here.
Variations on the process

There are a number of ways in which this process could be varied:

  • Vary the extent to which the process facilitator tries to influence the process of evolution . The facilitator could ask all participants to start from one common story beginning in the centre. During the process the facilitators could also introduce events that all storylines must make reference to in one way or another. The facilitator could also choose to specify the some desired characteristics of the end of the story. PS: We could see the facilitator as a representative of the wider / external environment.
  • Run the process for a longer period. If there were ten generations, or more, it might be possible to find storylines that were built by the contributions of all ten participants. In the wider context it might be of value to find stories that have more collective ownership.
  • Allow participants to add two new story segments each, rather than only one. This would increase the amount of variation within the process. But it would also make the process more time consuming. It could be a useful temporary measure to create more variation amongst the stories.
  • Limit participation in the process to those whose (initial)storylines had survived so far. This would increase the selection pressure within the process. It could bring the process of evolution to an end (i.e one story remaining).
  • Magnify parts of the process. Take two consecutive segments in a story, and re-run the process to start from the first segment, with the aim of reaching the other segment by the n’th generation.
  • Introduce a final summary process. At the desired end time ask each participant to priority rank all the surviving storylines. These judgments could then be aggregated to provide a final score for each storyline. (Normally evolutionary processes go on and on, with different “species” emerging and dying out along the way).

How could this process be used for project development purposes?

It could be used at different stages of a project, during planning, implementation or evaluation. At the planning stage it would help think through different scenarios that the project might have to deal with. At the evaluation stage it might provide different versions of the project history, for external evaluators to look at. During implementation it could provide a mix of both scenario analysis and interpretation of history.

The mix of stakeholders involved in the process could be varied, in different ways:

  • The participants could be relatively homogenous (e.g. all from same organisation) or more heterogeneous (e.g. from a range of organisations), according to the amount of diversity of storylines that was desired.
  • The results of the process generated by one set of stakeholders (e.g. an NGO) could be subject to selection by another (e.g. the NGO's stakeholders). Using the example above, the class teacher could have indicated their preferred storyline from amongst the 10 surviving stories generated by his students.
  • It would also be possible to have separate roles for different stakeholders: with one group making the retention decisions (which storlines will be continued) and another making variation decisions (what new story segments to be added on to what storylines (already selected for continuation). The former could be a wide group of stakeholders, and the latter a much smaller group of project planners.
Participants could take on different roles. They could act as themselves or as representatives of specific stakeholders. Responding as individuals may allow participants to think in wider terms than when they are representing their specific stakeholder group. Stakeholder groups could participate via representatives, or as teams (each team making one collective choice about what storylines to continue, and how to do so). A team approach might promote more thought about each step in the evolving storyline, and how the stakeholder group's collective longer term interests could be best served.

At the other extreme, participants’ contributions could be anonymous (but labeled with a pseudonym). This would allow more divergent and risky contributions that might not otherwise appear.

How is this different from scenario planning?
(from Wikipedia) "Scenario development is used in policy planning, organisational development and, generally, when organisations wish to test strategies against uncertain future developments."There are many different ways in which scenario planning is done, but it appears that there are two stages, at least: (a) identification of different scenarios that are of possible concern, (b) identification of means of responding to those scenarios.

Evolving storylines is different in that both processes are interwoven and continuous. Each new story segment is a response to the previous segment, and in turn elaborates the existing scenario (story) in a particular way. It is more adaptive.

In scenario analysis it appears that scenarios are different combinations of circumstances, each of which is seen as potentially important. Such as high inflation and high unemployment. These factors are identified first, prioritised, then used to generate varous combinations. Some of these may not be able to occur together, but others that are become the scenarios. With evolving storylines there no limit on the number or kinds of elements that can be introduced into a story, but there are limits on the number of storylines that can survive.

Scenario analysis seems to be limited to a smaller number of possible outcomes than the storyline process. This may be necessary because the response process is separated from the scenario generation process.

There is also a connection to war games, as applied to the development of corporate strategy development (See Economist, May 31st 2007). These involve competing teams and the taking of turns, "allowing competitors not just to draw up their own strategies but to respond to the choices of others". Evolving storylines could take this process a step further, allowing teams to experiment with multiple parallel strategies. Sometimes a portfolio of approaches may be more useful than a single strategy, not only as a way of managing risk, but also as way of matching the diversity of contexts where an organisation is working. This is especially so for organisations working in multiple countries around the world.

Requests:
  • If you have any plans for testing out this process please let me know. I would be happy to provide comments and suggestions: before , during or afterwards.
  • I would like to develop ways of making this process work with large numbers of participants via the internet, rather than only in face to face meetings. Especially using "open source" processes that could be made freely available via Creative Commons or GNU licenses. If you have any ideas and/or capacity to help with these type of developments please left me know.
regards, rick

Sunday, December 31, 2006

Prediction markets as a source of independent and continuous evaluation for development projects?

Imagine that at the beginning of a development project, even during the planning stage, the designers identified a number of observable events, which were expected to be achieved at different points in time throughout the lifetime of the project. Some might be very immediate, such as spending 90% of the annual budget by the end of each year, while others might be much longer term in nature, such as primary school completion rates in district x exceeding 90% by 2010. Well something like this happens already in most development projects, you might say. These types of events are described in the planning documents, and associated work plans. And later on, in progress reports.

But these statements of intentions are often not very publicly accessible, and they rarely have much consequence. Cutting off funding if specific performance targets are not reached rarely happens (in my experience) and probably for good reason, it is a very crude “sledgehammer” type of response. And often the donors themselves are not independent judges, they need their projects to be seen to be succeeding, and cutting off funding is effectively a criticism of their own previous judgements, as well as the recent performance of their grantee.

There is an alternative which might be able to provide more independent and continuous assessments of project progress. These are called "prediction markets" (also sometimes called information markets, decision markets, idea futures, and virtual markets).

Prediction markets allow a group of people to express an opinion over a period of time about the probability of an event occurring. A question is posed and people buy and sell shares in stocks representing possible answers to that question. The highest priced stock at the end of a period of time is the group's prediction. (A definition provided by inklingmarkets.com)

Prediction markets have been championed in James Surowiecki’s 2004 book, “The Wisdom of Crowds”, and widely discussed and used since then (see list of links below). They have been used to successfully predict political events (election outcomes), sports events (winners), market success of commercial products, and many other types of events. Recent well known users have included Google, Yahoo, Microsoft, and HP. The most important claim that has been made is that prediction markets can generate more accurate predictions of events than individual experts or highly structured planning / design processes involving multiple specialists.

It is possible that prediction markets could also be usefully applied to development projects. Two types of benefits might arise.

  • Firstly, the existence of the markets might generate incentives for a wider variety and larger number of people to become engaged in a discussion about aid and development. Even prediction markets that do not involve real money bets do manage to attract large numbers of participants, who get rewarded by social recognition and self-esteem, if they win.
  • Secondly, those responsible for aid budgets might get more accurate information about the expected performance of their portfolio of projects than they do from those who are directly responsible for the implementation of these projects.

The challenge would be how to create incentives for project managers to publicly disclose information about their project and its performance. For example, via project websites. Rewards could be given to project managers where:

  • the number of participants in the prediction market was large, relative to previous comparble markets
  • the most favored bet was successful (predicted the observed outcome)

But this then raises the question of who would provide the rewards. Donors to the project might have the same reservations as project manegsr about disclosure of information, and reluctance to see negative predictions proved correct. The alternative would be to find independent third parties, possibly specialist NGOs, who might have an interest in promoting greater transparency by aid agencies.

Project prediction markets could have different uses at different stages. During the implementation of the project the prediction market would be providing real time feedback on expectations about whether a project was likely to succeed, that might encourage corrective behaviours by project managers. At the end of the project, when success/failure has been defined and winning bets paid off, it would be useful to compare the project manager’s own bets against the market as whole. And to analyse any discrepancies between them, and any lessons to be learnt from these.

Prediction markets can be open to the public, or internal (as used in Google, Yahoo, Microsoft, and HP). The proposal outlined above is for the use of public prediction markets in development project outcomes. But to allow and encourage project "insiders” to participate, on condition that their bets are disclosed. In the same way that directors of companies can buy and sell shares in their own company, but they are normally required to disclose these dealings.

Incidentally, the operation of prediction markets might also generate a modest income for development purposes, by using open source, proprietary or web-hosted software to host the market (see the Wikipedia listing).

Happy New Year!
Rick Davies

An experiment

Please take part in this very experimental prediction market, where the prediction concerns the achievement of one of the Millenium Development Goals (MDG). Go to
http://home.inklingmarkets.com/market/show/3166
You can leave your comments and questions in the Discussion section. Note that you have $5,000 token dollars available to spend. These are called "inkles". Bear in mind that this particular MDG prediction market is very much in the beta stage, where I expect there will be quite a few problems that will need to be sorted out.

Links:


Sunday, December 17, 2006

Assumptions, evidence and multiple stakeholders

Over the last few months I have been on the sidelines of a review of an NGO funding mechanism. The review report has been drafted, then re-written. But as yet, as far as I can see, three major issues have not been addressed. These are likely to be relevant to many other multi-donor NGO funding mechanisms.

Issue No. 1: The treatment of key assumptions

The first issue is core funding of NGOs (national and local). At the centre of the original project design was the belief that that provision of core funding will make an important difference to how NGOs work. The review team recognised this idea. But they did not then question or explore in any detail how the provision of core funding will lead to better development outcomes. Yet this was undoubtedly the potential killer assumption in the centre of the project design. In fact there are two linked assumptions here, that both needed examination, even if only at a desk level.

The first assumption is that core funding will increase the freedom and autonomy of NGOs. This assumption could have been explored by looking at the different NGOs that had been funded by the project, and then making some comparisons.

Firstly, by comparing NGOs where project provided core funding was a big versus small proportion of the NGO's overall budget. In the review there was no table showing such figures, though they were readily available, and though there were significant differences between NGOs in this respect. Is there any evidence of autonomy being greater where core funding was a bigger proportion of an NGO's income? Or are other factors more important in determining autonomy? An even bet, I suspect.

Secondly, by comparing the extent of the constraints imposed on NGOs by core funding mechanism, versus the constraints the NGOs experienced when using other sources of funding. Did the review team ask NGOs to compare the project (as core funder) to their other donors in terms of the constraints they imposed, and what did they find out about the differences? Complaints about funding procedures need a comparator.

The second assumption is that increased freedom and autonomy of NGOs will lead to better development outcomes.

Here it would be useful to compare the core funded NGOs’ performance against that of other NGOs who are more constrained by their donors (e.g. as a result of their project specific funding), but working on the same type of development outcomes. For example, where both NGOs work on education sector issues. At the very least it would have been possible to identify some of these cases through interviews with NGOs, and maybe even interview some of them, to at least get to the stage of developing some indicative hypotheses.

The reason for making such a comparison is that there are some good counter arguments in favour of constraint as necessary component of creativity, versus privileging freedom and autonomy. Biological evolution is the most creative process we know of, and that process works through the imposition of a very severe constraint: the need to be able to adapt to the current environment, or die. Architecture is another field where it is recognised that the presence of constraints can drive creative solutions. There have also been extensive research on the role of constraints in the fields of art and literature.

Issue No. 2: The use of evidence

The second issue was about the use of evidence. Although there had been an annual review earlier in the year, important lessons have not yet been learned from the experience. In that review there was extensive and selective use of unattributed comments by NGOs, with no information presented on how representative each of these views were. Understandably that caused major problems for the acceptance of that review. The first and second draft of the mid-term review seemed to continue that questionable tradition, albeit with a little more balance.

The alternative approach, which had been proposed before the most recent review, was that by default, all comments made by NGOs should be from identifiable sources. Exceptions could then be made where there were explainable reasons why identities had to be withheld. The assumptions behind this proposal were that:

* NGOs are mature organisations led by mature people who have a working relationship with the funder, which can withstand open expression of criticisms. Not the reverse.

* NGOs need to be confident and assertive, if they are to be effective advocates. If they cannot openly express their critical views to their own donor, how can they ever be effective advocates of critical views to less sympathetic audiences?

In contrast, the review made a brief and sympathetic reference to the earlier reviews use of evidence from NGOs, and then focused on the issue of whether results of the review interviews should be quantified or not. This was not the primary issue, and does not even need to be seen as an either/or choice. The primary issue with the both review methodologies was how transparent and trustworthy the process of data collection and analysis is. The continued selective use of unattributed comments weakened the value of both reviews.

Given there are only a small number of NGOs that had been funded it would have been quite easy to tabulate, using text not numbers, all the answers given to a number of key questions, using one table per question, and to use these in the respective relevant sections of the report.

Issue No. 3: Multi-stakeholder involvement

The third issue was multi-stakeholder involvement. The problematic nature of multi-stakeholder involvement in strategy design and evaluation was barely recognised. The project design required a common goal and convergent activities working towards that goal. Yet it involved working with NGOs who are autonomous, diverse and sometimes conflicting in their views of the world. The project's ability to find a common goal, or to mobilise people around a common goal, was in practice very very limited. Similar challenges face the whole issue of appropriate NGO representation on the project’s governing body.

Nevertheless, in this context the review proposed “Widening the dialogue on problem definition and strategy development, bringing together NGOs, government, donors and others, and using competitive funding to NGO consortia to channel demand and support the identified priorities”. And at the same time “Limit the role of the [management team] to administering grants and allied activities, avoiding other activities of a more interventionist type that might undermine the central aim”.

The review proposed that project’s strategy be defined via the proposed steering committee, and supported via “strategic issues” meetings involving a wider group of stakeholders. Although proposals were made re the inclusion of different categories of people, based largely on expertise categories, how they are chosen, and subsequently re-appointed and replaced was the more challenging question that was answered.

There is a counter argument that an independent review team could have given some attention to. That is, project’s strategy should be developed by a limited group of identifiable stakeholders with visible interests. And that NGOs using project funding should also be encouraged to seek funding from alternative sources. Overall, the project (or its donors) should be encouraging the development of plurality of funding sources, representing a diversity of strategies. Rather than trying to merge many conflicting interests within the strategy of one funding mechanism, in a non-transparent fudge that satisfies no one.

Wednesday, April 26, 2006

Evidence that the (development) world is getting better

...a new approach to monitoring and evaluation ;-)

I have recently been reviewing the language we use, in the world of development aid, and come to conclusion that there is an accumulating body of evidence that the world is getting better.

Here are some examples of changes I have noticed. If you have noticed other similar changes, please post them as Comments below.

  • In the past we only had projects, now we have initiatives (updated 20 May 2010) 
    • In the past we only had projects, but we have interventions (June 2011, see DFID Business Plan How to Do guidance)
  • In the past we had plans, but now we have strategies 
  • In the past we did research, but now we do analytic work 
  • In the past we interpreted data, but now we are involved in sensemaking  (updated 29 May 2010) 
  • In the past we just had stories, but now we have narratives  (updated 29 May 2010) 
  • In the past we just did monitoring and evaluation, but now we do management for development results (MDR)
  •  In the past we were concerned about coordination, but now we are concerned about harmonisation 
  • In the past we only wanted things to work but now we expect them to be fit-for-purpose 
  • In the past we only had interests, but now we have passions 
  • In the past we had problems, but now we only have issues 
  • In the past we only had news, but now we have breaking news 
  • In the past we were just donors, but now we are development partners 
  • In the past we were just NGOs, but now we are Civil Society Organisations 
  • In the past we had some concepts that were not very practically useful but now we have "sensitising concepts" 
  • In the past we took a particular perspective... now we have an analytical lens (13 April 2011) 
  • In the past we used to stimulate discussion, but now we open a space for a dialogue (or versions thereof) (14 April 2011) 
  • In the past we used to ask a question or make a point in a conference, but now we make an intervention (or is this now also passe?) (14 April 2011)
  • In the past we used to search the internet, but now we do horizon scanning work (July 17, 2012)
  • In the past we just used things, but now we leverage
  • In the past we just had trial and error, but now we do problem driven iterative adaptation.  
  • In the past our activities only had an effect but now they impact things (2018), and now we even have "impactful development"
  • In the past we just tried to change things, but now we are aiming for transformational change (2018)
  • In the past we only had evaluators, but now we have impact researchers! (2021, thanks @EvaluationMaven)
  • In the past we just wanted more detail, but now are asking for more granularity (2021)
  • In the past we just tried to do participatory development , but now we are into in co-production (2021)

Integrating funding applications and baseline surveys

This idea falls into the category of "things I should have learned years ago!"

Over the last year or so I have been working on a number of research funding mechanisms in Ghana and Vietnam. Both involve something like the traditional two stage process of inviting simple / short Concept Notes for research, then from amongts the best of these, inviting fully developed Proposals for research. Quite a lot of information is provided by the grantee-to-be by this process, as well as by those who dont end up qualifying as grantees.

But up to now it has never occured to me that we should design this process to simultaneously gather information about the baseline status of these organisations and their activities, for subsequent monitoring and evaluation purposes. Especially information about their relationships with other actors at this early stage, which is of increasing interest to me. Instead, in one instance, we have organised a separate baseline survey some months later, involving the approved grantees only. Needless to say, this did not impress the new grantees, who had thought they had finished with form filling for the time being!

Another advantage of this approach is that by including the non-successfull applicants, we gather some wider contextual data, that will put the characteristics of the approved grantees in a broader perspective. Some of this information may reflect on the capabilty of the non-successful applicant, but other data may be more independent.

I have also been pushing a number of grant making bodies to use the application process to generate predictions of subsequent success, on a numerical scale. These predictions can later be compared to actual / perceived success, some years down the road. Not only is the correlation between the prediction and outcome of interest, so will be the positive and negative outliers (the unexpected successes and the unexpected failures). This is where case study investigations could help us learn a lot about what makes the difference between success and faulure.

Monday, March 20, 2006

The risks of big increases in aid flows to poor countries

The latest IDS Policy Briefing (Issue 25) is titled "Increased Aid: Minimising Problems, Maximising Gains", and contains a summary of papers in a recent IDS Bulletin on the same theme. Many of the concerns raised in this Briefing relate to my own experience and echo my pre-existing concerns. In particular:
- Donors will be pre-occupied with issues of quantity, and their attention will be diverted away from the more crucial question of the quality and effectiveness of aid.

- Absorptive capacity - the ability to put aid to effective use - is already in doubt as a result of poor governance, rigid and unresponsive administrative systems, and above all, the shortage of human resources.

- Governance reforms, that might help with absorption problems, take time. Speeding them up to enable bigger aid flows, is unlikely to work

- Increased aid flows will weaken incentives for recipient government to reprioritise their existing expenditures and to improve governance

- Aid coordination efforts will not keep up with increased aid flows from multiple directions, and the burden on the host government of managing its aid will increase.

- If much of the increased aid comes in the form of loans rather than grants (from WB and others) then this will impede recipient governments' efforts to break out of indebtedness and dependence on aid.

- In smaller countries the increased volume of aid might drive up the value of the national currency making exports more expensive and undermining efforts to base growth on rising exports
In one of the (summarised) IDS papers, Ros Eyben argues that a surge in aid could magnify certain donors practices that disempower recipient governments and civil society. She is especially concerned about the potentially damaging impact of "results-based management" endorsed by donors at Monterey in 2002 as the optimal approach. Apparently because it enables donors to define what is happening, and undermines the ability of recipient governments to do so.

I am a little skeptical about the ability of any M&E method or approach to have much of an impact, but I do share her concern about the impact of large increases of foreign aid on national sovereignty and the accountability of governments to their people. Some time ago I sat in on a meeting held to discuss the findings of a report on the cost of achieving the MDGs in country x. The focus of the discussion was almost wholly on the accuracy of the calculations. What was ignored was the massive scale of the required increases and how (if delivered) they would effectively dwarf the country's own revenue sources. And even more importantly, the political implications of such a situation. What sensible government would pay much attention to its own population's expressed needs, when the bulk of its revenue was coming from external aid? No amount of "governance reform" would seem to be able to redress the effects of these perverse incentives.

I was reminded of the slogan of the American revolution "No taxation without representation", and wondered whether the truth of the reverse also needs some publicity: "No (effective) representation without taxation"

I suspect that the use of results-based management may be more a symptom, rather than a cause; a too-simple response to the perverse accountability effects of large aid flows: governments becoming less accountable to their people. RBM might be more useful if it more directly addressed the causes of this quandary: the imbalance of aid revenue versus tax revenue. It could do this by including as a key performance measure the ability of the recipient government to increase its tax revenues. Achievement could be associated with increased aid revenues, but not otherwise. The target ratio between tax and aid revenues would of course have to be identified country by country, because tax raising capacity will vary widely. Ironically this is one performance measure that the people in the recipient countries would probably not be very happy with, but which could in the longer term empower them more in relation to their governments, than any large increase in aid.

I am sure there are already many cases where increased taxation is an objective of concern to donors and host governments. But how often, if ever, have donors been willing to limit their aid flows to any percentage or multiple of recipient countries tax revenues? Here there is another set of perverse incentives at work, to do with the nature of the "aid business". That to do so would be to undermine the resource base on which aid bureacrats earn their living and where they cand find future promotion opportunities. Some will be able to create career opportunities out of a well argued need to cut aid or to limit aid growth, but not all - by definition.

Further dis-incentives would probably lay ahead, should maximum levels of aid per country ever be agreed. Donor countries would have to agree who would contribute how much to a limited pot, when in fact many were seeking to maximise their influence. This would be the ultimate "hamonisation" challenge!

Late Note (23/03/06) From "Business in Africa"

"...Taking a look at the country’s national budget, virtually all spend is draft estimates and not accounted for cent by cent. An example of this lack of budget control is highlighted in the SAIIA report, where the director of monitoring and evaluation in the ministry of economic development and planning is quoted as saying, “What we are saying is that out of the 60 billion kwacha in the 2002/2003 budget, we only know of 37 billion kwacha that was used. We don’t know what happened to the rest because of lack of data.”

Foreign support makes up close to half of Malawi’s budget, which means that it is difficult to implement policies when the budget is dependent largely on sources from outside the country. In 2000, International Monetary Fund (IMF) aid and other donor funding was suspended following high-level corruption, archaic fiscal discipline and poor governance under the Bakili Muluzi administration. The IMF and World Bank resumed aid in 2004 when the economic minister in Muluzi’s cabinet, Bingu wa Mutharika, was named president."

Okay, so running a close second is the need for a performance indicator on the availability of adequate government accounts. This would be in the interests of citizens and supporting donors. No conlfict of interest here, unless donors have privileged access to government accounts that citizens do not. Unfortunately this is not an unknown occurence.

Another part of this latest story is the stop-go nature of aid flows. In highly aid dependent countries that cannot be a good way to proceed. What sort of performance monitoring system would generate such clumsy responses? Either one that was only barely working, or being ignored by its own proponents (or both).

Friday, December 16, 2005

The "attribution problem" problem

I have lost count of the number of times I have seen people make reference to "the attribution problem" as though doing so was a magic spell that dispelled all responsibility to do anything, or to know anything, about the wider and longer term impacts of a project. Ritualistic references to the "attribution problem" are becoming a bit of a problem.

In the worst case I have seen an internationally recognised consultancy company say that "our responsibilities stop at the Output level". And while other agencies might be less explicit, this is not an uncommon position.

This notion of responsibility is very narrow, and misconceived. It sees responsibilities in very concrete terms, delivering results in the form of goods or services provided.

A wider conception of responsibility would pay attention to something that can have wider and longer term impact. That is the generation of knowledge about what works and does not work in a given context. Not only about how to better deliver specific goods or services, but about their impact on their users, and beyond. Automatically, that means identifying and analysing the significance of other sources of influence in addition to the project intervention.

Contra to some people's impressions, this does not mean having to "prove" that the project had an impact, or working out what percentage of the outcome was attributable to the project (as one project manager recently expressed concern about). Something much more modest in scale would still be of real value. Some small and positive steps forward would include: (a) identifying differences in outcomes, within the project location [NB: Not doing a with-without trial], (b) Identifying different influences on outcomes, across those locations, (c) prioritising those influences, according to best available evidence at the time, (d) doing all the above in consultations with actors who have identifiable responsibilities for outcomes in these areas, (e) making these judgements open to wider scrutiny.

This may not seem to be very rigorous, but we need to remember our Marx (G.), who when told by a friend that "Life is difficult", replied "Compared to what?" Even if project managers choose to ignore the whole question of how their interventions are affecting longer term outcomes, other people in the locations and institutions they are working with will continue to make their own assessments (formally and informally, tacitly and expliictly). And those judgements will go on to have their own influences, for good and bad, on the sustainabilty and replicability of any changes. But in the process their influences may not be very transparent or contestable. A more deliberate, systematic and open approach to the analysis of influence might therefore be an improvement.

PS: On the analysis of internal variations in outcomes within development projects, you may be interested in the Positive Deviance initiative at http://www.positivedeviance.org/

Sunday, October 23, 2005

Impact pathways and genealogies

I have been working with three different organisations where the isssue of impact pathways has come up. Note the use of the plural: pathways. Network models of development projects allow the representation of multiple pathways of influence (whereby project activities can have an impact) whereas linear / temporal logic models are less conducive to this view. They tend to encourage a more singular vision, of an impact pathway.

In one research funding organisation there was a relative simple conception of how research would have an impact on peoples lives. It would happen by ensuring that research projects included both both researchers and practioners. Simple as it was, this was an improvement on the past, where research projects included researchers and did not think too much about practioners at all. But there was also room for improvement in this new model. For example, it might be that some research would have most of its impact through "research popularisers",who would collate and re-package research findings in user friendly forms, then communicate them on to practioners. And there may be other forms of research where the results were mainly of interest to other researchers.This might be the case with more "foundational" or "basic" research. So, there might be multiple impact pathways, including others yet not identified or imagined.

Impact pathways can not only extend out into the future, but also back into the past. All development projects have histories. Where their designs can be linked back to previous projects these histories can be seen as genealogies. The challenge, as with all genealogical research, is to find some useful historical sources.

Fortunately, the research funding organisation had an excellent database of all the research proposals it had considered, including those it had ended up funding. In each proposal the staff had added short lists of other previous research projects they had funded, which they thought were related and relevant to this project proposal. What the organisation has now is not just a list of projects, but also information about the web of expected influences between these projects, a provisional genealogy which stretches back more than ten years.

I have suggested to the organisation that this data should be analysed in two ways. Firstly, to identify those pieces of research which have been most influential over the last 10 to 15 years, simply in terms of influencing many other subsequent pieces of research. They could start by identifying which prior research projects were most frequently refered to in the lists attached to (funded) research proposals. This is very similar to citation analysis used in bibliometrics. These results would then need to be subject to some independent verification. Researchers' reports of their research findings could be re-read for evidence of the expected influence (incuding, but not only, their listed citations). They could also be contacted and interviewed.

The second purpose of a network analysis of past research would be to identify a sample of research projects that could be the focus of an ex-post evaluation. With the organisation concerned, I have argued the case for cluster evaluations, as a means of establishing how a large number of projects have contributed to their corporate objectives. But what is a cluster? A cluster could be identified through network analysis, as a groups of projects having more linkages of expected influence between themselves than they do have with other research projects around them. Network analysis software, such as UCINET, provides some simple means of identifying such clusters in large and complex networks, based on established social network analysis methods. Within those clusters it may also be of interest to examine four types of research projects, having different combinations of outwards influences (high versus low numbers of links to others) and inward influences (high versus low numbers of links from others).

Looking further afield it may of value for other grant making organisations to be more systematic about identifying linkages between the projects they have funded in the past, and those they are considering funding now. And then encouraging prospective grantees to explore those linkages, as a way of promoting inter-generational learning between development projects funded over the years.

Saturday, October 22, 2005

Networks of Indicators

A few months ago I was working with a large scale health project, that was covering multiple regions within a large country. The project design was summarised in a version of the Logical Framework. Ideally a good Logical Framework can help by providing a simplified view of the project intentions, through the use of a narrative that tells how the Activities will lead to the Outputs, via some Assumptions and Risks, and how the Outputs will lead to the Purpose level changes, via some Assumptions and Risks, and so on... Running parallel to this story will be some useful indicators, telling us when various events at each stage of the story has taken place.

That is of course in a ideal world. Often the storyline (aka the vertical logic) gets forgotten and the focus switches to the horizontal logic: ensuring there are indicators for each of the events in the narrative, and more!

Unfortunately, in this project, like many others, they had gone overboard with indicators. There were around 70 in all. Just trying to collect data on this set of indicators would be a major challenge for the project, let along analysing and making sense of all the data.

As readers of this blog may know, I am interested in network models as alternatives to the use of linear logic models (e.g. the Logical Framework) to represent development project plans, and their results. I am also interested in the use of network models as a means of complementing the use of Logical Framework. Within this health project, and its 70 indicators, there was an interesting opportunity to use a network model to complement and manage some of the weaknesses of the project's Logical Framework.

Sitting down with someone who knew more about the project than I did, we developed a simple network model of how the indicators might be expected to link up with each other. An indicator was deemed to be linked to another indicator if we thought the change that it represented could help cause the change represented by the other indicator. We drew the network using some simple network analysis software that I had at hand, called Visualyzer, but it could just have easily been done with the Draw function in Excel. I will show an "anonomised" version of the network diagram below.

When discussing the network model with the project managers we emphasised that the point behind this network analysis of project indicators was that it was the relationships between indicators that are important. To what extent did various internal Activities lead to changes in various public services provided (the Outputs)? To what extent did the provision of various Outputs affect the level of public use of those services, and their attitudes towards them (Purpose level changes)? To what extent did these various measures of public health status then related to changes in public health status (Goal level changes)?

The network model that was developed did not fall out of the sky. It was the results of some reflection on the project's "theory of change", its ideas about how things would work, how various Activities would lead to various Outputs and on to various Purpose level changes. As such it remained a theory, to be tested with data obtained through monitoring and evaluation activities. Within that network model there were some conspicuous indicators, that would deserve more monitoring and evaluation attention that others. These were indicators that (a) had an expected influence on many other indicators (e.g. govt. budget allocation), or (b) indicators that were being influenced by many other indicators (e.g. usage rates of a given health service)

The next step, on my next visit, will be to take this first rough-draft network model back to the project staff, and refine it, so it is a closer reflection of how they think the project will work. Then we will see if the same staff can identify the relationships between indicators that they think will be most critical to the project's success, and therefore most in need of close monitoring and analysis. The analysis of these critical relationships may itself not be any more sophisticated than a cross-tabulation, or graphing, of one set of indicator measure against another, with the data points reflecting different project locations.

Incidentally, the network model not only represented the complex relationships between each level of the Logical Framework, but also the complex relationships within each level of the Logical Framework. Activities happen at different times, so some can influence others, and even more so, when Activities are repeated in cycles, such as annual training events. Similarly, some Outputs can affect other Outputs, and some Purpose level changes can affect other Purpose level changes. The network model captured these, but the Logical Framework did not.

Wednesday, July 06, 2005

Fight institutional Alzheimers

I have taken this headline and the following text from the POLEX: CIFOR's Forest Policy Expert Listserver run by David Kaimowitz. I am reproducing it in full because I strongly agree with David's conclusions. How many other bilateral or multilateral aid gencies have done something like this recently? If you know of others, let me know.

"They say the good thing about having Alzheimer’s disease is that you are always visiting new places and meeting new people. Many development agencies have apparently taken that to heart. Rapid staff turnover, weak efforts to save and share documents, and strong incentives to repackage old wine in new bottles keep many institutions from learning from the past.

That why it is good to see the US Agency for International Development (USAID) invest in reviewing everything they have funded related to natural forests and communities during the last twenty-five years. The result is a three-volume report called USAID’s Enduring Legacy in Natural Forests: Livelihoods, Landscapes, and Governance by a Chemonics International team led by Robert Clausen. It provides an overview and ten country studies.

Back in the 1970s, USAID’s forestry activities focused mostly on fuelwood and promoting tree planting as part of watershed management projects. Later, growing concern about deforestation made them shift towards biodiversity conservation and protected areas. After that came a move towards market-based instruments such as forest certification, ecotourism, and tapping consumer demands for non-timber forest products. Over time, they have funded more NGOs and local governments and fewer national bureaucracies. And if the report’s authors have their way, the links between natural resources, democratization, and conflict prevention will soon be high on the agenda.

Through all that time and changes, some things remained the same. For example, it is still important to invest in forests for the long-term and get the technical aspects right. You need to work with specific farms, forests, and parks, but keep your eyes on larger landscapes. If no one invests in studying and monitoring forests and their products and services, when it comes time to justify investments or make decisions the data simply won’t be there. Projects need to focus more on ethnic and cultural issues. You ignore conflicts at your own risk.

People with advanced Alzheimers can be nice and well-intentioned, but they should not be running the show. If we don’t build up our institutional memory we will keep making the same mistakes, although we may give them another name. Let’s hope other agencies follow USAID’s lead and invest in learning from their own experience."


[If you would like to receive CIFOR-POLEX in English, Spanish, French, Bahasa Indonesia, or Nihon-go (Japanese), send a message to Ketty Kustiyawati at k.kustiyawati@cgiar.org]

regards from Rick, in Cambridge

Monday, May 23, 2005

Using "modular matrices" to describe programme intentions and achievements

There has been an interesting discussion about the pros and cons of Logical Frameworks, on the MandE NEWS mailing list. One participant has expressed concerns about the unrealistic expectations many people have about the use of the Logical Framework. We should not expect it to do everything. It is supposed to be a summary. To be read along side narrative accounts which can be as detailed as needed.

My response was to point out that there was some usable middle ground between long narrative accounts and tables that attempted to summarise a whole programe in a four by four set of cells. The middle ground is what I now call a "modular matrix approach" (MMA). Google defines
"modular” as follows: “Equipment is said to be modular when it is made of "plug-in units" which can be added together to make the system larger, improve the capabilities, or expand its size”

So a Gantt chart can be seen as a modular unit, because it can build onto and extend the LogFrame. It can do this because it has one common dimension: a set of Activities. Another module that I have seen used in association with the LogFrame is a matrix of Outputs x Actors (using the outputs). Here the Outputs are the common dimension that links this matrix and a LogFrame.

In the last year or so I have experimented with a range of modules, some of which have proved more useful than others. Ideally, this development process would be a collective enterprise, such that what emerged was a public library of usable planning modules. Some, like the Logframe, would offer a very macro perspective. Others, such as an Activity x Activity module, can provide a more micro perspective on work processes within single organisations.

When developing new matrix modules I use the social network analysis convention, that cell contents should describe the relationship from the row actor to the column actor. The actors involved are listed down the left column and across the top row. In practice I also use documents (produced by actors) and events (involving actors). Such matrices allow the representation of networks of communications and influence, not just one directional chains of cause and effect.

A second important convention that I try to follow, implicit in the above description, is that the entities listed on the two axes of such matrices should be verifiable, either by interviewing them (if they are actors) or reading them (if they are documents) or reading about them (if they are events). This will then allow us to establish if the links between them were planned, and eventuated, as described. There are probably other conventions that could be developed to enure that matrix modules developed by different people are compatable, and can add value to the whole.

For some recent practical experiments along these lines see this paper. In the near future I hope to provide a comprehensive summary of this approach, in a paper provisionally titled "From Logical to Network Frameworks: A Modular Approach to Representing Theories of Change" This paper will be publicised via the Network Evaluation and MandE NEWS mailing lists.

Saturday, April 02, 2005

Constructing "an auditable trail of intentions...."

A useful report has recently been produced by ODI (Lucas, Evans, Pasteur and Lloyd, 2004)“on the current state of PRS monitoring systems. PRS are national level Poverty Reduction Strategies, promoted by multilateral and bilateral aid agencies. In that report they argue for more attention to the severe capacity constraints facing governments who are try to monitor their PRSs. Donors need to take “a less ambitious attitude as to what can be achieved and a willingness to make hard choices when prioritising activities”. Later on, in discussions about the range of indicator data that might be relevant they note that “Given scarce resources, a focus on budget allocations and expenditures may well an appropriate response, particularly if it involves effective tracking exercises with mechanisms to ensure transparency and accountability…Linking these data to a small set of basic service provision indicators that can reasonably reflect annual changes could provide a reasonable starting point in assessing if the PRS is on track.”

Meanwhile I have been working on the fringes of a PRS update process that is taking place in a west African country. While I agree with the line taken above, I am wondering now if even this approach is too ambitious! This will be the second PRS for the country I am working in. This time around the government has made it clear to ministries that their budgets will be linked in to the contents of the PRS. This seems to have had some positive effect on their levels of participation in the planning groups that are drafting sections of the PRS. By now some draft versions of the updated PRS policies have been produced, and they have been circulated for comment within a privileged circle (including donors). Some attempts have been made at explicitly prioritising policy objectives, but only in one of five policy areas. Meanwhile there is a deadline approaching at high speed, for identifying and costing the programmes that will achieve these policy objectives. This is all due by the end of this month, April 2005. Then it is expected the results will feed into a public consultation and then into the national budget process starting in June. However, as yet there is no agreed methodology for the costing process. As the deadline looms the prospects increase for a costing process that is neither systematic or transparent (aka business as usual).

If the process of constructing the costings is not visible, then it becomes very clear to identify the specific linkages between specific PRS policy objectives and specific items in the national budget. So while we can, on ODI good advice, monitor budget allocations and expenditures, what they mean in terms of the PRS policy objectives will remain an act of literary interpretation. Something that could easily be questioned.

IMF and UNDP have I think both had some involvements in costings of broad policy objectives, including the achievement of the MDGs. However, from what I can see these costings have been undertaken by consultant economists, primarily as technical exercises. But I am not sure if this is the right approach. The budgets of ministries are political resources. The alternative approach is to ask Ministries to say how they will use their budget to achieve the various PRS policy objectives, and while doing so make it clear that their performance in achieving those selected objectives with their budget will be assessed. To do this we (/an independent agent) will need what can be described as “an auditable trail of intentions”, from identifiable policy objectives to identifiable programmes, with identifiable budgets, to identifiable outputs and maybe even identifiable outcomes.

There is an apparent complication. This auditable trail will not be a simple linear trail, because a single policy objective can be addressed by multiple programmes, and a single programme can address more than one policy objectives. Similarly with the relationship between a ministry’s programmes and outcomes in poor peoples lives. However, an audit trail can be mapped using a series of linked matrices (each of which can capture a network of relationships). These could include the following: PRS Policy Objectives X Ministry’s Programmes matrix, Ministry's Budget lines X Ministry’s Programmes matrix, and Ministry’s Programmes X Outputs matrix, and an Outputs X Outcomes matrix. This seems complex, but so is the underlying reality. As Groucho Marx said when his friend complained that life is difficult, “Compared to what?”

Postscript: Parallel universes do exist: Proof - a five year national plan with lists of policy objectives in the front and lists of programs in the back (with their budgets) but no visible connections between the policy objectives and programs & budget.

Parallel universes

Identifying the impact of evaluations: Follow the money?

Some years ago I was involved in helping the staff of a large south Asian NGO to plan a three-yearly impact assessment study. It was almost wholly survey based. This time around myself and a colleague managed to persuade the unit responsible for the impact assessment study to take a hypothesis-led approach, rather than simply trawl for evidence of impact by asking as many questions as possible about everything that might be relevant. The latter is often the default approach to impact assessment and usually results in very large reports being produced well after their deadlines.

With some encouragement the unit managed to generate a number of hypotheses in the form of if X Input is provided by our NGO and Y Conditions prevail then Z Outcomes will occur (aka Independent variable + Mediating variable = Dependent variable). Ostensibly they were constructed after consultations with line management staff, to get their interest and ownership in what was being researched. The quality of the hypotheses that were generated was not that great, but things went ahead. Questions were designed that would gather data about X, Y and Z, and cross-tabulation tables were constructed that would enable analysis of the results, showing with/without comparisons. The survey went ahead, the data was collected and analysed, and the report written up. The analytic content of the report was pretty slim, and not very well embedded in past research done by the NGO. But it was completed and submitted to management, and to donors. My inputs had ended during the drafting stage. The study then seemed to sink without trace, as so often happens.

A year or so later a report was produced on the M&E capacity building consultancy that I had been part of when all this had happened. In that report was a reference, amongst other things, to the impact assessment study. It said “The study also produced some controversial findings in relation to training, as it suggested that training was a less important variable in determining the performance of groups than had previously been thought. This finding was disputed at the time, but when [the NGO] had to make severe budget cuts in 2002-3 following the blocking of donor funds by the [government], training was severely cut. There is though still an urgent need for [the NGO] to undertake a specific study to review the relative effectiveness of different types of training.”