Rick On the Road

Sunday, April 25, 2010

Evaluating a composite Theory of Change (ToC)

One NGO I have been working with has a project that is being implemented via partner organisations in four different countries. There is one over-arching LogFrame capturing the generic ToC, but the situation in each country is quite different. So country specific LogFrames were developed to recognise that fact. However, for convenience of reporting to the back donor the progress report format has been based on the contents of the generic LogFrame. When it comes to the Mid-Term Review more attention will need to be paid to the country specific LogFrames. But then how will the four MTR results be systematically aggregated into one synthesis report?

Other colleagues of mine have had to review a funding mechanism involving 30 or more project partners. The diversity of activities on the ground there is even greater. Rather than focus on the original LogFrame that describes the purpose of the funding mechanism they got all the project staff and local partners together to construct a retrospective ToC that fitted all the funded activities. That is how they have dealt with the micro-macro compatibility issue.

How then have they evaluated the 30 projects in terms of this composite ToC? Well, because the ToC was reconstructed retrospectively there was no set of readily available monitoring data, for example as based on indicators in a LogFrame. Instead, the main source of evidence has been the qualitative data gathered from interviews with a large and diverse number of stakeholders. This approach has not met with approval in some quarters, which then prompted me to think how you could solve this problem. I should start by saying I do like the idea of a retrospectively constructed ToC, so long as there is also some accountability for its transition from any prior form such as a LogFrame.

My first thought was to treat the 30 projects as “units of evidence” and to try to classify them as having achieved, or not achieved, each stage in the ToC. The ToC was in a graphic form, with multiple events happening at different stages, and with lines representing causal links connecting the various events. The problem would then be how to classify each project. One possible means would be to get stakeholders to “success rank” the projects in relation to a given outcome event in the ToC, and then identify a cut-off point in the success ranking representing an acceptable level of achievement. This cut-off point would need to be explained and justified. To start with the focus of these success ranking exercises should be on those events most central to the ToC, i.e. those with many incoming and outgoing causal links. If there was sufficient time we would end up with a percentage (of projects) measure for each outcome, that could if necessary be weighted by scale of expenditure.

We could then turn our attention to examining what happened to the expected causal relationships between the events in the ToC. At this stage it might be possible to do simple 2 x 2 cross tabulations of the relationships between related pairs of events in the ToC, by counting numbers of projects that had achieved both of the events, achieved one but not the other, and achieved none at all. A Chi Square test could tell us how significant the observed relationship was.

There is however a potential difficulty with the next step in the process, which is to look at the larger picture of how all of the events are causally linked together to make the whole ToC work as expected. Event A may be linked to Event B by project X (and others) achieving both, and Event B may be linked to Event C by project Y (and others) achieving both. But if project X and Y are in different locations then the connection between Event A and C would seem very questionable, because the two Event Bs probably also happened in two different locations. In the worst case we could end up with a ToC where many of the individual casual links were working as expected, but where there was no evidence of the whole set of causal links working together. At the least we would have to investigate the relationships between the projects that created the larger causal pathway. In the imagined example above, we would need to look at projects X and Y and how their achievements of Event B inter-related.

This is, and sounds, complex. One means of simplification would be to identify and focus on the most important causal pathway in the ToC. In my colleagues’ ToC there were at least two major pathways and a number of minor variations. Identifying points of failure in the network of events would be one way forward. This could be done in two stages: 1. Success ranking might show that some outcome events in the ToC were not satisfactorily achieved by any projects. In that case the pathway it belonged to would be broken, 2. Chi square tests might show that although some pairs of events both took place, there was no significant association between them (in the form of the number of projects achieving both). Again the pathway they were part of would be broken.

This reflection on causal pathways make me think that the challenge of having to sort out “the attribution problem” for the final event in the ToC might be what could be called “a problem we would like to have”. It would presume that we have already established a plausible pathway of influence of our own activities. The discussion above suggests it may not be so easy in some cases.

Wednesday, December 02, 2009

Reflections on Dave Snowden’s presentations on sense-making and complexity

... at the Wageningen Innovation Dialogue, 30 November -1^st December 2009

From my point of view, one of the most interesting and important challenges is how to create useful representations of large, complex, dynamic structures, especially as seen by participants in those structures. For example, multi stakeholder processes in operation at national and international levels. Behind this view is an assumption, that if we have better representations then this will provide us with more informed choices about how to respond to that complexity. Note that the key word here is respond to, not manage. The scale of ambition is more modest. Management of complexity only seems feasible when it is on a small scale, such as the children’s play group example cited by Dave Snowden (DS).

I have had a long standing interest in one particular set of tools that can be used for producing representations of complex structures. These are social network analysis (SNA) methods and associated software. During the workshop Steve Waddell provided a good introduction to SNA and related tools.

DS’s presentations on the sense-making approach provided a useful complementary perspective. This was all about making use of large sets of qualitative data, of a kind that cannot be easily used by SNA tools. Many of this data was about people’s voices, values and concerns, all in the form of fairly unstructured and impromptu responses to questions asked by their peers (who were trained to do so). These are called “micro-narratives” (MNs).

DS’s sense-making process (and associated software) is innovative in at least three respects. Firstly, in terms of the huge scale. Up to 30,000 items of text collected and analysed in one application.In many cases this would be more like a census than a sample survey. I have never heard of qualitative data being collected on this scale before. Nor as promptly, including the time spent on analysis, in the case of the Pakistan example. Secondly, and related to this, is the sophistication and apparent user friendliness of the bespoke software and hardware that was used.

More interesting, and more important, was the decision to ask respondents to “self-signify” the qualitative information they had provided. This was done by asking the respondents to describe their own MNs by using two different kinds of scales, to rate the presence of different attributes already identified by the researchers as being of concern. The consequence of respondents providing this meta-data was that all the MNs could be given a location in a three dimensional space. In fact a number of different kinds of three dimensional spaces, if many self-signifiers were used. Within that space it was then possible for the researcher to look for clusters of MNs. Of special interest were clusters of MNs that were outliers, i.e. those that were not part of the centre of the overall distribution of MNs.

There are echoes here of the expectation that the collection and analysis of Most Significant Change (MSC) stories will help organisations identify “the edges of experience”, which they wanted to see more examples of in future (if positive), or less (if negative). The difference is DS's use of quantitative data to make these outliers more identifiable, in a transparent manner.

As far as I understand it, an additional purpose of using self-signifiers to identify clusters of MNs is to prevent premature completion of the process of interpretation by the researcher, and thus to strengthen the trustworthiness of the analysis that is made.

On the first day of the workshop I had two reservations about the approach that had been described. The first was about the “fitness landscape” that was drawn within the three dimensional space. How was it constructed, and why it was needed, this was unclear to me. My understanding now is that this surface is a mathematical projection from the 30,000 data points in that 3-D space (in the Pakistan example). A bit like a regression line in a 2D graph. One advantage of this constructed landscape is that it enable observers to have a clearer understanding of how these numerous MNs relate to each other on the three dimensions. When they are simply dots hanging in space this is much more difficult to do so.

I also wondered why “peak” locations were designated as peaks, and not troughs, and vice versa. This seems to be a matter of researcher choice. This seems okay, if the landscape has no more significance than a visual aid, as suggested above. But in some complexity studies peaks in landscapes are presented as unstable locations, and troughs as stable points, acting as “attractors”. Is it likely that any pole of any of the self-signifying scales will show this type of behaviour? If not, might it be better not to talk about fitness landscapes, or at least be very careful about not giving them more apparent significance than they merit? A related claim seems to have been made when DS said “Fitness landscapes show people where change is possible”. But is this really the case? I can’t see how it can be, unless desirable/undesirable attributes are built into the self-signifying scales chosen to create the 3D space. There is a risk that the technical language that is being used imputes more independent analytic capacity than the software has in reality.

The other concern I had was about who chooses the scales used to self-signify? I should say that I do think it is okay to derive these from a relevant academic field, or from the concerns of the client for the research. But might it provide an even more independent structuring of the MN data, if these scales were somehow also derived from the respondents themselves? On reflection, there seems to be no way of doing this when the sense- maker approach is applied on a large scale.

But on a much smaller scale I think there may be ways of doing this, by using a reiterated process of inquiry, rather than a once off process. I can provide an example by using data borrowed from a stakeholder consultation process held in rural Australia a few years ago. In the first stage respondents generated the equivalent of MNs. In this case they were short statements about how they expected a new fire prevention programme to help them and their community. These statements were in effect informal “objectives”, written in ordinary day-to-day language, on small filing cards. In the next stage the same individual stakeholders were each asked to sort these statements into a number of groups (of their own choosing), each group describing a different kind of expectation. Each of these groups was then labelled by the respondent who created it. The data from these card-sorting exercises was then aggregated into a single cards x cards matrix, where each cell value described how often the row card had been placed in the same group as the column card.

Here the card sorting exercise was in effect another means of self-signifying. It was generating meta-data, statements (group labels) about the statements (individual expectations). Unlike the tripolar and bipolar scales used in David’s sense-making approach, it did not enable a 3D space to be generated where all the 30 statements could be given a specific location. However, the cards x cards matrix was a data set that many SNA software tools can easily use to construct a network diagram, which is a 2D presentation of complex structures. The structure that was generated is shown below. Each node is a card, each link between two cards represents the fact that those two cards were placed in the same group one or more times (shown by line thickness). Clusters of cards all linked to each other were all placed in the same group one or more times.When using one software package (Visualyzer), a “mouseover” on any node can be used to show not only the original card contents (the expectation), but also the labels of the one or more groups that the card was later placed in.In this adapted use of self-signifiers the process of grouping cards helps add additional qualitative information and meaning to that already there in the card contents.

As well as being able to identify respondent defined clusters of statements, we can also sometimes see links between these clusters. The links are like a more skeletal version of the landscape surface discussed above. The “peaks” of that landscape are the nodes connected by strong links (i.e. the two cards were placed the same groups multiple times). These can be made easier to identify by applying a filter to screen out the weaker links. This is the metaphorical equivalent of raising the sea level, and covering the lower levels of the landscape.

The virtue of this network approach to analysing MNs is its very participative nature. Its limitation is its modest scalability. The literature on sorting methods suggests an upper limit of between 50 or so cards (I will investigate this further).While this is much less than 30,000, many structured stakeholder consultation processes can involve a smaller numbers of participants than this.

Key: Numbers represent the IDs of each card. Links indicate that the two cards were placed in the same group, one or more times. Thicker links = placed in the same group more often. Yellow nodes = most conspicuous cliques of cards (all often co-occuring).This image shows the strongest links only(i.e. above the average number. The mouseover function is not available for this image copy.

My final set of comments are about some of the risks and possible limitations of DS’s sense-making approach. The first concern is about transparency of method. To newcomers, the complexity terminology that is used when introducing the method was challenging, to say the least. At worst I wonder whether it is an unnecessary obstruction, and whether a shorter route to understanding the method would exist, if less complexity sciences terminology was used. The proprietary nature of the associated software is also a related concern to me, though I have been told that there is an intention to make an open source version available. Open source means open to critique and open to improvement, through collective effort, which is what the progress of science is ideally all about. The extensive use of complexity science terms also seems to make the approach vulnerable to corruption and possible ridicule, as people decided to “pick and mix” the bits and pieces of complexity ideas they are interested in, without understanding the basics of the whole idea of complexity.

Another issue is commensurate benefits. After seeing the scale of the data gathering involved, and the sophistication of the software used, both of which are impressive, I did also wonder whether the benefits obtained from the analysis were commensurate with the costs and efforts that had been invested, at least in the examples we were told about. Other concerns are not exclusive to the sense-making approach. What about the stories not told? Perhaps with almost census like coverage of some groups of concern this is less of a concern than with other large scale ethnographic inquiries. What about unexpected stories? Is the search for outliers leading to the discovery of views which are a surprise to clients of the research, and of possible consequence to their plans on how to relate to the respondents in the future? And are these surprises enough in number, or are they dramatic enough, to counterbalance the resource invested to find them?

------------------

“At the heart of all major discoveries in the physical sciences is the discovery of novel methods of representation” Steven Toulmin

Friday, October 30, 2009

On the poverty of baselines and targets...

I have been surprised to see how demanding DFID has become on the subject of baseline data. On page 13 of the new DFID Guidance (on using the new formatted Logical Framework) it is stated that ” All projects should have baseline data at all levels before they are approved. In exceptional circumstances, projects may be approved without baseline data at Output level..." Closer to the ground I have witnessed an UK NGO being pressed by DFID-appointed managers of a funding mechanism to deliver the required baseline data. This is despite the fact that the NGO's project will be implemented in a number of countries over a period of years, not all at once.

Meanwhile, in Uganda and Indonesia, I am watching two projects coming to an end. Both had baseline data collected shortly after they started. Neither is showing any signs of intending to do a re-survey at the end of the project period. Is anyone bothered? Not that I can see. Including DFID, who is a donor supporting one of the projects. And in both cases baseline surveys were expensive investments.To make matters worse, in one country the project performance targets were set before the baseline study, and in the other they have never really been agreed on.

I have just completed the final review of one project. We have diligently compared progress made on a set of indicators, against all the original targets. There are of course the usual problems of weak and missing data, and questionable causal links with project interventions. But what bothers me more is how outdated and ill-fitting some of these initial performance measures are. And how little justice this mode of assessment seems to be doing to what the project has been able to do since it started, especially the flexibility of its response in the face of the changing needs of the main partner organisation. Of even greater concern is the fact that this project is being implemented in a large number of districts, in a country that has been going through a significant process of decentralisation. Each district's capacities and needs are different, and not surprisingly the project's activities and results have varied from district to district. There is fact no one single project. Yet our review process, like many others, has in effect treated these district variations as "noise", obscuring what were expected to be region-wide trends over time.

I am now working on some ideas of how to do things differently in my next project review, in the same country. This time the focus will be more on internal comparisons: (a) between locations, (b) between time periods during the project period.

Tuesday, October 27, 2009

Why we should make economists work harder

"Why we should make life harder for aid agencies" is the title of an article by Tim Harford ("The Undercover Economist") in last weekend's Financial Times magazine section.

I agree with the sentiment, but not with the analysis. I was expecting better, given what I have read of Tim in the past.

Tim's article starts with the problem of how can we, as individual donors, be sure that our aid goes in the right direction and have the expected impact. The next problem, as seen by Tim, is that aid agencies are bureacracies. The solution is competition via a more open market. From within this perspective recent efforts at aid "harmonisation" are viewed by Tim with suspicion, and seen as almost the equivalent to establishing a cartel.

He then asks could agencies be made to compete , not only with each other, but even with private companies, to get funding from donor organisations. And could money (or rather vouchers) be given directly to aid recipients to spend, redeemable for services provided by a range of charities and aid agencies. These ideas he seeas as "radical" and possibly "far fetched" More immediately, he suggests we could "start by asking simple questions about where aid comes from, where it goes, how effective it is and how much is lost to administration – or worse."

I hope Tim will be pleasantly surprised to find his ideas are not seen as radical or far fetched, and in fact have been in play now for quite come time. What Tim really needs to do (apart from more homework before writing articles like this) is to start questioning the assumptions behind his analysis of the nature and benefits of competition amongst aid agencies.

1. In most ordinary markets the purchaser and user are one and the same person. The purchaser and user of aid agency services are different parties, seperated by continents and cultures.

2. In between them is not a single supplier, but a large and complex international aid supply network. See my map of one of the simpler aid supply networks, in a Guardian funded development project in Uganda (map is at the end of the article)

3. The quality of the product/service being provided is much more difficult to assess than that found in many goods and services markets in the UK. Measurement of poverty reduction is a field of its own, improved governance is another order of magnitude more difficult to assess, but nevertheless a common development objective. There are some more measuable oucomes, such as those captured via the Millenium Development Goals e.g. reduced maternal mortality. But these usually require changes in the performance of institutions e.g. national health services. These sorts of change are not simple to measure, let alone achieve. Aid agencies can avoid this challenge by directly supplying health services to poor communities, but they will then fail on another performance metric: sustainability

Tim's idea of vouchers (above) could best be described as quaint. It is now common place for aid agencies in humanitarian emergencies to give cash handouts to families in need, not just vouchers. So they can buy what they need from anyone, not simply "a range of charities and aid agencies" Cash transfers are also being tested for their usefullness in development programmes, where there is no emergency present.

Competition between aid agencies is happening all over the place. DFID has, for years, invited tenders from a wide range of organisations to implement its aid programmes. See their Current Contract Opportunities page But what difference is this making, that is the question. By contracting out work to others DFID moves its own "overhead costs" off its own books, onto others. But the overheads are still there. In fact they are multiplied, because in order to win contracts multiple organsiations invest substantial amounts of time and effort into producing complex documents, but only one wins. Those loosing bids are not products that can easily be sold to other possible buyers, like unused factory stock. Instead the costs of their non-use is figured in to the subsequent bids, including those that win.

So, costs will have gone up, but what about effectiveness? If that has improved, then the increased costs would be justifiable. The problem is, as touched upon above, it is very diffifuclt to measure the effectiveness of many contracted-out projects, because of the scale and complexity of the changes they are trying to achieve.

Wednesday, September 23, 2009

Constructing longer term perspectives

A few weeks ago a friend asked me for help with ideas for a presentation that needed to be made on "challenges for the international development sector..."

Not an easy task, where do you start? But I knuckled down and did some reflection. Work I am doing on DFID and AusAID funded projects in Indonesia ended up as the source of some ideas that may be useful. I have been working on these since late 2005 and the work continues until early 2010.

My short reply to my friend was as follows:

How to ensure that development interventions are designed and implemented within a long term perspective, that extends way beyond the typical 3-5 year planning cycles

There is a massive contradiction between the short term nature of project designs and what most people know about how long development can take (both technological and social).

If project planning cycles cannot be lengthened (e.g. because of goverment budget cycles and election cycles) then how can we make sure that these planning cycles are better linked up, into a more coherent longer term intervention? This is n easy task when there is constant staff turnover both within government and aid agencies. Strategy papers by themselves are not much use, because they have their own continuity problems, no new boss wants to simply say, yes, we will do more of same. Everyone wants to re-write the strategy in their own image

In Jakarta in October we will be holding an end-of-project review workshop for the DFID funded, GTZ implemented, AusAID monitored, government owned SISKES project (maternal and neonatal health). One of the two workshop objectives is

To engage participants in a longer term perspective on MNH development, that exceeds the typical 3-5 year project lifespan
By looking back, on developments since the beginning of the decade
By looking forward to up to four years in the future.
We will be including some people associated with a new AusAID MNH project in one of the 2 districts that GTZ are pulling out of. Plus a caste of thousands (well, 52 other participants so far).

One of the workshop exercises will be to engage participants in predicting trends in key service provision indicators over the next four years, based on their knowledge aquired through the SISKES project, and other sources. And then analysing the implications of these expected trends for the incoming projects, including the new AusAID MNH project

There is also a need for more connecting events at the design stage as well, where various stakeholders from prior and parallel related developments are brought in to inform planning decisions, or at least the choices to be considered. Often the consultants on design missions are about the only bridges to the past.

For other people's efforts to promote really long term thinking, see the Long Now Foundation

Monday, August 10, 2009

Bibliographic Timelines

It is a simple idea, but one that looks useful

During a recent mid-term review of AMREF's Katine Community Partnerships Project, I started to create a bibliography of project related documents, with a difference. Normally documents listed in a bibliography are structured in alphabetical order, by the authors' name. This helps you find the document if you know the authors name, but not much more.

This time I listed all the project documents in time order, by the year and the month when they were produced, starting with the oldest. In the text of most reports referenced documents are usually referred to by their author and date, so it is still easy to find cited documents in this chronologically ordered list. The added advantage of this "bibliographic timeline" is that it also gives you (the reader and/or writer) a quick sense of the history of the project. Most document titles make some reference to the event they are describing (e.g. baseline studies, needs assessments, workplans, annual reports, etc), so by scanning down the bibliography you can quickly get a rough sense of the sequence of activities that have taken place. Even though there may be a time lag between an event and when it is documented (say in the next month).

I have attached below a graphic image of the "bibliographic timeline" that was produced this way. Click on the image to get more detail.

Tuesday, December 23, 2008

Comments on the draft DFID evaluation policy

DFID and Independent Advisory Committee on Development Impact have sought public comments on two documents: the Draft Evaluation Policy and the Evaluation Topic List. More information on the public consultation process can be found at MandE NEWS

Comments can be emailed to evaluationfeedback@dfid.gov.uk Here below are two sets of comments that I have sent in:

1. The need for a meta-evaluation of the results of the decentralised evaluation policy

In the List of Potential Evaluation Topics, readers are invited to comment on “any topics you consider very important that we have not listed here”.

One gap which I noted was the lack of any reference to meta-evaluation of the many evaluation activities carried out within the country programmes.

However, the draft Evaluation Policy mentioned above makes eleven references to the role of “decentralised evaluation”. DFID’s decentralised evaluations “are those commissioned by our staff responsible for managing DFID’s programmes, policies and partnerships, normally in collaboration with their development partners”

The references to decentralised evaluations covered the following areas:
- increased use of decentralised evaluation as one of the 4 major priorities for developing the evaluation function in DFID. p.11
- sustaining a strong culture of decentralised evaluation across the Department. p.16
- strengthening its advisory and quality support role for decentralised evaluations p.17
- quality assurance of decentralised evaluations. p.4, p.16
- helping to set standards, providing support and advice, and reporting on quality. p4

But there are no references to a systematic or periodic meta-evaluation of decentralised evaluations. This seems like a major omission. Authority for evaluation has been decentralised, and advisory support and guidance will be provided, but there is no evident complementary mechanism for assessing the results.

PS: meta-evaluations are not the same as synthesis studies. A synthesis study looks at the findings across a number of evaluations, a meta-evaluation looks at the evaluation methods used by a number of evaluations. Most organisations, including DFID, already do quite a few sythesis studies.

2. The need for consultation on evaluation criteria, not just what should be evaluated

There needs to be some debate not just about what is to be evaluated, but on what criteria?

So far, during the present consultation, the question of what to evaluate has been subject of a separate DFID paper (the Evaluation Topic List) but the question of what criteria has only warranted a short section in an annex to the draft policy paper. In that annex DFID list “the internationally-agreed evaluation criteria …[that] will be applied to DFID evaluations. They appropriately note that while “It will not be appropriate to investigate every criterion in depth in every evaluation. DFID evaluators will be requested to provide an explanation of the criteria they have chosen (or not) to cover”. The listed criteria are 1. Relevance, 2. Effectiveness, 3. Effeciency, 4. Impact, 5. Sustainability, 6. Coverage, and 7. Coherence.

Elsewhere on this blog I have argued for the inclusion of two additional criteria to the traditional DAC 5 (1-5 above).These are equity and transparency

It could be argued that criteria 6 (coverage) already covers equity. However the choice of words can be important. Coverage is an apparently technical term, but equity is explicitly about a value: fairness, of process and outcome. DFID’s desire to eliminate of poverty is a statement about values. Values should be clearly stated, not hidden or assumed.

Transparency is not covered at all. Yet transparency is basic to the whole process of evaluation, especially when viewed in a wider context. Without access to information the ability of stakeholders in development programmes to evaluate performance on any of these criteria will be extremely limited. The importance of access to information was emphasised by the United Nations General Assembly in its first session in 1946, which states: “Freedom of information is a fundamental human right and … the touchstone of all the freedoms to which the UN is consecrated.” (Resolution 59)

More recently DFID was one of the founding signatories to the International Aid Transparency Initiative, publicised at the August 2008 High Level Forum on Aid Effectiveness in Accra, Ghana.

Given this recent statement of position by DFID transparency should clearly be included as an evaluation criteria on the DFID list. If this proposal raises concerns about the list becoming too lengthy, one could argue that it should certainly have higher priority than the newly proposed criteria 7 (coherence). In fact, perhaps it should be criteria number 1, ahead of relevance and all other criteria.

Thursday, July 24, 2008

An aid bubble? - Interpreting aid trends

(unattributed source)

This graph (from a DFID presentation) shows what many people have already heard, that the volume of aid given by DFID will continue to increase, but that the amount of money being spent on administering that aid will plateau. Does this divergence mean:

DFID has discovered a new means of effectively giving development aid that requires less and less administrative overhead each year?
There is a huge amount of slack capacity within DFID that can safely be pared away for years without hindering its effectiveness?
This graph is prima facie evidence of an impending aid bubble that is highly likely to burst in the next few years, as one or more mal-administered or corrupted aid programs are publicly exposed, to the discredit of both the good and the bad?
Yes, it does mean there will be more mal-administered and corrupted aid programs in the future, but not many more people will be worried about this than have been in the past?
This is a good example of where there is a pressing need for an ex-ante impact assessment of a budgeting strategy ( if ever there was one)?
The category “Admin budget” is meaningless and in fact the real costs of administering aid have not been adequately disaggregated in this graph.
Or...?

You can record your opinion, by posting a Comment below, or registering your vote on this anonymous opinion poll.

My initial preference would be for the fifth option, even though it is probably unlikely that the results would have much impact on decisions that have already been made.

But on reflection option one may not be so impossible as it seems. DFID may well give more and more of its aid through third parties (multilaterals, and international programmes of different kinds). When it does this those organisations' administration costs will not appear on the DFID books as administration costs, but as aid given. And those organisations can in turn use the same device to manage the apparent levels of their own administration costs, by funding other parties, such as national NGOs.

The cumulative outcome of this re-iterated strategy may well be very perverse, adding up to a bigger proportion of aid being spent on administration than would be the case if the orginal donor had been more directly engaged and been willing to show higher admin costs in its own budget. All this is speculative though. What it does suggest however, is the possible relevance of a "whole supply chain" approach to the evaluation of the costs of different forms of aid. Unlike private sector supply chains, the total cost of delivered aid is not evident in what the beneficiary pays for the final product.

Perhaps these issues could be pursued by the new Independent Advisory Committee on Development Impact?

Friday, March 21, 2008

Aid organisations as self-interested businesses?

This posting has been prompted by a letter I received recently. A client I am working with (evaluating their project and that of another donor) wanted me to sign a confidentiality agreement. While it did not seem excessively restrictive, in terms of general intent it was the very opposite of what I have been trying to encourage this and other donors to do with information about their projects. Increasingly over the past few years I have been pushing for more transparency, not less. The rationale being that the whole aid process would benefit by being more accountable to the public at large, not just to donors or the project manager’s immediate partners and intended beneficiaries. Some of my clients have taken this approach seriously and used their websites to make a whole range of project documents publicly available (See G-rap and PETRRA). Others have agreed in principle but seem to have made little progress in practice.

Parallel to this effort I have been trying to persuade donors and project managers that achieving specific development objectives is not enough For example, increased levels of health service usage, or increased farmers’ incomes. It is also essential that knowledge be accumulated, and made available, about how these objectives were achieved, and what factors made the difference between higher and lower levels of achievement. Without this knowledge the existing achievements are less likely to be sustainable, and they are certainly unlikely to be replicable. Given the scale of most development problems, sustainability and replicability of achievements is absolutely essential. Measuring sustainability and replicability is not easy. But identifying the availability of relevant knowledge should be possible.

If information was made publicly available on how specific developments were achieved then a project can be considered to have created a public good, that others can use. The more usable that knowledge is, the more value that public good is. Businesses do not often do this, though putting usable knowledge in the public domain is becoming more common in the world of software and internet services[1]. Businesses usually have a commercial self interest in keeping secret the key parts of their business processes that would enable others to compete with them in providing the same goods or services. The production of public goods could therefore be seen as a way of differentiating the degree to which aid organisations (of varying kinds) are operating as self-interested businesses versus more public interested organisations. Whether they make and distribute a profit could be considered a secondary matter.

If the production of public goods is accepted as an important defining feature of good aid organisations then more attention to the quality of those goods, and how it could be improved, would be justified. Some might argue that a lot of information products produced by aid organisations are often more like advertising and public relations materials, and better described as “vapourware”[2]. One means of improving the quality of potential public goods would be increased transparency. So we can see not only the final information product (e.g. a book, web page, video, etc), but the drafts and the debate that surrounded their development, and the background data. Not simply as a final package, but during the process. The public could then become engaged, though comment and feedback, in the process of producing the public good(s). This type of semi-open production process is increasingly common in some areas of business (see "Democratizing Innovation", 2005 and Wikipedia entry). In aid organisations this approach could be realised in fairly simple forms through the use of websites to host draft documents, and the use of online open forums and email lists to promote awareness and discussion of those documents. This is not rocket science. But nor is it yet common practice on the scale it should be.

In my argument above transparency has two rationales. One is pragmatic, tranparency could help improve the knowledge that is available about how best to have an impact. On the other hand, when visibly put into practice, transparency may also function as an important signal of intentions, helping us differentiate organisations that are more public interested from those that are more self-interested.

[1] For example, in the form of open source products, free internet services and services that inter-operable with those provided by others.

[2] Software products that have a name, and promotional materials, but not much in the way of contents that will actually make them work and deliver what they promise

Friday, February 01, 2008

Social Frameworks: An improvement on the Logical Framework?

Over the last week there have been quite a few email exchanges on the MandE NEWS email list about how to distinguish results from outcomes, results from impact, inputs from outputs, outcomes from impacts, etc. These are the various terms used to describe different levels of a Logical Framework description of a development intervention (in some of the variations of the Logical Framework used by some development agencies). This debate is not new; it comes and goes, and has appeared within most development organisations at some stage or another.

There are two causes of this confusion of nomenclature, in my view. One is that the Logical Framework describes a sequence of causally linked events happening over time. Time flows, it has no natural punctuation marks that can be used to distinguish and categorise stages of a process. It is not possible to “carve nature at the joints” when dealing with time. So any introduction of stage categories like inputs, activities, outputs, outcomes, impacts etc., is artificial, and requires a consensus amongst the users of these terms, if they are to be useful. Within organisations that can be achieved, across organisations it is usually much more difficult.

The second cause is a widespread confusion between two types of hierarchy. The Logical Framework is supposed to represent a temporal hierarchy, of events taking place through time. Here A is supposed to lead to B, which is supposed to lead to C, etc. However some organisations mix in a different kind of hierarchy, when they introduce terms like “components”, and “sub-objectives”. This is a hierarchy of inclusion, where A, B, and C are part of X, and X, Y, and Z are in turn part of some larger entity. So the upper levels of this hierarchy are not the outcomes of lower level activities, but simply wider generalisations or descriptions of types of things described at the lower levels. I have seen this sort of terminology in some UNICEF Logical Frameworks in Indonesia, mixed in with Purpose and Goal statements that are part of a temporal hierarchy.

A Social Framework?

I have been experimenting with an alternative, which does not “throw the baby out with the bathwater”. It could be called a Social Framework, rather than a Logical Framework, because it emphasises people and their relationships, rather than more abstract events and processes.

Let us start with the same tabular structure as the Logical Framework, but then introduce some significant changes. Each row of the narrative column (found on the left side of the Logical Framework) can be used to describe different types of actors (usually organisations or groups, rather than individuals). Actors in adjacent rows will be linked to each other by relationships that already exist, or which will be developed. The overall result is that the table describes a pathway of expected influence, from the actor in the bottom row up to the actor in the top row. The causal mechanisms are the relationships that link the actors. However, as in real life, this process of influence is unlikely to be one-directional. Both parties linked by a relationship may affect each other. For example a UK donor NGO may earn lessons from its southern partner, as well as being an important conduit of funding for that southern partner.

In the Katine project in Uganda this pathway consists of UK donors who fund AMREF who help develop the capacity of local organisations, who provide services to local households. You can see this "pathway to the poor" in the table below. However, you will see I have introduced an extra row, so I can differentiate between the internal workings of AMREF Katine as an organisational actor, and AMREF’s relationships with local organisations. As shown in the second table that follows the first, I could do the same with the other actors in the pathway. But one may not always want this degree of comprehensive detail. Nevertheless, note this basic point: actors and their relationships with each other are the basis of the Social Framework.

Simple version of the pathway

Actor	Local households
Actor	Local organisations
Relationship	AMREF’s relationship with local organisations
Actor	AMREF’s internal functioning
Actor	Donors

More detailed version of the pathway

Actor	Local households
Relationship	Local organisations’ relationship with local households
Actor	Local organisations’ internal functioning
Relationship	AMREF’s relationship with local organisations
Actor	AMREF’s internal functioning
Relationship	Donors’ relationships with AMREF
Actor	Donors’ internal functioning

It is not difficult to see some correspondence between these levels (especially in the first table) and the Logical Framework categories of Inputs, Activities, Outputs, Purpose and Goal. But talking in terms of specific categories of actors is much more tangible and communicable, especially across cultures. So, lets say goodbye to inputs, activities, outputs, etc, for the time being.

Moving on to the next column in the traditional Logical Framework, the Objectively Verifiable Indicators (OVIs), there is no reason why they cannot be used in this more Social Framework. Indicators could be identified for expected internal changes in each actor and for expected changes in their relationships with other adjacent actors in the Social Framework.

Moving on to the next column, the Means of Verificationn (MoV), this column function can also be retained, describing where and how information will be available about the expected changes. In addition, I suggest taking a more social view of this question. The MoV could describe who is expected to know about the changes described by the OVIs in the same row, because of their interests or responsibilities in this area. For example, the household row may have an indicator about households increased access to clean drinking water. In the OVI column in the same row, reference could be made to the Village Water Committee as a body who should know about changes of this type. Their knowledge and then their responses have implications for the sustainability of any improvements in water supply. This actor-oriented view implies the need for participatory approach, built around what people can and should be able to do in the way of monitoring. What is not needed is lists of disembodied items of information that might be found in a report or database somewhere.

Moving on to the next column, in the traditional Logical Framework we normally find Assumptions that refer to other factors that can influence the causal connection between events happening in adjacent rows. In the Social Framework I would suggest that this column describe assumptions about other actors, and the kind of influence that they are expected to have on the actor(s) described in the narrative row of this column (and vice versa).

The work of other NGOs in the same location may involve relationships with some of the actors in the pathways described in the Katine Social Framework. For example, the same government body, or the same community group. This could be flagged by a commentary in the Assumptions column This design flexibility contrasts with the rigidity of nested Logical Frameworks, where it is only possible to represent convergence of plans (Leading to pyramid like structures, with lots of things happening at the base, all converging on a few things at the top).

The Social Framework

This table below is a rough draft of what a Social Framework might look like for part of the Katine project in Uganda. This project is described in detail on the Guardian website.

Narrative description - of expected changes in a pathway of influence	Objectively Verifiable Indicators (OVIs), - evidence of expected change	Means of Verification (MoV), - who should know about the OVIs	Assumptions - about these and other actors
Expected changes in local households	e.g. indicators of access to safe drinking water, children's primary school participation, food sufficiency	e.g. village water committee, school management committee, village administration	e.g. the insurgents will not return again, force relocations and destroy property
Expected changes in local organisations’ relationship with local households	e.g. speed of repair of broken standpipes	e.g. village water committee, village administration	e.g. that local organisations will provide services equitably
Expected changes in local organisations’ internal functioning	e.g scores of weighted checklists re health clinic functioning	e.g. Health Unit Management Committee (HUMC)	e.g. that District Health Service will support implementation of HUMC recommendations
Expected changes in AMREF’s relationship with local organisations			e.g. AMREF will identify other NGOs who are also working with local organisations, cordinate plans with them and learn lessons re those groups
Expected changes in AMREF’s internal functioning			e.g. AMREF HQ will devolve right make public statements re the project
Expected changes in donors’ relationships with AMREF			e.g Existing donors will not prevent AMREF from seeking additional funding from other donors
Expected changes in donors’ internal functioning			e.g. Donors will be able to agree on desired outcomes of their relationship with AMREF

In complex development programmes people have tried to develop hierarchically nested Logical Frameworks, to show how different parts of a complex program connect to each other. But examples of these are not easy to find, despite the fact that there are many complex programmes in existence. In my experience, creating nested Logical Frameworks is not easy, and this may be the explanation for their scarcity.

Connecting up Social Frameworks to describe a more complex picture should be easier, because they have a modular structure. Each row describing an actor is in effect like a building block. These building blocks can be combined in different sequences. So, in addition to the pathway in the table above, a parallel Katine project pathway of influence could be Donors <-> AMREF <-> Ministry of Health <-> District Health Services <-> Local organisations <-> Local Households (actors in italics being part of other influence pathways already documented). This pathway could address the need for a parallel process of policy influencing at the national level, based on AMREF’s experience with local organisations in Katine. This pathway branches off then re-converges with the original pathway (See simple diagram version below)

As noted briefly above, each of the relationships connecting the actors in any part of the pathway is likely to involve two way communications and influence, unlike the one way causality in the Logical Framework. So messages can come back from households, via local organisations to AMREF, then go off to the Ministry of Health. Useful indicators in the AMREF row, could therefore include such developments as improved knowledge about the impact of central government policies on Katine households

PS: I have now updated the ideas describe above in a posting on MandE NEWS called The Social Framework as an alternative to the Logical Framework This is where all future developments of the idea can be found. So, please visit.

Monday, January 07, 2008

Assessing achievements in Katine, Uganda

This weekend I will head off to Uganda for two weeks, to meet the AMREF staff working on the Katine project (See the links on the left side of this blog for more info on this project), and to see Katine sub-country itself, the place and the people. This will be the first of a series of twice-yearly visits that I will be making over the next few years. As part of the preparation for this visit on Monday this week I attended a meeting in London, to go over my Terms of Reference (ToRs) for my visit with staff from AMREF and from the Guardian.

One of the things we discussed was my request last year that AMREF develop a disclosure policy, which will spell what sorts of information they will made publicly available, and under what circumstances. Much to my surprise, that policy has already been developed and approved by the Board in November, but nobody had told me, nor had it’s existence been made public via the AMREF website. This does seem to almost defeat the purpose of the policy, which is unfortunate, since the intentions expressed in the policy do seem positive.

PS: Since that discussion a copy has now been made available on the AMREF website. My questions to you, the reader, are: What do you think of it? How could it be improved? For comparison, here is a similar sort of policy developed by ActionAid.

In the same meeting we also discussed my visit schedule in Uganda. My draft ToRs are here. As you can probably see, the list of things to do is quite long, probably too long to complete in this visit. So my first meeting with AMREF in Uganda will have to focus on prioritising these tasks. Top of my own to-do list is to meet all the AMREF staff in Katine, find out about their various roles, and to talk about their expectations about my role as the external evaluator - what they would and would not like to me doing. I will be bring along all the comments made so far by participants in an online survey of people’s views on this subject, which you can find here online. So far this online survey has focused on a limited number of stakeholders: the staff of AMREF, Guardian and Barclays. But I hope to open it up to wider public participation on return from Uganda. Please feel free to add you views on this subject right now, by commenting on this blog below.

As well as the tasks listed in my Terms of Reference there are many other questions I would like to explore during my visit. Most of these have been prompted by my reading of AMREF’s project documents over the last month, and by reading the Guardian Katine blogs. Here are some of them:

People’s participation: What did the community needs assessments find out about the existence of different community views on development needs in Katine? It is highly unlikely that in a population of 25,000 that they all had the same set of priorities. People’s views are likely to vary by gender, age, and location, at least. How have these views affected the project design?

And in AMREF’s Monitoring and Evaluation Plan for the project, what role will community groups have in monitoring and evaluation of the project? How often will their views be sought? How will those views then feed into decision making about how the project develops? [These questions relate to the equity and relevance dimension of my evaluation work]

Project strategy: Will the project be aiming to assist the whole population evenly, or will it be targeting some groups more than others? Do AMREF have enough staff and financial resources to reach the whole population? Will the various developments in water supply, health and livelihoods be focused at different target groups, or it is essential that a given group of people experience the combined impact of all these developments? How much information is available at this stage about the distribution of the population through the sub-county, and various government services? Could a map of these be made available on the Guardian Katine website, which could be continually updated and unfilled with information, as the project progresses?

Project impact: Where will the impact of the project be most visible in three years time? Will it be in changes in school attendance and completion, changes in people’s health, or changes in their livelihoods? Will the proposed baseline survey enable AMREF to track the changes that are taking place, and separate out the effects of AMREFs inputs, from the effects of other changes taking place in the society and economy? What about unexpected changes that may not have been planned for? How will they be given adequate attention? Is the monitoring and evaluation plan realistic? Is it too ambitious in terms of the information that will be collected?

Sustainability: How will the impact of the assistance provided by AMREF be sustained in the future? Will government be better able, or more willing, to take responsibility for delivering good quality health and education services?

Transparency: What mechanisms does AMREF have for transparency at the local (Katine) level, as distinct from via its website and that of the Guardian? Which of the various project documents produced so far has AMREF made publicly available? What else could be made available right now? What problems, if any, are arising because of this transparency?

If there are other issues you think I should be looking at, please add your comments below.

Sunday, November 25, 2007

A network approach to the selection of "Most Significant Change" stories

I spent yesterday in a day-long meeting with the staff of an NGO grant-making body, in Ghana. A year ago I had run a two day training workshop for their grantees on the use of the "Most Significant Change" (MSC) method of impact monitoring, a method of monitoring-without-indicators. Since then they had started to collect "Most Significant Change" stories, and they had asked for some feedback on those stories.

In yesterday's meeting, and in my meetings with other organisations in the past, concerns have been expressed about the appropriateness of a hierarchical selection process of MSC stories, when the grantees, and their local partners were all very autonomous organisations, and the last thing the grant making body wanted to do was to create, or reinforce, any view that they were all part of an organisational hierarchy, with the grant making body, and its back donors, at the top.

I explained an alternative way of structuring the selection process, that involved the parallel participation of different stakeholders groups, with a reiterated process of story selection, then feedback to the plenary meeting of all participants. After the meeting yesterday I thought it might be useful to document this alternative, and make it more widely available. So this is what is now available below, in the form of a graphic image of an Excel file. If you click on the image it will be enlarged. Or, click on the link below the image to download the actual Excel file

Your comments and suggestions are invited, please use the Comment facility on this blog.

If you have not heard about MSC before it would be worth looking at the MSC Guide first. Especially section 5 on selection.

Click on the image to make it bigger, or download the Excel file

Postscript: The Washington Post ( 31 Dec 07 online) has an interesting article about how being able to see other people's judgements affects one's own judgements. One of the authors of the study is a well know writer/researcher on networks (Duncan Watts). See also Valdis Krebs' paper "It's the [local] Conversations, Stupid: The link between social interaction and political choice"

Saturday, October 20, 2007

Managing expectations about monitoring and evaluation in Katine

Yesterday I went to an event in London, hosted by Barclays, which functioned as the official opening of the Katine project. The Guardian's Katine website went online immediately afterwards, and today's Guardian newspaper features a front page article about the Guardian's involvement in Katine, and a magazine insert giving a detailed description of Katine: the place, the people and the project.

Already some differences in expectations are evident and will need to be managed. Visits to Katine by Guardian and Barclay's staff have clearly had a psychological impact on those staff that visited, and on those they have talked to since. Others are interested to go there as well. But at the same time, AMREF staff have an understandable concern about the manageability of a stream of such visitors. How much of their staff time will be taken up with the planning and hosting of these visits, and what effect will that diversion of resources have on the implementation of the project?

My Terms of Reference (ToRs) already include a responsibility to "Assess whether the Guardian is impacting project delivery or negatively impacting the lives of the community" Already I am thinking that this responsibility needs to be amended to refer to the involvement of the Guardian and Barclays in more general terms, not just media activities.

There are some practical (M&E) steps that could be taken right now. AMREF could start to log the time spent by their staff in planning and hosting each visit by outsiders. On the Guardian and Barclays side, as I suggested to one staff member yesterday, it would be useful if those thinking about a visit could try to be as clear as possible about the objectives of their proposed visit. The nature of what would be a reasonable level of visits is also under negotiation, as part of ongoing contract discussions between AMREF, Barclays and the Guardian.

Another issue that may need to be attended to is the possible impact of the Guardian choosing to focus its media attention on Katine village, which has a population of 1500 people, although AMREF will be working with a much larger group, the 25,000 people living in the wider Katine sub-country (which Katine village is part of). It is possible, though accident and/or intention that a disproportionate amount of project resources may end up being invested in Katine village. For this and other reasons I will need to examine AMREF's plans to see how they intended to address issues of equity: who is being assisted by what project activities, and why so. This leads us into wider issues of what are the most appropriate criteria for assessing AMREF's performance, in addition to equity and effectiveness. This will be the subject of another blog posting, yet to come.

Postscript (31/10/07): I have now set up a Frequently Asked Questions(FAQs) webpage on the topic of Monitoring and Evaluating Success in Katine

Friday, October 19, 2007

Katine: an experiment in more publicly transparent aid processes

Katine is a sub-district of Uganda (map). It is the location of an AMREF development project, funded by the Guardian, and Barclays Bank, starting this year, and scheduled to run for three years. Information about the project will be provided, and regularly updated, on a dedicated Guardian webpage

I will making a number of postings here (on Rick on the Road) and on the Guardian website, about the monitoring and evaluation (M&E) of this project.

At this early stage, there are some identifiable challenges. Some old, some new.
Old ones, which I am already familiar with, will need to be addressed by AMREF in the first instance:

Are the project objectives clear enough to be "evaluable"? Or are they just too fuzzy for anyone to judge? Right now the project staff are engaged in a process of participatory planning with people in Katine. Hopefully this will lead to some more clearly defined objectives, with identifiable and maybe even measurable outcomes, that all agree should be achieved. For example, that 95% of school age girls in the sub-district complete primary school

Amongst the many project activities (relating to education, employment, health and local governance) is there a clear sense of priorities? For example, that improvements in education are most important of all. Without this clarity, it will be hard to weigh up the different achievements and to reach a conclusion about overall success. Ideally the biggest achievements will be in the highest priority areas.

In reality there will be differing views on priorities, and even on the most important expected changes within each area (education, health, etc). Women will probably have different view to men, children will have different views to adults, poorer households will have different views to richer households, etc. Especially within a population of x,000 people. So, the third challenge will be to identify who are the different stakeholders in the project, and how their interests differ. And whose interests should the project prioritise

There are also some new issues, that I will have to address.

AMREF already has staff who are responsible for the monitoring and evaluation of the performance of its projects. But the Guardian and Barclays felt the need for an external M&E person, at least in the earliest stages of this project. The challenge for me is to make my role useful to both parties (AMREF and its two donors) but also to progressively phase out my role , as the Guardian and Barclays gain confidence in AMREF's own capacities to monitor and evaluate its own performance.

Unlike most development aid projects, this project will be in the public eye, via the Guardian, from the beginning. A Ugandan journalist will be based in the community, on a part time basis. The Guardian will be running a blog on the project for three years. There may even be a community run blog, whereby they tell the world, especially the UK, their vew of things. Where possible, project documentation will be made publicly available. All this has risks, as well as great potential for increasing public understanding about how development and aid work (and sometimes doesnt work). The second and much bigger challenge for me here is how to monitor and evaluate the impact of this public exposure

Another challenge, less threatening, will be how to best make use of this major opportunity to communicate with a large number of people. How can we get people to think about development as it happens in real life? Without drowning them in development jargon. And without reinforcing uncritical views about how easy it is to "help people" Perhaps we should start by remembering a quote from Henry Thoreau:

"If I knew for a certainty that a man was coming to my house with the conscious design of doing me good, I should run for my life...for fear that I should get some of his good done to me"