## Friday, April 08, 2011

### Models and reality: Dialogue through simulation

I have been finalising preparations for a training workshop on network visualisation as an evaluation tool. In the process I came across this "Causality Map for the Enhanced Evaluation Framework", for Budget Support activities.
On the surface this diagram seems realistic, budget support is a complex process. However,  I will probably use this diagram to highlight what is often missing in models of development interventions. Like many others, it lacks any feedback loops, and as such it is a model that is a long way from the reality it is trying to represent in summary form. Using a distinction being used more widely these days (and despite my reservations about it), I think this model qualifies as complicated but not complex. If you were to assign a numerical value to each node and to each connecting relationship, the value that would be generated at the end of the process (on the right) would always be the same.

The picture changes radically as soon as you include feedback loops, which is much easier to do when you use network rather than chain models (and where you give up using one dimension in the above type of diagram to represent the passage of time). Here below is my very simple example. This model represents five actors. They all start with a self-esteem rating of 1, but their subsequent self-esteem depends on the influence of the others they are connected to (represented by positive or negative link values, [randomly allocated]) and the self-esteem of those others.
You can see what happens when self-esteem values are recalculated to take into account those each actor is connected to, in this Excel file (best viewed with the NodeXL plugin). After ten iterations, Actor 0 has the highest self-esteem, and Actor 2 has the lowest. After 20 iterations Actor 2 has the highest self-esteem and Actor 1 has the lowest. After 30 iterations Actor 1 has the highest self-esteem and Actor O has the lowest. With more and more iterations the self-esteem of the actors involved might stabilise at a particular set of values, or it might repeat past patterns already seen, or maybe not.

There are two important points to be made here. The first is the dramatic affect of introducing feedback loops, in even the simplest of models. The aggregate results are not easily predictable, but they can be simulated. The second is that nature of the impact that is seen even in this very small complex system is a matter of the time period under examination. Impact seen at iteration 10 is different from iteration 20 and different again at iteration 30. In the words of the proponents of systems perspectives on evaluation, what is seen depends on the "perspective" that is chosen (Williams and Hummelbrunner, 2009).
PS 1: Michael Woolcock has written about the need to pay more attention to a related issue, captured by the term "impact trajectory". He argues that: "...in virtually all sectors, the development community has a weak (or at best implicit or assumed) understanding of the shape of the impact trajectories associated with its projects, and even less understanding of how these trajectories vary for different kinds of project operating in different contexts, at different scales and with varying degrees of implementation effectiveness; more forcefully, I argue that the weakness of this knowledge greatly compromises our capacity to make accurate statements about project impacts, irrespective of whether they are inspired by ‘demand’ or ‘supply’ side imperatives, and even if they have been subject to the most deftly implemented randomised trial"
PS1.1: Some examples:I recall that it has been argued that there is a big impact on households when they first join savings and credit groups, but the continuing impact drops down to a much more modest level thereafter. On the other hand, the impact of girls completing primary school may be the greatest when it reaches through to the next generation, to the number of their children and their survival rates.
There is one downside to my actors' self-esteem model, which is its almost excessive sensitivity. Small changes to any of the node or link values can  significantly change the longer term impacts. This is because this simple model of a social system has no buffers or "slack". Buffers could be in the form of accumulated attributes of the actors (like an individual's self-confidence arising from their lifetime experience or a firm's accumulated inventory) and also provided via the wider context (like individuals having access to a wider network of friends or firms having alternate sources of suppliers) . This model could clearly be improved upon.
PS 2: I came across this quote by Duncan Watts, in a commentary on his latest book "Everything is obvious" - "when people base their decisions in part on what other people are deciding, collective outcomes become highly unpredictable" That is exactly what is happening in the self-esteem model above.Duncan Watts has written extensively on networks.
Here below is another unidirectional causal model, available on the Donor Committee for Enterprise Development website

What I like about this example is that visitors to the website can click on the links (but not in the copy I have made above) and be taken to other pages where they will be given a detailed account of the nature of the causal processes represented by those links. This is exactly what the web was designed for. Visitors can also click on any of the boxes at the bottom and find out more about the activities that input into the whole process.

The inclusion of a feedback loop in this diagram would not be too difficult to imagine. For example, from perhaps the top box back to one of the earlier boxes e.g New firms start / register. This positive feedback loop would quickly produce escalating results further up the diagram. Ideally, we would recognise that this type of outcome (simple continuous escalation) does not fit very well with our perception of what happens in reality. That awareness would then lead to further improvements to the model, which generated more realistic behaviors.
PS 24 May 2011:  In their feminist perspective on monitoring and evaluation Batliwala and Pittman have suggested that we need "to develop a “theory of constraints” to accompany our “theory of change” in any given context..." They noted that "… most tools do not allow for tracking negative change, reversals, backlash, unexpected change, and other processes that push back or shift the direction of a positive change trajectory. How do we create tools that can capture this “two steps forward, one step back” phenomenon that many activists and organizations acknowledge as a reality and in which large amounts of learning lay hidden? In women’s rights work, this is vital because as soon as advances seriously challenge patriarchal or other social power structures, there are often significant reactions and setbacks. These are not, ironically, always indicative of failure or lack of effectiveness, but exactly the opposite— this is evidence that the process was working and was creating resistance from the status quo as a result .”
But it is early days. Many development programs do not yet even have a decent unidirectional causal model of what they are trying to do. In this context, the inclusion of any sort of feedback loop would be a major improvement. As shown above, the next step that can be taken is to run simulations of those improved models by inserting numerical values in the links and functions/equations in nodes of those models. In the examples above we can see how simulations can help improve models by showing how their results do not fit with our own observations of reality. Perhaps in the future they will be seen as a useful form of pre-test, worth carrying out at the earliest stages of an evaluation.
PS 3: This blog was prompted by comments to me by Elliot Stern on the relevance of modeling and simulation to evaluation, on which I hope he has more to say.

PS 4 I am struggling through Manuel DeLanda's Philosophy and Simulation: The emergence of synthetic reason (2011), which being about simulations, relates to the contents of this post.
PS 5: I have just scanned Funnell and Roger's very useful new book, Purposeful Program Theory" and found 67 unidirectional models but only 15 models that have one or more feedback loops (that is, 23%). This is quite dissapointing. So is the advice on the use of feedback loops: "We advise against using so many feedback loops that the logic becomes meaningless. When feedback loops are incorporated, a balance needs to be struck between including all of them and (because everything is related to everything else) and capturing some important ones. Showing that everything leads to everything else can make an outcome chain very difficult to understand - what we call a spagetti junction model. Neverthless some feedback loops can be critical to the success of a program and should be included ..."p187
Given the scarcity of models with feedback links even in this book, the risk of having too many feedback loops sounds like "a  problem we would like to have" And I am not sure why an excess of feedback links should be any more of a probability than an excess of forward links. The concern about the understandability of models with feedback loops is however more reasonable,  for reasons I have outlined above. When you introduce feedback loops what were either simple and complicated models start to exhibit complex behavior. Welcome to something that is a bit closer to the real world.
PS6:  "As change makers, we should not try to design a better world. We should make better feedback loops", the text of the last slide in Owen Barder's presentation "Development Complexity and Evolution"
PS7: In Funnell and Roger's book (page 486) they describe how the Intergovernmental Panel on Climate Change (IPCC) " recognised that controlled experimentation with the climate system in which the hypothesised agents of change are systematically varied in order to determine the climate's sensitivity to these agents...[is] clearly not possible"  Different models of climate change were developed with different assumptions about possible contributing factors. "The observed patterns of warming, including greater warming over land than over the ocean, and their changes over time, are simulated only by the models that include anthropogenic forcing. No coupled global climate change model that has used natural forcing only has reproduced the continental warming trends in individual continents (except Antarctica) over the second half of the 20th centrury" (IPCC, 2001, p39)

#### 1 comment:

1. Yesterday I attended a feedback session on Budget Support (BS) evaluation framework trainings provided by Juergen Lovasz (see https://www.youtube.com/watch?v=fP1RN9MiiFs) in Brussels in 2014 to two members of the NedWorc.org learning group on evaluation. Being a “systems man” myself, I too noted the total lack of feedback in the framework. There also seems to be a lack of feedback in the Budget Support program as a whole (except that EU civil servants who run the program are much more highly rewarded, which means they have a personal interest in making a show of BS effectiveness; not a very good mechanism if one thinks a more cautious or skeptical perspective would be preferable to make sure the billions are well spent). The design of an acceptable BS evaluation framework is obviously a very difficult task, first of all because of the countrywide scope of BS, but also because outcome and impact are to be taken into account, and because the effectiveness of technical assistance and policy dialogue are to be considered, thus making for totally incomparable sets of measures or judgments (judgment calls?) of success. It is obvious that the no matter how “objectively” the evaluations will be supervised or how well the frameworks are followed, the scope for manipulation of the evaluation results, not even mentioning the scope for ignoring, manipulating or downplaying the recommendations, is tremendous. How could it be improved? Here are three points that spring to my mind: 1. clarify the phenomenal weaknesses in the evaluation framework, how it can be manipulated etc., for all to know, and manage them in a more open fashion; 2. Introduce feedback mechanisms in the evaluation framework; this is impossible at framework level, but could be done at “arrow” level (see fig. 3 on http://www.oecd.org/dac/evaluation/dcdndep/38339122.pdf#page=34 ); 3. Create a feedback mechanism between the level of Budget Support and the apparent (i.e. evaluated) capacity to absorb it by the country in question. For a feedback mechanism such as 3 to work it is necessary that the lag time does not exceed 1 or 2 years. This clearly means that Results-Oriented Monitoring and Budget Support Evaluation will need to be more closely integrated.