Monday, March 06, 2023

How can evaluators practically think about multiple Theories of Change in a particular context?


This blog posting is been prompted by participation in two recent events. One was some work I was doing with the ICRC, reviewing Terms of Reference for an evaluation.  The other was listening in as a participant to this week's European Investment Bank conference titled "Picking up the pace: Evaluation in a rapidly changing world". 

When I was reviewing some Terms of Reference for an evaluation I noticed a gap which I have seen many times before. While there was a reasonable discussion of the types of information that would need to be gathered there was a conspicuous absence of any discussion of how that data would be analysed. My feedback included the suggestion that the Terms of Reference needed to ask the evaluation team for a description of the analytical framework they would use to analyse the data they were collecting.

The first two sessions of this week's EIB conference were on the subject of foresight and evaluation. In other words how evaluators can think more creatively and usefully about  possible futures – a subject of considerable interest to me. You might notice that I've referred to futures rather than the future, intentionally emphasising the fact that there may be many different kinds of futures, and with some exceptions (e.g. climate change) is not easy to identify which of these will actually eventuate.

To be honest, I wasn't too impressed with the ideas that came up in this morning's discussion about how evaluators could pay more attention to the plurality of possible futures. On the other hand, I did feel some sympathy for the panel members who were put on the spot to answer some quite difficult questions on this topic.

Benefiting from the luxury of more time to think about this topic, I would like to make a suggestion that might be practically usable by evaluators, and worth considering by commissioners of evaluations. The suggestion is how an evaluation team could realistically give attention not just to a single "official"  Theory Of Change about an intervention, but to multiple relevant Theories Of Change about an intervention and its expected outcomes. In doing so I hope to address both issues I have raised above: (a) the need for an evaluation team to have a conceptual framework structuring how it will analyse the data it collects, and (b) the need to think about more than one possible future and how that might be realised i.e. more than one Theory of Change.

The core idea is to make use of something which I have discussed many times previously in this blog, known as the Confusion Matrix – to those involved in machine learning, and more generally described simply as a truth table - one that describes four types of possibilities. It takes the following form:

In the field of machine learning the main interest in the Confusion Matrix is the associated performance measures that can be generated, and used to analyse and assess the performance of different predictive models.  While these are of interest, what I want to talk about here is how we can use the same framework to think about different types of theories, as distinct from different types of observed results.

There are four different types of Theories of Change that can be seen in the Confusion Matrix. The first (1) describes what is happening when intervention is present and the expected outcome of that intervention is present. This is the familiar territory of the kind of Theories of Change that an evaluator will be asked to examine.

The second (2) describes what is happening when intervention is present and the expected outcome of that intervention is absent. This theory would describe what additional conditions are present, or what expected conditions are absent, which will make a difference – leading to the expected outcome being absent.  When it comes to analysing data on what actually happened identifying these conditions can lead to modification of the first (1) Theory of Change such that it becomes a better predictor of the outcome and there are fewer False Positives (found in cell 2). Ideally the less False Positives the better. But from a theory development point of view there should always be some situations described in cell 2 because there will never be an all-encompassing theory that works everywhere. There will always be boundary conditions beyond which the theory is not expected to work. So an important part of an evaluation is not just to refine the theory about what works (1) but also to refine the theory of the circumstances in which it will not be expected to work  (2),  sometimes known as conditions or boundary conditions.

The third theory (3) describes what is happening when the intervention is absent but nevertheless the outcome is present. Consideration of this possibility involves recognition of what is known as "multi-finality" i.e. that some events can arise from multiple alternative causal conditions (or combinations of  causal conditions).  It's not uncommon to find advice to evaluators that they should consider alternative theories to those they are currently focused on. For example in the literature on contribution analysis. But it strikes me that this is often close to a ritualistic requirement, or at least treated that way in practice. In this perspective alternative theories are a potential threat to the theory being focused on (1). But a much more useful perspective would be to treat these alternative theories as potentially useful other courses of an action that an agent could take, which warrant serious attention in their own right. And if they are shown to have some validity this does not by definition mean that the main theory of change (1) is wrong. It' simply means that there are alternative ways of achieving the outcome, which can only be a bonus finding. 

The fourth theory describes what is happening when intervention is absent and the outcome is also absent (4).  In its simplest interpretation, it may be that the actual absence of the attributes of the intervention is the reason why the outcome is not present. But this can't be assumed. There may be other factors which have been more important causes. For example the presence of an earthquake, or the holding of a very contested election. This possibility is captured by the term "asymmetric causality" i.e. that the causes of something not happening may not simply be the absence of the causes of something happening. Knowing about these other possible causes of desired outcome not happening is surely important, in addition to and alongside knowing about how an intervention does cause the outcome. Knowing more about these causes might help other parties with other interventions in mind move cases with this experience from being True Negatives (4) to being False Negatives (3)

In summary, I think there is an argument for evaluators not being too myopic when they are thinking about Theories of Change they need to pay attention to.  It should not be all about testing the first (1) type of Theory of Change, and considering all the other possibility is simply as challengers, which may or may not then be dismissed  Each of those other types of theories (2-3-4) are important and useful in their own right and deserve attention.