Saturday, October 19, 2019

On finding the weakest link...

Last week I read and responded to a flurry of email exchanges that were prompted by Jonathan Morell circulating a think piece titled 'Can Knowledge of Evolutionary Biology and Ecology Inform Evaluation?". Putting aside the details of the subsequent discussions, many of the participants were in agreement with the idea that evaluation theory and practice could definitely benefit by more actively seeking out relevant ideas from other disciplines.

So when I was reading Tim Harford's column in this weekend's Financial Times, titled 'The weakest link in the strong Nobel winner 'I was very interested in this section:
Then there’s Prof Kremer’s O-ring Theory of Development, which demonstrates just how far one can see from that comfortable armchair. The failure of vulnerable rubber “O-rings” destroyed the Challenger space shuttle in 1986; Kremer borrowed that image for his theory, which — simply summarised — is that for many production processes, the weakest link matters.
Consider a meal at a fancy restaurant. If the ingredients are stale, or the sous-chef has the norovirus, or the chef is drunk and burns the food, or the waiter drops the meal in the diner’s lap, or the lavatories are backing up and the entire restaurant smells of sewage, it doesn’t matter what else goes right. The meal is only satisfactory if none of these things go wrong.
As you will find when you do a Google search to find out more information about the O-ring Theory of Development, you will find there is a lot more to the theory than this, much of it very relevant to evaluators.  Prof Kremer is an economist, by the way.

This quote was of interest to me because in the last week I have been having discussions with a big agency in London about how to go ahead with an evaluation of one of their complex programs. By complex, in this instance, I mean a program that is not easily decomposable into multiple parts – where it might otherwise be possible to do some form of cross-case analysis, using either observational data or experimental data. We have been talking about strategies for identifying multiple alternative causal pathways that might be at work, connecting the program's interventions with the outcomes it is interested in. I'll be reporting more on this in the near future, I hope.

But let's go right now to a position a bit further along, where an evaluation team has identified which causal pathway (s) are most valuable/plausible/relevant. In those circumstances, particularly in a large complex program, the causal pathway itself could be quite long, with many elements or segments. This in itself is not a bad thing, because the more segments there are in a causal pathway that can be examined then the more vulnerable to disproof the theory about that causal pathway is – which in principle is a good thing – especially if the theory is not disproved – it means it's a pretty good theory. But on the other hand, a causal pathway with many segments or steps pose a problem for an evaluation team, in terms of where they are going to allocate their resource-limited attention.

What I like about the paragraph from Tim Harford's column is the sensible advice that it provides to an evaluation team in this type of context. That is, look first for the weakest link in the causal pathway. Of course, that does raise a question of what we mean by the weakest link. A link may be weak in terms of its verifiability or its plausibility, or in other ways. My inclination at this point would be to focus on the weakest link in terms of plausibility. Your thoughts on this would be appreciated. How one would go about identifying such weak links would also need attention. Two obvious choices would be either to use expert judgement or different stakeholders perspectives on the question. Or probably better, a combination of both.

Postscript: I subsequently discovered some other related musings:


Wednesday, October 02, 2019

Participatory design of network models: Some implications for analysis

I recently had the opportunity to view a presentation by Luke Craven. You can see it here on YouTube:

Luke has developed an impressive software application as a means of doing what he calls a 'Systems Affects 'analysis. I would describe it as a particular form of participatory network modelling. The video is well worth watching. There is some nice technology at work within this tool. For example,see how a text search algorithms can facilitate the process of coding a diversity of responses by participants into a smaller subset of usable categories. In this case, descriptions of different types of causes and effects at work.

In this blog, I want to draw your attention to one part of the presentation, which is in matrix form which I have copied below. (Sorry for the poor quality, it's a copy of a YouTube screen)

In social network analysis jargon this is called an "adjacency matrix". Down the left-hand side is a list of different causal factors identified by survey respondents. This list is duplicated across the top row. The cell values refer to the number of times respondents have mentioned the row factor being a cause of the column factor.

This kind of data can easily be imported into one of many different social network analysis visualisation software packages, as is pointed out by Luke in his video (I use Ucinet/NetDraw). When this is done it is possible to identify important structural features. Such as some causal factors having much higher 'betweenness centrality'. Such factors will be at the intersection of multiple causal paths. So, in an evaluation context, they are likely to be well worth investigating. Luke explores the significance of some of these structural features in his video.

In this blog, I want to look at the significance of the values in the cells of this matrix, and how they might be interpreted. At first glance, one could see them as measures of the strength of a causal connection between 2 factors mentioned by a respondent. But there are no grounds for making that interpretation. It is much better to interpret those values as a description of the prevalence of that causal connection. A particular cause might be found in many locations/in the lives of many respondents, but in each setting, it might still only be a relatively minor influence compared to others that are also present there.

Nevertheless, I think a lot can still be Done with this prevalence information. As I explained in a recent blog about the analysis of QuIP data we can add additional data to the adjacency matrix in a way that will make it much more useful. This involves 2 steps. Firstly, we can generate column and row summary figures, so that we can identify: (a) the total number of times a column factor has been mentioned, (b) the total number of times a row factor has been mentioned.  Secondly, we can use those new values to identify how often a row cause factor has been present but a column effect factor has not been and vice versa.  I will explain this in detail with the help of this imaginary example of the use of a type of table known as a confusion matrix. (For more information about the Confusion Matrix see this Wikipedia entry).
In this example 'increased price of livestock 'is one of the causal factors listed amongst others on the left side of an adjacency matrix of the kind shown above. And 'increased income 'is one of the effect factors of the kind listed amongst others across the top row in the kind of matrix shown above. In the green cell the 11 refers to a number of causal connections respondents have identified between the two factors. This number would be found in in a cell of an adjacency matrix, which links the row factor with a column factor.

The values in the blue cells of the confusion matrix are the respective road total and column total. Knowing the green and blue values we can then calculate the yellow values.  The number 62 refers to the incidence of all the other possible causal factors listed down the left side of a matrix. And a number 2 refers the incidence of all the other possible effects listed across the top of the matrix.
PS: In Confusion Matrix jargon the green cell is referred to as a True Positive, the yellow cell with the 2 as a False Positive, and yellow cell with a 62 as a False Negative. The blank cell is known as a True Negative.

Once we have this more complete information we can then do simple analyses that can tell us just how important, or not so important, the 11 mentions of the relationship between this cause and effect are. ( I will duplicate some of what I've said in the previous post here) For example, if the value of 2 was in fact a value of 0 this would be telling us that the presence of an increased price of livestock was sufficient for the outcome of increased income to be present. However, the value of 62, would be telling us that while the increased price of livestock is sufficient it is not necessary for increased income. In fact, most of the cases have increased income arises from other causal factors.

Alternatively, we can imagine the value of 62 is now zero while the value of 2 is still present. In this situation, this would be telling us that an increased price of livestock is necessary for increased income. There are no cases where income increased income has arisen in the absence of increased price of livestock. But it may not always be sufficient. If the value 2 is still there it is telling us that in some cases although the increased price of livestock is necessary it is not sufficient. Some other factor is missing or obstructing things and causing the outcome of increased income not to occur.

Alternatively, we can imagine that the value 2 is now much higher, say 30. In this context, the increased price of livestock is neither necessary or sufficient for the outcome. And in fact, more often than not it is an incorrect predictor, and is only present in a small proportion of all the cases where there is increased income. The point being made here is that the value in the TruePositive cell (11) has no significance unless it is seen in the context of the other values in the Confusion Matrix. Looking back at the big matrix at the top of this blog we can't interpret the significance of the cell values on their own.

So far this discussion has not taken us much further than discussion in the previous blog. In that blog, I ended with the concern that while we could identify the relative importance of individual causal factors in this sort of one-to-one analysis we couldn't do the more interesting type of configurational analyses, where we might identify the relative importance of different combinations of causal factors.

I now think it may be a possibility. If we look back at the matrix at the top of this blog we can imagine that there is in fact a stack such matrices one sitting above the other. And each of those matrices represents one respondent's responses. And the matrix at the bottom is a kind of summary matrix, where the individual cells are totals of the values of all the cells sitting immediately above them in the other matrices.

From each individual's matrix we could extract a string of data telling us which of each of the causal factors have been reported as present (1) or absent (0) and whether particular outcome/effect of interest was reported as present (1) or absent (0). Each of those strings can be listed as a 'case ' in the kind of data set used in predictive modelling. In those datasets, each row represents a case, and each column represents an attribute of those cases, plus the outcome of interest.

Using EvalC3, an Excel predictive modelling app, it would then be possible to identify one or more configurations i.e. combinations of reported attribute/causes which are good predictors of the reported effect/outcome.

Caveat: There are in fact 2 options in the kinds of strings of data that could be extracted from the individuals' matrices. One would list whether the 'cause' attributes were mentioned as present, or not, at all. The other would only list the cause attribute is present or not, specifically in relation with the effect/outcome of interest.