Friday, February 28, 2020

Temporal networks: Useful static representations of dynamic events

.
I have just found out about the existence of a field of study called "temporal networks"  Here are two papers I came across

Linhares, C. D. G., Ponciano, J. R., Paiva, J. G. S., Travençolo, B. A. N., & Rocha, L. E. C. (2019). Visualisation of Structure and Processes on Temporal Networks. In P. Holme & J. Saramäki (Eds.), Temporal Network Theory (pp. 83–105). Springer International Publishing. https://doi.org/10.1007/978-3-030-23495-9_5
Li, A., Cornelius, S. P., Liu, Y.-Y., Wang, L., & Barabási, A.-L. (2017). The fundamental advantages of temporal networks. Science, 358(6366), 1042–1046. https://doi.org/10.1126/science.aai7488

Here is an example of a temporal network:
Figure 1


The x-axis represents intervals of time The y-axis represents six different actors. The curved lines represent particular connections between particular actors at particular moments of time. For example, email messages or phone calls.

In Figure 2 below, we can see a more familiar type of network structure. This is the same network as that shown in Figure 1. The difference is that it is an aggregation of all the interactions over the 24 time periods shown in Figure 1. The numbers in red refer to the number of times that each communication link was active in this whole period.

This diagram has both some strengths and weaknesses. Unlike Figure 1 it shows us the overall structure of interactions. On the other hand, it is obscuring the possible significance of variations in the sequence within which these interactions take place over time. In a social setting involving people talking to each other, the sequencing of when different people talk to each other could make a big difference to the final state of the relationships between the people in the network.

Figure 2
How might the Figure 1 way of representing temporal networks be useful?

The first would be as a means of translating narrative accounts of events into network models of those events. Imagine that the 24 time periods are a duration of time covered by events described in a novel. And events in periods 1 to 5 are described in one particular chapter of the novel. In a chapter, the story is all about the interactions between actors 2, 3 and 4. In subsequent chapters, their interactions with other actors are described.
Figure 3
Now, instead of a novel, imagine a narrative describing the expected implementation and effects of a particular development programme. Different stakeholders will be involved at different stages. Their relationships could be "transcribed" into a temporal network, and also then into a static network diagram (as in Figure 2) which would describe the overall set of relationships for the whole programme period.

The second possible use would be to adapt the structure of a temporal network model to convert it into a temporal causal network model. Such as shown in Figure 4 below. The basic structure would remain the same, with actors list row by row and time listed column by column. The differences would be that:

  1. The nodes in the network could be heterogeneous, reflecting different kinds of activities or events, undertaken/involved in by each actor. Not homogenous as in Figure 1 example above.
  2. The connections between activities/events would be causal, in one direction or in both directions. The latter signifying a two-way exchange of some kind. In Figure 1 causality may be possible and even implied, but it can't simply be assumed.
  3. There could also be causal links between activities within the same row, meaning an actor's particular at T1 influenced another of their activities in T3, for example. This possibility is not available in Figure 1 type model
  4. Some spacer" rows and columns are useful to give the node descriptions more and to make the connections between them more visible

Figure 4 is a stylised example. By this I mean I have not detailed the specifics of each event or characterised the nature of the connections between them. In a real-life example this would be necessary. Space limitations on the chart would necessitate very brief titles + reference numbers or hypertext links.
Figure 4: Stylised example
While this temporal causal network looks something a Gantt chart it is different and better.

  1. Each row is an about a specific actor, whereas in a Gantt chart each row is about a specific activity 
  2. Links between activities signal a form of causal influence , whereas in a Gantt chart they signal precedence which may or may not have causal implications
  3. Time periods can be more flexibly and abstractly defined, so long as they follow a temporal sequence. Whereas in a Gannt chart these are more likely to be defined in specific units like days, weeks or months, or specific calendar dates


How does a temporal causal network compare to more conventional representations of Theories of Change? Results chains versions of a Theory of Change do make use of a y-axis to represent time but are often much less clear about the actors involved in the various events that happen over time. Too often these describe what might be called a sequence of disembodied events i.e. abstract descriptions of key events. On the other hand, more network like Theories of Change can be better at identifying the actors involved in the relationships between them. But it is very difficult to also capture the time dimension in a static network diagram. Associated with this problem is the difficulty of then constructing any form of text narrative about the events described in the network.

One possible problem is whether measurable indicators could be developed for each activity that is shown. Another is how longer-term outcomes, happening over a period of time, might be captured. Perhaps the activities associated with their measurement would be what would be shown in a Figure 4 type model.

Postscript: The temporal dimension of network structures is addressed in dynamic network models, such as those captured in Fuzzy Cognitive Networks. With each iteration of a dynamic network model, the states of the nodes/events/actors in the network are updated according to the nature of the links they have with others in the network. This can lead to quite complex patterns of change in the network over time. But one of the assumptions built into such models is that all relationships are re-enacted in each iteration. This is clearly not the case in our social life. Some relationships are updated daily, others much less frequently. The kind of structure shown in Figure 1 above seems more appropriate view. But can these be used for simulation purposes, where all nodes would have values that are influenced by their relationships with each other?



Tuesday, November 05, 2019

Combining the use of the Confusion Matrix as a visualisation tool with a Bayesian view of probability


Caveat: This blog posting is total re-write of an earlier version on the same subject. Hopefully, this one will be more coherent and more useful!


Quick Summary
In this revised blog I:
1. Explain what a Confusion Matrix is and what Bayes Theorem says
2. Explain three possible uses for Bayes Theorem when combined with a Confusion Matrix

What is a Confusion Matrix?


A Confusion Matrix is a tabular structure that displays four possible combinations of two types of events, each of which may have happened, or not happened. Wikipedia provides a good description.

Here is an example, with real data, taken from an EvalC3 analysis.


    TP = True Positive, FP = False Positive, FN = False Negative, TN = True Negative

In this example, the top row of the table tells us that when the attributes of a particular predictive model (as was identified by EvalC3) are present there are 8 cases where the expected outcome is also present (True Positives). But there are also 4 cases where the expected outcome is not also present (False Positives). In all the remaining cases (all in the bottom row), which do not have the attributes of the predictive model, there is one case where the outcome is nevertheless present (False Negative) and 13 cases where the outcome is not present (True Negative). As can be seen in the Wikipedia article, and also in EvalC3, there is a range of performance measures which can be used to tell is how well this particular predictive model is performing – and all of these measures are based on particular combinations of the types of values in this Confusion Matrix.

Bayes theorem


According to Wikipedia, 'Bayes' theorem (alternatively Bayes's theorem, Bayes's law or Bayes's rule) describes the probability of an event, based on prior knowledge of conditions that might be related to the event '

The Bayes formula reads as follows:

P(A|B) = The probability of A, given the presence of B
P(B|A) = The probability of B, given the presence of A
P(A) = The probability of A
P(B) = The probability of B

This formula can be calculated using data represented within a Confusion Matrix. Using the example above, the outcome being present = A in the formula, and the model attributes being present = B in the formula.  So this formula could tell us the probability of finding True Positives i.e when these are both present. Here is how the various parts of the formula are calculated, along with some alternate names for their parts:


So far, in this blog posting, the Bayes formula simply provides one additional means of evaluating the usefulness of prediction models found through the use of machine learning algorithms, using EvalC3 or other means.

Process Tracing application


But I'm more interested here in the use of the Bayes formula for process tracing purposes, something that Barbara Befani has written about. Process tracing is all about articulating and evaluating conjectured causal processes in detail. A process tracing analysis is usually focused on one case or instance, not multiple cases. It is a within-case rather than cross-case method of analysis. 

In this context, the rows and columns of the Confusion Matrix have slightly different names. The columns described whether a theory is true or not, and the rows described whether evidence of a particular kind is present or not. More importantly, the values in the cells are not numbers of actual observed cases.  Rather, they are the analyst's interpretation of what is described as the "conditional probabilities" of what is happening in one case or instance.  In the two cells in the first column, the analyst puts their own probability estimates, between zero and one, reflecting the likelihood: (a) that if the evidence is present in the theory is true, that a man was the murderer, and (B) that if the evidence is absent the theory is true. In the two cells in the second column, the analyst puts their own probability estimates, between one and two, reflecting the likelihood: (a) that if the evidence is theory is not true, that a man was not the murderer, and (B) that if the evidence is absent the theory is not true. 

Here is a notional example. The theory is that a man was the murderer. The available evidence suggests that the murderer would have needed exceptional strength.
The analyst also needs to enter their "priors". That is, their belief about the overall prevalence of the theory being true i.e. men are most often the murderers. Wikipedia suggests that 80% of murders are committed by men.These prior probabilities are entered in the third row of the Confusion Matrix, as shown below. The main cell values are then updated in the light of those new value, as also shown below

Using the Bayes formula provided above, we can now calculate P(A|B),i.e. man being the murderer if the evidence x was found..  P(A|B) = TP/(TP+FP) = 0.97

"Naive Bayes" 


This is another useful application, based on an algorithm of this name, described here. 
On that web page, an example is given of a data set where each row describes three attributes of a car (color, type and origin) and whether the car was stolen or not. Predictive models (Bayesian or otherwise)  could be developed to identify how well each of these attributes predicts whether a car is stolen or not. In addition, we may want to know how good a predictor the combination of all these three individual predictors is. But the dataset does not include any examples of these types of cases.

The article then explains how the probability of a combination of all three of these attributes can be used to predict whether a car is stolen or not.

1. Calculate (TP/(TP+FP)) for color * (TP/(TP+FP)) for type * (TP/(TP+FP)) for origin 
2. Calculate (FP/(TP+FP)) for color * (FP/(TP+FP)) for type * (FP/(TP+FP)) for origin
3. Compare the two calculated values. If the first is higher classify a car as most likely stolen. If it is lower, classify a car as most likely not stolen.

A caution: Naive Bayes calculations assume (as its name suggests) that each of the attributes of the predictive model is causally independent. This may not always be the case.

In summary


Bayes formula seems to have three uses:

1. As an additional performance measure when evaluating predictive models generated by any algorithm, or other means. Here the cell values do represent numbers of individual cases.

2.  As a way of measuring the probability of a particular causal mechanism working as proposed, within the context of a process-tracing exercise. Here the cell values are conjectures about relative probabilities relating to a specific case, not numbers of individual cases.

3.  As a way of measuring the probability of a combination of predictive models being a good predictor of an outcome of concern. Here the cell values could represent either multiple real cases or conjectured probabilities (part of a Bayesian analysis of a causal mechanism) regarding events within one case only.






Saturday, October 19, 2019

On finding the weakest link...



Last week I read and responded to a flurry of email exchanges that were prompted by Jonathan Morell circulating a think piece titled 'Can Knowledge of Evolutionary Biology and Ecology Inform Evaluation?". Putting aside the details of the subsequent discussions, many of the participants were in agreement with the idea that evaluation theory and practice could definitely benefit by more actively seeking out relevant ideas from other disciplines.

So when I was reading Tim Harford's column in this weekend's Financial Times, titled 'The weakest link in the strong Nobel winner 'I was very interested in this section:
Then there’s Prof Kremer’s O-ring Theory of Development, which demonstrates just how far one can see from that comfortable armchair. The failure of vulnerable rubber “O-rings” destroyed the Challenger space shuttle in 1986; Kremer borrowed that image for his theory, which — simply summarised — is that for many production processes, the weakest link matters.
Consider a meal at a fancy restaurant. If the ingredients are stale, or the sous-chef has the norovirus, or the chef is drunk and burns the food, or the waiter drops the meal in the diner’s lap, or the lavatories are backing up and the entire restaurant smells of sewage, it doesn’t matter what else goes right. The meal is only satisfactory if none of these things go wrong.
As you will find when you do a Google search to find out more information about the O-ring Theory of Development, you will find there is a lot more to the theory than this, much of it very relevant to evaluators.  Prof Kremer is an economist, by the way.

This quote was of interest to me because in the last week I have been having discussions with a big agency in London about how to go ahead with an evaluation of one of their complex programs. By complex, in this instance, I mean a program that is not easily decomposable into multiple parts – where it might otherwise be possible to do some form of cross-case analysis, using either observational data or experimental data. We have been talking about strategies for identifying multiple alternative causal pathways that might be at work, connecting the program's interventions with the outcomes it is interested in. I'll be reporting more on this in the near future, I hope.

But let's go right now to a position a bit further along, where an evaluation team has identified which causal pathway (s) are most valuable/plausible/relevant. In those circumstances, particularly in a large complex program, the causal pathway itself could be quite long, with many elements or segments. This in itself is not a bad thing, because the more segments there are in a causal pathway that can be examined then the more vulnerable to disproof the theory about that causal pathway is – which in principle is a good thing – especially if the theory is not disproved – it means it's a pretty good theory. But on the other hand, a causal pathway with many segments or steps pose a problem for an evaluation team, in terms of where they are going to allocate their resource-limited attention.

What I like about the paragraph from Tim Harford's column is the sensible advice that it provides to an evaluation team in this type of context. That is, look first for the weakest link in the causal pathway. Of course, that does raise a question of what we mean by the weakest link. A link may be weak in terms of its verifiability or its plausibility, or in other ways. My inclination at this point would be to focus on the weakest link in terms of plausibility. Your thoughts on this would be appreciated. How one would go about identifying such weak links would also need attention. Two obvious choices would be either to use expert judgement or different stakeholders perspectives on the question. Or probably better, a combination of both.

Postscript: I subsequently discovered some other related musings:


.

Wednesday, October 02, 2019

Participatory design of network models: Some implications for analysis



I recently had the opportunity to view a presentation by Luke Craven. You can see it here on YouTube:https://www.youtube.com/watch?v=TxmYWwGKvro

Luke has developed an impressive software application as a means of doing what he calls a 'Systems Affects 'analysis. I would describe it as a particular form of participatory network modelling. The video is well worth watching. There is some nice technology at work within this tool. For example,see how a text search algorithms can facilitate the process of coding a diversity of responses by participants into a smaller subset of usable categories. In this case, descriptions of different types of causes and effects at work.

In this blog, I want to draw your attention to one part of the presentation, which is in matrix form which I have copied below. (Sorry for the poor quality, it's a copy of a YouTube screen)


In social network analysis jargon this is called an "adjacency matrix". Down the left-hand side is a list of different causal factors identified by survey respondents. This list is duplicated across the top row. The cell values refer to the number of times respondents have mentioned the row factor being a cause of the column factor.

This kind of data can easily be imported into one of many different social network analysis visualisation software packages, as is pointed out by Luke in his video (I use Ucinet/NetDraw). When this is done it is possible to identify important structural features. Such as some causal factors having much higher 'betweenness centrality'. Such factors will be at the intersection of multiple causal paths. So, in an evaluation context, they are likely to be well worth investigating. Luke explores the significance of some of these structural features in his video.

In this blog, I want to look at the significance of the values in the cells of this matrix, and how they might be interpreted. At first glance, one could see them as measures of the strength of a causal connection between 2 factors mentioned by a respondent. But there are no grounds for making that interpretation. It is much better to interpret those values as a description of the prevalence of that causal connection. A particular cause might be found in many locations/in the lives of many respondents, but in each setting, it might still only be a relatively minor influence compared to others that are also present there.

Nevertheless, I think a lot can still be Done with this prevalence information. As I explained in a recent blog about the analysis of QuIP data we can add additional data to the adjacency matrix in a way that will make it much more useful. This involves 2 steps. Firstly, we can generate column and row summary figures, so that we can identify: (a) the total number of times a column factor has been mentioned, (b) the total number of times a row factor has been mentioned.  Secondly, we can use those new values to identify how often a row cause factor has been present but a column effect factor has not been and vice versa.  I will explain this in detail with the help of this imaginary example of the use of a type of table known as a confusion matrix. (For more information about the Confusion Matrix see this Wikipedia entry).
In this example 'increased price of livestock 'is one of the causal factors listed amongst others on the left side of an adjacency matrix of the kind shown above. And 'increased income 'is one of the effect factors of the kind listed amongst others across the top row in the kind of matrix shown above. In the green cell the 11 refers to a number of causal connections respondents have identified between the two factors. This number would be found in in a cell of an adjacency matrix, which links the row factor with a column factor.

The values in the blue cells of the confusion matrix are the respective road total and column total. Knowing the green and blue values we can then calculate the yellow values.  The number 62 refers to the incidence of all the other possible causal factors listed down the left side of a matrix. And a number 2 refers the incidence of all the other possible effects listed across the top of the matrix.
PS: In Confusion Matrix jargon the green cell is referred to as a True Positive, the yellow cell with the 2 as a False Positive, and yellow cell with a 62 as a False Negative. The blank cell is known as a True Negative.

Once we have this more complete information we can then do simple analyses that can tell us just how important, or not so important, the 11 mentions of the relationship between this cause and effect are. ( I will duplicate some of what I've said in the previous post here) For example, if the value of 2 was in fact a value of 0 this would be telling us that the presence of an increased price of livestock was sufficient for the outcome of increased income to be present. However, the value of 62, would be telling us that while the increased price of livestock is sufficient it is not necessary for increased income. In fact, most of the cases have increased income arises from other causal factors.

Alternatively, we can imagine the value of 62 is now zero while the value of 2 is still present. In this situation, this would be telling us that an increased price of livestock is necessary for increased income. There are no cases where income increased income has arisen in the absence of increased price of livestock. But it may not always be sufficient. If the value 2 is still there it is telling us that in some cases although the increased price of livestock is necessary it is not sufficient. Some other factor is missing or obstructing things and causing the outcome of increased income not to occur.

Alternatively, we can imagine that the value 2 is now much higher, say 30. In this context, the increased price of livestock is neither necessary or sufficient for the outcome. And in fact, more often than not it is an incorrect predictor, and is only present in a small proportion of all the cases where there is increased income. The point being made here is that the value in the TruePositive cell (11) has no significance unless it is seen in the context of the other values in the Confusion Matrix. Looking back at the big matrix at the top of this blog we can't interpret the significance of the cell values on their own.

So far this discussion has not taken us much further than discussion in the previous blog. In that blog, I ended with the concern that while we could identify the relative importance of individual causal factors in this sort of one-to-one analysis we couldn't do the more interesting type of configurational analyses, where we might identify the relative importance of different combinations of causal factors.

I now think it may be a possibility. If we look back at the matrix at the top of this blog we can imagine that there is in fact a stack such matrices one sitting above the other. And each of those matrices represents one respondent's responses. And the matrix at the bottom is a kind of summary matrix, where the individual cells are totals of the values of all the cells sitting immediately above them in the other matrices.

From each individual's matrix we could extract a string of data telling us which of each of the causal factors have been reported as present (1) or absent (0) and whether particular outcome/effect of interest was reported as present (1) or absent (0). Each of those strings can be listed as a 'case ' in the kind of data set used in predictive modelling. In those datasets, each row represents a case, and each column represents an attribute of those cases, plus the outcome of interest.

Using EvalC3, an Excel predictive modelling app, it would then be possible to identify one or more configurations i.e. combinations of reported attribute/causes which are good predictors of the reported effect/outcome.

Caveat: There are in fact 2 options in the kinds of strings of data that could be extracted from the individuals' matrices. One would list whether the 'cause' attributes were mentioned as present, or not, at all. The other would only list the cause attribute is present or not, specifically in relation with the effect/outcome of interest.

Sunday, June 09, 2019

Extracting additional value from the analysis of QuIP data



James Copestake, Marlies Morsink and Fiona Remnant are the authors of "Attributing Development Impact: The Qualitative Impact Protocol Casebook" published this year by Practical Action
As the title suggests, the book is all about how to attribute development impact, using qualitative data - an important topic that will be of interest to many. The book contents include:
  • Two introductory chapters  
    • 1. Introducing the causal attribution challenge and the QuIP | 
    • 2. Comparing the QuIP with other approaches to development impact evaluation 
  • Seven chapters describing case studies of its use in Ethiopia, Mexico, India, Uganda, Tanzania, Eastern Africa, and England
  • A final chapter synthesising issues arising in these case studies
  • An appendix detailing guidelines for QuIP use

QuIP is innovative in many respects. Perhaps most notably, at first introduction, in the way data is gathered and coded. Neither the field researchers or the communities of interest are told which organisation's interventions are of interest, an approach known as "double blindfolding".  The aim here is to mitigate the risk of "confirmation bias", as much as is practical in a given context.

The process of coding the qualitative data that is collected is also interesting. The focus is on identifying causal pathways in the qualitative descriptions of change obtained through one to one interviews and focus group discussions. In Chapter One the authors explain how the QuIP process uses a triple coding approach, which divides each reported causal pathway into  three elements:

  • Drivers of change (causes). What led to change, positive or negative?
  • Outcomes (effects) What change/s occurred, positive or negative?
  • Attribution: What is the strength of association between the causal claim and the activity or project being evaluated
"Once all change data is coded then it is possible to use frequency counts to tabulate and visualise the data in many ways, as the chapters that follow illustrate". An important point to note here is that although text is being converted to numbers, because of the software which has been developed, it is always possible to identify the source text for any count that is used. And numbers are not the only basis which conclusions are reached about what has happened, the text of respondents' narratives are also very important sources.

That said, what interests me most at present are the emerging options for the analysis of the coded data. Data collation and analysis was initially based on a custom-designed Excel file, used because 99% of evaluators and program managers are already familiar with the use of Excel. However, more recently, investment has been made in the development of a customised version of MicroStrategy, a free desktop data analysis and visualization dashboard. This enables field researchers and evaluation clients to "slice and dice" the collated data in many different ways, without risk of damaging the underlying data, and its use involves a minimal learning curve. One of the options within MicroStrategy is to visualise the relationships between all the identified drivers and outcomes, as a network structure. This is of particular interest to me, and is something that I have been exploring with the QuIP team and with Steve Powell, who has been working with them

The network structure of driver and outcome relationships 


One way QuIP coded data has been tabulated is in the form of a matrix, where rows = drivers and columns = outcomes and cell values = incidence of reports of the connection between a given row and a given column (See tables on pages 67 and 131). In these matrices, we can see that some drivers affect multiple outcomes and some outcomes are affected by multiple drivers. By themselves, the contents of these matrices are not easy to interpret, especially as they get bigger. One matrix provided to me had 95 drivers, 80 outcomes and 254 linkages between these. Some form of network visualisation is an absolute necessity.

Figure 1 is a network visualisation of the 254 connections between drivers and outcomes. The red nodes are reported drivers and the blue nodes are the outcomes that they have reportedly led to. Green nodes are outcomes that in turn have been drivers for other outcomes (I have deliberately left off the node labels in this example). While this was generated using Ucinet/Netdraw, I noticed that the same structure can also be generated by the MicroStrategy.


Figure 1

It is clear from a quick glance that there is still more complexity here than can be easily made sense of. Most notably in the "hairball" on the right.

One way of partitioning this complexity is to focus in on a specific "ego network" of interest. An ego network is a combination of (a) an outcome plus (b) all the other drivers and outcomes it is immediately linked to,  plus (c) the links between those. MicroStartegy already provides (a) and (b) but probably could be tweaked to also provide (c). In Ucinet/Netdraw it is also possible to define the width of the ego network, i.e. how many links out from ego to collect connections to and between  "alters". Here in Figure 2 is one ego network that can be seen in the dense cluster in Figure 1

Figure 2


Within this selective view, we can see more clearly the different causal pathways to an outcome. There are also a number of feedback loops here, between pairs (3) and larger groups of outcomes (2).

PS: Ego networks can be defined in relation to a node which represents a driver or an outcome. If one selected a driver as the ego in the ego network then the resulting network view would provide an "effects of a cause" perspective. Whereas if one selected an outcome as the ego, then the resulting network view would provide a "causes of an outcome" perspective.

Understanding specific connections


Each of the links in the above diagrams, and in the source matrices, have a specific "strength" based on a choice of how that relationship was coded. In the above example, these values were "citation counts,  meaning one count per domain per respondent. Associated with each of these counts are the text sources, which can shed more light on what those connections meant.

What is missing from the above diagrams is numerical information about the nodes, i.e. the frequency of mentions of drivers and outcomes. The same is the case for the tabulated data examples in the book (pages 67, 131). But that data is within reach.

Here in Figure 7 and 8 is a simplified matrix. and associated network diagram, taken from this publication: QuIP and the Yin/Yang of Quant and Qual: How to navigate QuIP visualisations"

In Figure 7 I  have added row and column summary values in red, and copied these in white on to the respective nodes in Figure 8. These provide values for the nodes, as distinct from the connections between them.


Why bother? Link strength by itself is not all that meaningful. Link strengths need to be seen in context, specifically: (a) how often the associated driver was reported at all, and (b) how often the associated outcome was reported at all.  These are the numbers in white that I have added to the nodes in the network diagram above.

Once this extra information is provided we can insert it into a Confusion Matrix and use it to generate two items of missing information: (a) the number of False Positives, in the top right cell, and (b) the number of False Negatives (in the bottom left cell). In Figure 3, I have used Confusion Matrices to describe two of the links in the Figure 8 diagram.


Figure 3
It now becomes clear that there is an argument for saying that  the link with a value of 5, between "Alternative Income" and "Increased income" is more important than the link with the value of 14 between "Rain/recovering from drought" and "Healthier livestock"

The reason? Despite the first link looking stronger (14 versus 5) there is more chance that the expected outcome will occur when the second driver is present. With the first driver the expected outcome only happens 14 of 33 times. But with the second driver the expected outcome happens 5 of the 7 times.

When this analysis is repeated for all the links where there is data (6 in Figure 8 above), it turns out that only two links are of this kind, where the outcome is more likely to be present when the driver is present. The second one is the link between Increased price of livestock and
Increased income, as shown in the Confusion Matrix in Figure 4 below

Figure 4
There are some other aspects of this kind of analysis worth noting.  When "Increased price of livestock" is compared to the one above (Alternative income...),  it accounts for a bigger proportion of cases where the outcome is reported i.e. 11/73 versus 5/73.

One can also imagine situations where the top right cell (False Positive)  is zero. In this case, the driver appears to be sufficient for the outcome i.e. where it is present the outcome is present. And one can imagine situations where the bottom left cell (False Negative) is zero. In this case, the driver appears to be necessary for the outcome i.e. where it is not present the outcome is also not present.


Filtered visualisations using Confusion Matrix data



When data from a Confusion Matrix is available, this provides analysts with additional options for generating filtered views of the network of reported causes. These are:

  1. Show only those connected drivers which seem to account for most instances of a reported outcome. I.e. the number of True Positives (in the top left cell) exceeds the number of False negatives (in the bottom left cell)
  2. Show only those connected drivers which are more often associated with instances of a reported outcome (the True Positive in the top left cell), than its absence (the false Positive, in the top right cell). 

Drivers accounting for the majority of instances of the reported outcome


Figure 5 is a filtered version of the blue, green and red network diagram shown in Figure 1 above. A filter has retained links where the True Positive value in the top left cell of the Confusion Matrix (i.e. the link value) is greater than the associated False Negative value in the bottom left cell. This presents a very different picture to the one in Figure 1.

Figure 5


Key: Red nodes = drivers, Blue nodes = outcomes, Green nodes = outcomes that were also in the role of drivers.

Drivers more often associated with the presence of a reported outcome than its absence


A filter has retained links where the True Positive value in the top left cell of the Confusion Matrix (i.e. the link value) is greater than the associated False Positive value in the top right cell. 

Figure 6

Other possibilities


While there are a lot of interesting possibilities as to how to analyse QuIP data , one option does not yet seem available. That is the possibility of identifying instances of "configurational causality" By this, I mean packages of causes that must be jointly present for an outcome to occur. When we look at the rows in Figure 7 it seems we have lists of single causes, each of which can account for some of the instances of the outcome of interest. And when we look at Figures 2 and 8 we can see that there is more than one way of achieving an outcome. But we can't easily identify any "causal packages" that might be at work.

I am wondering to what extent this might be a limitation built into the coding process. Or, if better use of existing coded information might work. Perhaps the way the row and column summary values are generated in Figure 7 needs rethinking.

Looking at the existing network diagrams these provide no information about which connections were reported by whom. In Figure 2, take the links into "Increased income" from "Part of organisation x project" and "Social Cash Transfer (Gov)"  Each of these links could have been reported by a different set of people, or they could have been reported by the same set of people. If the latter, then this could be an instance of "configurational causality" . To be more confident we would need to establish that people who reported only one of the two drivers did not also report the outcome.

Because all QuIP coded values can be linked back to specific sources and their texts, it seems that this sort of analysis should be possible. But it will take some programming work to make this kind of analysis quick and easy.

PS 1: Actually maybe not so difficult. All we need is a matrix where:

  • Rows = respondents
  • Columns = drivers & outcomes mentioned by respondents
  • Cell values =  1/0 mean that row respondent did or did not mention that column driver or outcome
Then use QCA, or EvalC3, or other machine learning software, to find predictable associations between one or more drivers and any outcome of interest. Then check these associations against text details of each mention to see if a causal role is referred to and plausible.

That said, the text evidence would not necessarily provide the last word (pun unintended). It is possible that a respondent may mention various driver-outcome relationships e.g. A>B, C>D, and A>E but not C>E. Yet, when analysing data from multiple respondents we might find a consistent co-presence of references to C and E (though no report of actual causal relations between them). The explanation may simply be that in the confines of a specific interview there was not time or inclination to mention this additional specific relationship.

In response...

James Copestake has elaborated on this final section as follows "We have discussed this a lot, and I agree it is an area in need of further research. You suggest that there may be causal configurations in the source text which our coding system is not yet geared up to tease out. That may be true and is something we are working on. But there are two other possibilities. First, that interviewers and interviewing guidelines are not primed as much as they could be to identify these. Second, respondents narrative habits (linked to how they think and the language at their disposal) may constrain people from telling configurational stories. This means the research agenda for exploring this issue goes beyond looking at coding"

PS: Also of interest: Attributing development impact: lessons from road testing the QuIP. James Copestake, January 2019