Monday, June 14, 2021

Paired case comparisons as an alternative to a configurational analysis (QCA or otherwise)

[Take care, this is still a working draft!]

The challenge

The other day I was asked for some advice on how to implement a QCA type of analysis within an evaluation that was already fairly circumscribed in its design. Circumscribed both by the commissioner and by the team proposing to carry out the evaluation. The commissioner had already indicated that they wanted a case study orientated approach and had even identified the maximum number of case studies that they wanted to see (ten) .  While the evaluation team could see the potential use of a QCA type analyses they were already committed to undertaking a process type evaluation, and did not want a QCA type analyses to dominate their approach. In addition, it appeared that there already was a quite developed conceptual framework that included many different factors which might be contribute causes to the outcomes of interest.

As is often the case, there seemed to be a shortage of cases and an excess of potentially explanatory variables. In addition, there were doubts within the evaluation team as to whether a thorough QCA analysis would be possible or justifiable given the available resources and priorities.

Paired case comparisons as the alternative

My first suggestion to the evaluation team was to recognise that there is some middle ground between across-case analysis involving medium to large numbers of cases, and a within-case analysis. Typically, a QCA analysis will use both, going back and forth, using one to inform the other, over a number of iterations.. The middle ground between these two options is case comparisons – particularly comparisons of pairs of cases. Although in the situation described above there will be a maximum of 10 cases that can be explored, the number of pairs of these cases that can be compared is still quite big (45).  With these sort of numbers some sort of strategy is necessary for making choices about the types of pairs of cases that will be compared. Fortunately there is already a large literature on case selection. My favourite summary is the one by  Gerring, J., & Cojocaru, L. (2015). Case-Selection: A Diversity of Methods and Criteria. 

My suggested approach was to use what is known as the Confusion Matrix as the basis for structuring the choice of cases to be compared.  A Confusion Matrix is a simple truth table, showing a combination of two sets of possibilities, for example as follows:


Inside the Confusion Matrix are four types of cases: 
  1. True Positives where there are cases with attributes that fit my theory and where the expected outcome is present
  2. False Positives, where there are cases with attributes that fit my theory but where the expected outcome is absent
  3. False Negatives, where there are cases which do not have attributes that fit my theory but where nevertheless the outcome is present
  4. True Negatives, where there are cases which do not have attributes that fit my theory and where the outcome is absent as expected
Both QCA and supervised machine learning approaches are good at identifying individual (or packages of)  attributes which are good predictors of when outcomes are present or when they are absent – in other words where there are large number of True Positive and True Negative cases. And the exceptions, the False Positive and False Negatives. But this type of cross case-based led analysis do not seem to be available as an option to the evaluation team I have mentioned above.

1. Starting with True Positives

So my suggestion has been to look at the 10 cases that they have at hand, and start by focusing in on those cases where the outcome is present. Imagine there are 5. And then to start by looking at one of these. When examining that case they should identify one or more attributes which they think is the most likely explanation for the outcome being present. So please note here that this initial theory is coming from a single within-case analysis, not the cross-case analysis. The evaluation team will now have a single case in the category of True Positive. 

2. Comparing False Positives and True Positives

The next step in the analysis is to identify at least one most relevant case which can be provisionally described as a False Positive.. This False Positive case should be one that is as similar as possible in all its attributes to the True Positive case, with the obvious exception of the outcome not being present.  This type of analysis choice is called MSDO, standing for most similar design, different outcome - see the de Meur reference below.  Also see below on how to measure similarity.

The aim here is to find how the causal mechanisms at work differ. One way to explore this question is to look for an attribute that is present in the True Positive case but absent in the False Negative case, despite those cases otherwise being most similar.  Or, an attribute that is absent in the True Positive but present in the False Negative case. In the former case the missing case could be seen as a kind of enabling factor, whereas in the latter case it could be seen as more like a blocking factor.  If nether can be found by comparison of coded attributes of the cases then a more intensive examination of raw data on the case might still identify them, and lead to an updated/elaboration of theory behind the True Positive case. Alternately, that examination might suggest measurement error is the problem and that the False Positive case needs to be reclassified as True Positive.

3. Comparing False Negatives and True Positives

The third step in the analysis is to identify at least one most relevant case which can be described as a False Negative.  This False-Negative case should be one that is as different as possible in all its attributes to the True Positive case. This type of analysis choice is called MDSO, standing for most different design, same outcome. 

 The aim here should be to try to identify if the same or different causal mechanisms is at work,  when compared to that seen in the True Positive case. One way to explore this question is to look for one or more attributes that both the True Positive and False Negative case have in common, despite otherwise being "most different". If found, and if associated with the causal theory in the True Positive case,  then the False Negative case can now be reclassed as a True Positive. The theory describing the now two True Positive cases can now be seen as provisionally "necessary"for the outcome, until another False Negative case is found and examined in a similar fashion.If the casual mechanism seems to be different then the case remains as a False Negative.

Both the second and third step comparisons described above will help : (a0 elaborate the details, and (b) establish the limits of the scope of the theory identified in step one. This suggested process makes use of the Confusion Matrix as a kind of very simple chess board, where pieces (aka cases) are introduced on to the board, one at a time, and then sometimes moved to other adjacent positions (depending on their relation to other pieces on the board).Or, the theory behind their chosen location is updated.

If there are only ten cases available to study, and these have an even distribution of outcomes present and absent, then this three step process of analysis could be reiterated five times i.e. once for each case where the outcome was present. Thus involving  up to 10 case comparisons, out of the 45 possible.

Measuring similarity

The above process depends on the ability to make systematic and transparent judgements about similarity. One way of doing this, which I have previously built into an Excel app called EvalC3, is to start by describing each case with a string of binary coded attributes of the same kind as used in QCA, and in some forms of supervised machine learning. An example set of workings can be seen in this Excel sheet, showing  an imagined data set of 10 cases, with 10 different attributes and then the calculation and use of  Hamming Distance as the similarity measure to chose cases for the kinds of comparisons described above. That list of attributes and the Hamming distance measure, is likely to  need to be updated, as the investigation of False Positives and False Negatives proceeds.

Incidentally, the more attributes that have been coded per case, the more discriminating this kind of approach can become. In contrast to cross-case analysis where an increase in numbers of attributes per case is usually problematic

Related sources

For some of my earlier thoughts on case comparative analysis see  here, These were developed for use within the context of a cross-case analysis process. But the argument above is about how to proceed when the staring point is a within-case analysis.

See also:
  • Nielsen, R. A. (2014). Case Selection via Matching
  • de Meur, G., Bursens, P., & Gottcheiner, A. (2006). MSDO/MDSO Revisited for Public Policy Analysis. In B. Rihoux & H. Grimm (Eds.), Innovative Comparative Methods for Policy Analysis (pp. 67–94). Springer US. 
  • de Meur, G., & Gottcheiner, A. (2012). The Logic and Assumptions of MDSO–MSDO Designs. In The SAGE Handbook of Case-Based Methods (pp. 208–221). 
  • Rihoux, B., & Ragin, C. C. (Eds.). (2009). Configurational Comparative Methods: Qualitative Comparative Analysis (QCA) and Related Techniques. Sage. Pages 28-32 for a description of "MSDO/MDSO: A systematic  procedure for matching cases and conditions". 
  • Goertz, G. (2017). Multimethod research, causal mechanisms, and case studies: An integrated approach. Princeton University Press.