Saturday, April 18, 2015

A mistaken criticism of the value of binary data

When reviewing a recent evaluation report I came across the following comment:
"Crisp set QCA where binary codings are used to establish the presence or absence of certain conditions does not facilitate a nuanced or granular analysis."
Wrong. Simply wrong.

A DFID strategy for promoting "improved governance" could be coded  as present or absent. This does seem crude, given the varieties of ways in which a governance strategy could actually be implemented. But the answer is not to ditch binary coding, but to extend it.

This can be done by breaking down the concept of "a strategy for improved governance"  into a number of component parts or attributes, and then coding for their presence/absence. The initial conception of the governance strategy is then deemed present if all 10 attributes are present. But it only takes a single change in one attribute at a time to produce 9 new versions of almost the same strategy. If you change two attributes at a time, there are ( 1 think... 1-(10 x 10) =) 100 new versions. If any number of attributes can be changed then this means there are 2 to the power of 10 possible configurations of the strategy, some of which may be very different from the present strategy. Basically it does not take much tweaking of the initial configuration before you will have nuances by the bucketful!

The limitations of the dis-aggregation-into-components approach have nothing to do with the nature of binary coding, but rather whether there are enough cases available to allow identification of the kinds of outcomes associated with the different varieties of configurations arising from the more micro-level coding of attributes.

If there are enough cases available, then learning about what works through the emergence (or planned development) of variations in the initial configuration then becomes possible. Some of these new versions of a governance strategy may work more effectively than the initial model, and others less so. Incremental exploration becomes possible.

For more on the idea of exploring adjacent variations in causal configurations see Andreas Wagner's very interesting (2014) book titled "The Arrival of the Fittest" which explores a theory of how innovation is possible in biological systems. Here is a review of the book, in the Times Higher Education website.

There is also a connection here, I think, with Stuart Kauffman's concept of "the adjacent possible", an idea also taken up by Stephen Johnson in his book "Where do good ideas come from: The natural history of innovation" Here is a review of the book in the Guardian

Postscript 2015 05 14: I heard the same"binary is crude"  criticism again today from a person attending a QCA presentation at the UK Evaluation Society Conference in London.

This time I will present another response. Binary judgments can be and often are derived from a dichotomised scale that captures graduations of the phenomena of interest. As Carroll Patterson pointed out today, with current QCA software it is now possible to experiment with varying the location of the cut-off point on such scales, and observe the consequences for the quality of the configurations that are then identified as the best fitting solutions The same approach is also possible with searches for best-fitting configurations using an evolutionary algorithm, which is another approach I have been experimenting with recently. It is also possible to go much further into the specific details of the underlying concept being measured by a scale by basing it on the aggregated output of a weighted checklist, like the kind I have described elsewhere. Basically, the limit to what is possible is defined by the imagination of the researcher/evaluator, not any inherent limitation of binary measures.

Postscript  2015 05 17: I tried to post a Comment below in reply to Anon's comment below, but wont accept any HTML formatting, so I will place the comment here instead.

RE "If, to combat the reductiveness of binary coding, you introduce a scale of 4-6 points, you still face the same problem in coding something more complex – a remote non expert is reducing a complex context and process to a number in an arbitrary way. "
Coding for QCA (and other purposes, such as when using NVIVO) should always be done in a way that is transparent and replicable, with attention to inter-rate reliability. It should certainly not be done in an “arbitrary way”
RE "Grading a large, diverse and complicated country on a scale of 0-1 or 1-5 on 'improved governance' is just ridiculous. Anyone who has studied the way people actually behave, governance, how decisions are really made or projects succeed or fail, will tell you that this reductiveness does not helpfully or accurately reflect reality."
QCA has been used in a field of Political Science since the 1980s and many of these applications have been cross-country analyses of political systems.
RE QCA is not qualitative – as it seeks to reduce a complex qualitative issue to a quantitative score - a number.
In crisp-set QCA data set the “number” 0 or 1 is actually a category not a numerical value. QCA could be done just as well by replacing  the 0’s and 1’s with the words “absent” and “present” 
RE "QCA is not comparative – the serious comparative part comes afterwards in some form of qualitative analysis, which researchers can choose. Looking at the truth table for patterns is the only form of comparison that QCA offers."
There are two levels of analysis involved in QCA: within-case analysis and between-case analysis.  At the beginning within-case analysis informs the selection of conditions to be included in a data set. When inconsistencies are found in an examination of configurations in a data set good practice advises a return to within-case analysis to identify missing conditions that can resolve these inconsistencies.  When these have been resolved and set of configurations has been identified that accounts for all case in the most parsimonious way possible,  these then need to be interpreted by reference to   the details of specific cases, with particular attention to more detailed process that connect the conditions making up the configurations.
RE "In my view QCA is a quantitative form of data management and pattern identification."
It does depend on what you mean by quantitative. It is based on a form of mathematics known as set theory, but that is about logical relationships, not quantities. In case there is any reservation about its significance, pattern identification is very important. In a data set with 10 different conditions there are 2 to the 10 different possible combinations of these that might be consistently associated with an outcome of interest. Finding these is like looking for a needle in a haystack. QCA and other methods like decision tree algorithms, help us find what part of the haystack the needle is most likely to be found. But as I said at the end of my section of the UKES presentation, finding a plausible configuration is not enough. It is necessary but not sufficient  for a strong causal claim. There also needs to be a plausible account of the likely causal mechanisms at work that connect the conditions in the configurations. These will only be found and confirmed through detailed within-case investigations, using methods like (but not only) process tracing. And the pattern finding has to be systematic, and transparent in the way it has been done. This is the case with QCA and Decision Tree modeling, where there are specifics algorithms used, both with their known limitations

There is a useful reference that may be of interest: Wagemann, C., Schneider, C.Q., 2007. Standards of Good Practice in Qualitative Comparative Analysis (qca) and Fuzzy-Sets.