Saturday, December 26, 2015

False Positives - why we should pay more attention to them


In the last year I have been involved in two pieces of work that have sought to find patterns in data that are good predictors of project outcomes that were of interest. In one cases as the researcher, in another case in a quality assurance role, looking over someone else's analysis.

In both situations two types of prediction rules were found: (a) some confirming stakeholders' existing understandings, (b) others contradicting that understanding and/or proposing a novel perspective. The value of further investigating the latter was evident but the value of investigating findings that seemed to confirm existing views seemed less evident to the clients in both cases. "We know that...lets move on.../show us something new" seemed to be the attitude. Albeit after some time, it occurred to me that two different next steps were needed for each of these kinds of findings:

  • Where findings are novel, it is the True Positive cases that need further investigation. These are the cases where the outcome was predicted by a rule, and confirmed as being present by the data.
  • Where findings are familiar, it is the False Positives that need further investigations. These are the cases where the rule predicted the outcome but the data indicated the outcome was not present. In my experience so far, most of the confirmatory prediction rules had at least some False Positives. These are important to investigate because if we do so this could help identify important boundaries to our confidence about where and when a given rule works.
Thinking more widely it occurred to me how much more attention we should pay to False Positives in the way that public policy supposedly works. In war time, civilian casualties are often False Positives, in the calculations about the efficacy of airstrikes for example. We hear about the number of enemy combatant killed, but much less often about the civilians killed by the same "successful" strikes. There are many areas of public policy, especially in law I suspect, where there are the equivalent of these civilian deaths, metaphorically if not literally. The "War on Drugs" and the current "War on Terrorism" are two that come to mind. Those implementing these policies are preoccupied with the numbers of True Positives they have achieved and with the False Negatives i.e the cases known but not yet detected and hit. But counting False Positives is much less so in their immediate interest, raising questions of if not by them, then by who?

Some Christmas/New Year thoughts from a dry, warm, safe and secure house in the northern hemisphere...

PS : seehttp://arstechnica.co.uk/security/2016/02/the-nsas-skynet-program-may-be-killing-thousands-of-innocent-people/

Meta versus macro theories of change


A macro-ToC is single ToC that seek to aggregate into one view the contents of many micro-ToCs. For example, the aggregation of many project-specific ToCs  into an single country-level ToC. There are two risks with this approach:

  1. The loss of detail involved in this aggregation will lead to a loss of measurability, which presents problems for evaluability of a macro-ToC
  2. Even where the macro-ToC can be tested the relevance of the results to a specific project could be contested, because individual projects could challenge the macro-ToC as not being an adequate representation of their project intentions. 
The alternative to a macro-ToC is something that could be called a meta-ToC. A meta-theory is a theory about theories. A meta-ToC would be a structured set of ideas about the significant differences between various ToCs.  These differences might be of various kinds e.g. about the context, the intervention, the intended beneficiaries, or any mediating causal mechanisms. Consider the following (imagined) structure. This is in effect a nested classification of projects. Each branch represents what might be seen by a respondent as significant differences between projects, ideally as apparent in the contents of their ToCs and associated documents. This kind of structure can be developed by participatory or expert judgement methods  (See PS 2 link below for how). The former is preferable because it could increase buy in to the final representation by the constituent projects and their associated ToCs.
The virtue of this approach is that if well done, each difference in the tree structure represents the seed of a hypothesis that could be the focus of attention in a macro evaluation. That is, the "IF.." part of an "IF..THEN.." statement. If each difference represents the most significant difference, the respondents could then be asked a follow-up question: "What difference has or will this difference made?" Combined with the original difference, the answers to this second questions generates what are are essentially hypotheses (IF...THEN...statements), ones that should be testable by comparing the projects fitting into the two categories described.

Some of these differences will be more worthwhile testing than others, if they cover more projects. For example, in the tree structure above, the difference in "Number of funders" applies to all five projects, whereas the difference in "Geographic scale of project" only applies to two projects. More important differences, that apply to more projects, will also by definition, have more cases that can be compared to each other

It is also possible to identify compound hypotheses worth testing. That is, "IF...AND...THEN..." type statements. Participants could be asked to walk down each branch in turn and indicate at each branch point "Which of these types of projects do you think has/will be the most successful?" The combination of project attributes described by a given branch is the configuration of conditions hypothesised to lead to the result predicted. Knowledge about which of these are more effective could be practically useful. 

In summary: This meta-theory approaches maximises the use of diversity that can be present in a large portfolio of activities, rather than aggregating it out of existence. Or more accurately, out of visibility.

PS 1: These thoughts have been prompted by my experience of being involved in a number of macro-evaluations of projects in recent years.

PS 2: For more on creating such nested classifications see https://mande.co.uk/special-issues/hierarchical-card-sorting-hcs/