Saturday, December 26, 2015

Meta versus macro theories of change


A macro-ToC is single ToC that seek to aggregate into one view the contents of many micro-ToCs. For example, the aggregation of many project-specific ToCs  into an single country-level ToC. There are two risks with this approach:

  1. The loss of detail involved in this aggregation will lead to a loss of measurability, which presents problems for evaluability of a macro-ToC
  2. Even where the macro-ToC can be tested the relevance of the results to a specific project could be contested, because individual projects could challenge the macro-ToC as not being an adequate representation of their project intentions. 
The alternative to a macro-ToC is something that could be called a meta-ToC. A meta-theory is a theory about theories. A meta-ToC would be a structured set of ideas about the significant differences between various ToCs.  These differences might be of various kinds e.g. about the context, the intervention, the intended beneficiaries, or any mediating causal mechanisms. Consider the following (imagined) structure. This is in effect a nested classification of projects. Each branch represents what might be seen by a respondent as significant differences between projects, ideally as apparent in the contents of their ToCs and associated documents. This kind of structure can be developed by participatory or expert judgement methods  (See PS 2 link below for how). The former is preferable because it could increase buy in to the final representation by the constituent projects and their associated ToCs.
The virtue of this approach is that if well done, each difference in the tree structure represents the seed of a hypothesis that could be the focus of attention in a macro evaluation. That is, the "IF.." part of an "IF..THEN.." statement. If each difference represents the most significant difference, the respondents could then be asked a follow-up question: "What difference has or will this difference made?" Combined with the original difference, the answers to this second questions generates what are are essentially hypotheses (IF...THEN...statements), ones that should be testable by comparing the projects fitting into the two categories described.

Some of these differences will be more worthwhile testing than others, if they cover more projects. For example, in the tree structure above, the difference in "Number of funders" applies to all five projects, whereas the difference in "Geographic scale of project" only applies to two projects. More important differences, that apply to more projects, will also by definition, have more cases that can be compared to each other

It is also possible to identify compound hypotheses worth testing. That is, "IF...AND...THEN..." type statements. Participants could be asked to walk down each branch in turn and indicate at each branch point "Which of these types of projects do you think has/will be the most successful?" The combination of project attributes described by a given branch is the configuration of conditions hypothesised to lead to the result predicted. Knowledge about which of these are more effective could be practically useful. 

In summary: This meta-theory approaches maximises the use of diversity that can be present in a large portfolio of activities, rather than aggregating it out of existence. Or more accurately, out of visibility.

PS 1: These thoughts have been prompted by my experience of being involved in a number of macro-evaluations of projects in recent years.

PS 2: For more on creating such nested classifications see https://mande.co.uk/special-issues/hierarchical-card-sorting-hcs/

Friday, August 21, 2015

Clustering projects according to similarities in outcomes they achieve

Among some users of LogFrames it is verboten to have more than one Purpose level (i.e. outcome) statement. They are likely to argue that where there are multiple intended outcomes a project's efforts will be dissipated and will ultimately be ineffective. However, a reasonable counter-argument would be that in some cases multiple outcome measures may simply be more nuanced description of an outcome that others might want to insist is expressed in a singular form.

The "problem" of multiple outcome measures becomes more common when we look at portfolios of projects where there may be one or two over-arching objectives but it is recognised that there are multiple pathways to their achievement. Or, that it is recognized that individual projects may want to trying different mixes of strategies , rather than just one alone.

How can an evaluator deal with multiple outcomes, and data on these? Some years ago one strategy that I used was to gather the project staff together to identify for each output, what its expected relative causal contribution was of each of the project outcomes. These judgements were expressed in individual values that added up to 100 percentage points per outcome, plotted in an (Excel) Outputs x Outcome matrix, projected onto a screen for all to see, argue and edit. The results enabled us to prioritise which Output to Outcome linkages to give further attention to, and to identify, in aggregate, which Outputs would need more attention than others.

There is also another possible approach. More recently I have been exploring the potential uses of clustering modules within the RapidMiner data mining package. I have a data set of 34 projects with data on their achievements on 11 different outcome measures. A month ago I was contracted to develop some predictive models for each of these outcomes, which I did. But it now occurs to me that doing so may be somewhat redundant, in that there may not really be 11 different types of project performance. Rather, it is possible that there are a smaller number of clusters of projects, and within each of these there are projects having similar patterns of achievement across the various possible outcomes.

With this in mind I have been exploring the use of two different clustering algorithms: (k-Means clustering and DBSCAN clustering. Both are described in practically useful detail in Kotu and Deshpande's book "Predictive Analytics and Data Mining"

With k-Means you have to specify the number of clusters you are looking for (k), which may be useful in some circumstances. but I would prefer to find an "ideal" number. This could be the number of clusters where there is the highest level of similarity of cases within a cluster compared to other alternative numbers of clusterings of the same cases. The performance metrics of k-Means clustering allows this kind of assessment to be made. The best performing clustering result I found identified four clusters. With DBSCAN you don't nominate any preferred number of clusters, but it turns out there are other parameters you do need to set, which also affect the result, including the number of clusters found. But again, you can compare and assess these using a performance measure, which I did. However, in this case the best performing result was two clusters rather than four!

What to do? Talk to the owners of the data, who know the details of the cases involved and show them the alternative clustering, including information on which projects belong to which clusters. Then ask them which clustering makes the most sense i.e. is most interpretable, given their knowledge of these projects.

And then what? Having identified the preferred clustering model it would make sense then to go back to the full data set and develop predictive models for these clusters: i.e. to find what package of project attributes will best predict the particular cluster of outcome achievements that are of interest.