A counterfactual is a description of what would have happened, if an intervention had not taken place. The use of randomised control groups is one way to construct a counterfactual. A population of people are randomly assigned to either a control group or an intervention group. Differences in the outcomes of those populations are then compared. If the difference is sufficiently statistically significant then a plausible causal claim can be made that difference in outcomes is because of the intervention.
As might be expected, there are plenty of circumstances when
social programs are designed and implemented, where it is simply not practical
to organise a randomised control group. In addition, the comparisons that
are made will be between averages of the two groups. However, in many social
programmes such averages are of limited practical use, because the
implementation contexts are so varied and no single “solution” is likely to be
applicable. Average effects can still be informative at a high level, but they
need to be complemented by methods that take contextual diversity seriously.
I'm currently working with an evaluation team that is examining a large-scale
public health programme in the United Kingdom, covering many different
locations and involving many different types of local partnerships. But
with one common outcome of concern, which is to increase people's physical
activity levels in their daily life. In their work the evaluation team is
already making use of a causal configurational approach to the understanding of
what works for whom in what circumstances. It is finding different
configurations of causal conditions across these locations that are associated
with changes in activity levels. This approach is consistent with the high
level of diversity in locations partnerships and interventions.
But what it does not yet have is a counterfactual, a defensible description of
what might have happened in these locations in the absence of this
intervention. This is where the idea of a rank order
counterfactual becomes relevant. By a rank‑order
counterfactual I mean a very specific kind of “what would have happened
otherwise.” Instead of trying to predict the exact outcome that would have been
achieved in each location without an intervention, we can start by asking a
simpler, comparative question: which location would probably
have changed more, and which less, if the intervention had
never existed? The answer will be in the form of a rank ordering of
locations, from those with more to less expected change. That
ranking would be constructed based on all
available baseline information, trends, and contextual knowledge. This
proposed approach falls into a category of counterfactuals known as
"logically constructed counterfactuals", and it aligns well with
configurational evaluation because it focuses on patterns of relative change
across diverse contexts.
A subsequent evaluation of those same locations should also
be able to generate a new rank ordering, which is based on observed outcomes.
These counterfactual and actual rankings can then be compared, using a
scatterplot and correlation measures. The scatterplot is also visually powerful
for communication: it lets people see at a glance which locations behave as
expected and which ones stand out as surprises. If the intervention had no
effect we should see a linear relationship, the observed and counterfactual rankings
should be the same. If the intervention had positive, or perhaps even negative
effects, this should not happen. We might see various locations which are
outliers from that expected trend. When locations we expected to be
“natural leaders” did not improve much, and those we expected to be “natural
laggards” moved to the top of the league table, that pattern is a signal that
the intervention may have been influential, and it gives the evaluation team
clear cases where alternative explanations should be probed. The task of the
evaluation is then to probe those alternative explanations, not to assume the
intervention is the only possible cause. The rankings are not a substitute for
theory‑based
evaluation; they are a way to make its claims sharper and more testable. The focus
on ranking differences can convert a vague theory (“we think these factors
matter”) into a concrete, specific prediction about which locations should do
better.
The sensitivity of the rank comparison process will depend
on the number of ranked items. The more rank positions there are, the more
sensitivity there will be to differences in performance, which is
good. But, as shown in research on sorting algorithms, the time
required to generate a complete sorting, using any of the well-known
methods, can be significant. Growing faster than proportionally to the number
of items, though far slower than exponential growth. In addition to the extra
time required, a rank order counterfactual will require a stronger
evidential base where the number of rank positions is greater.
When a large number of locations are involved in an
intervention one practical way of addressing this tension is to use a
stratified random sample, and to generate the rankings for that
sample only. Another approach to managing large numbers of locations
is to think of ranked bands of locations rather than individual rankings for
each location. What should be of interest, then, are systematic shifts in
band membership between the counterfactual and actual observations – for
example, locations expected to be in the “low‑change” band
turning up in the “high‑change” band in
practice.
In this way, rank‑order counterfactuals do not replace theory‑based evaluation, but sharpen it: they turn general expectations about context into explicit, testable predictions about who should have changed most in the absence of the programme. In work which I hope to document in the European Evaluation Society conference later this year I will explain how the use of the hierarchical card sorting process was used to generate argument and evidence based counterfactual rank orderings, and how an LLM was used to support this process.

No comments:
Post a Comment