Thursday, December 25, 2025

Exracting additional knowledge and performance from a configurational model that already has wide coverage

 
A decision tree algorithm, as available within EvalC3, can generate a classification tree  (a set of predictive models) of the kind shown here.


Some of the models (each branch is a model) are very detailed (i.e. has lots of attributes) and have narrow coverage. Such as HasQuotas+NotPost Conflict Situation+High Level of Human Development+ Low Womens Status = Low levels of womens representation in Parliament, which covers two cases (Senegal, Tanzania). 

Others are quite simple, with only two or three attributes and can have much wider coverage. Such as HasQuotas+ IsPostConflict = High levels of womens representation in Parliament, which covers two six cases (Burundi, Ethiopia, Mozambique, Namibia, South Africa, Uganda)

These wide coverage models may have unexplored potential, in the form of unexploited information content within the cases they cover. The raw (i.e. numerical) outcome data for the cases they cover only can be examined and recalibrated i.e re-dichotomised into two new sub-groups representing relatively higher versus lower outcome values within that set only

A new configurational analysis can then focus on that sub-set of cases to see if (a) any of the pre-existing attrubutes could predict membership of the two sub-groups, or (b) if any additional attributes, based on other knowledge of these cases, could do so.The ability to predict such finer grained performance differences would be a significant improvement.

This analytic step is a complementary move to that known as "pruning", where the removal of a mode attribute improves coverage, at the cost of precision.  Here an extra attribute is sought that will improve precision but at the cost of coverage. Perhaps it could be called "grafting"...

Postscript: But how significant will this addition to the model be? If, as above, there are six cases involved, there are 2^6 possible binary groupings of these case i.e 32. So any one grouping of two sets of cases has a 1/32 or 3.125% chance of occuring randomly (if the cases are causally independent). 

Tuesday, December 02, 2025

Objectives as data: The potential uses of updatable outcome targets

 The context

A specialist agency is funding more than 40 different partner organisations, each working in a different part of the country but with the same overall objective of increasing people's levels of physical activity (because of the positive health consequences). These partners are often working with quite different communities, and all have substantial degree of independence about how they work towards the overall objective. 

Some agency representatives have asked about the nature of the target that program as a whole is working towards, and have emphasised how essential it is that there be clarity in this area. By target they mean an actual number. Specifically the percentage of people self-reporting that they achieve a certain level of physical activity each week, as identified by an annual survey that is already underway and will be repeated in future.

Possible responses

In principle it would be possible to set a target for the proportion of the population reporting being physically active. Such as 75%.  But it would be very hard to identify an optimal target percentage, given the diversity of partner localities, and the communities within these. 

Relative targets may be more appropriate. Such as a 25% increase in reported activity levels. Especially if partners were each asked to identify what they think are achievable percentage increases in their own localities within the next survey period. This estimation would take place in a context where these partners already have experience working in those locations, identifying some of the things that work and dont work. My hypothesis, yet to be tested, would be that these partners will make quite conservative estimates. And if so, this might come as some surprise to the donor and perhaps lead to some revision of their own expectations

Taking this idea further, partners could be periodically asked if they wanted to adjust their expectations upwards or downwards , of the change that could be achieved - in the time remaining in the interventions lifespan. Subject to being able to explain the rationale for doing so. My second hypothesis is that this number, and commentary, could be a valid and useful form of progress reporting in its own right.

Making sense of the responses

An assessment of overall progress over longer time scale would need to consider both the scale of ambitions and the extent of their achievement. These can't be combined into one number based on a simple formula because any such number could be achieved by adjustment of expectations and or performance. However it could be usefully represented by a scatterplot, with data points reflecting each of the partners, of the kind shown below.

The location of partners in different quadrants suggests different implications about how the different partners should be managed

  • High ambition/low achievement: May need additional support, capacity building, or problem-solving
  • Low ambition/low achievement: May need fundamental partnership restructuring or exit considerations
  • High ambition/high achievement: Candidates for scaling, sharing learning, reduced support intensity
  • Modest ambition/high achievement: Opportunities to stretch ambitions

This framework also provides plenty of potentially useful analytic questions

  • Are ambitions increasing or decreasing?
  • Is the gap between expected and actual narrowing or widening?
  • For a given level of actual achievement did differences in expectations have any role or consequences
  • For a given level of expected change what might explain the differences in the partners actual achievements
  • How do individual partners positions within this matrix change over time? Are there distinct types of trajectories and how can these differences be explained?

In summary

A single numerical value based on the data in this matrix will provide a meaningless simplification. 

In contrast, a scatterplot visualisation can generate multiple potentially useful perspectives. 

It is more useful to see targets as necessarily malleable responses to changing conditions, than as unarguable reference points.

Postscript

There is a type  of reinforcement learning algorithm known as Temporal Difference Learning (TDD), that embodies a very similar process. It is described as "a model-free reinforcement learning method that learns by updating its predictions based on the difference between future predictions and the current estimate". Model-free means it has no built in model of the world it is working in. 

When implemented as a human process it is vulnerable to gaming, because the agents (humans) are aware of the system's mechanics, unlike the neural networks or simplified agents typically used in computational TD learning. But one adaptation, suggested by Gemini AI, is to "reward partners not just for the +/-gap, but for the accuracy of their final predictions over multiple cycles". Relatively higher accuracy, over multiple time periods, might be indicative of potentially generalisable / replicable delivery capacity, usable beyound the current context.