Saturday, September 01, 2007

Checklists as mini theories-of-change

During a recent evaluation of a UNICEF assisted health program in Indonesia I was given a copy of a checklist that had been designed for use in assessing the functioning of sub-district health centres in South Sulawesi. You can get a general idea of its structure from the image below. There is a list of attributes of “high performing” health centres down the left, grouped into categories, sub-categories and sub-sub-categories. Down the right side are columns, one for each health centre. Ticks are placed in each row of a column to indicate if the attribute in that row was found in that health centre. I think it is intended that if all the attributes are ticked then the health centre will be deemed to have “graduated” and no longer need to be given development assistance.



While this format has the important virtue of simplicity it does make two assumptions that may be useful to question. It appears that all the listed attributes are essential. This assumes that there is a consensus on what constitutes a “good” health centre. However, in practice, developing that consensus may be an important part of the process of developing a “good” health centre. Not only within the health centre, amongst the staff of the health centre, but also externally, amongst other organisations that the health centre has to work with (e.g. the district hospital, village health posts, and the district health office).

The second assumption is that all the attributes are of equal importance. This seems unlikely. For example, it would be widely agreed that having a supply of oxytocin (attribute 2.4.a) is much more important than “Mother's day celebration implemented each year at sub-district level” (attribute 4.2.b). Attempts to develop the capacity of the health centre will need to be guided by a clear sense of priorities, about what attributes are more important than others. The choice between organising a mothers’ day event and ensuring a supply of oxytocin could be a matter of life or death.

These two “problems” could be seen as opportunities. Attributes on the list could be weighted by asking selected stakeholders to rank the attributes in terms of their relative importance (by allocating points adding to 100 points). If there are a large number of attributes the ranking could start with the major categories, then sub-categories, then attributes within them. Importance could be defined as how much they are likely to continue to improved usage of quality services that will effect people’s health outcomes. The first set of stakeholders could be internal to the health service, and later on external stakeholders could be consulted. Attributes that were given widely different rankings would then be the focus of discussion as to why views varied so much. The assumption here is that this may lead to some convergence of views on priorities. It could also be relevant to staff training agendas. During the evaluation referred to above, we found that comparing different stakeholders ranking of the effectiveness of a number of (other) project activities generated a constructive discussion that increased both stakeholder groups’ understanding of each other, and of the issues involved.

Even when agreement is reached about appropriate weightings a question might be raised about whether this will necessarily lead to expected outcomes. Such as how women are using the health service or their behaviour after visiting the health centre. It would therefore be useful to compare the scores of different health centres and how they related to outcomes observed by those different health centres. How well do these scores predict these outcomes? If they do not, the scores could be re-calculated on the basis of a different set of weightings, to see if emphasising other attributes produced a better fit between health centre scores and observed outcomes of concern. If so, that would suggest the need for a re-orientation of priorities within the health centre. A given set of weightings is in effect the theory-of-change, and the score it generates can be treated as a prediction of an expected outcome. A series of predictions (scores from different health centres) would be needed to see how well the theory fits reality (outcomes observed by those health centres).

Incidentally: A target score on a checklist could be inserted as a single indicator in a Logical Framework, allowing a simple reference to be made to the measurement of a complex outcome. The wider use of checklist scores might help limit the use of overly simplistic indicators of progress, as seen in many Logical Frameworks.

PS: This discussion is not a criticism of the checklist as currently in use. It is an outline of what I think is some of its untapped potential.

3 comments:

  1. Hi Rick. I see a third 'problem' with the checklist and its use. If, as you say "if all the attributes are ticked then the health centre will be deemed to have “graduated” and no longer need to be given development assistance", then the checklist does not provide any element as to the medium/long term sustainability of the checked situation. A positive situation you have witnessed today under support by the programme could disappear tomorrow for a lot of reasons. The checklist seems to provide a still picture at a particular moment in time without checking for longer term sustainability factors, which could alter substantially any judgment. I tink.
    Best,
    Mario

    ReplyDelete
  2. Interesting point. Two possible responses:
    1. Make sure there are follow up visits six or twelve months later, after graduation, to see if the checklist score has been sustained.
    2. Build in some items on the checklist menu that might be indicative of likely sustainability. For example, "There have been no gaps in the essential drug supply availability over the last twelve months"

    ReplyDelete
  3. Hi Rick, I have seem similar checklists and I think they are a good complement to actual surveying/interviewing of health workers and patients/populations. What I find odd about this checklist is apart from the point you make about lack of prioritization is the way weighting is done - sometimes they speak of "90%", "80%", "50%" or "30%" - to achieve an indicator - how did they choose these percentages - and why the difference? I prefer checklists where there is a "neutral" phase such as "current state of water system" and a scale "low/medium/high".
    Glenn O'Neil

    ReplyDelete