Vous êtes sur la page 1sur 7

Public Health (2005) 119, 239–245

MINI-SYMPOSIUM — PUBLIC HEALTH OBSERVATORIES


Public health indicators
Julian Flowers*, Pamela Hall, David Pencheon

Eastern Region Public Health Observatory, Institute of Public Health,


Robinson Way, Cambridge, CB2 2SR, UK

KEYWORDS Summary An indicator is a measure used to express the behaviour of a


Indicator; system or part of a system. Indicators are widely used in the public sector, and
Performance there is widespread use of indicators for performance management of public
management health.
In this paper, we define some of the terms used in relation to indicators. We
outline some of the most important issues around selection and construction of
indicators, and we include criteria for developing or assessing indicators. Use of
inappropriate indicators can be misleading and can result in negative con-
sequences for public health, and we point out the potential for pitfalls. Some
misinterpretation of indicators could be avoided by use of better methods of
presentation than the familiar league table. We use the example of a funnel plot
to show a method of summarising indicator data which avoids ranking, and allows
rapid identification of areas functioning outside normal limits.
Q 2005 The Royal Institute of Public Health. Published by Elsevier Ltd. All rights
reserved.

Introduction In this article, we examine three issues:

All health services and systems are increasingly † What makes a public health indicator, and what
subject to scrutiny and public health is no makes it ‘good enough’?
exception. Such scrutiny often relies on summary † Some pitfalls in public health indicators; and
† Some of the ways of presenting indicators.
measures of the system to indicate ‘performance’.
As a result, there is increasingly a culture of
management by measurement, and there is bur-
geoning interest in the development and use of
indicators to guide and support public health
Background
practice. As Deming said, ‘what gets measured
For the purposes of this paper, we define an
gets done’.
indicator as a summary and synthesised measure
that indicates how well a system might be perform-
ing. A useful definition of a public health indicator is
* Corresponding author. Tel.: C44 1223 330348; fax: C44 1223
330345.
‘a summary statistic which is directly related to and
E-mail address: julian.flowers@rdd-phru.cam.ac.uk which facilitates concise, comprehensive, and
(J. Flowers). balanced judgments about the condition of a major
0033-3506/$ - see front matter Q 2005 The Royal Institute of Public Health. Published by Elsevier Ltd. All rights reserved.
doi:10.1016/j.puhe.2005.01.003
240 J. Flowers et al.

aspect of health, or progress towards a healthier † Partnerships; and


society’1,2. † Public health capacity3.
There is no shortage of indicators—they come
in baskets, scorecards, portfolios and libraries.
Virtually every policy initiative is accompanied by Indicator language
indicators. Given the breadth of public health,
indicators are rarely considered in isolation. In Just as there is a profusion of indicators, there is a
England for example, the Chief Medical Officer has profusion (and confusion) of terms used. We offer
developed, with the Public Health Observatories, a some thoughts, and hopefully clarification, on the
set of regional indicators covering: use of some of these terms (see Box 1).
An indicator merely indicates—it is a measure of
† Population health status; interest which is used to indicate some concept,
† Determinants of health and risk factors; construct or process that we cannot measure
† Public health interventions; directly. Its value often derives from the context
† Services; in which it is used.

Box 1
Data The fundamental components which are processed in many ways, including
collation, contextualisation and interpretation, to reveal or create information/
knowledge.
Information/knowledge/ Processed and accurate data—collated, linked, contextualised, interpreted and
intelligence presented/disseminated in sufficient time to enable a decision-maker to take
whatever action is required.
Research evidence The results of systematic studies that can give important generalisable knowledge on
what causes certain health related outcomes, and knowledge on what works in
diagnosing and managing important clinical and public health problems.
Practice What organisations, teams and individuals actually do.
Guidance Knowledge on best practice that has been published or issued by an authoritative
person or body. It should be assembled by a systematic and explicit process. It is simple
to understand and possible to implement and helps practitioners to make the best
possible decisions on behalf of their patients and the public.
Criteriona An important area of health and quality that can be measured in order to understand
the system better.a
Standarda The level which a criterion should not fall, or should exceed, or aspire below.a
Indicator A summary and synthesised measure that indicates how well a system might be
performing. An indicator merely indicates—it is a measure of interest which is used to
indicate some concept, construct or process that we cannot measure directly. Levels
that exceed or fall short of the expected or desirable level are worthy of further
investigation.
Target A desired change in the value of an indicator over a given period of time. Aspirational
level (e.g. of an indicators) at which to aim. The direction, and rate of progress
(sometimes called the trajectory) are both as important as actually meeting the
target. Targets should focus on direction not just destination.
Surveillance Systematic and continuous collection, analysis and dissemination, that enables
important changes in incidence and prevalence of determinants and diseases, to be
detected sensitively and addressed rapidly and efficiently. It should have the capacity
to detect the unexpected through alerts and alarms.
Monitoring Periodic performance and analysis of routine measurements, aimed at longer
term trends in determinants and health status. Monitoring tends to be
reactive, post hoc, and tracks issues that are already known through alerts and
alarms.
Index A summary measure which often has a built-in standard or baseline. It is a relative
measure usually measured on a ratio scale.
Trajectory The periodic change required in the value of an indicator in order to achieve a target
a
Alternatively, a standard is something set by authority, to be attained or aspired to; and a criterion is a method or measure by
which adherence to (or achievement of) a standard will be measured. Each standard can have one or several criteria that will be
used to assess to what extent the standard has been achieved, or exceeded.
Mini-symposium 241

What makes a good public health geographically or using other dimensions such as
indicator? ethnicity), but few had been validated.
Drawing on this and the efforts of many other
Most policy areas have indicators. Ideally, the commentators in this area, we describe in Table 1
development and construction of indicators a framework which could be used to help develop
would follow a rigorous and scientific process, indicators or to assess fitness for purpose of existing
but often there are political and other constraints4. indicators2,4,6–10.
Surprisingly little work has been done on what makes
a good indicator in public health. For example, a Assessing validity
recent review of the literature identified 18 popu-
lation based indexes which combined mortality and Relevance
morbidity data and were largely derived from There should be a clear rationale for developing an
routinely available or easily accessible data5. The indicator, which includes a link to current policy.
indexes were developed to facilitate comparisons Good indicators should be timely, and there should
between the health of populations (defined be evidence that the indicator is a plausible proxy

Table 1 Twenty questions to ask of a proposed indicator.


Title, Rationale, Validity Example
Declarative title? Record of glycosylated haemoglobin (HbA1C) level
Description (with definitions where appropriate)? % of patients with diabetes who have had a record of
HbA1C in the previous 15 months
From which organisation/unit does this indicator origi- GMS new contract dataset28
nate?a
What is the broad policy area to which you would allocate Clinical and cost effectiveness
this indicator to?
Rationale with evidence? (a) HbA1C levels related to outcome
(b) practices which keep good records are practices
are associated with good outcomes
What is this indicator purporting to indicate (e.g. HbA1C- Organisation and commitment of the practice.
control; Retinopathy-control/quality of service)?
Face validity?b Yes
Construct validity?c Yes
Data
Numerator (N) and denominator (D) and comparator N, number of patients with diabetes who have had a
(each has a source, or means of data collection and record of HbA1C in the previous 15 months.
quality assurance (provenance))?
D, total number of people in practice with diabetes
Routine or special collection? Routine
What is the unit of analysis? What level (place, General practice
institution, person.) is being analysed?
Time? (frequency) Annual
Qualitative/quantitative? Quantitative
Disease classification (be specific about type of diabetes)? e.g. Type 1 diabetes as defined by.
If calculated—which method? Simple division
Miscellaneous
Strengths? Good evidence base and high face validity
Weaknesses? Risk of denominator under-estimate (see below)
Risk of gaming and perverse incentives? To keep the value of this indicator high, there is an
incentive to under record the denominator (i.e.
total number of diabetics)
How is this likely to influence (improve) practice/ Improved records lead to better clinical practice
behaviour? and to better prevalence estimates
Is the indicator mainly associated with structure, process Process.
or outcome?
a
Don’t invent new indicators, unless there is a very compelling reason to do so.
b
Does the indicator appear to measure what it intends to in practice—is it intuitive?
c
Is there a sound theoretical basis for constructing the indicator in this way?
242 J. Flowers et al.

for the underlying measure of interest. For Repeatability


example, life expectancy is a plausible indicator Most indicators are tracked over a time period. It is
of population health. important to consider changes in the components of
the indicator, including changes in collection or
coding of the data underlying indicators. If a change
Validity
is significant, it may be necessary to revise the
An indicator should have face validity in that it
indicator. For example the shift from use of the
should be likely to measure what it purports to
ninth revision of the International Classification of
measure. For example, the Index of Multiple
Diseases (ICD-9) to the tenth revision (ICD-10)
Deprivation (IMD) 2000 is widely used as an
resulted in significant alterations to coding of
indicator of the level of deprivation in small areas
clinical data, which has affected monitoring of
in the UK11. It correlates well with previous
mortality data over time12,13.
measures such as Townsend or Carstairs and
identifies as deprived areas that people accept as
Construction and deconstruction
being deprived.
For complex measures, e.g. life expectancy, it is
Indicators should also have construct validity.
valuable to be able to deconstruct the measure into
Many indicators are complex composite measures
its components, e.g. cause-specific or age-specific
combing several elements into a single figure. The
death rates. This allows consideration of the source
elements should be plausible and the composition of any variation, and enables interventions to be
of the indicator should make sense. For example, targeted at specific causes. There are no population
the IMD 2000 includes several domains, each of interventions for increasing population life expect-
which contains measures which people would ancy, but there are many interventions targeted at
accept as representing a form of deprivation. specific causes of mortality.

Technical criteria for indicators Feasibility


Indicators should usually be constructed using
Behaviour routinely collected data. It is important to consider
Indicators should be ‘well-behaved’, that is, a the availability and quality of both numerator and
change in the value of the indicator should denominator data. For example, it is currently
be interpretable, and, for composite indicators, difficult to produce primary care trust (PCT) level
the indicator value should change in an appropriate indicators because recent PCT population estimates
direction if the underlying elements change. are not available.
Murray et al. have recently provided some useful Calculations should be transparent; ideally, given
guidance for summary population health measures the appropriate data it should be possible to
like life expectancy10. In a nutshell, any measure reconstruct an indicator and derive the same values.
which is a summary of age-specific rates should
change appropriately if any of the underlying rates Consequences of indicators
change. For example, if a summary measure of
health is based on age-specific rates (e.g. mortality The presence of indicators may have consequences
or morbidity) and a rate is lower in any age-group, on the behaviour of a system. These may be
providing everything else remains the same the welcome, for example improvement in data collec-
summary measure should improve; if the age- tion. However, other unintended consequences
specific rate of disease frequency or severity is may be detrimental to the system.
higher in any age-group, providing everything
else remains the same the summary measure
Gaming
should worsen. Note that this is not always the
This is an issue to do with pursuit of targets, which
case with some commonly used summary measures
can cause an unwanted behaviour change. People
such as Standardised Mortality Rates9.
may ‘fiddle the figures’ in order to avoid punishment
or embarrassment, e.g. suspending people from
Clear specification waiting lists so that they don’t count as ‘breaches’ of
Clear and comprehensive information should be some standard. For example, if general practices are
available about the construction of an indicator, to be rewarded for completeness of, say, measuring
including details of numerator and denominator blood pressure in diabetics, they may create a
data and the calculations necessary to derive the register only including those diabetics known to
indicator value. have had their blood pressure measured, and
Mini-symposium 243

therefore appear to be doing exceptionally well. The The Will Rogers phenomenon has largely been
Audit Commission reviewed 41 NHS acute trusts described in cancer survival where ‘migration’ is
thought to be at greater risk of misreporting waiting between cancer stages, but it was originally
list information.14 In three trusts, there was described in relation to true population
evidence of deliberate misreporting, and in 19 trusts migration15–17. We know little about the effects of
reporting errors were identified. migration on population health measures. For
example, if the healthiest people in an unhealthy
Balance area move to a much healthier area where they
become the least healthy people, the effect on the
Ideally indicators should be balanced; they should average health in both populations might be to
not focus attention on one part of a system to the reduce it. If, however, people move from small,
exclusion of the rest. Improving performance in one unhealthy populations to populous, healthy ones,
area may have a negative impact in other areas. An the health gap between the populations might
indicator relating to the proportion of people increase because the average health in the former
waiting more than two weeks for an outpatient population will fall but there may little noticeable
appointment for suspected cancer may result in effect in the latter. There is some evidence for this
increased waiting times for treatment of people in the changes in socioeconomic gradients18–20.
with diagnosed cancer, as trusts strive to improve
their performance against the indicator.
Regression to the mean
Health warnings for public health
indicators Regression to the mean is a very common problem,
where a measurement yielding an extreme value on
Indicators are not without their pitfalls. We one occasion tends to yield a value closer to the
describe here some of the factors to consider average on the next occasion without anything else
when interpreting indicators. having changed21–24. This particularly affects indi-
cators where year on year change is used to indicate
Will Rogers phenomenon an underlying trend.
In Fig. 1, we can use PCT data on circulatory
This is the paradox observed when moving an item disease mortality to illustrate regression to the
from one set to another moves the average values mean25. If we compare the change in circulatory
of both sets in the same direction. disease mortality between 1998 and 1999, and
between 1999 and 2000, using the ratios of the
‘When the Okies left Oklahoma and moved to directly age-standardised rates (DSRs) for the two
California, they raised the average intelligence pairs of years, we produce the graph on the left of.
level in both states.’14 There is a negative correlation between the two

Figure 1 Regression to the mean. Areas with an increase in directly age-standardised mortality rates (DSRs) for circulatory
disease between the first pair of years (rate ratio O1) tend to show a reduction in DSRs between the second pair of years (rate
ratio !1), and vice versa. Scatter plots show directly age-standardised mortality rates for circulatory disease in males aged
under 75, primary care trusts in England 1998–2001. Source: compendium of clinical and health indicators 200225.
244 J. Flowers et al.

ratios. If mortality increases between one pair of even in the most stable systems. It is good practice to
years, it tends to fall in the next pair of years, and include a measure of uncertainty in league tables, for
vice versa. This phenomenon can be demonstrated example presenting confidence intervals for values or
using different data sets; for example, if we plot ranks, but this doesn’t solve the problem:
data from 1999/2000 and 2000/1, we produce a
similar graph (graph on the right of Fig. 1). † There is a natural tendency to focus on the rank of
an organisation in a table and ignore the confi-
League tables dence interval.
† Comparison of multiple confidence intervals is a
The presentation of indicators is crucial to their form of multiple significance testing. Remember
interpretation, and to what extent (appropriate) that on average one in every 20 measurements will
action is taken. Indicators are often presented as fall outside the 95% confidence intervals.
ranks or league tables, often using traffic light † Confidence intervals are not readily understood
coding (green for satisfactory performance, amber by everyone that might wish to use such data.
when there is some cause for concern and red for
unsatisfactory performance). Such methods have It is much better to assess indicators against a
serious limitations (e.g. in ranking, someone has to fixed baseline, or to estimate the underlying trend (if
be worst regardless of how good they are); the there are sufficient data points), or to create
principal flaw is the implicit assumption that there a control chart, which takes into account the
is a performance difference between organisations. uncertainty in estimating annual percentage change.
Analysing and ranking results on the basis of this Better presentation methods use techniques such
underlying assumption inevitably leads to organis- as ‘statistical process control’. This can be used to
ations being compared with each other. distinguish between those parts of the system that are
Measuring uncertainty operating within normal limits and those parts which
show greater than expected variation, for example
Ranking, such as in league tables, fails to allow for the using control charts and funnel plots26,27. Such
variation associated with measurement that occurs methods combine the two most important features

Figure 2 Funnel plot showing 2000–2001 data from Fig. 1 but with rate ratio plotted against the average number of
events over the time period. The dotted lines represent 95 and 99.8% control limits; the solid horizontal line represents
‘no change’, i.e. a rate ratio of 1. Only one area lies above the upper 95% limit indicating probable genuine increase in
rate, and a few lie below the lower 95% limit showing probable genuine decrease in rate. No area gives cause for alarm,
i.e. none lie outside the upper or lower control limits. Source: compendium of clinical and health indicators 200225.
Mini-symposium 245

of good data presentation: valid construction and Murray CJL, Evans DB, editors. Health systems performance
intuitive display. In Fig. 2, we use a funnel plot to assessment. Debates, methods, and empiricism. Geneva:
World Health Organization; 2003.
present PCT data on circulatory disease mortality. 10. Murray CJL, Mathers CD, Salomon JA. Towards evidence-based
public health. In: Murray CJL, Evans DB, editors. Health
systems performance assessment. Debates, methods, and
Conclusions empiricism. Geneva: World Health Organization; 2003.
11. Department of the Environment, Transport and the Regions.
Indices of Deprivation 2000. London: DETR; 2000.
Just as better understanding of research methods in 12. World Health Organization. International Classification of
health care and improved understanding in the Diseases, ninth revision. Geneva: WHO; 1977-8.
consumers of these methods has led to both better 13. World Health Organization. International Statistical Classifi-
reporting and better quality research, we believe that cation of Diseases and Related Health Problems, tenth
a similar approach should be applied to the use and revision. Geneva: WHO; 1992-4.
14. Audit Commission. Waiting list accuracy. Assessing the
appreciation of data and indicators in public health. accuracy of waiting list information in NHS hospitals in
Attention to the criteria that we have outlined, and to England. London: Audit Commission; 2003.
presentation, may help us to improve our use of 15. Wikipedia-The free encyclopedia. Will Rogers phenomenon.
indicators, develop new and perhaps more appro- Available at from URL: http://en.wikipedia.org/wiki/Will_-
priate indicators and find new ways of communicating Rogers_phenomenon [accessed 20 September 2004].
16. Feinstein AR, Sosin DM, Wells CK. The Will Rogers phenom-
the meaning of indicators. There are pitfalls in using
enon. Stage migration and new diagnostic techniques as a
public health indicators, and although this is by no source of misleading statistics for survival in cancer. N Engl
means an exhaustive review of the subject, it may J Med 1985;312:1604–8.
serve to remind public health practitioners and wider 17. Bodner BE. Will Rogers and gastric carcinoma. Arch Surg 1988;
audiences of some of the key points to consider when 123:1023–4.
proposing, preparing and interpreting indicators. 18. Dent DM. Improving cancer survival results by artefact-Will
Rogers and the stage migration phenomenon. S Afr Med J
1996;86:645.
19. Boyle P, Norman P, Rees P. Changing places. Do changes in
References the relative deprivation of areas influence limiting long-
term illness and mortality among non-migrant people
1. Deming WE. The new economics for industry, government, living in non-deprived households? Soc Sci Med 2004;58:
education. Cambridge: Massachusetts Institute of Technology 2459–71.
Center for Advanced Educational Services; 1994. 20. O’Reilly D, Stevenson M. Selective migration from deprived
2. Mathers C. Framework for the development of national areas in Northern Ireland and the spatial distribution of
public health indicators. A paper presented to the national inequalities: implications for monitoring health and inequal-
public health information working group meeting of May ities in health. Soc Sci Med 2003;57:1455–62.
1998;1999. 21. Bland JM, Altman DG. Statistics notes: some examples of
3. Bailey K, Flowers J, Streather M, Wilkinson J. Indications of regression towards the mean. BMJ 1994;309:780.
public health in the English Regions. Stockton-on-Tees: 22. Bland JM, Altman DG. Statistic notes: regression towards the
Association of Public Health Observatories; 2003. mean. BMJ 1994;308:1499.
4. Bird SM, Cox D, Farewell VT, Harvey G, Holt T, Smith PC. 23. Avery AJ, Rodgers S, Heron T, Crombie R, Whynes D,
Performance Indicators: Good, Bad, and Ugly. J Royal Stat- Pringle M, et al. A prescription for improvement? An
istical Society: Series A (Statistics in Society) 2005;168:1–27. observational study to identify how general practices vary in
5. Kaltenthaler E, Maheswaran R, Beverley C. Population-based their growth in prescribing costs. Commentary: beware
health indexes: a systematic review. Health Policy 2004;68: regression to the mean BMJ 2000;321:276–81.
245–55. 24. Morton V, Torgerson DJ. Effect of regression to the mean on
6. Mathers C, Schofield D. Development of national public health decision making in health care. BMJ 2003;326:1083–4.
indicators. A paper presented to the national public health 25. Department of Health. Compendium of clinical and health
information working group meeting of December 1997 1999. indicators 2002. London: National Centre for Health Out-
7. Murray CJL, Ezzati M, Lopez AD, Rodgers A, Vander Hoorn S. comes Development, London School of Hygiene and Tropical
Comparative quantification of health risks conceptual frame- Medicine; 2003.
work and methodological issues. Popul Health Metr 2003;1:1. 26. Mohammed MA, Cheng KK, Rouse A, Marshall T. Bristol,
8. Gakidou E, Murray CJL, Frenk J. A framework for measuring Shipman, and clinical governance: Shewart’s forgotten
health inequality. In: Murray CJL, Evans DB, editors. Health lessons. Lancet 2001;357:463–7.
systems performance assessment. Debates, methods, and 27. Battersby J, Flowers J. Presenting performance indicators:
empiricism. Geneva: World Health Organization; 2003. alternative approaches. INphoRM 2004; 4.
9. Mathers CD, Salomon JA, Murray CJL, Lopez AD. Alternative 28. Department of Health. Quality and outcomes framework-
summary measures of average population health. In: Guidance. London: DH; 2003. [Indicator ref. DM 5].

Vous aimerez peut-être aussi