Vous êtes sur la page 1sur 246

Econometrics of Health Care

Advanced Studies in Theoretical and Applied Econometrics

Volume 20

Managing Editors:
J.P. Ancot, Netherlands Economic Institute, Rotterdam, The Netherlands
A.J. Hughes Hallet, University of Strathclyde, Glasgow, United Kingdom

Editorial Board:
F.G. Adams, University of Pennsylvania, Philadelphia, U.S.A.
P. Balestra, University of Geneva, Switzerland
M.G. Dagenais, University of Montreal, Canada
D. Kendrick, University of Texas, Austin, U.S.A.
J.H.P. Paelinck, Netherlands Economic Institute, Rotterdam, The Netherlands
R.S. Pindyck, Sloane School of Management, M.I. T., U.S.A.
H. Theil, University of Florida, Gainesville, U.S.A.
W. Welfe, University of Lodz, Poland

The titles published in this series are listed at the end of this volume.
Econometrics of Health Care
edited by
G. Duru
J. H. P. Paelinck



Library of Congress Cataloging-in-Publication Data

Econometrics of health care! edited by G. Duru. J.H.P. Paelinck.

p. em. -- (Advanced studies 1n theoretical and applied
eCunometrics ; v. 20)
Selected papers from the Fourth International Conference on System
Science in Health Care. held in Lyons. July 1988. and two previous
meetings held in Lyons. Feb. 1983 and Rotterdam, Dec. 1985.
ISBN-13: 978-94-010-7420-9 e-ISBN-13: 978-94-009-2051-4
DOl: 10.1007/978-94-009-2051-4
1. Medical care. Cost of--Congresses. 2. Medical economics-
-Congresses. 3. Econornetrics--Congresses. I. Duru. Gerard.
II. Paelinck. Jean H. P. III. International Conference on System
Science in Health Care (4th: 1988 : Lyons. France) IV. Series.
[DN~M: 1. Del1very of Health Care--economics--congresses. W 84.1
RA410.5.E23 1990
for L1brary of Congress 90-4594

Published by Kluwer Academic Publishers,

P.O. Box 17, 3300 AA Dordrecht, The Netherlands.

Kluwer Academic Publishers incorporates

the publishing programmes of
D. Reidel, Martinus Nijhoff, Dr W. Junk and MTP Press.

Sold and distributed in the U.S.A. and Canada

by Kluwer Academic Publishers,
101 Philip Drive, Norwell, MA 02061, U.S.A.

In all other countries, sold and distributed

by Kluwer Academic Publishers Group,
P.O. Box 322, 3300 AH Dordrecht, The Netherlands.

02-10-93-100 ts

All Rights Reserved

© 1991 by Kluwer Academic Publishers
Softcover reprint of the hardcover 1st edition 1991
No part of the material protected by this copyright notice may be reproduced or
utilized in any form or by any means, electronic or mechanical,
including photocopying, recording or by any information storage and
retrieval system, without written permission from the copyright owner.
Table of contents

G. Duru and J. H. P. Paelinck Vll

Introduction: the health system in the general economy 1

Health expenditure growth and macroeconomic models
y. Saillard 3

Part one: supply and demand 17

The MIMIC health status index
W. P. M. M. van de Ven and E. M. Hooijmans 19
Estimating demand for medical care: health as a critical factor for
adults and children
1. van der Gaag and B. L. Wolfe 31
An empirical model of the demand for health care in Belgium
G. Carnn and J. van Dael 59
Reconciling spatial demand/supply imbalances in acute care
J. R. Roy and M. Anderson 79
Physicians' specialty choice and specialty income
1. W. Hay 95

Part two: functioning: cost and financing 115

Subsidies, quality, and regulation in the U.S. nursing home industry
P. J. Gertler 117
A Poisson process of which the parameter contains a non-stationary
error: application to the analysis of a series of deaths in a large
B. Larcher 141

vi Table of contents

The construction of a model for medical cost and labour (MEDlKA)

R. l. A. M. van den Broek 159
The microeconomics and econometrics of bonus systems in health
P. Zweifel 187
Microsimulation of the costs of the health system in the Federal
Republic of Germany
R. Brennecke 203

Part three: synthesis 225

Segmentation and classification. An application to patients' risk
l.-P. Auray, C. Durn, M. Terrenoire, D. Tounissoux, and A. Zighed 227
A general equilibrium model of health care
M. Chatterji and l. H. P. Paelinck 237

Econometrics of Health Care - which we have sometimes called 'medico-

metrics' - is a field in full expansion. The reasons are numerous: our knowl-
edge of quantitative relations in the field of health econometrics is far from
being perfect, a large number of analytical difficulties - combining medical
(latent factors, e.g.) and economic facts (spatial behaviour, e.g.) are faced by
the research worker, medical and pharmaceutical techniques change rapidly,
medical costs rocket more than proportionally with available resources,
medical budgets are in the process of being tightened.
So it is not surprising that the practice of 'hygieconometrics' - to produce
a neologism - is more and more included in the programmes of econometri-
cians. The Applied Econometrics Association has devoted to the topic two
symposia in less than three years (Lyons, February 1983; Rotterdam,
December 1985), without experiencing any difficulties in getting valuable
papers: on econometrics of risks and medical insurance, on the measurement
of health status and of efficiency of medical techniques, on general models
allowing simulation. These were the themes for the second meeting, but other
aspects of medical-economic problems had presented themselves already to
the analyst: medical decision making and its consequences, the behaviour of
the actors - patients and physicians -, regional medicometrics and what
not: some of them have been covered by the first meeting.
Finally, in July 1988 took place in Lyons the Fourth International
Conference on System Science in Health Care; it should not be astonishing .
that there also some econometric papers on the topic would have been
presented, hence the inclusion of some outstanding ones.
It was thought that a selection of papers from the three conferences would
be valuable permanent consulting material; this is why they have been
collected and presented in this volume, which has been financially suppoited
by the Applied Econometrics Association.

G. DURU, Lyons
J. H. P. PAELINCK, Rotterdam

G. Duru and J. H. P. Paelinck (eds.), Econometrics of Health Care, vii.


The health system in the general economy

Health expenditure growth and
macroeconomic models*

CEPREMAP, 142, rue du Chevaleret, F-75013 - Paris

The need for a macroeconomic approach arises from an old debate, the crisis
revived, on the connection between the development of tertiary or non-
market sectors and the economic growth slow-down. Characteristics of the
health sector (especially the effects of technical progress on costs of pro-
duction) appear to many authors to be among the causes of economic crisis
or at least to be an indicator of contradictions in contemporary economic
Theoretical foundations of this characterization of the health system's
macroeconomic dynamics can be found in Baumol (1967), Bacon and Eltis
(1976), Lorenzi et at. (1980) ... We don't intend here to give answers to this
general problem. Instead we propose, following the usual macroeconomic
method, to study the effects of health activities' development. The main
constraint here is the macroeconomic modelling which does not introduce
variables and relations which would permit one to estimate health risks or
effects of care activities on productivity in market sectors.

Some preliminary remarks

Medical care demand

Medical care demand cannot be described with traditional economic mecha-

nisms. Demand is expressed by consumers in a very hazy manner: as a desire
for complete care in case of need. Health producers make this vague social
demand, specific, depending on technological innovations. Medical care
techniques are not unbiassedly defined but arise from complex relations
between administration (its mode of codification of health activities and its
health policy objectives), the medical profession (with its own objectives as a
social group 1) and the productive system which not only fixes the financing
constraint but also participates in innovation and in diffusion of new care
This particular mode of demand for health services and relations 2 makes

C. Duru and 1. H. P. Paelinck (eds.), Econometrics of Health Care, 3-16.

© 1991 Kluwer Academic Publishers.
4 Y. Saillard

the usual econometric treatment of household consumption of care services

problematic. The attempts to estimate consumption functions on time series
make clear the insufficiency of prices and income as explanatory variables
and the importance of a 'habit effect' or of a trend (often very regular) we
have to explain. Then we are deprived of useful relations between health
services consumption and practical macroeconomic variables. Such an obser-
vation implies that we must find other explanatory variables.

Health system

'Health system' is a very useful term but its frontiers must be defined. Here,
we use the French Health Accounts assumptions. So, activities included in
the health system are those 'economic activities which enter directly into the
provision of health care'.3 This function is ambiguously defined according to
the criterion of 'bringing specific means into play, these means being more or
less tied to medical techniques'.3
Such a criterion implies a sometimes difficult distinction between medical
and lodging activities. It excludes direct actions on morbidity risk factors (or
primary prevention); only the delivery, as opposed to the production, of
medical and pharmaceutical products is included. In fact, we restrict our
attention to medical activities, excluding other activities, which are induded
in French Health Accounts: teaching and training of medical and nonmedical
manpower, medical research, sanitation, management of the health system.
Institutional units 'exercising by some main or secondary means one health
activity' 4 belong to the health system. As for medical activities, these units
are mainly hospitals and physicians' practices.
More precisely, using usual macroeconomic models leads one to study
only the 'care system' as opposed to the 'health system'. This means that the
variable 'population health condition' is excluded.
Previous modelling approaches 5 told us that, with available statistical
information, a more complete approach introducing care services consump-
tion (prescribed or directly bought), socio-cultural factors, care supply, and
population health condition, environment ... above all displays the pre-
eminence of environment on care services consumption in explaining varia-
tions in health condition.6 But the factors which appear the most relevant are
precisely those which are not usually included in macroeconomic models.
Even if they were, we would have to propose explicit relations between
environment, care supply and other somewhat elusive indicators like 'socio-
cultura1' indicators. Last, we may wonder if a general indicator like 'popula-
tion health indicator' has even a sense. We don't pursue this debate, except to
insist on the limitation of the present approach.

Model of the care system

Simplifications we use here in constructing our model and the characteristics

Health expenditure growth and macroeconomic model 5

of care demand will lead us to focus on the supply side of the Care system. In
connection with this general approach, we use projection models'? The
general method of such models consists of collecting variables which
describe the care system, to select strategic ones, to distinguish endogeneous
and exogeneous variables. Then projections are achieved, using coherence
relations. Past trends play a prominent role in such models. Then the quality
of projections measures more or less the degree of inertia of the health
system (at least in real terms but relative price estimates remain uncertain).
In order not to give an excessive weight to past trends, an alternative
method is to describe behaviour of Care system factors, assuming that the
care system can be viewed as a set of several markets.8 One can wonder if
such a method is relevant, even in the American institutional framework (for
instance, the estimation of usual consumption functions with net prices as
explanatory variables). Moreover the problem remains to aggregate the
microeconomic behaviour (which, according to recent micro economic analysis
could be much more complex than the usual microeconomic models) to get
macroeconomic relations. Lastly retrospective simulations, in the approach
we quote, lead to corrections on the parameters of the supply of health
services (for instance on the distribution between different kinds of physi-
cians' services or physicians' specialities). So medium long term evolutions
must be exogeneously introduced, showing the limits of even sophisticated
microeconomic modelling.
Previous remarks and the French institutional framework (without markets
in the microeconomic sense) explain why our approach to the care system is
very close to that of projection models, yet linked with a macroeconomic

An attempt to integrate an exogeneous projection of health expenditure in

a macroeconomic model

Every quantitative model of the care system, even the simplest, must make
assumptions on the links between this system and the whole economy: in
particular concerning the wage rate and prices outside the care system and
often by default, selecting only thin relations.
The first benefit one can expect from a linkage between a macroeconomic
model and a health model is to get a better coherence among the economic
variables and the levels of activity and spending in the care system. The
coherence we can really attain depends on the quality of the linkage but also
on the features of the macroeconomic model.

Macroeconomic model and health submodel

Here, we propose a very simple linkage between a model which describes the
6 Y. Saillard

formation of health expenditure and a medium-term macroeconomic model.

Three possible cases are exhibited in the Figure 1:
The macroeconomic model, of the Keynesian type, is supposed to work
with successive phases: determination of production and of value added to
satisfy aggregate demand, fixing of the level of employment from hypotheses
on production functions, estimation of wages and prices (in order to satisfy a
stable distribution of National Income between wages and profits), then
building up sector accounts which lead to investment and to consumer
The first linkage ('parallel working') only consists in deriving health
expenditure from the description of the activities in the health system (which
may be more or less sophisticated). The health model gets the general wage
rate and price level from the macroeconomic model. There is only one
strategic variable in the linkage between the two models: consumer expendi-
ture whose structure is then in part given by the health model. This kind of
connection also permits one to disaggregate such macroeconomic flows as
social benefits, employment and investment.
In this case, variations in health expenditure growth have macroeconomic
effects in so far as they modify aggregate demand. Wage and price evolutions
are not exogeneous but given by the macroeconomic model; alternative
hypotheses on the working of the economy which have different inflationary
effects will be reflected in health expenditure (though in a mechanical way).

l.a. "Parallel" working l.b. Feed-back at the wage- 1.c. Full linkage
price block level

Figure 1. Possible linkages between Keynesian macroeconomic model and a health submodel.
Health expenditure growth and macroeconomic model 7

Last, it is possible to find the counterpart of expenditure in terms of level

of activities (the interest in such a correspondance depends on the quality of
the health model). Yet, in this linkage, the channels of macroeconomic effects
are not specific to health expenditure.
The second kind of linkage (feedback at the wage price block level) forces
one to take into account a rule regarding the budget deficit and alternative
ways of meeting this rule. The rule may be 'strict equilibrium of public
accounts' or 'no more deficit than x % of the GDP'. Alternative ways of
meeting this rule are increasing fiscal pressure (with one or a mix of taxes),
limitation of the public expenditure (for instance by a fall in the rate of
The third linkage is an hypothetical one where we would know all about
the possible connections between the health system and the whole economy:
the morbidity risks (from consumption and from working conditions), effec-
tiveness of health expenditure in terms of labour productivity. Even if some
pin-point studies are available on these connections, integrating them in a
macroeconomic framework is still a distant objective.

Illustration, using simulations on a market-non-market model

Here, we only illustrate the first two linkages, using an already existing simple
model (AGORA).9 In order to interpret some quantitative results which are
summarized in the next paragraph, we first proceed to a very brief presenta-
tion of the macroeconomic model and of the health model in AGORA.

(i) The macroeconomic model

It's a very simple one with a Keynesian structure like in Figure 1. We only
present the hypotheses which are important for the effects of the growth of
health expenditure on the whole economy.
- Investment in the market sector (except for 'Energy' and for 'Trans-
port' where investment is exogeneous) is a function (of the accelerator type)
of the increase in production. No financial variable is introduced in these
investment functions.
- Household final consumption expenditure is a function, for each
market good of its price, of its consumption in the previous year and of the
whole household final consumption expenditure which is consistent with the
evolution of the rate of saving. For one part, the consumption of personal
health services, the household final consumption is fixed by the health
submodel. So the private payments for this consumption are a forced
utilization of household disposable income.
- Annual labour supply in each market subsector depends on the
contemporaneous production (technically, pre-requisite employment) and on
previous employment. Labour demand, for each market subsector too,
depends on the components of the labour force and on the various kinds of
labour already available which can satisfy the labour supply.
8 Y. Saillard

- Relative prices are those which make consistent the production struc-
ture and the distribution of value added. They arise from hypotheses on the
stability of the wage share in value added, on the general wage rate and on
endogeneous labour productivity for each market subsector. 10
- Other hypotheses are made to simplify the macroeconomic model (at
least in its basic framework):
varying capacities of production are not introduced
exports are exogeneous and imports vary according to elasticities with
respect to final consumption expenditure or to domestic production
money creation is endogeneous without effect on real flows and the
interest rate does not affect firms' choices
non-market sectors (hence health activities) grow independently of each
the rate of growth of all benefits (except health benefits) is a fixed
multiple of the inflation rate. Yet, unemployment benefits vary with the
basic wage and the level of unemployment.
fiscal pressure is constant in all its components.

(ii) The health submodel

The health submodel is also very simple. As already noted at the beginning
of this section, it mainly reconstitutes health consumption expenditure from
trends in the activity of this sector.11
Descriptive parameters of health system activity are exogeneous but links
between costs and prices can vary when we improve alternative administra-
tive rules.
Activities are described in some detail according to French institutional
framework; as these details are not used in this paper, we only mention the
basic structure:
hospital care in a three-part administrative classification of hospitals and
four-part classification of services
medical services in private practices produced by general practitioners
and by specialists
pharmaceutical products purchased in private pharmacies.
Inputs costs are estimated from labour and non-labour coefficients
matrices which are applied to activity levels. Alternative financing structures
of health expenditure are applied. The basic one assumes constant propor-
tions of three financing sectors: social security and public sector, mutual
funds and private insurance, direct payments.
The linkage between the health sub-system and the macroeconomic model
is shown in the Figure 2.

(iii) Using the model for simulations: main results

A traditional way to study macroeconomic effects of health expenditure
growth is to use the multiplier method. So, Table 1 shows the effects of an
increase of health expenditure relative to the basic evolution when this
Health expenditure growth and macroeconomic model 9

Macroeconomic model Exogeneous variables

wage-price block in the macroeconomic model
General price General wage rate
level Population demographic

1 !
Physician and Nonphysician Manpower
hospital medical services per patient
- Hospital Care
mean stay duration
Parameters admissions in hospitals (rate per head)

health Ambulatory Care { Physician Manpower
system Acts per physician
Medical and Pharmaceutical Products

Financing structure of health expenditure

- Labour supply Public sector
- Aggregate demand - Public financing of health - Final consumption (for health services)
expenditure - Primary income (income received
from health sector)
- Disposable income (after collective

Figure 2. Linkages between the health submodel and the macroeconomic model in AGORA.

increase is maintained throughout the years of projection, with a constant

financing structure and three alternative assumptions regarding the house-
hold savings rate. Such results give us examples of effects of alternative
economic assumptions outside the health system.
Quantitative results mainly depend on the way we choose to deal with
nonhealth final consumption. A first (rough) assumption is that acceleration
of health expenditure growth involves no acceleration of collective financing
and no adjustment, even in the short-term, of the household savings rate. In
such a case there is substitution between health and non-health final
consumption which, more or less, eliminates multiplier effects on the national
product. Such countereffects may imply a growth slowdown as in our model,
with an improvement of the trade balance. Acceleration of health expendi-
ture implies even in this case a positive effect on market sector investment.
Variations in the budget deficit depend on the contemporaneous increase in
collective financing and on the final effect on market production (which has a
mechanical positive effect on public resources).
A second case (see results in column 2 of Table 1) assumes that the
acceleration in health expenditure implies no acceleration in private pay-
lOY SaWard

Table 1. Incidence of more health expenditure on GDP components (effects of an increase of

1 MdF in the basic year, four years later, in constant prices).

Main macroeconomic 1st case" 2nd case b 3rd case e


Varying GDP components

household non-
final health -0.6 -0.1 0.3
health 1.4 1.0 1.4
market sector
investment 0.2 0.2 0.3
imports 0.1 -0.2
GDP 1.1 1.1 1.8
Budget deficit -0.8 -0.7 -0.6
Trade balance -0.1 0.1

" Constant level of collective financing, stable household rale of saving.

b Compensating collective financing, stable household rate of saving.
c Short-term fall of the rate of saving.

ments for this consumption. So there are no more countereffects on the non-
health final consumption side even without fast adjustment of the savings
The final effect in production is the same as in the previous case but with
a smaller effect on the consumption structure (that is to say weak substitution
between health and nonhealth final consumption). The Budget deficit dete-
rioration is very close to what we get in the first case: collective financing
grows faster but activity, and therefore resources, also grows faster. The only
aggregate indicator, that fares worse in the second case is the balance of
trade, which is still almost the same as before.
A third simulation assumes that an acceleration in the private payments
due to faster growth in health expenditure can be damped by household
saving. So we establish an ex post macroeconomic link between the savings
rate and the growth of consumption, whose level is not determined by
traditional household behaviour but in accordance with a largely institutional
dynamic. It amounts to the same thing to assume that the savings rate is
defined, for households, out of health consumption and benefits. With our
simple macroeconomic model, such a hypothesis implies that the positive
multiplier effects can expand by all channels: increase in final health
consumption, increase in incomes received in market and non-market
sectors. A faster economic growth rate implies (according to our mOdel), a
better government budget balance but a worse trade balance.
A more complex use of the model is obtained when adding constraints on
the budget deficit. In such a case a limit is fixed for the level of budget deficit.
Health expenditure growth and macroeconomic model 11

Two extreme cases are distinguished: either fiscal pressure is endogeneous

without feedback on health expenditure growth, or fiscal pressure is also
fixed and a health expenditure slowdown must be achieved. Regulated-
budget-deficit simulations are reproduced on Table 2.
As an upper limit a 1% budget deficit (in terms of market GDP) is allowed
for the last year (10th) of the simulation. With our model this objective, when
achieved with an increase of public receipts, implies a slowdown in house-
hold consumption and in national product growth (and then less imports and
a better trade balance), and a worsening unemployment. Of these two ways
of increasing public revenue, wage-based contributions to social security are
the more inflationary.
Our alternative assumption for achieving budget-deficit regulation implies
a very speedy worsening on the labour market. Slowing down public
expenditure with the same effect on activity as in the previous case (an
annual rate of growth of the market GDP which diminishes by 0.2%)
produces 300,000 more unemployed workers. This result mainly comes from
the assumption that when public expenditure growth must be reduced,
growth of employment (which is the largest expense) in public sectors we
study has also to be reduced at the same rate.
Simulations which are discussed here can sustain a scheme where different
kinds of macroeconomic effects of an increased rate of health expenditure
growth are summarized (see Figure 3). The main assumptions are in squares.
They are:
structure of health expenditure financing (public or private financing)
structure of care production (especially the share of wage costs in total
constraint on budget deficit (upper limit?)
structure of wage cost in the whole economy (repercussions of public
financing health expenditure on wage costs)
household saving behaviour (exogeneous or endogeneous rate of saving)
structure of household final consumption expenditure (links between final
consumption in health and in other products)
profit and wage shares in value added.
Two groups of relations act simultaneously: a first one which corresponds
to the multiplier effects and a second one which corresponds to the links
between wages (level and structure) and prices.
The multiplier effects imply an increase in final consumption due to more
intermediate goods and investments in the health system but also due to
increases in employment which permit more household final consumption
expenditure. These effects "involve in the very short term an increase in
aggregate demand and an increase in inflation because of pressures on
production. Then in the medium term these tensions draw investments in the
market sector.
A second kind of effect depends on the structure of labour costs. It
displays the special character of the French case with respect to the level
Table 2. Simulation of regulation budget deficit policies: main results (from Peaucelle et al. (1983».
Regulating budget deficit policies
With more receipts With less expenditures
Basic Using Using wage For health For Central (a) and (b) ~
scenario income basic contri- and Administra-
taxes butions to education (a) tion, Defence
social and Infra-
security structure (b)

Annual MarketGDP 2.2 2.0 2.0 2.0 2.1 2.0

mean (at constant prices)
rate of Price of market GDP 10.0 10.1 10.4 10.0 10.1 10.1
(1981-1986) Non-market G.DP 0.8 0.7 0.9 -0.1 0.6 -0.3
(at constant pnces)
Price of non-
marketGDP 11.9 12.0 12.1 11.9 11.9 11.9
Ratio public expenditure on market
GDP (at current prices) 45.1 45.5 45.0 44.9 45.0 44.7
Ratio public expenditure on market
GDP (at constant prices) 42.2 42.5 42.1 42.0 42.0 41.8
Unemployment (thousands) 2387 2436 2434 2672 2440 2723
Annual mean rate of growth
(1981-1986) of household final
consumption expenditure (at
constant prices) 2.1 1.6 1.6 1.8 2.0 1.8
Health expenditure growth and macroeconomic model 13



IStructure of health
expenditure financing
Structure of
care production

Health benefits Compensation

/ of=ploy~,
I Constraint on I
budget deficit I

_...J " ' - - - - - - . Labour

disposable income

rHousehold savingl
behaviour r
final consumption

Structure of household
final consumption
expenditure Profits and
wages shares
Health Non health in value added

consump tion expenditure
'\..Aggregate _ Production
Multiplier effects Demand
profit rate

Figure 3. Effects of a variation in the level of health activity when feedbacks between the health
submodel and the macroeconomic model only work through the wage-price block.

of the wage-based contributions to social security. It's also necessary to

remember the limitations of our approach which does not introduce feed-
back of activity in the health system on labour productivity. To simplify, two
cases can be distinguished according to how modification in public expendi-
14 Y Saillard

ture affects labour cost: its structure or rate of growth. For us, changing the
structure of labour cost means increasing contributions to social security
(from employees and employers) to the prejudice of the net wage. Except for
effects of new labour cost structure on low wages,12 we can suppose that for
employers, the total labour cost is more important than its structure. On the
other hand, if the share of the net wage is reduced, it will have appreciable
effects on final consumption. Well-known consumption functions do not take
into account the structure of disposable income.
On the whole, household disposable income varies in our simulations
according to three combined effects: a growth in wages in the health system
and in the market economy (due to more activity), a growth in health benefits
and a slowdown in net wages (new structure of labour costs). The final
variations depend on the constraint on the budget deficit (and on how it is
satisfied) and obviously on the way the macroeconomic model works
(especially the production functions in each sector and the division of
consumption between domestic products and imports).
So, health activities' development can generate cumulative effects, distor-
tion in labour cost structure reinforcing the increase of nonmarket goods in
household consumption.
The second case (acceleration of labour costs) uses a new group of
interdependances which are linked to inflationary effects. A medium-term
loop is added to the previous ones: it acts through operating surplus, level of
internal financing, rate of profit in market sectors. Then everything depends
(still assuming no productivity earnings coming from health activities) on the
way employers recover labour cost increases in price acceleration (with
constraints on domestic and international trade). A too tight constraint on
prices involves reducing investment projects and recessionary consequences.
A weaker constraint may imply more inflationary pressure so that the
problem reverts to consumers' choices and the financial sector.


Most methods used to describe health system dynamics, including ours, lead
one to focus on the autonomous development of these activities, as if they
were impervious to economic contingencies. It seems mainly due to statistical
or methodological constraints. To take into account all the links between the
health system and the whole economy would first imply an analysis of an
epidemiological type. Thus, we could display how socioeconomic variables
are responsible for health conditions. But such an analysis must integrate the
study of access to the health system and the evaluation of efficacity according
to several points of view. Then we face problems like relations between risk
factors, levels (individual and social) of analysis, indicators of health,
measures of effects of health conditions on productivity ... Attaining such
objectives is still a utopia.
Health expenditure growth and macroeconomic model 15

So more punctual approaches are necessary in order to appraise compara-

tive efficiency of health projects, even if it can serve as a basic framework.
Macroeconomic models might also evolve. For instance distinguishing
market sectors of activity is perhaps less relevant than opposition between
large and small firms or than some indicators of work conditions (shiftwork,
night or day work, ...). As to the relation between consumption and health
condition, it is less meaningful to know the value and the quantity of
consumed goods than to mark what they imply in terms of specific risks (for
instance: alcohol, tobacco, hypercaloric foods, ...). Problems of morbidity
statistics and of their general lack of etiological considerations are well-
In spite of these limits, international studies and the differentiate effects of
economic crisis on health expenditure prove the necessity of including it in
the general economic framework.


* I express my most profound thanks to Professor Mike Jerison (of New-York and
CEPREMAP) for correcting my English. However the responsability for all other errors
that remain in the text are mine alone.
1. Group which is not homogeneous.
2. Remark which is not particular to our area and we could also make for household
equipment goods or some services. When made clear, it permits here to render an
account of social functions of medical profession: Levy et al. (1982).
3. Comptes de la Sante - Methodes et series 1950-1977 (1979).
4. Comptes de la Sante - Methodes et series 1950-1977 (1979).
5. For France, see Letourmy (1976).
6. Very often, in terms of variations of rates of mortality. It also shows the cumulative effect
of socio-cultural and supply factors.
7. For France, see Couder et al. (1972).
8. For instance, Yett et al. (1979).
9. Peaucelle et al. (1983). AGORA is a medium-long term model with a macroeconomic
model and two submodels on Education and on Health. At the beginning of the building
of this model (1975), there was no French mini-model. So the team had to build its own
macroeconomic model in addition to the two submodels. The main objective of this study
was to analyze the links between market and non-market sectors.
10. Depending on the level of production in each market subsector.
11. So it is very near to a projection model.
12. Due in France to the (less and less important) ceiling of contributions to social security.

Bacon, R. and Eltis, W. (1978), Britain's Economic Problem: Too Few Producers, The
MacMillan Press Ltd, 2nd edition.
Baumol, W. J. (1967), Macroeconomics of Unbalanced Growth: The Anatomy of Urban Crisis,
American Economic Review, Juni.
Comptes de la Sante (1979), Methodes et series 1950-1977, CREDOC-INSEE, Ministere
de la Sante et de la Famille, Rapport CREDOC.
16 Y. Saillard

Couder, B., Sandier, S. and Tonnellier, F. (1972), Recherche de projections coherentes pour
des variables interdependantes, Consommation nO 3, Juillet-Septembre.
Letourmy, A. (1976), Sante, environnement et consommations medicales, Discussion autour
d'un modele, Revue Economique nO 3.
Levy, E., Bungener, M., Dumenil, G. and Fagnani F. (1982), La croissance des depenses de
sante, Economica.
Lorenzi, M., Pastre, O. and Toledano, 1. (1980), La crise du XXe siecle, Economica.
Peaucelle, I., Petit, P. and Saillard, Y. (1983), Depenses publiques: structure et evolution par
rapport au PIB. Les enseignements d'un modele macroeconomique, Revue d'Economie
Politique nO l.
Yen, D. E., Drabek, L., Intriligator, M. D. and Kimbell, L. J. (1979), A Forecasting and Policy
Simulation Model of the Health Care Sector. The HRRC Prototype Microeconometric
Model, Lexington Books.

Supply and demand

The MIMIC health status index*
(what it is and what it does)


Department of Health Policy and Management, Erasmus University Rotterdam
(PO Box 1738, 3000 DR Rotterdam, The Netherlands).
Social and Cultural Planning Office (PO Box 37, 2280 AA Rijswijk, The Netherlands).


During the last decade a huge progress in econometric studies has been made
in developing models with so-called latent or unobservable variables. I In this
paper we apply the latent variable technique in the field of the health care
sector. Health, being one of the most predominant explanatory variables for
health indicators like the use of health care facilities or sick-leave, is an
outstanding example of a latent variable. During the last years some studies
have appeared specifying health as a latent variable. 2 One of the profits of
these MIMIC-models 3 is that it yields an 'index of the health status'. This
MIMIC-Health Status Index (MIMIC-HSI) is fully characterized by its causes
(health determining factors) and by its indicators (health indicators).
In this paper we will give a definition of the MIMIC-HSI and we will
interpret its numerical properties. One of the advantages of the MIMIC-HSI
over the classical health status indexes like e.g. a weighted sum of health care
utilization and sick-leave, is that factors like money-price, time-price and
supply of health care facilities (influencing health care utilization) or the level
of the sick-pay and other insurance conditions (influencing sick-leave) have
no longer any influence on the health status index.

The classical model

Quantification of the health status of an individual or a region is seriously

hampered by the lack of a natural unit of measurement for health. Although
health cannot be measured directly, many indicators of health are measur-
able. For instance, at an aggregated level we can use the mortality rate, life
expectancy, or the sick-leave rate as health indicators. At the individual level
we have blood pressure, cholesterol percentage, disability, self perceived
general state of health, etc. The use of health care facilities is also frequently
used as a health indicator.
Differences in health indicators between individuals or regions are often

G. Duru and 1. H. P. Paelinck (eds.), Econometrics of Health Care, 19-29.

© 1991 Kluwer Academic Publishers.
20 W. P. M. M. van de Yen and E. M. Hooijmans

explained by differences in explanatory variables. In formula we might

express this as follows:

HIt = L TCuXk + Vt t = 1,2, ... L, (la)


where HIt is the t-th Health Indicator, X k is the k-th explanatory variable,
TCu is an unknown parameter indicating the influence of X k on HIt and Vt is
a disturbance term, representing all explanatory variables not explicitly
accounted for in the equation. 4 In order not to complicate the notation, we
will suppress the index n indicating the n-th observation (individual, region,
etc.), n = 1,2, ... N.
One of the recommendations of the Conference on health status indexes
(Tucson, Arizona, 1972) was to make a difference between a health
indicator and a health index. The first term is used for rather specific
measurements such as those mentioned above. The latter term should be
used for derived measures that combine several health indicators (Berg,
1973, p. 253). Often the index is taken to be weighted sum of indicators:

HSI = L wtHI t , (lb)


where HSI stand for Health Status Index and wt indicates the weight that HIt
carries. We refer to model (1) as the Classical Model.
A major problem that may occur when using a health status index HSI like
(lb) is that factors X k which do not influence health, but do have an
influence on HIt> are influencing HSI via HIt. For instance, if in an economic
recession the sick-leave rate (HIt) decreases as a result of fear for losing ones
job or as result of a lower sick-pay, then the value of HSI will increase and
incorrectly be associated with a better health status. Another example is the
number of hospital bed-days per capita, which is largely determined by
hospital capacity. Closing some 25 hospitals, as the Dutch government
intends to do, will lower the total bed capacity with about 12% and thereby
the number of hospital bed-days (HIt) with approximately the same per-
centage. This lower use of hospital facilities will incorrectly be associated
with a better health status. Building a new hospital or decreasing the
copayment rate will increase the health indicator 'hospital consumption' and
will in the same way incorrectly lower the value of HSL
Other problems which arise when using a health status index HSI like (1)
are: Which weights W t have to be assigned and who determines the weights?
(see e.g. Berg, 1973, pp. 254-255).

The MIMIC-model

With the aid of present day computer facilities and statistical methods it is
The MIMIC health status index 21

possible to develop a health status index that is not bowed down by these
problems. In constructing a new health status index we start from the view
that health is a theoretical concept which is not directly measurable, that
there exists an ordering in different health statuses and that health is fully
characterized by its causes and by its indicators. Following the terminology
used in the statistical literature (Joreskog and Goldberger, 1975) we call the
index based on the above mentioned starting points the MIMIC-Health
Status Index, where MIMIC stands for Multiple Indicator MultIple Causes.
Our approach is in the spirit of Robinson and Ferrara (1977), who stated:
'The approach we take seems new, for we model health as unobservable link
between observable causes and observable effects' (p. 139). In formula the
MIMIC-model is defined as follows:

K[ K2
HIl = L alktXlk +L awX2k + OtMIMIC-HSI + Ut (2a)

t = 1,2, ... L,

MIMIC-HSI = L f32kX2k + L f33kX3k + Ut , (2b)

k~1 k~l


Xl = a KI-vector of variables which only influence HIt;

X 2 = a Kz-vector of variables which have a direct influence on the MIMIC-
HSI and both a direct and an indirect (via MIMIC-HSI) effect on HIt;
X3 = a K3-vector of variables which have a direct influence on the MIMIC-
HSI and therefore an indirect effect on HIt.
a 1kt , a2kt , 0" PZk and P3k are unknown parameters (to be estimated) and u,
and u are disturbance terms.
All X-variables and all health indicators HI are observable; the disturb-
ance terms and the MIMIC-HSI are unobservable.
The division of X into XI' X z and X3 might be as follows:
(a) If the health indicator is the use of health care facilities:
XI: supply of health care facilities and money- and time-price variables;
Xz: income, education, medical knowledge;
X3: health status proxies (e.g. age-sex), use of health care facilities in the
past, life-style variables, housing, environmental pollution, the kind of
employment and other factors included in the production function of
health (see e.g. Grossman, 1972).
(b) If the health indicator is sick-leave:
Xl: sick-pay, travel time to work, whether or not breadwinner;
Xz: job-satisfaction, responsibility, and rate of payment;
X3: as above.
22 W. P. M. M. van de Ven and E. M. Hooijmans

Classical model versus MIMIC-model

Schematically the difference between Classical model (1) and the MIMIC-
model (2) may be illustrated as in Figure 1.
What are the main advantages of the MIMIC-HSI as defined in (2) over
the HSI as defined in (I)?
First, Xl-variables which do not influence health have no longer any direct
influence on the value of the health status index.
Second, the total effect of :TCt of X2-variables on HIt has been disentangled
in the direct effect au and the indirect effect ddJ2 (via MIMIC-HSI). As
an illustration we mention the results of Vande Yen and Vander Gaag
(1982) who found evidence that income has both a positive direct
(income-)effect (au > 0) and a negative indirect effect via MIMIC-HSI
(dtfJ2 < 0) on medical consumption. The latter is the result of a positive
effect (fJ2 > 0) of income on health (better life-style, hygiene, attitude
to health etc.) and a negative effect (d t < 0) of health on medical
Third, the weights W t need no longer to be predetermined e.g. by intuition
or by some Delphi-method, but all unknown parameters, including those
needed for calculating MIMIC-HSI, may be estimated using an appropriate
data-base and an appropriate estimation method.

Figure lao Classical model.

Figure lb. MIMIC-model.

The MIMIC health status index 23

The MIMIC-health status index

Substituting the unobservable MIMIC-HSI in the MIMIC-model yields the

Classical Model
~ ~ ~

HI! = I JrlktXlk + I JrZkt X 2k + I Jr3ktX3k + Vt (3a)

k~l k~l k~l

t = 1,2, .. . L,
where Jr is decomposed in Jr l , Jrz and Jr3 according to Xl' X z and X3 and

Jr lkt = alkl (3b)

JrZkl = a Zkt + odJ2k (3c)
Jr3kl = 0t (33k (3d)
Vt = Ut + Ot u. (3e)

Because all variables in Equation (3a) are observable, we can make an

estimate of all parameters Jr and of EViVj for all i, j. If we are able to write
the unknown parameters of the MIMIC-model as a function of the consis-
tently estimable parameters of Equation (3a), we can also make a consistent
estimate of the parameters of the MIMIC-model (i.e. the identification
problem has been solved).
In the case Xz, I == 1 (i.e. (32, I is the so-called 'constant term'), necessary
conditions herefore appear to be:
(A) Oto = c for some to with c a constant,
(B) (32, I = d, with d a constant,
(C) for each k ~ 1: aZktk ~ 0 for some t k •
Under these conditions the Equation (3b)-(3e) can be 'solved' as follows:





(33k = Jr3ktn
. (4e)
24 W. P. M. M. van de Ven and E. M. Hooijmans

If further Euiuj = 0 (i =/: j) then

Eu 2 = E (HI j • HI2) and Eu~ = E (HIf - 2Eu2, (4f)

01 02

so all parameters of the MIMIC-model are identified.

Assuming Eu = 0 we may use as an estimate of the MIMIC-HSI, given
some X:

K, K,

= I f32kX2k + I f33kX3k, (5a)

k~j k~l

where f32k and f33k are estimates of the unknown parameters {32k and {33k'
Substituting i32k and (33k from (4d) and (4e) in (5a) yields


From (5b) we may conclude that MIMIC-HSI is a variable whose scale is

'unique up to a linear transformation'. This implies that the MIMIC-HSI is
so-called interval variable i.e. a variable with an interval scale of measure-
ment. 'In this sort of measurement, the ratio of any two intervals is
independent of the unit of measurement and of the zero point. In an interval
scale, the zero point and the unit of measurement are arbitrary' (Siegel, 1956,
In Equation (5b) c serves as the arbitrarily chosen unit of measurement
and d, which equals MIMIC-HSI (0), as the zero point.
In facilitating the interpretation of the numerical properties of the MIMIC-
HSI we make a comparison with temperature measurement. Just as health,
temperature may be considered a concept which is not directly measurable,
for which an ordering exists and which is fully characterized by its causes and
by its indicators. (The theoretical zero-base 'zero degrees Kelvin' corre-
sponds to the theoretical zero-base in health 'being dead'). If the temperature
in centigrade (Celsius) is known, one may apply a simple linear transforma-
tion 5 in order to know the temperature in Fahrenheit. The ratio of the differ-
ence between temperature values on one scale ('index') is equal to the ratio
between the equivalent difference on the other scale. In accordance with our
intuition about health it is meaningless to say that the value of the MIMIC-
HSI of region (or person) A is twice as large as that of region (or person) B.
However, from (5b) we conclude that it does have meaning to say that the
difference in the MIMIC-HSI between A and B is twice as large as the
difference in the MIMIC-HSI between C and D.
The MIMIC health status index 25


The MIMIC-HSI has some fruitful applications. Having estimated the

parameters of the MIMIC-health care model it is possible to calculate for
each individual or group of individuals the value of the MIMIC-HSI. This
index value may be used in comparing the health status of different
individuals or different states (regions), in allocating health care resources
and assigning budgets to regions or in evaluating the quality of care.
For example, Robinson and Ferrara (1977) present a ranking of the 50
U.S. states for 1969 according to 'health'. They warn the reader that for more
than one reason their empirical results should not be taken seriously (p. 141),
so their ranking should primarily be considered as an illustration.
In analyzing a 1975-data base with observations on 675 households with
1589 children, Wolfe and Van der Gaag (1981) specify and estimate a
MIMIC-health care model. They present the value of their MIMIC-health
status index for children by socio-economic groups.
After a slight modification their model may be written as model (4.1)6 and
their index may be written as (4.10). Wolfe and Van der Gaag do not go into
detail about the interpretation of the presented index-values, but now we may
conclude that their index is measured on the interval scale. Hooijmans and
Van de Ven (1982) specify and estimate a MIMIC-health care model using
1974-data for 63 regions covering all of The Netherlands in order to explain
the regional differences in the main forms of health care utilization and
health care supply. As an illustration they calculated the MIMIC-HSI for the
11 provinces (which also cover all of The Netherlands) and presented their
rankings. From the previous section we may conclude that ranking does not
contain all the relevant information that may be derived from the MIMIC-
HSI. Therefore, in Table 1, we present the value of the MIMIC-HSI for each
If health care supply is one of the HI I-variables (health indicators) in the
MIMIC-health care model ('the facilities are allocated according to, among
others, health needs'), another application is to calculate the difference in
expected health care facilities between two regions. This difference can be
divided in a difference, based on differences in health determining factors
and in a difference, based on other factors. For example, if the number of
general practitioners per population (GP-density) HII depends on a vector
Xl and on the regional health status (MIMIC-HSI) which is influenced by the
health determining factors represented by the vector X 3 , we have the
following simple model:
HII = + Ot MIMIC-HSI + U I ,
a1tXI (6 a)
MIMIC-HSI = f33X3 + u. (6b)
The expected value of HII equals
E(HIIIX) = aItXI + 7i3t X 3 (6c)
26 W. P. M. M. van de Yen and E. M. Hooijmans

Table 1. The MIMIC-HSI per province. For the choice of zero-

base (d) and unit of measurement (c), see Hooijmans and Van de

Province MIMIC-HSI

Noord-Brabant 0.27
Utrecht 0.23
Limburg 0.22
Gelderland 0.21
Drente 0.19
Overijssel 0.18
Zuid-Holland 0.07
Noord-Holland 0.05
Groningen 0.02
Friesland 0
Zeeland 0

where we used 0tfJ3 = Tt3t (from (3d».

The difference in expected value between region A and B then reads
E(HIIIXA) - E(HIlIXB) = alt(Xt - Xf) + Tt3t (Xf - Xn. (6d)
From (6d) we conclude that Tt31 (Xt - Xn equals the difference in the
expected GP-density between region A and B while only accounting for
differences in health determining factors between A and B; alt(Xt - Xf) is
the difference while only accounting for differences in other factors (Xl).
If one knows the influence of health determining variables like the use of
health care facilities (in the past), road safety, environmental hygiene,
education, welfare work or sporting facilities and if one knows the costs of
changing these variables, it is possible to apply a cost-effectiveness analysis by
comparing the costs of increasing health in different ways. If one further
knows the benefits of a better health in terms of less (utilization of) health
care facilities, one may apply a cost-benefit analysis by comparing these
benefits with the costs of improving health.
An illustration of this may be found in Van de Ven and Van der Gaag
(1982). They estimated an ll-equation MIMIC-health care model using a
1976-data base of 3.636 Dutch male breadwinners. In Table 2a the influence
of some variables on the MIMIC-HSI is presented. A one-year increase in
education yields, other things being equal, a 0.035 increase in the MIMIC-
HSI; the same effect is achieved by a reduction of the rate of unemployment
by 0.128/0.035 = 3.7%. If the costs of increasing education or decreasing
unemployment are known, one may compare the costs of increasing health in
these two ways. In Table 2b the effect of the MIMIC-HSI on health care
utilization is presented. An increase of the MIMIC-HSI by one unit results
on the average in a reduction of 1.2 GP contact per half year, a reduction of
The MIMIC health status index 27

about 28 guilders (in half a year) on medicine prescribed by the GP etc. (see
The costs of increasing health can then be compared with the benefits in
terms of reduced health care costs.
The results in Table 2 should only be considered as an illustration of the
possible applications of the MIMIC-HSI because the data base on which
these results are based, was not primarily collected for the performed
MIMIC-analysis. Nevertheless, it offered sufficient possibilities to illustrate
the use of the MIMIC-HSI at the individual level.

Conclusion and discussion

In this paper we defined the MIMIC-Health Status Index (MIMIC-HSI) as a

latent variable representing 'health' in a MIMIC-health care model. For the
case that the MIMIC-HSI is a linear function of health determining factors
we concluded from the identification restrictions imposed on the MIMIC-
health care model, that under general conditions the MIMIC-HSI is mea-
sured on a so-called interval scale, i.e. the ratio of any two intervals is
independent of the arbitrarily chosen unit of measurement and of the zero
point. Although the MIMIC-HSI is in the stage of development, it looks
promising and seems to have some fruitful applications in the health care
planning e.g. cost-effectiveness analysis, cost-benefit analysis and the alloca-
tion of budgets or resources to regions.

Table 2a. Total effects of exogenous variables on the MIMIC-HSP.

Family size Unemployment Age Education


Effect 0.174 -0.128 -0.023 0.035

Table 2b. Total (reduced form) effect of the MIMIC-HSI on the use of some health care
facilities b •

GP-contacts Medicine b Specialist Medicine b Hospital

(half year) prescribed contacts prescribed days
byGP (half year) by specialist (yearly)
(half year) (half year)

Effect -1.22 -28.54 -1.31 -28.39 -2.34

a Table 9 and 10 in Van de Venand Vander Gaag (1982).

b expressed in Dutch guilders (one guilder approx. 40 dollar> cents).
28 W. P. M. M. van de Ven and E. M. Hooijmans

Further research is needed in the interpretation and the numerical

properties of the MIMIC-HSI in the more general case e.g. if it is a vector, if
there exists simultaneity, if the model is dynamic or not linear or if identifica-
tion is achieved through restrictions on the elements of the co-variance


* A previous version of this article has been presented at the European Meeting of the
Econometric Society, Dublin, September 1982.
1. E.g. Zellner (1970), Goldberger (1972), Joreskog and Goldberger (1975) and Aigner and
Goldberger (1977).
2. E.g. Robinson and Ferrara (1977), Lee (1979), Van de Ven and Van der Gaag (1982),
Wolfe and Van der Gaag(1981) and Hooijmans and Vande Ven (1982).
3. Multiple Indicator MultIple Causes (Joreskog and Goldberger, 1975).
4. Equation (la) may be considered as the reduced form of a more complicated, simulta-
neous model. Here we will only be concerned with the parameters Jrk( and not with the
underlying structural parameters.
5. F= 1.8 C + 32.
6. Th = Y3;3 in their model should be substituted in the 1]!- and 1]4-equation; 1]1 and 1]4
should be substituted in the 1]6-equation. The correlation of the reduced form disturbance
terms as a result of the latter substitution does not affect the discussion about the
interpretation of the MIMIS-HSI in the previous section.
7. Note: the MIMIC-health care model on which these results are based, is linear in the
logarithms of the variables. So we are dealing with the logarithm of the MIMIC-HSI.

Aigner, D. J. and Goldberger, A. S. (eds.) (1977), Latent Variables in Socio-Economic
Models, North Holland.
Berg, R. L. (1973), Health Status Indexes: Proceedings of a Conference Conducted by Health
Services Research, Tucson, Arizona, Chicago Hospital Research and Educational Trust.
Goldberger, A. S. (1972), 'Maximum-likelihood Estimation of Regressions Containing Un-
observable Independent Variables', International Economic Review 13, 1-15.
Grossman, M. (1972), The Demand for Health: A Theoretical and Empirical Investigation,
Columbia University Press, New York.
Hooijmans, E. M. and van de Ven, W. P. M. M. (1982), 'Implementing a Health Status Index
in a Structural Health Care Model', In J. van der Gaag, B. Neenan and T. Tsukahara (eds.),
Economics of Health Care, Praeger, pp. 302-330.
Joreskog, K. G. and Goldberger, A. S. (1975), 'Estimation of a Model with Multiple Indicators
and Multiple Causes of a Single Latent Variable', Journal of the American Statistical
Association 70,631-639.
Lee, Lung-Fei (1982), 'Health and Wage: A Simultaneous Equation Model with Multiple
Discrete Indicators', International Economic Review 23,199-221.
Robinson, P. M. and Ferrara, M. C. (1977), 'The Estimation of a Model for an Unobservable
Variable with Endogenous Causes', In D. J. Aigner and A. S. Goldberger (eds.), Latent
Variables in Socio-economic Models, North-Holland Publ. Co., Amsterdam, pp. 131-142.
Siegel, S. (1956), Nonparametric Statistics for the Behavioral Sciences, McGraw Hill.
Van de Ven, W. P. M. M. and van der Gaag, J. (1982), 'Health as an Unobservable: A
The MIMIC health status index 29

MIMIC-model for Health-care Demand', Journal of Health Economics 1, 157-183.

Wolfe, B. and van der Gaag, J. (1981), 'A New Health Status Index for Children', In J. van
der Gaag and M. Perlman (eds.), Health, Economics and Health Economics, North Hol-
land Pub!. Co., pp. 283-304.
Zellner, A. (1970), 'Estimation of Regression Relationships Containing Unobservable In-
dependent Variables', International Economic Review 11, 441-454.
Estimating demand for medical care:
health as a critical factor for adults and children*


The World Rank
University of Wisconsin-Madison

1. Introduction

In an era when resources are increasingly directed toward medical care,

understanding the factors that influence demand takes on greater importance.
Most work in this area (see, for example, Newhouse and Phelps, 1974;
Hyman, 1971; Rosett and Huang, 1973; Newhouse, 1981) is directed to
understanding the role of personal income and health insurance on demand,
with emphasis on the dimensions of insurance. The recent Health Insurance
Study conducted by the Rand Corporation focused on measuring responsive-
ness to various coinsurance rates - the partial payment by the consumer.
Other recent work has addressed the value of time, length of wait, and
demand for medical care. Equity issues are implicitly or explicitly raised by
many of these studies. For example, the Health Insurance Study suggests that
persons with low incomes decrease their medical care usage more than
higher-income individuals when coinsurance is imposed.
Most medical care is therapeutic rather than preventive; 1 that is, it is for
purposes of treating acute and chronic illness. People seek care when they
have a health problem. Thus, in the analysis of the demand for medical care,
health plays a critical role. It must be included in the demand analysis if we
are to get unbiased measures of the role of other factors such as income,
insurance, waiting times, and so forth.2
The problem is that health is difficult to measure. Generally, one or a
number of partial measures - days ill, self-assessed rating of excellent, good,
fair, or poor health, functional limitations - is available, but each of these
captures only a part of health status and may be influenced by an individual's
own expectations. This last characteristic is particularly true for the most
commonly used measure, the self-assessment of overall health (see Manning
et al., 1981, p. 45, for a review). For example, a handicapped individual may
be doing well at the time of the survey and respond that he is in 'excellent' or
'good' health, yet, from society'S point of view, that person might be
considered to be in only fair or poor health. It is probably fair to say that the
search for an 'ideal' health measure is hopeless unless, perhaps, we specify

G. Duru and J. H. P. Paelinck (eds.), Econometrics of Health Care, 31-58.

© 1991 Kluwer Academic Publishers.
32 J. van der Gaag and Barbara L. Wolfe

the purpose of this measure in advance. In this paper, we search for a

measure of health status suitable to be included in health care demand
analysis. An inaccurate measure of health is likely to lead to bias in demand
The main problem stems from the fact that health itself is influenced by
many of the same factors that influence the demand for medical care.
Examples of these factors include education (see Edwards and Grossman,
1980; Shakotko, 1980) and income (see Grossman, 1972). Thus, if education
- or any other factor exerting these influences - were included in the
demand analysis, but health itself were not, the coefficient on education
would include the influences of education on health, and health on demand,
not simply education on demand. Thus, finding an index (or indexes) to
appropriately measure health may be critical to improving analysis of the
demand for medical care. It is, however, possible that the omission of health
from demand analysis limits us to measuring only the gross effects of
variables that influence both demand and health - education and income,
for example - but does not create omitted variable bias for the other
variables. In this case, demand analysis could proceed without the need for
extensive data on health. Analysts would need to realize they are measuring
gross effects in certain cases. Omission of health may, however, result in
biased estimates if there is correlation between the omitted term and the
additional independent variables.
In this paper we investigate the importance of health as a determinant of
the demand for medical care; the influence of 'demand-related factors' on
health, and the importance of including accurate measure(s) of health in the
demand analysis. We do so separately for children and adults - in part
because preventive care may playa larger role in medical care demanded for
children, and in part because for children the demand is parent-initiated
rather than self-initiated.
We begin with a large set of health measures or indicators, some of which
may be relevant in the demand for medical care. In Section 2 we explore
these measures and attempt to combine them through the use of principal
component analyses. In Section 3 we explore the factors that influence
health, as measured both by the health factors from the principal component
analyses and the separate health measures. In Section 4 we present our
demand analysis and explore the differential results as we change the health
variables included and alternatively omit health from the analyses. Through-
out these sections we also explore the role of attitude on demand for health
care - attitude toward quality, convenience, and cost of medical care.
Finally, in Section 5 we bring these separate explorations together and
extend our analysis to a structural model of the demand for medical care. In
this, health is treated as a latent variable, the health measures serve as
indicators, and we have equations explaining health and utilization all as part
of a Multiple Causes-Multiple Indicators (MIMIC). We present our conclu-
sions in Section 6.
Estimating demand for medical care 33

2. Principal component analyses of health and attitude variables

The data used in this study were collected in 1975 by the Rochester
Community Child Health Survey. A 1% sample of families with children
under 18 years of age in Monroe County (where Rochester, New York, is
situated) were interviewed in 1975. Observations on 972 adults and 1191
children aged 1-18 within 514 households are used in this work. The data
are rich in health information, both in terms of health status and medical care
utilization, and in information regarding the households' attitudes towards
the seeking of professional care. To the data we have added provider availa-
bility, by matching resident location to physician location and calculating the
travel distance to hospitals and health centers. For more detail on the data
see Wolfe (1980).

Health factors

As stated earlier, the data contain many health measures. For everyone in the
sample, we observe a subjective evaluation of health (HSTAT) given by the
respondent, the presence of a handicap (HCAP), whether the individual's
activity is limited in any way (LIM), and over the past year the number of
days ill (DAYS ILL), the number of days in bed (DAYS BED), and whether
or not the family member has been ill (ILL). Table 1 presents the relative
frequency distribution of these measures for adults.
We see that although almost half the sample is rated as having excellent
health, nearly 60% have been ill during the year. Persons report 4.49 days in
bed on average, a figure slightly below that reported on the Health Interview
Survey (HIS) of the civilian noninstitutionalized population of the United
States for 1980. More days ill than days in bed are reported, as expected.
(No comparable data are available in the HIS.) The percentage who report
handicaps is similar to those reporting 'with limitation in major activity'
(5.2%) in the HIS, while the percentage reporting LIM is similar to those
reporting 'with activity limitations' (8.6%) in the HIS Survey.3
While it is clear from these data that most adults are relatively healthy, it is
not an easy matter to decide which health variables best represent the health
status of the adults. Some variables contain overlapping information (DAYS

Table 1. Proxy measures of health status for adults. (N = 972)

Relative frequency distribution Percentage

of HSTAT adults with Average number
2 3 4
Excellent Fair Good Poor HeAP LIM ILL DAYS ILL DAYS BED

48.56 40.33 9.16 1.95 5.04 8.44 59.88 8.63 4.49

34 J. van der Gaag and Barbara L. Wolfe

BED, DAYS ILL) while other variables seem to convey conflicting informa-
tion (HSTAT, ILL).
For children, the picture is even more complicated. First we observe the
same health information as for adults. Table 2 presents the relative frequency
distribution of these health measures for children. Again, over half the
sample is reported to be in excellent health, yet over 75% have been ill
during the year. In terms of comparisons to national statistics, these children
have fewer days ill and days in bed than those reported in HIS. The handicap
percentages are closer: 2.0 for HIS, 2.1 for this sample; 3.8 for activity
limitation for HIS, 5.45 for this sample.
In Table 3 we display the incidence of seventeen specific health distortions
for children. Taken one by one, the data merely provide frequencies; e.g.,
nearly a quarter of the sample have allergies other than asthma or hay fever.
It is likely, however, that there is overlap; we can expect that the information
on breathing problems (24.22%) at least partly contains the same informa-
tion as that on certain allergies.
The main purpose of this paper is to assess which aspects of health are
relevant to health care utilization. This task would be greatly simplified if the
large number of health measures available could be reduced to a smaller set
of independent variables. We construct such a set of health factors by
calculating the principal components of the correlation matrices of the health
measures. Tables 4 and 5 show the rotated factor matrices for the health
measures of Adults and of children.4 In Table 4 we see that the five variables
for adults reduce to two independent components: factor 1, with large
loadings on the handicap measures (HANDICAP), and factor 2, with large

Table 2. Proxy measures of health status for children. (N = 1191)

Relative frequency distribution Percentage

of HSTAT children with Average number

2 3 4

60.03 34.34 4.79 0.84 2.10 5.45 77.29 5.63 2.67

Table 3. The prevalence of seventeen health distortions in the subsample of children

(percentage). (N = 1191)

Asthma 5.30 Seeing 2.10 Diabetes 0.17

Hay fever 8.92 Speaking 7.15 Behavior 7.74
Other Allergy 23.97 Arthritis 0.34 Learning 8.07
Kidneys 1.77 Bronchitis 8.75 Breathing 24.22
Heart 4.71 Epilepsy 1.93 Nose 40.96
Hearing 4.96 Cerebral palsy 0.25
Estimating demand for medical care 35

Table 4. Rotated factor matrix for adult health measures (Varimax


Factor 1 Factor 2

HCAP 0.958 0.032

LIM 0.959 0.013
ILL -0.046 0.512
DAYS ILL 0.066 0.885
DAYS BED 0.074 0.864

" Only factors with an eigen value exceeding 1.00 are shown.

Table 5. Rotated factor matrix for children's health measures (Varimax Rotation)."

Factor 1 Factor 2 Factor 3 Factor 4


HCAP 0.919 0.021 0.000 0.078

LIM 0.910 0.017 0.011 0.112
ILL 0.019 -0.039 0.634 -0.002
DAYS ILL 0.005 0.028 0.788 0.027
DAYS BED -0.005 0.076 0.813 0.018
ASTHMA -0.003 0.520 -0.063 0.061
HAY FEVER 0.063 0.700 -0.045 -0.027
OTHALLGY 0.031 0.593 0.151 -0.036
KIDNEYS -0.014 -0.033 0.Q35 -0.076
HEART -0.023 -0.064 -0.036 0.151
HEARING 0.045 0.043 -0.D18 0.142
SEEING 0.066 -0.032 0.036 0.061
SPEAKING 0.058 -0.013 0.065 0.500
ARTH 0.098 0.004 -0.034 -0.111
BRONCH -0.074 0.256 0.186 -0.004
EPILEPSY 0.190 -0.036 -0.016 0.504
CERPALSY 0.585 0.072 0.014 0.134
DIABETES 0.022 -0.045 -0.003 -0.050
BEHAVIOR -0.026 0.105 -0.045 0.629
LEARNING 0.100 -0.062 0.047 0.688
BREATHING 0.002 0.534 0.070 0.059
NOSE 0.009 0.724 -0.030 0.041

a There were eight factors with an eigen value exceeding 1.00. Only the first four are shown.

loadings on those measures usually related to acute illnesses (ACUTE). The

total variance explained by both factors is 73%.
In Table 5, for children, we find approximately the same two factors as for
Adults. Factor 1 correlates highly with handicap measures (including cere-
bral palsy) and factor 3 correlates highly with the acute illness proxies (ILL,
36 J. van der Gaag and Barbara L. Wolfe

DAYS ILL, DAYS BED). The other two factors presented in Table 5 are
also easily interpretable. Factor 2 scores high on all measures of respiratory
diseases (RESPIRATORY), while factor 4 relates to diseases with a large
behavioral content (BERAVIOR). The four factors contain almost 35% of
the total variation of the 22 original health measures.
We will use the two factors obtained for adults and the four factors for
children in the analyses that follow, and interpret them as suggested above.

Attitude or taste factors

A large number of measures of attitudes toward seeking medical care are

available in the data. The attitude part of the survey includes questions on
the importance of having guaranteed access to a doctor (Guaranteed Access)
and the importance of having convenient office hours (Convenient Hours).
The replies take on values from 1 = very important to 3 = not important.
Another set of questions relate to the attention received while seeing a
doctor, including: does the doctor spend enough time with you (MD Time)?
The responses range from 1 = not enough time to 3 = enough time. Finally,
we have questions rating the health care received (Quality of Care). The
responses range from 1 = excellent to 4 = poor. Table 6 presents the relative
frequencies of these attitude variables.
Again, there is so much information that it is nearly impossible to
characterize attitudes. Guaranteed access to a doctor seems very important,
but comprehensive services do not. People believe M.D.s do not listen
enough but do give enough time, and so on. Many of these measures no
doubt overlap and represent the same underlying concerns. In order to gain
insight into these concerns, we calculated the principal components of this
set of attitude variables. The results are presented in Table 7. We see that the
variables reduce to three independent components. Factor 1 has high
loadings on factors related to M.D. or medical attention and to other factors
generally related to the quality of care. Factor 2 has heavy loadings on cost
factors with an emphasis on the cost of time. Factor 3 stresses convenience,
showing heavy loadings on a convenient location, one M.D. for the family,
and comprehensive services. In the analysis to follow we use these three
generally interpretable attitude factors to represent family tastes toward
medical care.
We thus have assembled a unique data set which includes socioeconomic,
individual, and family characteristics, data on health care utilization, matched
availability data, and constructed independent factors to measure health
characteristics and attitudes.

3. What factors affect health and attitude?

As pointed out in the Introduction, it is likely that some or all of the health
Table 6. Average values of attitude variables for families.

Importance of Medical attention Quality of care

1. Guaranteed access 1.08 7. Reasonable fees 1.38 13. MD careful 1.16 18. Quality of care 1.43
2. Convenient hours 1.47 8. Fast appointments 1.35 14. MD concerned 1.34 19. Satisfied 1.13
3. Convenient location 1.70 9. Short office wait 1.56 15. MD listens 1.16 20. Relative care 1.50 ~
4. Recommended by friend 2.25 10. Friendliness of staff 1.47 16. MDtime 2.78 21. FindMD 2.38 §.
5. 24 hr emerg. care 1.23 11. Type of patients 2.81 17. MDinfo 2.54 ....
6. Comprehensive services 1.92 12. All see 1 MD 2.07 ~.
Values: 1 (very important) to Values: 1 (not enough) to Values: 1 (excellent) to
3 (not important) 3 (enough) 4 (poor) ~

38 J. van der Gaag and Barbara L. Wolfe

Table 7. Rotated factor matrix for attitude variables (Varimax Rotation)."

Variable Factor 1 Factor 2 Factor 3

(quality) (cost) (convenience)

Guaranteed access -0.096 0.071 -0.208

Convenient hours 0.103 0.516 0.079
Convenient location -0.033 0.329 0.443
Recommended by friend 0.033 0.035 0.118
24 hr emerg. care 0.051 0.039 0.214
Comprehensive services -0.018 0.210 0.658
Reasonable fees -0.007 0.579 0.268
Fast appointments -0.029 0.608 -0.016
Short office wait -0.018 0.756 0.011
Friendliness of staff 0.150 0.542 0.022
Type of patients -0.129 0.101 -0.D18
All see 1 MD -0.045 -0.033 0.739
MDcareful 0.698 0.062 0.013
MD concerned 0.744 0.099 0.077
MDlistens 0.785 -0.031 0.031
MDtime -0.702 0.089 -0.025
MDinfo -0.652 -0.062 0.074
Quality of care 0.722 -0.019 -0.159
Satisfied 0.579 0.039 -0.032
Relative care 0.187 -0.011 -0.179
FindMD 0.068 0.006 0.058

" There are six factors with eigen values exceeding 1.00. Only 3, those easily interpretable, are

and attitude measures are systematically related to a number of socio-

economic variables that enter the demand equations. Thus, if the demand
equations are estimated without the health and attitude variables, some of the
coefficients obtained are likely to be biased. On the other hand, if the health
measures are included, the coefficients of the socioeconomic variables show
only partial effects on health care utilization, and should be interpreted as
The magnitude of this potential problem is an empirical question that
often is not addressed due to lack of data. In this section, we will assess to
what extent health (H) and attitude (T) variables are systematically related to
various socioeconomic variables. We will estimate equations of the form
H = H (individual characteristics, family characteristics)
T = T (family characteristics)
where H represents a health measure and T represents a taste factor.
In Table 8 we present the health equations for adults. As health measures
we use both the two health factors and the six separate health variables.
As explanatory variables we include individual characteristics: age, sex
Table 8. Determinants of adults' health.

Factor 1 Factor 2 (excellent = 1,

AGE 0.011 (2.79) -0.003 (0.71) 0.012 (4.01) 0.002 (2.08) 0.005 (3.32) 0.009 (0.08) -0.018 (0.29) -0.003 (1.33)
FEMALE -0.179 (2.79) 0.156 (1.56) 0.004 (0.06) -0.028 (1.18) -0.088 (2.19) 2.29 (0.84) 1.51 (0.91) 0.111 (1.81)
EDUCATION -0.027 (1.83) 0.015 (1.06) -0.014 (1.24) -0.005 (1.47) -0.011 (1.92) 0.223 (0.56) 0.018 (0.08) 0.016 (1.83)
FAMSIZE -0.030 (1.12) -0.001 (0.03) 0.007 (0.35) -0.004 (0.71) -0.017 (1.63) 0.524 (0.73) -0.222 (0.51) -0.011 (0.67)
NONWHITE -0.077 (0.62) -0.082 (0.66) 0.209 (2.13) -0.030 (1.02) -0.013 (0.26) -2.47 (0.72) 0.949 (0.46) -0.113 (1.49) t;l
MARRIED -0.217 (1.41) 0.112 (0.73) -0.205 (1.69) -0.032 (0.88) -0.105 (1.70) -1.66 (0.39) -1.73 (0.68) 0.200 (2.13) §.
WORKFULL -0.231 (1.80) 0.079 (0.62) 0.019 (0.19) -0.041 (1.33) -0.110 (2.13) -1.94 (0.55) 2.11 (0.99) 0.080 (1.02) ~
WORKPART -0.128 (0.85) 0.037 (0.24) 0.005 (0.04) -0.031 (0.86) -0.048 (0.80) -2.01 (0.48) 2.81 (1.12) -0.013 (0.14) :::-.
OCCUPATION 0.001 (0.37) -0.001 (0.33) -0.002 (1.12) 0.000 (0.39) 0.000 (0.33) -0.007 (0.10) -0.026 (0.62) 0.000 (0.03) ~
FAMINC -0.006 (1.10) -0.012 (2.05) -0.010 (2.12) -0.001 (0.97) -0.003 (1.34) -0.373 (2.34) -0.116 (1.21) -0.004 (1.15) 1}
MEDINC -0.025 (1.92) 0.016 (1.26) -0.021 (2.04) -0.006 (2.05) -0.009 (1.66) 0.349 (0.99) 0.343 (1.62) -0.001 (0.14) ~
CONSTANT 0.886 (2.84) -0.366 (1.18) 1.97 (8.03) 0.228 (3.08) 0.464 (3.72) 1.83 (0.21) 2.42 (0.47) 0.216 (1.14) ~

0.051 0.016 0.103 0.016 0.063 0.017 0.012 0.027 '0'
N = 755. <"Il
(-statistics in parentheses.

40 J. van der Gaag and Barbara L. Wolfe

(FEMALE), race (NONWHITE), education, employment status (WORK-

PART, WORKFULL), and occupation (as measured by a commonly used
occupation status scale, the Bogue Scale). Family characteristics include
marital status (MARRIED), family size (FAMSIZE), family income (FAMINC)
and the median income of the census tract where the family lives (MEDINC).
The last variable may be viewed as a better proxy for economic status
('permanent income') than annual family income. (The means and standard
deviations for these variables are in Appendix A).
Though the R2 for the HANDICAP factor is low, 0.036, we find sig-
nificant coefficients for AGE, FEMALE, EDUCATION, WORKFULL,
MEDINC. Note that the effects of WORKFULL and MEDINC may be a
case of reversed causation. We do not interpret the equations presented in
Table 8 as 'health production functions.' We merely assess the extent of
systematic relationships among health measures and socioeconomic variables.
With respect to the HANDICAP factor, these relationships seem to be of
some importance. But with respect to the ACUTE factor, we find only one
significant coefficient: adults in higher income families show a lower score
(are 'healthier') on the ACUTE factor.
The equation explaining HSTAT shows that older individuals judge
themselves to be in relatively poorer health. So do nonwhites. Married adults
and adults living in higher income families, on the other hand, give them-
selves high health scores (Le., low scores on HSTAT). The five final columns
of Table 8 show slight effects of socioeconomic variables on the separate
health measures, but the overall picture is mixed. For instance, MEDINC
shows a negative coefficient for HCAP and LIM and is not significant for
DAYS ILL and ILL, but shows a significant positive coefficient for DAYS
BED. All socioeconomic variables included (except occupation), however,
show an impact on one or more of the health measures for adults.
In our analysis for children, the variables included in the regression
explaining the health measures are similar to those for adults except that
more variables are now family variables. MARRIED refers to the marital
status of the head of the household.
The employment and occupation variables are included for both parents.
A few individual variables are also added: a dummy variable which indicates
if the child was born while the mother was less than 20 years old (LMAGE)
and birth order (BIRTHORD).
Table 9 presents the estimation results for the four health factors for
HSTAT. We also regressed all individual health measures against the socio-
economic variables but the estimates did not yield any additional informa-
tion. We therefore do not present these results.
From the estimation results presented in Table 9 we can make some
general observations. In addition to the age, sex, and birth order of the child,
the variables for family income, family size, and mother's education seem to
be related to one or more of the five health measures. But the direction of
their effect depends on the particular measure employed. Mother's educa-
Table 9. Determinants of children's health.

Factor 1 Factor 2 Factor 3 Factor 4


AGE 0.001 (0.07) 0.006 (0.81) -0.010 (1.19) -0.003 (0.29) -0.011 (2.22)
FEMALE -0.076 (1.14) -0.057 (1.00) -0.024 (0.38) -0.287 (4.42) 0.013 (0.35)
FAMINC 0.015 (2.09) -0.003 (0.44) -0.004 (0.51) 0.002 (0.34) -0.005 (1.35)
MEDINC 0.012 (0.86) -0.003 (0.27) 0.012 (0.89) -0.002 (0.15) -0.019 (2.46)
BIRTHORD 0.006 (0.13) -0.082 (2.19) -0.007 (0.18) -0.086 (2.01) -0.011 (0,48)
LMAGE -0.100 (0.55) -0.179 (1.15) -0.16 (0.09) -0.067 (0.38) -0,017 (0.18)
FAMSIZE 0.018 (0.51) 0.025 (0.82) -0.092 (2.67) 0.035 (1.01) -0.003 (0.15)
NONWHITE 0.060 (0.48) -0.029 (0.27) -0.143 (1.18) -0.150 (1.22) 0.041 (0.60)
MARRIED -0.525 (2.50) 0.040 (0.22) -0.065 (0.32) 0.190 (0.92) -0.003 (0.03) ~
FFULL 0.295 (2.07) 0.022 (0.18) 0.001 (0.01) -0.131 (0.94) 0.082 (1.07)
FPART 0.208 (0.53) 0.633 (1.88) -0.461 (1.22) -0.444 (1.15) 0.109 (0.51) :::to
FOCC 0.005 (1.40) 0.000 (0.14) -0.003 (0.81) -0.007 (2.12) -0.002 (1.08) ~
MFULL 0.113 (0.71) 0.185 (1.34) -0.038 (0.25) -0.195 (1.25) 0.228 (2.64) ~
MPART 0.039 (0.25) 0.062 (0.46) 0.049 (0.32) -0.073 (0.47) 0.223 (2.61) ~
MOCC -0.002 (0.70) -0.001 (0.58) -0.000 (0.07) 0.002 (0.85) -0.003 (2.10) ~
MEDUC -0.046 (2.88) 0.027 (1.91) 0.033 (2.15) -0.026 (1.67) -0.005 (0.53) >:>..
CONSTANT 0.110 (0.40) -0.405 (1.72) 0.286 (1.08) 0.832 (3.10) 1.93 (13.05) ~
R2 0.036 0.026 0.038 0.046 0.059 ~

N=999. £5
(-statistics in parentheses. £5

42 1. van der Gaag and Barbara L. Wolfe

tion, for instance, shows a negative effect on the HANDICAP factor, a

positive effect on the RESPIRATORY and ACUTE factor, and a negative
effect on the BEHAVIOR factor. No significant relationship between
HSTAT and MEDUC is found. Thus, general conclusions like 'mother's
education has a positive effect on children's health' cannot be drawn from
this analysis. The important point is that when health measures are used in
the analysis for the demand for health care, one should be aware that these
measures are related to the socioeconomic variables that are themselves
included as explanatory variables in the demand analyses. Moreover, since
some socioeconomic variables usually employed in demand analysis do have
a positive (or negative) effect on some health measures, the estimation results
of the demand analysis may depend on the choice of the health measure
In the equations analyzing the determinants of the attitude or taste factors,
we include a similar set of variables. For these, since the unit of observation
is the family, all variables are family variables. They include both parents'
labor force participation and occupation, age of the head, whether or not
they own the home in which they reside, race, marital status, family size, and
family and tract medium income measures. As can be seen in Table 10, for
only one factor, CONVENIENCE, do these variables have much impact.
For this factor, family income, mother's education, and homeownership all
have positive effects, while race (being nonwhite) and age of head both have
negative effects.

Table 10. Equations 'explaining' taste factors (households as unit of observation).

Factor 1 Factor 2 Factor 3

(Quality) (Cost) (Convenience)

FAMINC (10,000's) 0.107 (1.11) 0.115 (1.20) 0.206 (2.28)

MEDINC (10,000's) -0.013 (0.061) -0.284 (1.48) 0.197 (1.09)
FAMSIZE 0.010 (0.274) -0.008 (0.201) -0.015 (0.420)
NONWHITE -0.081 (0.448) -0.312 (1.75) -0.406 (2.41)
MARRIED -0.232 (0.809) -0.277 (0.977) -0.327 (1.22)
FFULL 0.216 (1.09) 0.017 (0.089) -0.068 (0.371)
FPART 0.203 (0.375) -0.644 (1.21) -0.738 (1.47)
FOCC -0.002 (0.373) 0.005 (1.06) 0.003 (0.731)
MFULL -0.179 (0.865) -0.172 (0.845) 0.024 (0.126)
MPART -0.107 (0.528) -0.089 (0.445) 0.101 (0.532)
MOCC 0.001 (0.182) -0.001 (0.185) 0.001 (0.395)
MEDUC 0.009 (0.426) 0.026 (1.24) 0.040 (2.04)
AGE HEAD -0.006 (1.05) -0.008 (1.51) -0.021 (4.05)
OWN HOME 0.027 (0.180) -0.164 (1.09) 0.267 (1.88)
CONSTANT 0.054 (0.136) 0.447 (1.14) -0.278 (0.751)
R2 0.01 0.04 0.15

N = 514.
Estimating demand for medical care 43

In order not to complicate the analysis too much, we will, in the next
section, always include these taste factors in the demand analysis. Thus, we
should bear in mind that if a significant impact of one of the taste factors
(especially factor 3) on utilization is found, the coefficients for income, race,
and mother's education show only partial effects, 'holding taste constant.'
In general, the analysis of health care utilization is hampered by the fact
that no generally acceptable unidimensional health measure exists. As shown
above, principal component analyses or factor analytical techniques can
successfully be employed to reduce the sometimes large number of corre-
lated measures into a smaller set of independent ones. But this approach is
quite mechanical and still does not yield one unidimensional measure.
It is probably fair to say that one unidimensional measure of health status,
representing all facets of health, and usable for a variety of purposes, simply
does not exist. However, in Section 5 we will show how a single, comprehen-
sive health measure can be obtained, once the purpose of that measure is
specified. But first we will present an analysis of the demand for medical
care, including taste factors and using the health factors derived in the
previous section as proxy measures for health. We will provide comparisons
with results obtained when the longer list of the health measures is used and
when the health measures are completely omitted.

4. Health care utilization

For our demand analysis, in addition to modeling the determinants of the

total number of provider visits, we distinguish four categories: visits
to emergency rooms (HOSPERVS), visits to hospital outpatient clinics
(HOSPOPVS), visits to health centers or clinics (HCORCLVS) and physi-
cian visits at office or home (OFFHMVS). The explanatory variables include
family variables such as income, race, marital status, attitude, insurance
coverage by type and family size. For adults they also include labor force
participation, age, sex, and health variables. For children they include age,
sex, family characteristics, and health variables. Finally, availability is mea-
sured by the distance to the nearest hospital (HOSP), the distance to the
nearest HMOS or non-HMO clinic (HMO, XHMO) and the number of
doctors per population (ALL indicates all physicians, for adults; GPPED
indicates general practioners and pediatricians, for children) is also included.
Table 11 presents the adult health care equations for adults. The health
variables included are the two health factors constructed in Section 2, plus
the subjective measure HSTA T. With respect to the health measures, we find
that a high score on the HANDICAP factor (FACTI) is only significant with
respect to hospital outpatient visits. The ACUTE health factor (FACT2)
shows a significant effect on all but one of the measures of health care utiliza-
tion. Visits to a health clinic or health center are the exception. The subjec-
tive health evaluation measure, HSTAT, seems to be a strong predictor for
Table 11. Health care utilization equations for adults.
Constant 0.333 (2.23)b 0.440 (1.66)" 0.965 (2.70)b -1.082 (1.52) 0.591 (0.72) f}
GT55 -0.014 (0.13)
-0.319 (1.53) 0.534 (1.92)" 0.270 (0.48) 0.410 (0.64)
FEMALE -0.020 (0.55) -0.011 (0.16) 0.033 (0.36) 0.995 (5.53)b 0.992 (4.70) ~
FAMSIZE -0.D11 (1.02) -0.002 (0.12) -0.038 (1.42) -0.145 (2.70)b -0.189 (3.09) ~
NONWHITE -0.095 (1.86)b 0.199 (2.08)b 0.364 (2.80)b -0.498 (1.94)b I::>
-0.040 (0.13) ;:s
MARRIED 0.003 (0.04) -0.200 (1.61)" 0.304 (1.82)" 0.187 (0.55) 0.251 (0.65) 1:1..
FAMINC -0.000 (0.05) -0.005 (1.09) -0.006 (1.12) 0.025 (2.22)b 0.014 (1.08) I::>
MEDINC -0.000 (0.09) 0.014 (1.43) -0.006 (0.47) 0.059 (2.30)b 0.079 (2.61) ~
ATIlT! -0.006 (0.30) -0.072 (1.81)a -0.009 (0.17) -0.122 (1.12) -0.204 (1.66)
ATIIT2 -0.021 (1.14) -0.021 (0.59) -0.077 (1.64)" -0.019 (0.21) -0.129 (1.20)
ATIIT3 -0.003 (0.20) -0.050 (1.69)a -0.049 (1.24) 0.060 (0.75) -0.053 (0.59) t""'
MCAID 0.173 (1.94)b 0.313 (1.88)a 0.907 (4.07)b -0.152 (0.34) 1.231 (2.40) ~
PRIVINS -0.209 (2.68)b -0.067 (0.46) -0.658 (3.39)b 0.475 (1.21) -0.447 (1.01) 9:
HMOINS -0.026 (0.52) 0.059 (0.64) 0.870 (6.99)b -0.128 (0.51) 0.785 (2.75)
WORKFULL 0.019 (0.50) -0.126 (1.77)" -0.030 (0.31) 0.013 (0.07) -0.130 (0.60)
WORKPART -0.046 (1.10) 0.015 (0.20) -0.202 (1.93)" -0.166 (0.79) -0.399 (1.67)
HOSP -0.000 (0.11) 0.003 (0.69) -0.Q15 (1.03)
ALL 0.370 (0.09) -0.468 (0.23) -0.008 (0.30)
HMO -0.003 (0.63) 0.013 (1.31)
XHMO -0.004 (0.75) -0.Q11 (0.84)
FACT! 0.001 (0.05) 0.185 (6.60)b 0.030 (0.80) 0.071 (0.94) 0.200 (3.37)
FACT2 0.069 (4.54)b 0.223 (7.90)b 0.052 (1.38) 0.757 (9.92)b 1.093 (12.63)
HSTAT 0.052 (2.64)b 0.008 (0.21) 0.121 (2.47)b 0.649 (6.57)b 0.845 (7.5)
R2 0.070 0.174 0.220 0.272 0.355

" Significant at 10% level.

b Significant at 5% level.
Estimating demand for medical care 45

health care utilization, except for hospital outpatient care. All three measures
are important in explaining the total number of visits. Clearly the HSTA T
measure contains information that is not contained in the two more objective
health factors.
With respect to the other explanatory variables, we find only a few signifi-
cant coefficients for HOSPERVS. Individuals with Medicaid insurance have
more visits to a hospital emergency room than privately insured individuals.
There are also slight racial differences. The overall explanatory power of this
equation is low, R2 = 0.070.
For HOSPOPVS we find significant racial differences: nonwhites seek
care more often in a hospital outpatient clinic than whites. Individuals with
Medicaid coverage also show more visits to an outpatient clinic. Being
married and being employed full-time reduces the number of these visits. We
note, finally, that high scores on the 'Quality' and 'Convenience' (ATTITl,
ATTIT3) factors show a negative impact on HOSPOPVS. Apparently this
type of health service does not stand in high esteem for the quality conscious.
Our regression results explain 22% of the variation in HCORCLVS and
27% of the variation in OFFHMVS. Nonwhites with Medicaid coverage or
HMO insurance show a relatively high number of visits to health centers or
clinics. Whites from high income families and 'rich' neighborhoods, and with
private health insurance show more visits to the physician's private office.
These racial and income-related differences are less pronounced for the total
number of visits. The variables NONWHITE and F AMINC show no signifi-
cant effect, but median income in the neighborhood is positively related to
overall utilization. Adults with Medicaid coverage or HMO insurance show a
higher number of visits than do the privately insured. The total number of
visits of adults scoring high on ATIITI ('Quality') is slightly below average,
but the other attitude factors show no effect.
We also find two familiar results: women show higher utilization rates than
men, and individuals living in large families show a lower number of visits
than members of small families. We finally note that our availability measures
do not show any significant impact on utilization. The measurement errors
inherent in the way we constructed these variables might have caused this
result. Or the differences in availability in the relatively small area from
which we obtained the data are simply so small that no effect on utilization
can be observed.
The above results appear to be somewhat sensitive to the use of alterna-
tive variables 'to control for health.' Table 12 gievs some selected regression
results for the case where no health variables are included (column 1), only
the two health factors (column 2), only HSTAT (column 3) and, finally, in
column 4, the two health factors plus HSTAT (as in Table 11). The regres-
sion coefficients of the variables not included in the table appear to be not
sensitive to the changes in health variables.
From Table 12 we learn that it does matter whether or not one controls
for differences in health status. For instance, no income effect and no signifi-
46 J. van der Gaag and Barbara L. Wolfe

Table 12. Selected regression results for adults, using various health measures.

(1) (2) (3) (4)

No health 2 health HSTAT 2 health factors
measures factors + HSTAT
FAMINC -0.001 (0.55) -0.000 (0.12) -0.001 (0.35) -0.000 (0.05)
NONWHITE -0.092 (1.76)" -0.085 (1.66)" -0.103 (2.00)b -0.095 (1.86)"
ATIlT! 0.002 (0.10) -0.001 (0.03) -0.007 (0.32) -0.006 (0.30)
ATIIT2 -0.024 (1.27) -0.025 (1.33) -0.020 (1.06) -0.021 (1.14)
ATIIT3 -0.010 (0.64) -0.009 (0.60) -0.001 (0.07) -0.003 (0.20)

FAMINC -0.009 (0.39) -0.005 (LlO) -0.008 (1.73)" -0.005 (1.09)
NONWHITE 0.157 (1.54) 0.200 (2.10)b 0.138 (1.36) 0.199 (2.08)b
ATIIT1 -0.048 (1.14) -0.071 (1.80)" -0.063 (1.50) -0.072 (1.81)"
ATIIT2 -0.016 (0.43) -0.021 (0.60) -0.010 (0.25) -0.021 (0.59)
ATIIT3 -0.057 (1.82)" -0.051 (1.75)" -0.042 (1.34) -0.050 (1.69)"

FAMINC -0.008 (1.41) -0.007 (1.19) -0.007 (1.24) -0.006 (1.12)
NONWHITE 0.377 (2.90)b 0.339 (2.99)b 0.352 (2.72)b 0.364 (2.80)b
ATIlT! 0.010 (0.19) 0.004 (0.07) -0.008 (0.15) -0.009 (0.17)
ATIIT2 -0.084 (1.78)" -0.085 (1.81)a -0.075 (1.60) -0.077 (1.64)"
ATIlT3 -0.064 (1.63)" -0.063 (1.59) -0.048 (1.20) -0.049 (1.24)

FAMINC 0.011 (0.92) 0.023 (1.97)b 0.017 (1.44) 0.025 (2.22)b
NONWHITE -0.453 (1.58) -0.373 (1.41) -0.597 (2.19)b -0.498 (2.30)b
ATIlT 1 -0.015 (0.12) -0.051 (0.46) -0.126 (1.10) -0.122 (1.12)
ATIIT2 -0.055 (0.52) -0.059 (0.61) -0.003 (0.03) -0.019 (0.21)
ATIIT3 -0.030 (0.34) -0.019 (0.23) 0.082 (0.97) 0.060 (0.75)

FAMINC -0.007 (0.44) 0.011 (0.82) 0.001 (0.10) 0.014 (1.08)
NONWHITE 0.012 (0.03) 0.132 (0.43) -0.203 (0.62) -0.040 (0.13)
ATIITI -0.049 (0.34) -0.115 (0.91) -0.201 (1.47) -0.204(1.66)"
ATIIT2 -0.174 (1.37) -0.184 (1.65)" -0.097 (0.82) -0.129 (1.20)
ATIIT3 -0.165 (1.55) -0.147 (1.58) -0.022 (0.22) -0.053 (0.59)

" Significant at 10% level.

b Significant at 5% level.

cant racial differences are measured for OFFHMVS if no health variables are
included, but both variables f>how a significant effect in column 4, when the
two health factors and HSTAT are added to the equation.6
The choice of the health variables is also relevant. If only HSTAT is
included, we find no significant racial differences for HOSPOPVS but a
significant income effect. If only the two health factors are included, we find
Estimating demand for medical care 47

just the opposite. A similar type of reversal - although in the opposite

direction - appears for OFFHMVS.
The effect of the attitude variables is also sensitive to whether or not
health variables are included. The results suggest, not surprisingly, that one's
attitude toward seeking professional medical care is not independent of one's
health status.
The estimation results for children are presented in Table 13. The four
are included in the regressions, together with HSTAT.
The HANDICAP factor (FACTI) does not show any significant impact
on utilization, while the BEHAVIOR factor (FACT4) shows a positive effect
on hospital outpatient visits only. The other two health factors show the
expected positive impact on utilization almost everywhere.
As was the case for adults, HSTA T seems to contain information about
the children's health that is not contained in the four health factors included
in the regression. With the exception of HOSPERVS, HSTA T is significantly
positively related to all forms of health care utilization.
A similar result was obtained for adults - i.e., HSTA T contains informa-
tion not included in the other health variables. Given the large amount of
other health information contained in the health factors included (especially
for children), these results are surprising. In fact, they cast serious doubt on
the use of HSTAT in health care utilization equations, unless HSTA T is
collected at the beginning of the period under investigation. Otherwise there
is the obvious danger that a score on the HSTAT scale is influenced by
previous health care utilization patterns. This seems to be the case here.
Manning et al. (1981) show the inconsistency of results obtained using a
'postdicted' HSTA T measure. Our results underscore this problem.
With respect to the other explanatory variables, Medicaid coverage is
again one of the variables that shows a significant effect on HOSPERVS. We
also observe slight age and sex differences, while children from intact house-
holds (i.e., the mother is married) show slightly lower utilization rates. The
overall explanatory power is low: R2 = 0.053.
In addition to slight sex differences, we see a significant, and relatively
large, influence of NONWHITE on hospital outpatient visits. But again, we
are not very successful in explaining outpatient utilization differences: R2 =
Children living in low-income neighborhoods, who are nonwhite, and who
have either Medicaid or HMO insurance show a relatively high number of
visits to a health center or clinic. High utilization of private offices is
observed for white children with neither HMO nor Medicaid insurance,
living in 'richer' neighborhoods. The private insurance variable is positive, as
expected, but not significant.? The race and insurance-rated differences are
not observed for total utilization, but children living in high-income neigh-
borhoods show a slightly higher overall utilization rate than their less well-off
Table 13. Health care utilization equations for children. .j>.
CONSTANT 0.111 (0.87) -0.312 (1.44) 1.081 (3.70)h -0.795 (1.52) 0.121 (0.20) ~
LT6 0.049 (1.52) 0.056 (1.00) 0.193 (2.50)h 0.824 (6.16)h 1.130 (7.24)b
12-17 0.063 (2.04)1> 0,018 (0.33) -0.051 (0.72) 0.150 (1.18) 0.184 (1.23) ~
FEMALE -0.067 (2.61)h 0.072 (1.61)" 0.078 (1.31) 0.050 (0.47) 0.140 (1.12)
FAMSIZE 0.013 (1.26) 0.017 (0.96) -0.046 (1.95)1> -0.135 (3.20)b -0.149 (3.00)b ~
NONWHITE 0.042 (0.83) 0.370 (4.24)b 0.251 (2.10)" -0.593 (2.82)b 0.087 (0.35) ~
MARRIED -0.113 (2.05)h 0.016 (0.168) 0.154 (1.19) 0.217 (0.94) 0.236 (0.87) ;::
FAMINC -0.002 (0.68) 0.001 (0.13) 0.005 (0.83) 0.004 (0.36) 0.008 (0.66) 0;;
MEDINC 0.001 (0.18) 0.005 (0.57) -0.028 (2.18)" 0.078 (3.44)" 0.059 (2.16)" l:l
ATTITI -0.006 (0.24) 0.013 (0.33) 0.065 (1.22) -0.210 (2.18)" -0.128 (1.14) ;;
ATTIT2 -0.021 (1.06) 0.016 (0.47) -0.085 (1.85)' 0.082 (0.99) -0.011 (0.12) i:1
ATTIT3 0.005 (0.29) 0.004 (0.13) -0.084 (2.07)b 0.257 (3.59)b 0.172 (2.04)b r
MCAID 0.157 (1.96)1> 0.121 (0.88) 0.405 (2.18)" -0.438 (1.32) . 0.265 (0.68)
PRIVINS 0.126 (1.87)' -0.033 (0.28) -0.401 (2.57)h 0.132 (0.4 7) -0.157 (0.48) §
HMOINS -0.066 (1.34) -0.054 (0.63) 0.947 (8.30)' -0.556 (2.72)b 0.257 (1.08) s;:;

MFULL 0.013 (0.36) -0.047 (0.76) -0.075 (0.91) -0.088 (0.59) -0.198 (1.14)
MPART -0.014 (0.43) 0.088 (1.51) -0.170 (2.17)" 0.403 (2.89)b 0.301 (1.85)"
HOSP -0.003 (1.45) -0.004 (1.02) -0.022 (1.70)'
GPPED 0.157 (0.91) -1.81 (0.03) -0.463 (0.55)
XHMO 0.000 (0.06) 0.001 (0.04)
FACTI -0,015 (1.19) 0.017 (0.82) 0.036 (1.27) 0.011 (0.21) 0.051 (0.85)
FACT2 0.036 (2.49)" -0.036 (1.45) 0.117 (3.50)" 0.187 (3.14)1> 0.305 (4.40)"
FACT3 0.063 (4.80)" 0.087 (3.88)b 0.106 (3.51)" 0.4 77 (8.80)" 0.738 (11.67)b
FACT4 -0.006 (0.50) 0.052 (2.39)1> 0.Q38 (1.27) -0.039 (0.73) 0.045 (0.73)
HSTAT 0.012 (0.50) 0.119 (2.89)" 0.180 (3.23)b 0.558 (5.59)b 0.873 (7.50)b
R2 0.053 0.063 0.221 0.275 0.280

a Significant at 10% level.

I> Significant at 5% level.
Estimating demand for medical care 49

Children from families who score high on the Convenience scale

(ATTIT3) show fewer visits to a health center, but see the private physician
more often. A surprising result is that emphasis on Quality (ATIlT 1) is
negatively related to the number of private physician visits. We note finally
that children whose mothers work part-time have more private visits, less
health center visits, and relatively high overall utilization. Children from large
families show, as usual, somewhat lower utilization rates.
In Table 14 we show some selected regression results, explaining health
care utilization using alternative health-control variables. s The results are
more stable than for adults. 9 The effect of HSTAT on observed racial
differences are not sensitive to the health information included. The positive
effect of A TIIT1 on health center visits disappears as soon as some health

Table 14. Selected regression results for children, using alternative health measures.

(1) (2) (3) (4)

No health 2 health HSTAT 2 health factors
measures factors + HSTAT

FEMALE 0.056 (1.26) 0.078 (1.74)" 0.054 (1.23) 0.072 (1.61)
FAMSIZE 0.008 (0.47) 0.018 (1.00) 0.010 (0.55) 0.017 (0.96)
NONWHITE 0.365 (4.12)b 0.378 (4.31)b 0.358 (4.07)b 0.370 (4.24)b
MPART 0.101 (1.71)" 0.096 (1.66)" 0.088 (1.51) 0.088 (1.51)
ATIITl 0.039 (1.96)b 0.024 (0.62) 0.019 (0.47) 0.013 (0.33)
ATIIT2 0.014 (0.42) 0.017 (0.50) 0.013 (0.39) 0.016 (0.47)
ATIlT3 0.002 (0.06) 0.002 (0.06) 0.009 (0.29) 0.004 (0.13)

FEMALE 0.057 (1.94)". 0.087 (1.46) 0.053 (0.89) 0.078 (1.31)
FAMSlZE -0.062 (2.58)b -0.045 (1.89)" -0.060 (2.51)b -0.046 (1.95)b
NONWHITE 0.242 (1.97)b 0.259 (2.15t 0.235 (1.94)b 0.251 (2.10)b
MPART -0.155 (1.94)b -0.156 (1.98)b -0.177 (2.24)b -0.170 (2.17)b
ATIlT 1 0.111 (2.04)b 0.083 (1.54) 0.079 (1.46) 0.065 (1.22)
ATIlT2 -0.086 (1.84)" -0.083 (1.80)" -0.088 (1.90)" -0.085 (1.85)"
ATIlT3 -0.082 (1.98)b 0.92 (2.27)b -0.071 (1.75)" -0.084 (2.07)b

FEMALE 0.044 (0.39) 0.079 (0.72) 0.033 (0.30) 0.050 (0.47)
FAMSlZE -0.194 (4.33)b -0.132 (3.08)b -0.187 (4.31)b -0.135 (3.20)b
NONWHITE -0.588 (2.62)b -0.557 (2.62)b -0.621 (2.85t -0.593 (2.82t
MPART 0.452 (3.02)b 0.444 (3.14)b 0.393 (2.71)b 0.403 (2.89t
ATIlT! -0.068 (0.66) -0.156 (1.61) -0.162 (1.62) -0.210 (2.18)b
ATIIT2 0.060 (0.68) 0.086 (1.02) 0.056 (0.64) 0.082 (0.99)
ATIIT3 0.286 (3.75)b 0.231 (3.19)b 0.317 (4.28)b 0.257 (3.59)b

a Significant at 10% level.

b Significant at 5% level.
50 1. van der Gaag and Barbara L. Wolfe

information is included, but becomes significantly negative if all health

information is added to the OFFHMVS equation.
The effect of MPART on utilization seems to be slightly overestimated
when the subjective health measure HSTA T is not included in the equations.
When HSTAT is included, the positive effect on HOSPOPVS becomes
nonsignificant, the negative effect on HCORCLVS increases in absolute
value, and the positive effect on OFFHMVS decreases. The negative effect of
F AMSIZE on HCORCLVS and OFFHMVS is more sensitive to the objec-
tive health measures, and becomes less pronounced (but remains significantly
negative) when these measures are included.
This section, then, confirms once again that health is an important deter-
minant of health care utilization. More important, we show that the inclusion
or exclusion of certain health variables affects the coefficients on variables
which themselves affect health. Consequently, the casual choice of one or
more health measures from an ad hoc list 'to control for health,' as is often
the case in the literature is not without consequences for the measured
impact of other variables. Finally, we note that the use of several health
measures makes it hard to generalize about the role of health in determining
In the previous sections we reduced a large number of health variables to
a more manageable set of independent health factors. In the next section we
will go one step further, i.e., we will use these factors as health indicators in a
structural model for health care demand in which HEALTH is treated as a
one-dimensional latent variable.

5. A structural model of demand for health care

In Sections 1-4 we showed the following:

1. By the application of principal component analysis, one can successfully
reduce the dimensions of a set of data on health status. For children we
were able to reduce a set of 26 variables to four independent factors.
These four factors all had a very clear interpretation and explained about
one-third of the total variance.
2. Various socioeconomic variables affect health. But the sign and the
magnitude of the impact depends on the health measure employed.
3. Because of 2, the choice of proxy measure for health status in the analysis
of the demand for medical care does influence the results of the analysis.
These results should therefore be interpreted as conditional on the health
measures included.
Our analysis did not result in unambiguous statements about the positive
or negative effect of family characteristics on health. Nor are we able to say
which health measures to include in the analysis of the demand for medical
care. Both problems stem from the simple fact that no unidimensional
measure of health status exists.
Estimating demand for medical care 51

Ideally one would like to estimate a demand equation of the following

Demand = D (individual characteristics, family characteristics, avail-
ability of medical care, health status).
The individual and family characteristics are already specified in the
previous section, as is the availability of medical care. But, instead of using a
number of proxy measures, we would like to represent health status by one
comprehensive measure.
Likewise, instead of estimating many equations of the form
(Health = H (individual characteristics, family characteristics),
where Health is represented by a number of proxy measures (Section 3), we
would like to represent Health by the same comprehensive unidimensional
measure as used in the demand equations.
This leads to the following model specification:
H* = a'x+ EI (1)
Di = (3~iZ + (3;i H * + E2i i = 1,4 (2)
HPj = ciH* + E3j j = 1, K. (3)
The first equation resembles the equations specified in Section 3: health is
assumed to be a function of a number of socioeconomic variables, x. The
dependent variable health, H*, is unobservable. 1O The equation can be
interpreted as either a production function or a demand function of health. In
both cases H* is desired health status. The second set of equations resembles
the utilization equations estimated in Section 4. The demand for medical care
is a function of exogenous variables, Z, and health, H*. Thus, the proxy
measures of health employed in Section 4 are replaced by one variable: H*.
The vectors x and Z may partially overlap.
The model includes K additional equations. These equations state that the
proxy measures of health, HPj, j = 1, K, are proportional to the overall
measure H*. Thus the probability of an illness increases as health, H*,
decreases. The number of days in bed will decrease if H* increases, etc. That
is, provided the estimation results show the correct signs for the coefficient
cj • This model, which has the form of a MIMIC model (see Joreskog and
Sorbom, 1978, for more detail), is estimated for adults and children
separately. For children we use the four health factors of Section 2 as
indicators (HPj, j = 1, 4). For adults we use the six original health proxies
(HPj,j = 1,6).
The model has been estimated under the assumption that all disturbance
terms are normally distributed. I I Furthermore, we assume E (EI' E2i ) = E (El'
E3J = E(E2i' E3j) = 0, i = 1, 4, j = 1, K. And E(C3j, C3i) = 0, j ¥- i. The
disturbance terms added to the utilization equations, E Zi , are allowed to be
freely correlated with each other.
52 1. van der Gaag and Barbara L. Wolfe

To identify all parameters in the model, we fix the constant term of the
HOSPERVS equation to be equal to its value obtained from the regression
analyses in the previous section. HEALTH is dimensioned by setting its
impact on HCORCLVS equal to -1.0. Thus a one unit increase in health
results in one less visit to a health center.
At the bottom of Table 15, we see that a one unit increase of HEALTH*
decreases HSTAT by 2.9, the number of days ill by 1.6, and the number of
days in bed by 8.4. Furthermore, it reduces the probability that HCAP = 1
or LIM = 1 by 0.09 and 0.14, respectively.
Each column in the top part of Table 15 represents an equation in our
modeL With respect to HEALTH* we find a significant positive effect of
family income. This, of course, is a 'summary' of the findings presented in
Table 8. HEALTH* in fact can be viewed as a weighted sum of the health
indicators used. The HEALTH* equation indicates a negative effect of
FEMALE and a slight negative effect of NONWHITE on health. The utiliza-
tion equations show that, 'holding health constant,' nonwhites have fewer
visits to private offices and hospital emergency rooms, and more visits to
health centers.
The other variables included in the health equation show no impact on
health. Given the analyses in the previous section, this result is not surprising.
As we have seen, various socioeconomic variables have positive, negative, or
no effect at all on health, depending on which health measure we use.
Consequently, when we obtain a unidimensional health measure, based on
the various health measures previously employed, the effect of the socio-
economic variables can be expected to be small at best. A similar result is
obtained for children (see Table 16): FAMINC is the only significant variable
in the health equation, apart from the familiar age and sex effects.
HEALTH*, in the model for children, correlates highly with HSTAT, but
does not show much relationship with the four health factors. In fact, the
coefficient for the second factor (RESPIRATORy) has the 'wrong' sign
(bottom of Table 16). When interpreting both Tables 15 and 16 it is useful to
remember the scale we used for the children's and adult's health variables.
For both children and adults, we scaled the health measure so that a one unit
increase on health results in one less visit to a health care center. The corre-
sponding reduction for hospital outpatient visits is also nearly one less visit
for adults, approximately one-half for children, and for hospital emergency
room visits is slightly greater than one-third for adults but only approxi-
mately one-hundredth for children. These results seem plausible and give
some basis for our claim that the unobserved HEALTH* factor can serve as
a comprehensive unidimensional health index. However, the results for
private office visits (OFFHMVS) imply a very high response to a one unit
increase in HEALTH* - 3.07 and 6.16 for children and adults respec-
tively.I2 Given the average values of OFFHMVS (approximately 1.6 for both
children and adults), these results seem implausible. On the other hand, many
of the results look quite reasonable and are consistent with those based on
the regression analysis in the previous sections.
Table 15. Estimation results of the structural model of demand for medical care (adults).

HEALTH* -0.339 (2.90)b -0.946 (2.93)b -1.00 (-) -6.16 (3.69)b

HOSP -0.001 (0.54) 0.003 (0.69)
ALL -0.001 (0.19) -0.007 (0.35)
HMO 0.001 (0.29)
XHMO -0.005 (1.11)
MCAID 0.008 (0.82) 0.486 (2.49)b 1.667 (6.79)b -0.627 (1.19)
PRIVINS -0.290 (4.37)b -0.077 (0.53) -0.106 (0.58) 0.538 (1.37)
HMOINS -0.016 (0.32) -0.068 (0.69) 0.803 (6.5W -0.104 (0.40)
AT! -0.003 (0.28) -0.043 (1.86)" -0.014 (0.49) -0.095 (1.53)
AT2 -0.011 (0.89) -0.002 (0.09) -0.052 (1.72)" 0.046 (0.72)
AT3 0.010 (0.77) 0.057 (2.33)b -0.044 (0.14) 0.020 (1.30)
NONWHITE -0.057 (1.52) -0.128 (2.54)h 0.090 (0.91) 0.237 (1.88)" -0.790 (2.80)b
55+ -0.033 (0.47)
FEMALE -0.066 (2.34)b
EDUC 0.005 (1.31) 0.002 (0.32) 0.004 (0.38) -0.014(1.11) 0.004 (0.11)
FULL -0.012 (0.50) 0,025 (0.96) -0.117 (2.15)" 0.004 (0.06) -0.603 (3.87)" :::to
PART 0.046 (1.46) -0.048 (1.16) 0.030 (0.37) -0.151 (1.47) 0.132 (0.56) ~
FAMSIZE -0.005 (0.62) -0.018 (1.89)" -0.Q15 (0.75) 0.032 (1.25) -0.187 (3.23)" f}
MARRIED 0,025 (0.63) ;:!
MEDINC 0.003 (0.10) ;:s
TFAMINC 0.005 (2.35)b -0.000 (0.06) -0.001 (0.21) . 0.002 (0.32) 0.053 (4.28)"
Constant -0.653 (3.47) 0.274 (-) -0.293 (0.16) 0.053 (2.36) -0.189 (3.47) ~



-2.889 -0.089 -0.144 -1.029 -1.605 -8.376

(3.80)h (3.35)h (3.32)b (3.79)b (3.62)b (3.56)h ~
- = value fixed. ~
" Significant at 10% level.
h Significant at 5% level. w
Table 16. Estimation results of the structural model of demand for medical care (children). VI
HEALTH* -0.009 (3.90)b -0.527 (3.50)1> -1.00 (-) -3.07 (4.82)b ~
HOSP -0.003 (1.39) -0.003 (0.71)
GPPED -0.D18 (1.06) -0.049 (0.70) f}
HMO -0.007 (1.57)
XHMO -0.002 (0.40) ~
MCAID 0.195 (2.53)b 0.119 (0.90) 0.397 (0.22) -0.588 (1.75)" ~
PRIVINS 0.112 (1.68)" -0.005 (0.46) -0.382 (2.46)b 0.043 (0.15) ;::
HMOINS -0.047 (0.95) -0.006 (0.75) 1.1 04 (9.05)b -0.618 (2.94)b !:l...
ATI 0.002 (0.16) 0.115 (0.52) 0.144 (1.49) -0.109 (1.98)b
AT2 -0.010 (0.73) 0.000 (0.02) -0.073 (2.38)b 0.119 (2.08)b ~
AT3 0.001 (0.10) -0.004 (0.16) 0.021 (0.63) -0.231 (3.90)b i3
NWHITE -0.290 (1.42) 0.029 (0.58) 0.218 (1.64)a 0.035 (0.16) -1.635 (2.69)b !:""'
LT6 -0.210 (4.47)1>
12-17 0.001 (0.04) ~
FEMALE -0.047 (1.65)" s:;

MEDUC 0.034 (1.36) -0.004 (0.68) 0.012 (0.74) 0.041 (1.57) 0.171 (2.33)b
MFULL -0.214 (1.37) 0.020 (0.54) 0.154 (1.55) -0.269 (1.61) -0.642 (1.38)
MPART -0.167 (1.18) -0.017 (0.50) 0.D18 (0.20) -0.334 (2.20)b -0.066 (0.16)
FAMSIZE 0.D25 (0.66) -0.002 (0.16) 0.021 (0.84) -0.031 (0.74) -0.102 (0.90)
LMAGE 0.080 (1.04)
MARRIED -0.083 (1.34)
MEDINC -0.004 (0.63)
FAMINC 0.031 (1.90)a -0.002 (0.92) 0.016 (1.68)" 0.027 (1.59) 0.091 (1.87)"
Constant -4.391 (2.08) 0.111 (-) -2.245 (1.88) -3.711 (1.75) -12.28 (1.97)
-3.81 (2.17) -0.003 (0.31) 0.D17 (1.58) -0.020 (1.58) -0.011 (1.08)

- = value fixed.
a Significant at 10% level.
b Significant at 5% level.
Estimating demand for medical care 55

For adults, racial differences in utilization patterns cannot be attributed

solely to differences in health status. Family size has a significant negative
impact on health care utilization, 'holding health constant.' To the extent that
employment status influences health care utilization, the effect is direct, not
through health status. Finally, total family income shows a positive effect on
health status and on the number of private office visits, again 'holding health
For children, we find no significant racial differences in health status but,
'holding health constant,' we find that nonwhites go more often to hospital
outpatient clinics and less often to private physician practices. The effect of
mother's education on children's health is positive, but not significant. The
direct effect of mother's education on private physician visits is positive and
significant. The employment status of the mother shows only direct effects on
utilization (Le., not through health). Family income shows positive effects
both on health and utilization.

6. Conclusion

It has been common practice to add one or more proxy variables for health
in demand equations for medical care 'to control for variation in health
status.' The choice of these proxy variables is almost always guided by the
availability of the data.
In this paper we show that this habit is not as innocent as it seems. Health
measures should obviously be included in demand equations for medical
care. But the choice of the variables representing health will have an impact
on the estimation results regarding various socioeconomic variables.
As shown in Section 3, health should be treated as endogenous. But doing
this does not solve the problem presented by the fact that not one of the
available health measures is by itself a sufficient proxy for health. A variety
of proxy measures must therefore be used.
In Section 5 we showed how these proxy measures can be used as
indicators for an unidimensional health measure. This measure, which is
unobservable, is introduced in a structural model for health care demand.
Thus HEALTH becomes a latent variable in a Multiple Causes-Multiple
Indicators (MIMIC) model.
As indicators we use the health proxy measures and utilization of health
care. The latter can be used as an indicator for health once we adequately
control for income, insurance, availability, and taste differences. As causal
factors in a health production function, the socioeconomic variables that
were correlated with one or more of the health measures analyzed in Section
3 were included.
The results are encouraging, especially for children. The model yields
reasonable estimates, as compared to the unrestricted OLS regressions on
56 1. van der Gaag and Barbara L. Wolfe

utilization, and the latent variable HEALTH does have the impact one would
expect if it represents a true measure of health status.
Some caveats, however, should be mentioned. We did not solve the
question of how to choose among various health-proxy measures. We merely
pushed the problem one step back by including one latent variable,
HEALTH, in the demand equations and by stating that the proxy measures
were proportional to this overall measure. Thus the ex-post interpretation of
HEALTH is conditional upon the choice of the health indicators used and
the weight they get in the estimation process.
The estimation assumes normality of the disturbances. For adults we use
various 0-1 dummy variables as health indicators, which makes the nor-
mality assumption less plausible. For children we transformed many discrete
health-proxy variables into a small set of continuous factor scores, which is
an important improvement over our earlier work. But the problem remains
with respect to the health-care utilization data which are bounded from
below by zero, and usually take only a few discrete values. This problem
seems particularly severe with respect to private office visits. This variable
has a large concentration of zeros and (other than HOSPOPVS) correlates
strongly with a number of other variables. This might explain our implausible
results with respect to OFFHVS, both for adults and children.
Finally, we should mention that part of the model is constructed in an ad
hoc manner, with a little a priori knowledge and without a firm theoretical
base. The utilization module can easily be shown to be derived from a
general demand framework. But the 'production function of health' should be
viewed as a first attempt to show the impact of various socioeconomic
variables on a comprehensive measure of health status. The formulation fits
within Grossman's theory of the demand for health. But the analyses lack the
input of other disciplines, e.g., epidemiology. A further understanding of the
causal relationships between, say, income or family size or education and
health is needed to improve the specification of the health production
function in the model.


* The research reported in this paper was supported in part through funds granted to the
Institute for Research on Poverty by the Department of Health and Human Services
pursuant to the provisions of Economic Opportunity Act of 1964.
1. There are, of course, major exceptions: well-baby care, immunizations, some gyneco-
logical care, some screening tests, care during pregnancy.
2. Manning et al. (1981) takes a similar view.
3. The HIS numbers are for adults aged 17-44 years, both sexes. See U.S. Department of
Health and Human Services (1981), p. 24.
4. In order to assess what information is contained in the variable HSTAT, we performed
the principal component analyses with and without this variable. In both cases the same
health factors were obtained. HSTAT correlates with two of these factors, HANDICAP
Estimating demand for medical care 57

and ACUTE. However, as we see in Section 4, HSTAT also contains independent

information relevant to the prediction of health care utilization. This information did not
show up as an independent factor in the principal component analyses with HSTAT
included. In the remainder of this paper we will delete HSTAT from the principal
component analyses. But we will treat it as an additional variable to explain utilization in
Section 4. This also permits the comparison of our results with other work using HSTAT,
such as Colle and Grossman (1978), and Goldman and Grossman (1978).
5. Health Maintenance Organization. In this type of arrangement consumers pay a fixed
amount - a capitation fee - for all services for a specified period of time.
6. We reestimated the equations of columns 2 and 4, replacing the two health factors with
the five original health measures on which they were based. The results were almost the
same, showing that the two constructed health factors adequately represent the variation
in the five original health measures.
7. There are three included insurance variables: Medicaid (MCAID), private insurance
(PRIVINS) and HMO insurance (HMOINS). The omitted category is 'no insurance.' The
insignificance of PRIVINS may be due to the high correlation between Medicaid and
PRIVINS variables (-0.726).
8. The results for HOSPERVS and TOTAL appeared not to be sensitive to alternative
health specifications. They are therefore not included in Table 14.
9. As for adults, we also ran the regressions including all health variables. The results
confirmed our results using the four health factors plus HSTAT and are therefore not
10. In the past couple of years a number of studies have been published using this approach.
Work based on microdata includes Van de Ven and van der Gaag (1979) and Lee
(1979). The work of Wolfe and van der Gaag (1981) indicates a preliminary version of
the model presented here, using only part of the health information. Hooymans and Van
de Ven (1982) present a useful discussion on the identification of such a model and the
subsequent dimensions (and interpretation of the resulting health index).
11. This assumption is likely to be violated, given the limited character of some of the
endogenous variables. Lee (1979) deals with this problem when deriving the likelihood
function of his model. For children, we reduce the problem by replacing the health
indicators (usually binary during variables) by the continuous health factor scores. We do
not provide a solution, however, for the limited character of the health care utilization
data. Comparison of our results with the ones obtained in the previous sections does not
suggest that the possible bias due to the violation of the normality assumption is of major
importance. Some oddities in our results, however, do call for caution (see text).
12. Turning to Table 13, the results for HSTAT also suggest a much larger response among
children from a one unit decrease in health on OFFHMVS and HCORCVS, HOSPOPVS
or HOSPERVS. The ratio of coefficients (for example, OFFHMVS to HCORCLVS) is
similar for HSTAT in DLS and H* in MIMIC. Among adults (see Table 11), the OLS
results for HSTAT also follow a similar pattern to the adult MIMIC results.

Colle, A. and Grossman, M. (1978), 'Determinants of Pediatric Care Utilization', Journal of
Human Resources 13 (Supplement), 115-58.
Davis, K. and Reynolds, R. (1976), 'The Impact of Medicare and Medicaid on Access to
Medical Care', In R. Rosett (ed.), The Role of Health Insurance in the Health Services
Sector, New York, National Bureau of Economic Research.
Edwards, L. and Grossman, M. (1980), 'Children's Health and the Family', In R. Scheffler
(ed.), Annual Series of Research in Health Economics, Vol. 2, Greenwich, Conn., JAI
58 1. van der Gaag and Barbara L. Wolfe

Goldman, F. and Grossman, M. (1978), 'The Demand for Pediatric Care: A Hedonic
Approach', Journal of Political Economy 86, 259-80.
Grossman, M. (1972), 'On the Concept of Health Capital and the Demand for Health',
Journal of Political Economy 8012, 223-55.
Hooymans, E. M. and Van de Ven, W. P. M. M. (1982), 'Implementing a Health Status Index
in a Structural Health Care Model', In van der Gaag, J., Neenan, B. and Tsukurhara, T.
(eds.), Economics of Health Care, New York, Praeger.
Hyman, J. (1971), 'Empirical Research on the Demand for Health Care', Inquiry 8, 61-71.
Joreskog, K. G. and Sorbom, D. (1978), Estimation of Linear Structural Equation Systems by
Maximum Likelihood Methods, Chicago, International Educator Services.
Lee, L. F. (1979), 'Health and Wages: A Simultaneous Equation Model with Multiple Discrete
Indicators', Working Paper No. 79-127, Department of Economics, University of
Manning, W. G., Newhouse, J. P. and Ware, J. E. Jr. (1981), 'The Status of Health in Demand
Estimation: Beyond Excellent, Good, Fair and Poor', NBER Conference Paper 86,
Newhouse, J. P. (1981), 'The Demand for Medical Care Services: A Retrospect and Prospect',
In J. van der Gaag and M. Perlman (eds.), Health, Economics, and Health Economics,
Amsterdam, North Holland.
Newhouse, J. P. and Phelps, C. E. (1974), 'Price and Income Elasticities for Medical Care
Services', In Mark Perlman (ed.), The Economics of Health and Medical Care, New York,
John Wiley and Sons.
Robinson, P. M. and Ferrara, M. C. (1977), 'The Estimation of a Model for an Unobservable
Variable with Endogenous Causes', In Aigner, D. J. and Goldberger, A. S. (eds.), Latent
Variables in Socioeconomic Models, Amsterdam, North-Holland.
Roset!, R. and Huang, L. P. (1973), 'The Effect of Health Insurance on the Demand for
Health Care', Journal of Political Economy 81, 281-305.
Shakotko, R. A. (1980), 'Dynamic Aspects of Children's Health, Intellectual Development,
and Family Economic Status', New York, NBER Working Paper No. 451.
U.S. Department of Health and Human Services. (1981), Current Estimates from the National
Health Interview Survey: United States, 1980, DHHS Publication No. (PHS) 82-1567,
Van de Ven, W. P. M. M. and van der Gaag, J. (1982), 'Health as an Unobservable: A
MIMIC-Model for Health Care Demand', Journal of Health Economics 1, No.2, August.
Wolfe, B. L. (1980), 'Children's Utilization of Medical Care', Medical Care 18 (December):
pp. 1196-1207 (Institute for Research on Poverty Reprint No. 419).
Wolfe, B. and van der Gaag, J. (1981), 'A New Health Status Index for Children', In van der
Gaag, J. and Perlman, M. (eds.), Health, Economics, and Health Economics, Amsterdam,
An empirical model of the demand for
health care in Belgium


SESO, Universitaire Faculteiten Sint-[gnatius. Prinsstraat 13, 2000 Antwerpen, Belgium

The paper presents an empirical model of the demand of health care in

Belgium. The analysis pertains to 17 categories of medical care and to two
subgroups of health insurance beneficiaries, namely the 'active' and the
'widows, orphans, pensioners and invalids'. The estimation results show that
income and relative prices matter in the demand for medical care. Supplier
induced demand is also detected for a number of medical care categories.
Other explanatory variables in the model include the size of the child
population, climatic conditions and a time trend, representing technological
advances in health care.

I. Introduction

The purpose of this paper is to search, by means of econometric modelling,

for the important determinants of the demand for medical care in Belgium.
The Belgian health insurance scheme consists of a scheme for blue and white
collar workers and a scheme for the self-employed. In this paper we will
restrict ourselves to the demand of the group of workers. In doing so, we will
capture about 94% of the expenditures of the insurance scheme for medical
care. In order to give the reader a better understanding of health care
demand and the institutional set-up, we give a brief description of the health
insurance system for workers in the next section. In the third section we will
specify the structural equations of the health care model. This model is of the
macro-economic type and uses data from 1966 to 1980. The estimation
results will be presented and commented upon in the fourth section.

II. Description of the health insurance system (HIS)


The econometric analysis pertains to the two subgroups of beneficiaries in

G. Duru and 1. H. P. Paelinck (eds.), Econometrics of Health Care, 59-78.

© 1991 Kluwer Academic Publishers.
60 G. Ca"in and 1. van Dael

the health insurance scheme for blue and white collar workers. The first
subgroup includes the active persons and the persons in their charge (e.g.
children). The second subgroup comprises widows, orphans, pensioners,
invalid persons and the persons in their charge (henceforth referred to as
WOPI). Note that the total number of beneficiaries has increased by
1368357 from 1966 to 1980. In 1980, the number of beneficiaries
amounted to 8491479.

Financing of the HIS

The sources of revenue for the HIS consist of employers' and workers'
contributions on the one hand and government subsidies and taxes on the
other. More specifically, they are:
(a) The contributions of employers and workers that are calculated on total
wages; there is no special ceiling for the calculation of these contribu-
tions. The employers' and workers' contribution rates are 3.75 and 1.8%
(b) A government subsidy which is equal to 95% of the costs of the
treatment of the so-called social diseases 1 plus 27% of the normal
expenditures for medical treatment.2
(c) A special government allowance that finances the health insurance of the
unemployed. 3
(d) An excise taX on tobacco. 3

Health insurance benefits

Out-patient medical care

The benefits include full or partial refunding of medical care expenses:
(a) For general medical help, i.e. visits by and consultations with physicians
(general practitioners and specialists), assistance by paramedical per-
sonnel, technical assistance by physicians (e.g. laboratory tests) and
dental care, the reimbursement is 75% of the official fees. For WOPI
below a certain income, the reimbursement is 100%.
(b) Technical assistance delivered by specialists (laboratory tests and radio-
the!apy) is reimbursed at 100%.
(c) For drugs, a distinction is made between pharmaceutical products and
pharmacists' drug preparations. For the preparations patients pay a fixed
amount that varies according to the group of beneficiaries. Before 1
November 1980, beneficiaries paid a fixed amount for the pharmaceu-
tical products as well. After that date, a new reimbursement system
became applicable: the personal out-of-pocket contribution in the case of
pharmaceutical products would depend henceforth upon the therapeutic
value of the drugs and on the subgroup of beneficiaries to which the
patient belongs. Note that pharmaceutical products are only reimburs-
able if they appear on the official list of accepted products.
An empirical model of the demand for health care in Belgium 61

In-patient medical care

(a) The government determines the prices of a hospital day in different
hospital wards. These prices also differ according to the type of hospital; in
Belgium one distinguishes mainly university and general hospitals. A supple-
ment to the normal price 4 can be granted by the government if the hospital
manager can show that the predetermined prices are not sufficient to cover
the real costs in his hospital. From 1975 onwards, the hospital manager can,
after approval by the Minister of Public Health, set his own hospital day
price; in this case, he does not need to apply for eventual supplements to the
normal price.
(b) The financing of the price of a hospital day is as follows: 25% of the
price is financed by the Ministry of Public Health. However, the Ministry's
subsidy is higher for university hospitals. Subsequently, there is a (relatively
small) out-of-pocket contribution by the patient. This personal contribution
varies according to the subgroup of beneficiaries to which one belongs: the
personal share is higher for active persons than for the other group. The
patient also pays a fixed amount per drug (25 BF) that is administered at the
hospital. Finally, the remainder of the price of a hospital day is paid to
hospitals by the HIS.

The share of HIS benefits in national income

In Table 1, we present the shares of health insurance benefits in national

income over the period of 1966 to 1980. The expenditures of the scheme for
the self-employed are also given. From the table, it is clear that the total
expenditure of the HIS for medical care (both schemes) in national income
has increased from 2.87% in 1966 to 4.80% in 1980. This evolution can be
explained by the increase in real benefits, the increase in the number of
beneficiaries and partly by a higher demand for medical care. Note that there
has not been a sufficient increase in the total revenue of the HIS so that
budget deficits were created. The latter are reported in Table 2 for the years
of1974 to 1980.

III. Structural specification of the model

Categories of medical care

For each subgroup of beneficiaries in our model, the following categories of

medical care are considered:
1a. Consultations at the general practitioner's office;
lb. Consultations with the general practitioner at the patient's home;
2a. Consultations with the pediatrician;
2b. Consultations with other specialist physician;-
3a. Preservative dental care (fillings);
62 G. Carrin and J. van Dael

Table 1. Expenditures of the HIS for medical care and their share in national income,

Year Expenditures scheme Expenditures scheme Total expenditures

for workers for self-employed as % of national
(in 109 BF, current (in 10 9 BF, current income
prices) prices)

1966 19.7 0.9 2.87

1967 20.8 1.0 2.84
1968 23.8 1.4 3.06
1969 28.2 1.8 3.27
1970 32.7 2.2 3.42
1971 36.3 2.5 3.47
1972 41.5 2.9 3.50
1973 49.2 3.4 3.63
1974 59.6 4.1 3.76
1975 74.7 5.4 4.27
1976 88.8 6.5 4.45
1977 98.5 7.1 4.59
1978 110.0 7.7 4.76
1979 117.3 8.1 4.77
1980 125.4 8.8 4.80

Source: Various issues of Algemeen Verslag van het Rijksinstituut voor Ziekte-en Invaliditeits-
verzekering (RIZIV), Voornaamste financiiile en statistische uitkomsten van de verplichte
verzekering tegen ziekte en invaliditeit (Brussels, 1 July 1981) and own computations.

3b. Other dental care (orthodontics, dental prostheses);

4. Prostheses;
Sa. Hospital bed-days due to surgery;
5b. Hospital bed-days due to observation of patients;
6. Technical medical treatment (includes certain lab tests);
7a. Special care (radiotherapy, X-rays);
7b. Laboratory tests (performed by specialists);
8a. Surgery;
8b. Anaesthesia;
9. Drugs;
10. Physiotherapy;
11. Nurses' care of outpatients;
12. Special care for patients having tuberculosis, cancer, poliomyelitis,
congenital or mental diseases;
13. Deliveries;
14. Haemodialysis;
15. Midwives' care;
16. Hospital bed-days due to deliveries;
17. Travel and supervision expenditures.
Note that for categories of medical care 1 to 11 behavioural equations are
An empirical model of the demand for health care in Belgium 63

Table 2. Yearly deficit or surplus ofthe HIS for workers.

Year Deficit (-) or surplus (+) in

106 BF, current prices

1974 +113.3
1975 +693.9
1976 -2823.2
1977 +227.9
1978 -4436.7
1979 -5979.1
1980 -2625.4
Cumulative deficit end of 1980 -14829.3

Source: Unpublished documents from RIZIV (Brussels).

specified. The last six categories are exogenous in the model. To give the
reader an idea of the relative importance of the different categories, we
present in Table 3 the expenditures of each category and their share in health
insurance expenditures for 1980. One can verify that the endogenous part of
the model captures 88% of the HIS expenditures. It can also be seen that the
expenditures for hospital bed-days, special care and laboratory tests account
alone for 40% of total HIS expenditures. In Table 4 we present the share of
the two subgroups in the expenditures related to each above mentioned
category of medical care.

Behavioural equations of medical care Categories 1 to 3

Concerning these types of demand, we reason that a calIon the general

practitioner (GP), specialist doctor or dentist is, in the first place, the result
of a spontaneous decision on the part of the individual patient. Such a
decision is not only caused by a state of illness. Individuals may also demand
medical care because they perceive their health status as deteriorating. They
may also seek medical care in order to prevent future illnesses or to ask the
doctor for general advice. We postulate now that a certain part of the income
is allocated to consultations and dental care. Furthermore we will maintain
the hypothesis that, to a certain extent, individuals are sensitive to relative
prices. This means that, if the price of medical care relative to that of a
substitute increases, they may decide to adjust their demands downwards.
However, it is clear that price sensitivity will be almost non-existent if
medical care is both badly needed and hardly substitutable.
In view of the considerations above, we adopt a demand equation of the
following type: 5

In q{ = a. + f3 In( Y/P)t + I Yk In(pj / Pk)1' (1)

64 G. Carrin and 1. van Dael

Table 3. Expenditures of different categories of medical care and their share in

HIS expenditures for workers, 1980.

Category of Expenditures in 10 6 BF Expenditure as % of HIS

medical care (current prices) total outlay

la 4943.6 3.94
Ib 5937.6 4.73
2a 601.9 0.48
2b 4300.3 3.43
3a 2490.3 1.98
3b 1727.9 1.38
4 1476.9 1.18
5a 9020.4 7.19
5b 15610.5 12.44
6 3069.5 2.45
7a 14028.3 11.18
7b 10043.1 8.01
8a 4948.6 3.95
8b 1095.7 0.87
9 22327.4 17.80
10 5474.6 4.36
11 2949.9 2.35
12 7566.1 6.03
13 807.5 0.64
14 147.9 0.11
15 108.7 0.09
16 1618.7 1.29
17 4891.8 3.90

Source: Statistics of the RIZIV (Brussels).

Note: Due to rounding errors, the sum of the elements of the first and second
column is not exactly equal to 125.4 billion and 100% respectively.

where q{ is the demand for medical care (per beneficiary) of type j; Y is

disposable income per beneficiary (in current prices); P is the consumption
price deflator; pj is the patient's price of medical care of type j; pk is the
price of a substitute commodity k; t indicates the year; and K indicates the
amount of substitutes for j. Note that Y is put equal to the average earnings
(WAGE) and the average pension (pENSION) in the case of subgroups 1
and 2 respectively. Note that K differs according to the medical care
category treated.
It has to be granted that determinants other than income and relative
prices may playa role in patients' demand for medical care. Firstly, medical
care delivered to patients may be influenced by supply factors. In Belgium,
doctors and dentists are paid by means of fees for services. This implies that
they may have a monetary incentive to expand their services. This monetary
incentive is likely to be strong whenever the ratio of doctors and dentists to
population is high. Indeed, the higher these ratios the lower the average
An empirical model of the demand for health care in Belgium 65

Table 4. Share of the subgroups in medical care expenditure, 1980

(in %).

Category of Subgroup 1 Subgroup 2

medical care (active) (WOPI)

1a 71.6 28.4
Ib 38.0 62.0
2a 95.9 4.1
2b 66.5 33.5
3a 89.5 10.5
3b 59.8 40.2
4 54.0 46.0
5a 48.4 51.6
5b 31.1 68.9
6 53.1 46.9
7a 64.9 35.1
7b 61.3 38.7
8a 58.1 41.9
8b 68.3 31.7
9 48.3 51.7
10 38.0 62.0
11 20.4 79.6
12 38.7 61.3
13 92.0 8.0
14 55.1 44.9
15 97.6 2.4
16 96.4 3.6
17 49.7 50.3

Source: Computation from statistics of the RIZIV (Brussels).

income per doctor or dentist is likely to be if patients' demand is not

especially induced. This supplier-induced demand effect can be tested by
including these ratios as determinants in Equation (1).
Secondly, demographic variables may playa role in the demand for
medical care. It is indeed safe to postulate their morbidity levels are
correlated with the age structure of the population: the older the population,
the higher society's morbidity level and the higher the demand for medical
care. Two demographic variables will now be retained, viz. the ratios of the
child population (of less than 15 years) and the adult population (between 40
and 60) to the population under 60. The inclusion in the analysis of a
demographic variable representing the popUlation older than 60 is not
necessary in view of the fact that separate equations will be estimated for the
WOPI. Since the latter group is dominated by pensioners, the old age effect
will be captured directly by the coefficients in those equations.
Thirdly, one may expect that the demand for pediatric care will be
influenced especially by the child population. In addition it is likely that the
latter variable encourages the consultations at patients' homes. This can be
66 G. Carrin and 1. van Dael

explained by the fact that parents frequently dislike transporting a sick child
to a physician's office because the transport itself may enhance the child's
illness. Furthermore examinations (in the physician's office) of sick children,
having caught an infectious disease, may be discouraged by the physician
himself in order to limit the transmittal of the disease.
Fourthly, medical care may be affected by climatic conditions; for instance
severe winters may boost the demand for consultations with physicians due
to the widespread occurrence of colds, influenza, etc. We have therefore
introduced average winter temperature (in centigrade) as a determinant in
the equations explaining consultations.
Let us now include the additional determinants in Equation (1) and
present the completely specified equations. For Category 1 we have:
In qfO = a 1+ (31 In( YIP), + Yll In( pgoI pgh), + Y12 In( pgoI pas), +
+ YJ31n(pgoIPpe),+ 01 In Rr + clin TEMPI +
+ ~I In CHILD t + CPlln OLD t (2)
In qrh = a 2 + .82 In(YIP), + Y21 In(pgh/ pgO)1 + Y2Zln(pgh/pos), +
+ Y231n(pgh/Ppe)t+ O2 In Rf + c2in TEMP, +
+ ~21n CHILD + CP21n OLD,. (3)
The superscripts go, gh, pe, and os refer to medical care categories la, Ib, 2a
and 2b respectively. Rg is the ratio of general practitioners to population
covered, TEMP indicates the average winter temeprature 6 in centigrade,
while CHILD and OLD are the ratios of the child population (of less than 15
years) and adult population (between 40 and 60) to the total population
(under 60), respectively. Two kinds of substitution effects are introduced in
the theoretical specifications: first, substitution between GP's and specialist's
care and, secondly, substitution between GP's consultation at the patient's
home and GP's consultation at the doctor's office.
The equations for Category 2 are the following:
In qfe = a 3 + .83ln( YIP), + Y31 In( ppe / pgO)t + Y32 In( ppc/ pgh)t +
+ 03ln Rre + c3ln TEMPt + ~31n CHILD t (4)
In q~S = a 4+ .841n(Y/P)t + Y41In(pos/pgO)t+ Y42In(pos/pgh), +
+ 04 In R~ + c4ln TEMPt + ~41n CHILD t +
In these equations, we also introduce substitution between GP's and special-
ist's care. Rpe and RS denote the ratios of pediatricians and specialists
(excluding pediatricians) to the total amount of beneficiaries, respectively.
Given that pediatricians only deliver medical care to children, the variable
OLD has been omitted from Equation (4).
An empirical model of the demand for health care in Belgium 67

The specification of the equations for Category 3 is as follows:

In qfd = + Ysln(ppd / P)t + Os In R~ +
as + f3sln( YIP)t
+ ¢s!n CHILD t + CPs In OLD t (6)
In q~d = a 6 + f36In(Y/P)t + Y6In(pod/P)t + 06 In R~ +
+ ¢61n CHILD t + CP6ln OLD t • (7)
In this case, we have considered all other consumption to be the substitute
for dental care, hence the use of the consumption price deflator in the
relative price terms. The variable Rd represents the ratio of dentists to
covered population.
A comment is in order about the relation between Equations (2) to (7)
and consumer demand theory. Notice that the equations satisfy the homo-
geneity property: this property implies that the demand for qj is not sensitive
to an identical percentage change of income and prices. Note that the
coefficients f3 are real income elasticities whereas the coefficients yare the
compensated price elasticities.

Behavioural equations of medical care Categories 4 to 11

Basic specification
The medical care categories discussed here are different from those treated
before. They are not the immediate result of the patients' own demand.
Rather, they are the result of prescriptions by general practitioners and
specialists. Some types of medical care are also closely linked to hospital
stays. The determinants of these medical categories will therefore include
demand for medical care of Categories 1 and 2 and, wherever appropriate,
hospital stays (expressed in bed-days). In other words, these determinants
reflect that one first needs to consult with doctors or to stay in a hospital in
order to be referred to more specialized forms of medical care.
Another determinant is likely to be the price of medical care relative to
the price of a substitute. The sign of the price coefficient is expected to be
negative. Indeed, there may be a negative price effect on medical care caused
by patients who consider the price as being too high and consequently refuse
or postpone a particular prescription. Note that in the equations for hospital
bed-days, special care, laboratory tests and surgery, no relative price variable
will appear in view of the fact that over the sample period the coinsurance
rate has been zero for these types of medical care.
The basic specification can therefore be written as follows:

where qi is demand for medical care of type i (per beneficiary); qg is

consultations with general practitioners (per beneficiary); qS is consultations
with specialists (per beneficiary); qh is hospital bed-days (per beneficiary); pi
68 G. Carrin and 1. van Dael

is the patient's price of qi; and P is the consumption price deflator. The
coefficients {3, y and 0 can be referred as to prescription elasticities whereas
e is the price elasticity. In the following subsections, we will treat the
different medical categories in somewhat greater detail and adjust the basic
specification (Equation (8» wherever necessary.

Prostheses (qpr)
We reason here that prostheses are prescribed by general practltloners
and/or specialists. Relative prices are also supposed to have an effect upon
qpr. The specified relationship is therefore
In qfr = a7 + .87 In q~ + Y7ln q~ + 07 In (ppr/ P)t. (9)

Hospital in-patient care

We make a distinction here between hospital stays due to surgery (qhS) and
those due to observation of the patient (qhO). The most important deter-
minants of qhO are prescriptions by general practitioners and specialists. The
main determinant of qhs is qSU, reflecting the link between acts of surgery and
the patient's stay in hospitals due to surgery.
Note also that, in this case, the more valuable a patient's time is, the fewer
days he is likely to stay in the hospital. We will capture this effect by
introducing real income as an explanatory variable. The expected impact is
negative in the sense that a higher real income induces patients to ask for a
reduction of their hospital stay. The equations are the following:
In q~O = as + .8sln qf + Ysln q~ + osln( Y/P)t (10)

Technical medical treatment (qtmt)

The equation for qtm! is similar to that for qpr. The only difference is that we
introduce qh as a determinant since technical medical treatment can also be
complementary with hospital stays. The equation is therefore:
In q~mt = a lO + .810 In qf +
+ YIO In q~ + 010 In q~ + flO In(ptmt/ P)t. (12)

Laboratory tests( ql) and special care (qr)

For these types of medical care qh also figures as a codeterminant in the
equations. In the present case, we also test the likelihood that technological
advance has induced extra prescriptions for laboratory tests and special care
by specialists. In other words, supply of new possibilities may create its own
demand. In order to account for this possible demand creation, a time trend
has been inserted in the behavioural equation. The equations are:
An empirical model of the demand for health care in Belgium 69

In q; = a 12 + f312ln qr + Y12 ln q~ + 0 12 In q~ + e12 t, (14)
where t refers to the time trend.

Surgery ( qSU) and anaesthesia ( qa)

The variable qS is a first determinant of qSU since there is a clear link between
specialists' care and the likelihood that a surgical act is performed. The ratio
RS is part of the explanatory variables as well. It is a proxy variable for the
ratio of surgeons to total population, that is included in order to test the
hypothesis that acts of surgery are performed more frequently as R S becomes
larger. Another explanatory variable is the time trend t that represents a
technological trend in the art of surgery. The latter may account for some
demand creation, as inc rasing know-how makes possible a greater variety
and frequency of surgery. In view of the above, the specification is
Anaesthesia is viewed as complementary to acts of surgery, whence the
following specification was selected:
In q~ = a 14 + f3141n q~u. (16)

Drugs (qdr)
The consumption of qdr is explained first by qh in order to account for the
fact that hospital inpatients are important consumers of drugs. Secondly,
drugs are prescribed to outpatients by general practitioners and specialists,
whence qg and qS are introduced in the specification for qdr. Thirdly, a
relative price variable may influence the drug consumption.
The specification is therefore
In q~r = a lS + f31sln q~ + Ylsln q~ + olsln q~ + eJsIn(pdrI P)t. (17)

Physiotherapy (qk)
Physiotherapy is performed mainly upon prescription by doctors and as a
complementary service to hospital inpatient care. Therefore qh, qg and qS are
included as determinants in the qk equation. Furthermore, we may again
have a relative price effect from (PkIP). In addition, we will investigate
whether there is supplier-induced demand by introducing the ratio of
physiotherapists to beneficiaries (R k) in the equation. The specification is
In q~ = a 16 + f3161n q~ + Y16ln qf+ 016 ln q~ +
+ E 16 In(pkIP)t + ¢161n R~. (18)

Nurses' care of outpatients (qD)

This type of care to an outpatient is often complementary to a previous stay
in a hospital. qn can also be prescribed by doctors to regular patients, so that
70 G. Carrin and 1. van Dael

qS and qg are included as determinants. We will also test whether the relative
price is an explanatory factor in this case. The specification is
In q~ = a 17 + f317ln q~ + Y17ln qr + 017 ln q~ + E17In(pnj P)t. (19)


For each subgroup of beneficiaries, we have the following identities:

qf = qfo+ qfh (20)
q~ = qfe+ q~S (21)
q~ = q~o+ q~s+ q~d (22)
E: = q! . P: . Bt (23)

E/ = L E:, (24)

where i refers to each of the categories of medical care (including the

exogenous ones), where E:
refers to the expenditures of category i, and
where q~d, E t and Bt are the hospital bed-days per beneficiary (due to
deliveries), the total expenditures and number of beneficiaries respec-
tively. Summing Et of both subgroups will, of course, give us total HIS

IV. Estimation results

The data

Data from 1966 to 1980 were used to estimate the equations. The data on
medical care are taken from the RIZIV-statistieken published by the Minis-
terie van Sociale Voorzorg. The other data are taken from various issues of
the Statistisch laarboek van de Sociale Zekerheid and the Statistisch laarboek
van Belgii!.
Note that the prices of medical care that are directly available in the
statistics are those that are reimbursed by the HIS to the patient. In view of
the rather stable relationship between the patient's price and the reimbursed
costs for medical care over the sample period, we decided to use the latter as
proxies for the patient's price, without having to fear large approximation
errors. The estimation technique used was basically ordinary least squares.

The results for Categories 1 to 3

The results for the active and the WOPI are presented in Tables Sa and b
respectively. The first general remark is about the cross-price elasticities. In
Table 5a. Estimation results of medical care Categories I to 3 - subgroup of the active. :t.
Category Dependent Explanatory variables Numbers of OW S.E.E. if> ~
variables observations
Constant WAGE/P pi/p pgh/p'" pgh/ PP' Pgo/po' Ri CHILD OLD TEMP ~:
1a qgl) -1.4242 0.7236 -0.1959 0.8063 4.0460 14 2.21 0.014 0.977 g
(1.0178) (0.0677) (a) (0.3401) (0.9204) f}
Ib 4.1034 0.21 -0.0659 -0.0571 1.5045 3.3374 -0.0176 13 1.83 0.028 0.621
(2.2146) (a) (a) (a) (0.3263) (1.4980) (0.0119)
2a qPC -4.8268 0.7701 0.2315 0.3118 1.6395 15 1.97 0.022 0.962 l1:>
(1.9 I (9) (0.0829) (0.0990) (0.1668) (0.3887) f}
2b qm -2.4248 0.7157 0.0552 0.2210 1.0858 3.0554 15 2.43 0.017 0.989
(0.5730) (1.7696)
(1.3269) (0.2516) (0.0411) (0.1642)
qpd ~
3a -8.7403 1.3163 -0.6306 0.4059 15 1.67 0.053 0.956
(2.6980) (0.3385) (0.3831) (0.1094)
3b quJ -0.3375 0.1661 -0.4481 15 1.61 0.015 0.916
(1.2520) (0.0386) (0.1776) ~

Notes: 1. All variables are expressed in natural logarithms. 2. The figures below the coefficients are standard errors. 3. The superscript j refers to the demand category 2
j estimated. 4. OW, S.E.E. and j[' are the Durbin-Watson statistic, the standard error of regression and the cofficient of determination (corrected for degrees of
freedom) respectively. 5. The symbol a indicates that the coefficient value has been assigned.



Table 5b. Estimation results of medical care Categories 1 to 3 - subgroup of the WOPI. ~

Explanatory variables Number of DW p S.E.E. IP ~

Category Dependent Constant PENSION/P pi/P pgh/P''' pgO/P'" TEMP tJ
variables ......
la qgO -0.1560 0.0948 -0.2630 15 0.971 0.6490 0.022 0.220
(0.8208) (0.0778) (a)
Ib qgh -0.1256 0.1814 -0.0256 -0.0173 14 1.15 0.Q25 0.692
(0.3503) (0.0335) (a) (0.0089)
2b qO' -1.2795 0.1872 0.0625 0.2502 15 1.68 0.021 0.937
(1.2135) (0.1074) (0.0357) (0.1430)
3a qpd -12.5998 1.2680 -0.4704 15 1.50 0.052 0.947
(0.6765) (0.1757) (0.2926)

Notes: 1. See notes of Table 5a. 2. p is the coefficient of first-order autocorrelation between the residuals. An iterative technique was used to calculate p; note that the
DW-test may be inconclusive.
An empirical model of the demand for health care in Belgium 73

some equations these were rather difficult to estimate due, primarily, to

multi-collinearity problems. It was then decided to impose certain coefficient
values in the relevant equations. A priori values were obtained by making
use of the symmetry condition of consumer demand theory. For instance,
between any two medical care commodities, say m and n, the symmetry
condition dictates that


where the subscript c indicates that the cross-price effects are compensated
price effects. We now define the cross-price elasticities



C nm = ( :q:)
. c

Using Equation (25), we can write Equation (26) as

C mn = ( :q:)
p c

or, further using Equation (27), as


We proceed further as follows. We estimated, in an unconstrained way, the

equations for Categories 2a and 2b. Taking the estimated price elasticities
and subsequently using Equation (29), we were able to calculate the assigned
coefficients. For the ratio of qnpn/qmpm,
we used an average of the ratios for
1966, 1974 and 1980.
Secondly, notice that for the WOPI, no estimation results are presented
for qpe and qOd. The reason is that the time series for these variables
displayed an almost constant value throughout the sample period. These
variables will therefore be considered as exogenous in our model.
One can see that the income effects are rather important in the demand
equations. Only in the case of qgh for the active did we have to impose a
value for the income elasticity; the latter was based upon the estimation
result obtained for the WOPI. One notices that the income elasticity in the
equation for preservative dental care is higher than for the other types of
74 G. Carrin and J. van Dael

medical care. Since it exceeds unity, it would mean that preservative dental
care is a lUxury good. Notice also that the income elasticities of the active
exceed those of the WOPI. According to us, the latter indicates that income
is less of a constraining factor for the WOPI due to the fact that medical care
is delivered at a very low average price to this subgroup. The same reason
can be used to explain the rather high unexplained variance of qgO of the
WOPI. Considerable stochastic movements in qgO are the cause of a low R2:
they are likely to arise more frequently here because income and prices only
constitute weak constraints on the WOPI's demand for medical care.
Concerning the price elasticities, we were unable to find significant
substitution effects between GP's home and office consultations! Substitution
effects were obtained between medical care offered by GP's and specialists.
For instance, in the case of qOS for both subgroups, the price effects indicate
that patients are more inclined to demand specialist's care if the GP's medical
care becomes relatively more expensive. The results also convey that patients
demand more pediatric care if the difference in prices of GP's and pediatric
care is narrowing. Notice that the price elasticities in the case of dental care
are also very significant.
Inspecting Table Sa, it is interesting to note that the demographic variables
CHILD and OLD has a statistically significant impact on consultations of the
active with GP's and specialists. The population group between 40 and 60
years old has definitely a higher medical care consumption pattern than the
younger generations; the marginal effect of OLD is at least twice as high as
that of CHILD. Notice that CHILD has a stronger influence on the demand
for pediatric care and the demand for home visits than on the demand for
office consultations and specialist's care. Our estimation results also reveal
that the demographic variables exert no special influence on dental care.
The results further show that a supplier-induced demand effect seems to
be present in the case of pediatric and dental care for the active. The effect
of climatic conditions is not to be neglected in the demand of both subgroups
for home visits by GPs.

The results for Categories 4 to 11

The results in Tables 6a and b clearly show that there are strong linkages
among the various medical care categories. The variables qg and qS are
codeterminants in most equations. The variable qSU has a special impact upon
qhs and qa. Hospital stays qh has significant effects in the equations for ql, qdr,
qk and qn. Comparing the results of the WOPI with those of the active, we
notice especially that the prescription elasticities in the case of qpr and qho for
the WOPI exceed those for the active. This reflects a higher medical need by
the WOPI for these particular types of medical care. Furthermore, we see
that the elasticity of qSU on qhs is also higher for the WOPI, conveying their
need for longer hospital stays due to surgery.
The price elasticities indicate that prices seem to matter in the allocation
Table 6a. Estimation results of medical care Categories 4 to 11 - subgroup of the active.

Category Endogenous Explanatory variables Number of DW p S.E.E. 1[2

variables observations ~
Constant q" q' qh q'. PilP WAGEIP Ri ;:s

4 qpr 4.3861 0.1495 -1.0905 14 2.25 0.024 0.982 ~

(0.2743) (0.0565) (0.0500) ~:
5a qho 3.0262 0.2498 0.5828 -0.3883 13 2.35 0.018 0.635 .....
(1.3867) (0.0718) (0.1676) (0.1396) ~
5b qh' 3.3729 0.5642 -0.3477 15 1.22 0.027 0.436 C
(1.5392) (0.1694) (0.1239)
6 q,mt 1.4905 1.8924 1.2616 0.2103 -0.9768 15 1.02 0.103 0.953 ~
(1.6629) (0.4116) (0.2744) (0.0457) (0.2911) So
7a q' -1.0850 0.4148 0.7111 1.6593 0.1269 15 1.73 0.066 0.990
(0.1598) (0.1089) (0.1867) (0.4356) (0.0072) f}
7b q' -0.2740 0.0708 0.2832 0.1214 0.0538 15 1.54 0.Q35 0.985 ~
(0.0852) (0.0477) (0.1911) (0.0819) (0.0068) ;:s
8a q~U 2.1510 0.4752 0.4973 0.0083 15 2.29 0.013 0.991
(0.1195) (0.0975) (0.0052)
8b qa -2.2609 0.6040 15 1.36 0.7082 0.029 0.941
(0.1549) (0.1623) §:
9 q"' -0.0110 0.5120 0.5632 0.2150 -0.3471 13 1.88 0.Q35 0.277
(0.0154) (0.2101) (0.2311) (0.0882) (0.2449) 2
to II 12.4655 0.6761 0.2897 0.4507 -1.5888 0.8934 14 1.71 0.043 0.975 ....;:s~
(1.6885) (0.3141) (0.1346) (0.2094) (0.3011) (0.0819)
II qn 0.3723 1.1136 1.0567 0.4772 -0.6666 14 1.89 0.5417 0.039 0.780 tJ::l

(1.6560) (0.4909) (0.5010) (0.2104) (0.4165) ~

Notes: See the notes of Tables 5a, b. The equation for q"' has been estimated in first differences.
Table 6b. Estimation results of medical care Categories 4 to 11 - subgroup of the WOPI.
Category Endogenous Explanatory variables Number of OW p S.E.E. 1[2 g
variables observations ~
Constant q" q' qh qSU pi/P Ri S·
qP' 0.3140 0.7326 -1.1186 14 2.23 0.025 0.998 ~
4 4.3153
(0.2245) (0.0448) (0.1 045) (0.0177)
5a qhu -2.4468 0.9984 2.3295 15 1.41 0.063 0.927 ~
(0.2515) (0.0746) (0.1740) \::j
qh' I:)
5b 0.4158 0.8493 15 1.83 0.046 0.923 !\I
(0.0357) (0.0653) ....
6 qlml 1.1275 0.3879 0.2586 0.5818 -0.2174 15 0.91 0.8542 0.059 0.985
(1.6410) (0.2325) (0.1550) (0.3488) (0.2425)
7a ql -2.3099 0.4122 0.7067 1.6490 0.0611 15 0.88 0.089 0.983
(1.3110) (0.2106) (0.3610) (0.8424) (0.0481)
7b q' -0.2086 0.1326 0.5302 0.2272 0.0389 15 1.23 0.043 0.974
(0.4994) (0.0979) (0.3917) (0.1679) (0.0159)
8a q'. 1.0857 0.7041 0.3110 0.0236 15 1.29 0.012 0.995
(0.5286) (0.1644) (0.0779) (0.0039)
8b q" -2.4917 0.6774 15 1.62 0.6798 0.029 0.967
(0.0578) (0.1307)
9 q"' -0.0197 0.4506 0.4957 0.1893 -0.0897 13 2.03 0.034 0.106
(0.0192) (0.2514) (0.2765) (0.1056) (0.2921)
10 qh 10.8749 0.2169 0.0964 0.8678 -1.0749 0.8766 14 1.22 0.065 0.978
(4.7996) (0.1756) (0.0780) (0.7024) (0.3662) (0.3860)
11 q" 1.5057 1.2037 0.3009 0.8024 -0.8424 14 0.94 0.7025 0.075 0.900
(3.0153) (0.4432) (0.1108) (0.2955) (0.7968)

Notes: See the notes of Tables Sa, band 6a.

An empirical model of the demand for health care in Belgium 77

of medical care. They show that patients or their doctors, being their patients'
agents, have a demand for medical care that is price sensitive. The elasticities
are especially high in the case of prostheses and physiotherapy and indicate
that strong substitution takes place if these types of medical care get dearer
vis-a-vis other commodities.
The wage effects in the equations for the active explaining qhO and qhs
show that there is a tendency to shorten one's hospital stay as the oppor-
tunity cost in terms of lost wages in the market increases. The influence of
supplier-induced demand could be detected in the case of surgery and
physiotherapy. Its impact is especially strong in the equations for qk. Note
that the improvement in medical technology, as captured by a time trend, has
an impact on surgery, special care and laboratory tests.
It has to be recognized that the equations describing the demand for drugs
are doing less well in terms of R 2 than other equations. The estimated price
elasticities are also not significant statistically. These results can be explained,
firstly, by the highly aggregated nature of the variable drugs which makes it
very difficult to find an ideal specification. Secondly, the reimbursement
system that was valid up to November 1980 did not give any incentive to
doctors and their patients to choose a cost-effective treatment. Indeed,
patients paid a fixed amount per drug regardless of the drug's cost. Thus, the
relative absence of hard economic constraints on prescription behaviour
contributes to the explanation of both the rather stochastic nature of drug
consumption, leading to a low R 2, and the statistical insignificance of the
price coefficients.

V. Concluding remarks

The model explained above is estimated using macrodata for all beneficiaries
of the insurance scheme for workers. We have been able to show that real
income and prices of medical care matter in the allocation of medical care. In
some cases we also found effects on demand generated by suppliers of
medical care. Furthermore, the progress in medical technology, captured by a
time trend, creates extra demand for certain types of medical care. Climatic
conditions and the population structure also seem to matter in a number of
Earlier, in Table 1, we presented the share of medical care expenditures in
national income. This share increases from 1966 to 1980. Yet, it is stable
from 1978 to 1980. Note now that no major policy changes were introduced
during those particular years. We argue that the reason for this apparent cost
containment can be found in the evolution of some important explanatory
variables revealed by our model. According to us, the variables that are most
responsible for this stable share are real wages and real pensions. Indeed, the
growth in those real incomes during the years 1978 to 1980 has been
remarkably modest compared with the growth in the pre-1978 period. This
78 G. Carrin and J. van Dael

lower growth is likely to have slowed down the growth in the demand for
medical care.
In general the explanatory power of the equations is rather high. We do
find high standard errors of estimate, however, in the case of qtrnt for the
active and of qtrnt, qk and qn for the WOPI. A major problem encountered
while building the model was that the sample size was rather small. There is
the risk that the estimates may not always be robust. In a number of cases,
explanatory variables are subject to too low a variance, contributing to the
statistical insignificance of certain estimates. Such insignificant coefficients
can be tolerated, however, when they have a theoretically correct sign. It is
also our hope that as more data information becomes available, more precise
estimates will be found.
It is granted that the present paper represents only a first step in the
construction of a comprehensive model of the health sector in Belgium. In
addition, one could model the markets for various categories of health
personnel and the determination of medical care prices. Studies of the
macroeconomic type like this one can also be complemented by models
using microdata on patients and suppliers of medical care. Finally, it is
evident that the present model can and will be used in forecasting health care
expenditures and in simulating alternative government policies.


Thanks are due to the Belgian Fonds voor Kollektief Fundamenteel Onder-
zoek for financial support. Comments and suggestions by A. P. Barten, H.
Deleeck, L. Delesie, D. Deli(~ge, J. Kesenne, W. Nonneman, N. Van Belle and
an anonymous referee on previous drafts of this paper are gratefully
acknowledged. We also thank participants of the Public Economics Seminar
of Namur and the 9th International Conference of Applied Econometrics
(Budapest, March 1982) for useful comments. Remaining errors are the
authors' sole responsibility.


1. These include tuberculosis, poliomyelitis, cancer, congenital and mental diseases.

2. Applicable up to the end of 1981; from 1982 onwards, subsidies equal 80% of the
expenditures of the WOPI.
3. From 1982 onwards, these sources of revenue no longer exist.
4. The price of a hospital day includes the following elements: depreciation, financial
charges, overhead costs, maintenance costs, nurses' salaries, administration costs, the
costs of drug preparations, hotel costs and the costs of laundry and linen.
5. The same type of specification applies to both subgroups. Hence, no special subgroup
index will have to appear in the equations explained in this and the following sections.
6. More precisely, it is the average of the temperatures in the months of January, February,
March, November and December.
Reconciling spatial demand/supply imbalances
in acute care


Commonwealth Scientific and Industrial Research Organisation,
Division of Building, Construction and Engineering,
Highett, Victoria 3190, Australia


In assessing the performance of acute care hospital systems, most of the

emphasis has been on supply factors, such as patient throughput, occupancy
rates, etc., and the associated hospital operating costs. Less attention has
been paid to assessing the need for care, and to reconciling the spatial
discrepancies between this need and the actual satisfied demand as reflected
in observed admissions within a given supply configuration. The determina-
tion of the need for care is especially important in the excess demand
situations which exist in most large cities in countries with highly-subsidised
public hospital systems. This is because (i) the spatial distribution of the
unsatisfied demand has significant equity implications and (ii) the abstraction
of need for care from the observed spatial pattern of admissions becomes
more difficult as excess demand levels increase. In order to forecast the
behaviour of such a system, two models are required. Firstly, it is necessary
to forecast the need for care from within different parts of a region (e.g. the
different municipalities within a metropolitan area). This need is clearly not
unbounded, and must conform with constraints such as income levels and the
corresponding incidence of private hospital insurance. Secondly, a model is
required to forecast how this need for care is transformed into an actual
spatial pattern of hospital admissions under alternative scenarios of hospital
development and changes in the throughput characteristics of the participat-
ing hospitals. This model must make a fundamental distinction between two
cases for each specialty (i) overall excess demand (typical of many specialties
in public hospital systems) and (ii) overall excess supply (typical of most
private hospital systems). However, due to spatial and other externalities,
some facilities should be permitted to run below their nominal capacity in the
excess demand situation, and some facilities should be permitted to run at
capacity in the excess supply situation. Finally, it is assumed that an external
model exists which can subdivide the potential demand for care in each
municipality into two components, one for public hospitals and the other for
private hospitals. Such a model would need to be responsive to changes in

G. Duru and J. H. P. Pae!inck (eds.), Econometrics of Health Care. 79-94.

© 1991 Kluwer Academic Publishers.
80 1. R. Roy and M. Anderson

hospital insurance arrangements and the level of queuing in the public

hospital system.
The HOSPIM model, developed by the authors at CSIRO with partial
support from the Commonwealth Department of Health, Australia, repre-
sents an attempt to formulate and implement a microcomputer package
containing the two models as described. In addition to drawing on support
from within CSIRO plus their own resources, the authors were influenced by
several key contributions to this field. These include the pathbreaking paper
by Mayhew and Leonardi (1982), the companion book (Mayhew, 1986) and
further advances in Mayhew et at. (1986).
In the following, the development of the HOSPIM models for the excess
demand case is first described. Then there is an analogous treatment of the
excess supply situation. A further section deals with the implementation of
the model, including the data input requirements and the mode of operation.
In conclusion, the possible outputs from the model are described, including
various performance indicators and summary statistics, illustrated with a
practical example.

Models for overall excess demand conditions

Modelling the need for acute care

In the excess demand case, it is essential to distinguish between the intrinsic

need for care and the actual satisfied demand as expressed in a given supply
environment. Overall excess demand conditions will be identified by chronic
queuing at all or most of the facilities or particular specialties in the given
hospital system. Although the subject of assessing 'need' for acute hospital
care is a highly complex matter, involving issues such as lifestyles, alternative
medicine, levels of primary care, ability to pay, health education, the role of
preventative medicine and possible surgical overservicing tendencies, etc.,
HOSPIM uses a definition consistent with its primary role as a decision aid
in the location and mix of acute hospital capacity. Thus, the need for care in
each specialty is defined as that demand which would notionally emerge in
each zone if every potential patient were closely accessible to a hospital with
no endemic queuing, under existing (or projected) hospital insurance condi-
tions, primary care availability and household incomes. This definition
should ideally encompass all factors influencing demand except for the
location and availability of supply.
Having defined the need for care in relation to the policy instruments
relevant to the interaction model, the next task is to develop a separate
model to determine this need. In most cities, the age of patients and their sex
category are recognised as the main determinants of morbidity. Other
Reconciling spatial demand/supply imbalances in acute care 81

secondary influences such as ethnicity and socia-economic group also have

an influence. In HOSPIM, the demand model is implemented for age/sex
morbidity, with the option of correcting for the secondary influences by the
user if desired. In order to neutralise the effects of different local availability
of facilities, the morbidity rates are average values for admissions over the
entire study area.
Thus, defining Pia as the base-period population in municipality i of
age/sex category a, mak as the average admissions (i.e. morbidity) rate per ca-
pita per year of age sex category a for specialty k, eik as the (optional) extra
contribution of the secondary influences (other than age and sex) on
morbidity in municipality i for specialty k and 0ik the proportional leakage
of patients from i in specialty k to hospitals outside the study region, the
expected demand f'ik from municipality i in specialty k is given as


Because the morbidity data may have only been available for a larger spatial
unit than the study area, it is usually necessary to modify the f'ik values in the
base period by a small common correction, defined as llb to ensure that, for
each specialty k, the sum of the actual observed admissions T~k from zones i
to hospitals j equals the sum of expected demand from zones i, that is,

I T~k = (1 + llk) I f'ik Vk, (2)


where we now define the adjusted relative demand (1 + llk) f'ik as W;k.
For the simplest class of demand/supply interaction model, estimates of
the expected or relative demand W;k for care are all that is required for input.
However, such a model implies that a uniform relative increase of capacity in
each hospital would induce a correspondingly uniform relative increase in
admissions from each zone, irrespective of the existing relative levels of
unsatisfied demand in such zones. In order to be able to support a demand/
supply interaction model with elastic demand, it is necessary to determine the
need for care, rather than just the relative demand. The procedure used is to
seek the zone i which has the maximum value (1 + r) of the ratio (~j T~k)/
W;k of observed admissions to expected admissions based on the morbidity
of its population. This zone may in a general sense be defined as that most
'accessible' to care for the given specialty. It is then hypothesised that if new
supply were provided such that all zones were made as accessible as this best
zone, each of them would also exhibit this same ratio of observed admissions
to expected admissions. Furthermore, if strong queuing exists in the system
for the specialty, even this best zone will not have produced its full latent de-
82 J. R. Roy and M. Anderson

mand for care, and one may perform sensitivity analyses with trial values e of
a further uniform increase in demand. Thus, from the above, the need for
care W;k may be defined as:
~k = (1 + e)(1 + r) W;k (3)
representing a uniform increase of the expected demand values W;k' Note
that, if the incidence of day clinics varies greatly in different parts of the
study region, the values of W;k would need to be correspondingly corrected
before being used as inputs to the hospital demand/supply model. In
addition, if the availability of primary care varied markedly within the region,
projections from the hospital model would only be expected to be reliable if
the proposed hospital supply policies were accompanied by policies to en-
courage general practitioners to locate in deprived areas.

Hospital demand/supply interaction models

Having evaluated the need for care in our excess demand system, the next
task is to determine the expected pattern of hospital admissions under
alternative scenarios of hospital construction, partial or complete closures,
changes in hospital case-mix and changes in the throughput characteristics
for each hospital for each specialty. Several approaches are possible to
develop such a model, and HOSPIM allows the user the choice of the
approach perceived as most relevant for his region.
Firstly, the entropy maximisation procedure, introduced into spatial
modelling by Wilson (1967), is extended to handle elastic demand (Roy et
al., 1987). Such a model is simpler to calibrate than the analogous utility
model presented by Mayhew and Leonardi (1982). Defining the microstates
of the system as either distinct potential patients not being admitted for
treatment or as distinct patients being admitted to any of the particular
hospitals in the study area, the task is to maximise the number of microstates
to determine the most probable macrostate I;j' that is the number of patients
admitted from i to j, and the unsatisfied demand (W; - l: j I;j) in i called Vi'
Note that, as such an analysis must be performed for each specialty in tum,
the specialty index k is omitted for convenience. The number of microstates
NT is given as the number of ways (combinations) that each distinct potential
patient of group W; in the base period may be assigned either to the
unsatisfied group (~ - l:j I;J or to admitted groups I;j in each hospital j,


The natural log of NT is taken, the Stirling approximation In X! = X (In X-

1) is applied and hospital capacity utilisation constraints
Reconciling spatial demand/supply imbalances in acute care 83


are enforced, where Dj is the nominal case load capacity of hospital j for the
given specialty (computed from data on available beds, occupancy rate and
average length of stay). If a travel time (distance) constraint for each hospital

L I;j tij = L T~ tij Vj (6)

i i

is also added, with tij the travel time (distance) between zone i and hospital j,
the entropy maximisation problem is given as

5 = max - L I;j (In I;j - 1)

T;j, A,. {3j ij


where Aj and Pj are Lagrange multipliers to be determined. The result comes

out as


where A = exp - (Pj lij) and Bj = exp - (AJ, which may be obtained
iteratively after substitution of (8) into (5). The objective (7) may then be
modified, converting the original Lagrange multipliers Pj to parameters, via
the following Legendre transform (Lesse, 1982)


where ~ is the base period travel time (2:i T~ tiJ to hospital j. The Legendre
transform switches the role (i.e. as knowns or unknowns) between certain
nominated Lagrange multipliers and the right-hand sides of the correspond-
ing constraints, whilst leaving unchanged the optimal solution of the original
problem. In our case, the original problem, with average trip times Cj to each
hospital given and the corresponding multipliers Pj as unknowns, is trans-
formed to a problem where Pj is given and the corresponding average trip
times ~ to each hospital are unknowns, which can then be evaluated anew in
the forecast period as the spatial distribution of demand and supply changes.
Thus, as trip times here represent the internal behavioural variables of the
model, the transformed problem can evaluate the behavioural response of
84 J. R. Roy and M. Anderson

the system to the expected changes. The transformed forecasting problem is

finally given as

+~( tv.- 7T;j) In [( tv.- 7T;j) -1]

+7 A;(D;-~ T;j), (10)

wher~ Dj represent the forecast (or planned) hospital case load capacities
and W; the forecast need values from (3). The solution for (10) is of a similar
form to (8), except that tV; is replaced by tv;, and Dj by Dj in (5) to reflect
new hospital capacities.
An alternative formulation of (8) can be made in terms of the unknown
unsatisfied demand levels Vi' yielding


where Vj can be evaluated iteratively to satisfy the potential patient capacities

tV;. As unsatisfied demand levels increase, the Vi values approach direct
proportionality with the tV; terms, allowing (11) to be finally given in terms
ofthe original relative demand values ~ (see (3», giving


a destination-constrained model as defined by Wilson (1967). Although

HOSPIM provides this model as an option and it was used by Mayhew et al.
(1986), its application should ideally be restricted to cases where large levels
of excess demand exist.
A further limitation of (12) is that if a new hospital were opened, the
pattern of admissions to all existing hospitals would be unaffected, irrespec-
tive of how close the new hospital is to an existing hospital. Similarly, if an
existing hospital is increased or decreased in capacity, model (12) would not
modify any of the individual admissions flows to the other hospitals. In fact,
model (12) possesses the IIA (Independence from Irrelevant Alternatives)
property associated with multinomial logit models (Hensher and Johnson,
1981), which are conceptually weak for competing facilities arranged non-
uniformly in space (Roy, 1985). However, model (8), with its denominator
going over all hospitals, is capable of handling spatial competition effects,
and thus avoiding the weaknesses described above.
Reconciling spatial demand/supply imbalances in acute care 85

As illustrated by Mayhew et al. (1986), the model in (12) usually provides

a rather poor fit to observed patient flows, with errors in major flows of
about 30% being not uncommon. This is because although travel time
(distance) between home and hospital may be an important surrogate for
influences on hospital choice such as (i) accessibility for patient visits by
family and friends, (ii) knowledge by a patient's G.P. of hospital staff and
conditions and (iii) the chances of neighbours having been admitted to the
same hospital, etc., it cannot represent all influences. For instance, a more
remote hospital may have a top surgeon in the specialty required. Alterna-
tively, large queues may exist at the most accessible facilities to certain
patients. Thus, an alternative method is sought which can directly bias the
predicted admissions pattern in terms of observed admissions at the base
period. This problem was handled using an 'inferred deterrence' approach in
Mayhew et at. (1986). In Roy et al. (1987), information theory was used for
a more general solution of this problem, which yielded a result in the form of
Equation (8), but where hj is given as the quotient (T~I V9) of the observed
admissions and the 'observed' unsatisfied demand. The reader can confirm
that this model arises from the information theory problem


which is solved at the forecast time period.

In problems where new hospitals are being planned in the forecast period,
the above formulation is riot relevant, as it contains no instruments for
relativising the accessibility of the new hospital to that of the rest of the
system. Rather than having to return to the original entropy models of (8) or
(12), Mayhew et al. (1986) proposed using (12) just for the new hospitals
and the more accurate (13) for all existing or modified hospitals. This
approach was generalised and derived from an information theory objective
in Roy (1987). If the result of the entropy calib!ation of (8) or (12) yielded
admissions 'Irj and unsatisfied demand values Vi respectively, the hj values
for the new model can be shown to be
The revised in~rmation t~ory obje~tive is for~d by replacing In 'Irj
in (71 by In ('Ir/_Tj) wher~ 'Irj = IJ I 'Irj , ~d In l~ - L j 'Irj) in (7) by
In [(~ - Lj'IryV;J where Vi = if!IVi . The 'Irj and Vi terms can be regarded
as endogenous prior biases, which, if applied to the base period entropy
objective, enable it to perfectly reproduce the observed results T~ and V9,
with unchanged Lagrange multipliers to those obtained originally via prob-
86 J. R. Roy and M. Anderson

lem (7). It is seen that this approach has decomposed the interaction effect
into a time (distance) component and a bias component. The user is free to
choose an impedance parameter f3j for each of the new hospitals from
knowledge of those for 'comparable' hospitals. Also, average bias terms can
be obtained for the new hospitals consistent with any errors in patient
admissions from origin zones in the existing system (Roy, 1987).
Finally, in some hospital systems experiencing overall excess demand,
certain outmoded or less accessible facilities may well be running below their
nominal capacity, while the rest are, as expected, subject to significant
queuing. For this case, the problem in (7) should be revised by interpreting
Dj as the actual utilised capacity in each hospital j, which the admissions
have to satisfy via (5) at the base period. This will be less than the nominal
capacity for such hospitals. Then, in forecasting, the problem in (10) is
modified to give


where the destination capacity constraints now represent inequalities on the

forecast available capacity in the form


The problem comes from the Legendre transform


which ensures that both {3j and Aj are treated as parameters. Inclusion of the
latter (AJ enables the implied hospital attractiveness terms Bf to be projected
for forecasting in cases where the facilities are running below capacity in the
base period and possibly also in the forecast period. The solution of (15)
comes out as


where Bf = exp - (A f) as before, but Ef is the additional destination factor

Reconciling spatial demand/supply imbalances in acute care 87

exp - (A;) evaluated to satisfy the inequality constraint on the forecast

capacity Dj. From the Kuhn-Tucker conditions, multiplier Aj will become
zero (i.e. Ej = 1) whenever a constraint in (16) is inactive, leaving the original
Bj as the implied attractiveness of such a hospital. Upon substitution of (8)
into (5), it can be seen that (B/ Dj ) is an appropriate measure of relative
attractiveness for any hospital j. Note that, to adapt the simpler model of
(12) to certain facilities running below capacity, Bj is evaluated at the base
period as D/[~ i Wi hj j, leading to a forecasting model in the form
where, as above, Ej = 1 if the corresponding capacity constraint in (16) is
inactive and a positive value otherwise. This model may be regarded as a
hybrid form between the classical unconstrained and destination-constrained
entropy models of Wilson (1967).
A further challenge, not yet met in HOSPIM or any comparable spatial
model, is to somehow take account of different queuing intensities in
different hospitals for different specialties. This is not yet a critical issue in
the strong excess demand conditions existing in many urban public hospitals.
However, as the incidence of day hospitals increases further, and as average
lengths of stay continue to decrease, one will enter a transition stage between
the excess demand and excess supply situations, where the hospitals currently
with the smallest queues are the first ones likely to start operating below their
nominal capacities. At this stage, the planner must indicate in advance to
HOSPIM which hospitals should continue to obey the capacity constraints
(5), and where they can be relaxed to inequalities such as (16).

Models for overall excess supply or inelastic demand conditions

Modelling the demand for care

The excess supply case for a specialty is exemplified by an absence of

queuing, except for stochastic effects. The implication is that all current
demand is being met in the context of the given income of patients and their
hospital insurance arrangements. Thus, an increase in supply is not taken to
induce any extra demand, but merely to cause a redistribution of existing
demand. Note that, it is not assmed that excess demand or excess supply is
necessarily a global condition of the hospital system as a whole. Instead,
HOSPIM allows this condition to be defined for each specialty, with some
specialties being permitted to be in excess demand and others in excess
supply or inelastic demand (e.g. obstetrics).
Prior to the calibration of the excess supply model, it is possible to
compute the error matrix eik (see (1» such that the admissions computed
from the morbidity data equal the observed admissions Qik = ~j T~k from
88 1. R. Roy and M. Anderson

each zone. By transposition of Equation (1), eik is evaluated via


Then, in forecasting applications, the new demand O;k can be calculated

from (1), using eik as in (20) and any revised population, morbidity and
leakage values (if applicable). This new demand will then be entirely satisfied
in each zone and for each specialty in excess supply.

Hospital demand/supply interaction model

Returning to the entropy derivation in (7), removing the entropy term for
unsatisfied demand and adding constraints on all demand being satisfied
from each zone as


the following result is obtained


where fij is as before and Bj is exp - (A. j ), with A. j the Lagrange multiplier on
(5), where Dj must here represent the actual utilised capacity, which in the
excess supply case will usually be less than that available. Also, in this excess
supply case, it is advisable to obtain .f;j as exp - (f3i tij ), with f3i defined for
each zone, thus avoiding the potential multicollinearity with the hospital
attractiveness parameters Bj if f3j were used. Now if the Legendre transform
of (17) is applied to the excess supply problem, the model structure for
forecasting becomes


where 0; is the new satisfied demand obtained from (1) and Ej has the same
interpretation as in the analogous Equation (18). Note that, whereas in the
excess demand case most facilities run at their nominal capacity for a given
specialty with the possible exception of a few less favoured facilities running
below capacity, the converse is true for the excess supply situation. In this
case, the inequality hospital capacity constraints in (16) will usually be
Reconciling spatial demand/supply imbalances in acute care 89

inactive, leading to Ej = 1 in most cases. Note that, the model (23) may be
regarded as a hybrid form between the classical origin-constrained and
doubly-constrained entropy models of Wilson (1967).
All the above entropy models for excess supply can be converted to
information theory models in a similar way as shown for the excess demand
case in (13) and (14). For the case of no new facilities being constructed, the
interaction term tij simply equals the observed admissions T~. For situations
where new facilities are being considered, hj in (14) is now given as
Finally, the- reader may wonder why the excess demand model of (8) is not
interchangeable with the excess supply model of (22) by merely reversing the
demand indices i and the supply indices j. The reason is that in the original
definition of microstates in (4), the potential patients within each zone i were
treated as distinct, but not the beds within each hospital j. This is because (i)
the 'urgency' of different patients for admission can vary widely within a
region and (ii) even if some particular wards within a hospital are regarded as
'better' than others, the patient cannot usually choose a particular ward, but
must accept what is immediately available within the category covered by his
hospital insurance. In other words, the simple inequality constraints (18) on
hospital capacity in the excess supply case differ fundamentally from the
elastic demand 'constraints' on potential patients in the excess demand case,
introduced via the entropy on unsatisfied demand (~ - ~ j Tij) in (8). The
latter terms ensure that, so long as overall excess demand exists in the
system, all zones will have some unsatisfied demand.

Implementation of the model

Sequence of operation

After reading in the data common to all specialties (e.g. travel time, hospital
catchment specifications, etc.), HOSPIM computes the demand and case load
capacities and calibrates the spatial models for each specialty in tum, using
base period data. A redundancy index is computed, representing the explana-
tory power of the model. In addition, the root mean square error between the
modelled admissions from each zone and those observed is evaluated. For
each specialty,the calibration is followed by an update phase, which usually
represents the 'no build' situation in the forecast time period. The update
phase incorporates changes which are not potentially controllable by the
health care planner, such as population and morbidity changes, planned
facilities coming into service by the forecast period, planned case mix
changes, planned closures, expected changes to average lengths of stay and
occupancy rates, etc. The results of the update phase are then regarded as
the new base case, adjusted to the forecast pe~od. Finally, one enters the key
90 J. R. Roy and M. Anderson

module of HOSPIM, the interactive forecast phase. In this phase, the planner
can interactively try alternative hospital development policies, including
integrated case-mix changes, closing wards, hospital extensions, hospital
closures and new hospital construction. The forecast phase can also be used
as a vehicle for sensitivity analyses about uncertain data, such as population
changes, morbidity changes, reductions in average lengths of stay, etc.
For each run through the forecast phase, HOSPIM evaluates the changes
to the key results and indicators with respect to the update phase. Some of
these results are for each specialty in turn, and others are with respect to the
whole system. Successive changes can be accumulated, after which the system
may be returned to the update phase condition using a single command. This
feature is useful for comparing the independent effects of individual changes
with the coupled effects of a set of integrated changes. In this regard,
HOSPIM warns the user if a set of integrated case-mix changes exceeds the
bed capacity of the hospital, as well as indicating the likely presence of idle
beds. It is the planner's responsibility to estimate if the planned case-mix
changes, etc., can be handled by the expected availability of medical, nursing
and support staff.

Output of model

Although the main results of HOSPIM are the expected admissions 1!jk
between each zone i and hospital j for each specialty k, the planner is more
directly interested in the various performance indicators which can be
evaluated from these flows and other available data. At this stage, HOSPIM
does not compute cost-based indicators - these can always be added when
better cost data by specialty becomes available.
Several indicators are population-based, and require the determination of
standardised populations Pik by zone and specialty, which are obtained as


where P is the total study area population, 0ik the relative leakages to outside
hospitals defined in (1), and W;k the expected relative demand for care
computed earlier from age, sex and (optionally) other characteristics of the
population. A typical population-based indicator nik is given as


the number of admissions from zone i to specialty k per unit of standardised

Reconciling spatial demand/supply imbalances in acute care 91

population. Note that, in an excess supply system, a low value of this

indicator would usually be given a positive interpretation, so long as the low
number of admissions was not due to people in such zones not being able to
afford treatment. On the other hand, for excess demand, low values of nik
would be regarded adversely, as they would indicate zones with poor
accessibility to facilities and information.
The efficiency of the hospital location pattern may be defined as the
average time (distance) tk consumed by the admitted patients in specialty k
from their place of residence, given as


An equity index of the location pattern is given as the coefficient of

variation of the average time (distance) i;k as


used to access hospitals out of each zone i. This is clearly influenced by the
level of spatial aggregation adopted in aggregating the origins of patients.
A level of service index Sik for each zone i and specialty k may be defined
as the relative level of satisfied demand, given as


Similarly, the equity of the level of service can be given as the coefficient of
variation of the above indices over all zones i. Zones with poor accessibility
to facilities will usually be expected to have relatively low values of the above
satisfied demand index.
Another important criterion is zonal self-sufficiency for health care, which
indicates the proportion Pik of a zone's patients in specialty k admitted to
hospitals within their own zone, given as


where ji is the subset of hospitals (if any) located in zone i. This criterion is
again dependent on the level of zonal aggregation.
In the excess supply situation, hospital capacity indicators are very
92 1. R. Roy and M. Anderson

important. Consider the capacity utilisation indicator Cjk given as


where Djk is the case load capacity for specialty k in hospital j. In addition,
HOSPIM indicates the proportion of a hospital's case loads coming from its
own zone, as well as (optionally) the proportions coming from up to two
user-defined rings of zones around the hospitals' zone. Also, it allows the
user to (optionally) aggregate the zone system used for analysis up to the
health region or health district level, with certain results then being output at
this more policy-relevant level. Finally, the relative attractiveness 1jk of a
hospital j for specialty k can be defined as
where Bjk is the destination factor from (8) etc. This represents the hospitals
'pulling power' above and beyond its size and average accessibility to
In addition to the above specialty-specific indicators, global indicators
representing the average performance of the system as a whole can also be
obtained. Also, if HOSPIM is later combined with operating cost and
maintenance models within a unified data base, cost-based indicators can be
readily obtained for the integrated patient/activity/building system, which
would provide the planner with a more comprehensive decision aid package.

An illustrative example

This example describes the testing of HOSPIM on data generously provided

by the Victorian Department of Health for part of the Northern and Western
Metropolitan Regions of Melbourne (Figure 1). As the earliest comprehen-
sive data available was for 1983/84 and the latest for 1986/87, the purpose
of the tests was to project the expected 1986/87 pattern of admissions from
a model calibrated on 1983/84 data. This period represents a good test of
the robustness of the model, as in 1984/85 private hospital insurance was
replaced by a new system, where normal public hospital care was freely
available and insurance for private treatment in public hospitals was optional.
The eight municipalities treated in the study can be classified as among the
lower- to middle-income parts of Melbourne. In 1983/84, only one hospital
(PANCH) was located within the study area, and it was necessary to include
ten extra hospitals (mostly central city hospitals) to account for an acceptable
proportion (say 90%) of the admissions of study area residents. In 1984/85,
one new hospital, Essendon, came into operation within the study area.
Because of data limitations, the analysis was performed for all specialties
Reconciling spatial demand/supply imbalances in acute care 93



Sunshine & District e e Fairfield e Hospitals

Weslern & General. loyal Children's
Royol MelbourneeeRoyal IIbmen's c Centroid of L.G.A.
· I . _ _ eSI. Vlncenl's
Queen VIe ono
t eM

Figure 1. Study area zones and hospitals.

combined, rather than a separate analysis for each specialty. As not all of the
hospitals possessed all specialties, this aggregation inevitably produced some
The calculation of need for care was based on metropolitan-wide morbid-
ity rates by age and sex being applied to the age/sex distribution of the
population in each municipality, corrected for leakages to hospitals outside
the eleven included. These demand values were further enhanced to bring the
ratio of observed admissions to expected admissions up to be 'best' value for
all zones. Then, sensitivity analysis was made with respect to a further
increase of this demand, with uniformly good results being found for
increases in the range of 15 to 35% (see (3».
In the calibration of the model (8) on 1983/84 data, the R.M.S. error of
modelled vs. observed admissions from each zone was 17.5%, reducing to
15% for the 1986/7 forecasts when model (8) was used with the corrected
interaction terms k given from (14). On the other hand, the simplified model
in (12) produced a 26% R.M.S. error on the 1983/84 calibration data,
reducing to 16% for the 1986/7 forecasts when model (12) was used with
the corrected interaction terms k given from (24). Thus, the elastic demand
feature in (8) greatly increases the ability to calibrate compared with the
conventional approach in (12), whilst having little effect on its forecasting
accuracy. Because of the major change to the hospital insurance system
94 1. R. Roy and M. Anderson

introduced in 1984, it was decided to re-run the model with calibration on

1984/5 data. As expected, this greatly improved both the calibration ability
and forecasting accuracy of the models, with R.M.S. errors on calibration as
low as 12% and in forecasting as low as 9%. An even lower forecasting error
had been hoped for. However, the introduction of some day clinics in some
of the outer areas has not yet been corrected for in the demand calculations,
and may produce some improvement. Nevertheless, forecasting errors in
zonal admissions of 9-12% would usually be regarded as quite acceptable
for most planning purposes.
Finally, as expected, the more remote zones were having less of their
expected demand satisfied than those more accessible to facilities, with the
most outer area Bulla just having 0.8 of demand satisfied and Preston and
Keilor having more than 1.1 satisfied when total expected demand was
normalised to equal total observed admissions. This finding is consistent with
the spatial deterrence hypothesis adopted by the interaction models.


Hensher, D. A. and Johnson, L. W. (1981), Applied Discrete-Choice Modelling, Croom Helm,

Lesse, P. F. (1982), 'A Phenomenological Theory of Socio-economic Systems with Spatial
Interactions', Environment and Planning A 14,869-888.
Mayhew, L. D. (1986), Urban Hospital Location, George Allen and Unwin, London.
Mayhew, L. D. and Leonardi, G. (1982), 'Equity, Efficiency and Accessibility in Urban and
Regional Health-care Systems', Environment and Planning A 14, 1479-1507.
Mayhew, L. D., Gibberd, R. W. and Hall, H. (1986), 'Predicting Patient Flows and Hospital
Case-mix', Environment and Planning A 18,619-638.
Roy, J. R. (1985), 'On Forecasting Choice among Dependent Spatial Alternatives', Environ-
ment and Planning B 12,479-492.
Roy, J. R. (1987), 'A Alternative Information Theory Approach for Modelling Spatial Interac-
tion', Environment and Planning A 19,385-394.
Roy, J. R., Mayhew, L. D. and Leonardi, G. (1987), 'Structures of Planning Models for User-
attracting Facility Systems', Sistemi Urbani 1,33-55.
Wilson, A. G. (1967), 'A Statistical Theory of Spatial Distribution Models', Transportation
Research 1,253-269.
Physicians' specialty choice and specialty income*

Senior Research Fellow, The Hoover Institution, Stanford University,
Stanford, California 94305, U.S.A.

1. Introduction

Physicians have been labelled at various times as being in short supply,

although the current view of policymakers is that recent shortages have been
confined to particular specialties and certain medically underserved loca-
tions. The primary care specialties - general practice, family practice,
internal medicine, pediatrics and OB/GYN - have been singled out as being
shortage specialties. Recent federal projections of physician surpluses, along
with substantial reductions in federal support for medical education, have
provided added impetus to a reexamination of medical school admissions
criteria and their role in effectively matching medical school graduates with
the medical needs of society.
A large body of literature exists concerning the noneconomic determi-
nants of physician specialty and location choice and the role of personal
background traits and medical education characteristics on physician career
decisions (e.g., Ernst and Yett, 1985). Unfortunately, this literature does not
generally incorporate economic motivation with the factors under analysis
and therefore fails to weigh the relative importance of these two sets of
factors. Moreover, the omission of economic factors may bias reported
results. For example, a finding that certain physician personality types are
associated with nonrural practice locations might actually be a result of those
personality types being more financially motivated and thus less likely to
practice in rural locations than other physicians.
Sloan (1970) provides the first non-descriptive analysis of physician
specialty income differentials, using data from Medical Economics' surveys
of physicians in 1955, 1959, and 1965. Using a human capital framework,
Sloan first computes present values and internal rates of return to each of
nine specialties and general practice. With alternative discount rates of 5 and
10%, he subtracts the general practice present value of earnings from each of
the nine specialty estimates to obtain the relative income advantage to each
type of specialization.
Sloan concludes from this analysis that the financial motivation to incur

G. Duru and J. H. P. Paelinck (eds.), Econometrics of Health Care, 95-113.

© 1991 Kluwer Academic Publishers.
96 1. W. Hay

the longer residency training programs associated with specialization are

extremely weak, except at discount rates approaching zero. At a 5% discount
rate, the present value of earnings in general surgery are about equal to those
in general practice, while at 10%, general surgeons' lifetime earnings
fall below those of general practitioners. The same general pattern prevails in
comparing the other specialties with general practice. Internal medicine and
pediatrics generate substantially lower present values than general practice.
Sloan proceeds to use his rate of return estimates in a simultaneous model
of specialty choice. The model is of an aggregative nature; the dependent
variables are proportions of residents in a particular specialty in a given year.
Sloan concludes from these results that the physician's choice of specialty is
not influenced by relative specialty income differentials.
There have been some criticisms of Sloan's approach. Due to the aggrega-
tive nature of his model, he is unable to exploit the demonstrated relation-
ships between individual socia-demographic characteristics and specialty
choice. The limitation of the structural model to three explanatory variables
- specialty "income", number of foreign medical graduates, and the supply
of residency positions in a given year - is overly simplistic. Moreover, there
is no theoretical justification for including the supply of residencies in an
equation modelling the demand for residencies by specialty. While this
variable will certainly improve the statistical goodness-of-fit if the residency
market tends towards equilibrium, the institutional factors that cause
hospitals to offer residencies would not plausibly lead medical graduates to
fill these vacancies. If the appropriate analytic framework involves a truly
simultaneous model of the demand for and supply of residencies by specialty,
Sloan's parameter estimates are clearly subject to simultaneity bias.
Hadley's (1975) dissertation provides the first microanalytic econometric
model of the physicians' specialty choice. His model is formulated in terms of
the individual physician rather than the aggregative time series analysis
employed by Sloan (1970). The major finding of Hadley's investigation is
that economic incentives did not significantly influence specialty choice. The
income and personal finance proxies that Hadley employed entered into the
specialty choice equations with implausible or insignificant signs. On the
other hand, individual preferences and personality traits had strong influ-
ences on specialty choice as did NBME scores and certain demographic
characteristics. Medical school and internship hospital characteristics did not
have important effects on specialty choice, when other medical student
characteristics were accounted for.
Hadley uses the mean net incomes by specialty and state of residence in
1965 as a proxy for physicians' expected earnings. While he is able to
account for regional variation in specialty income, a number of other factors
leading to intraspecialty income variation are ignored. Kehrer (1974) found,
even after standardization for specialty and other personal and practice
attributes, that women physicians earn approximately seventy percent per
hour as much as men. Ernst et at. (1978) found that foreign medical
Physicians'specialty choice and specialty income 97

graduates work shorter hours and earn less than their U.S. counterparts.
They found net income to vary with the size of practice, and with the volume
of radiology and laboratory services provided by the practice.
Perhaps a less obvious flaw in the modelling approach used by both Sloan
and Hadley is the treatment of specialty income as an exogenous explanatory
variable in the specialty choice equation. If specialty income and specialty
choice are jointly determined by the medical students' background and
environment, then an estimation technique that takes this simultaneity into
account would seem to be more appropriate.

2. Selectivity bias in a simultaneous logit-OLS model

The recent development of estimation techniques appropriate in cases where

the sampled data is nonrandom has expanded and refined econometric
investigation of a number of important issues. Some issues currently being
explored include the determinants of union wages and union participation,
wages of secondary workers and secondary labor force participation, market
disequilibrium models, school training decisions, durable goods purchasing
decisions, and housing market demand. l
As a simple example of the linear model with a dichotomous sampling
rule, consider the probit-regression model:
Yl = 1 iff Ul < Zy
Yl = 0 iff Ul ~ Zy
Y2 observed iff Yl = l.
It has been shown that if E(ulUz) #: 0, then the OLS regression of Y2 on X
will yield biased and inconsistent estimates of b (Olsen, 1975).
The estimation techniques developed thus far include the full information
maximum likelihood model and a two-stage limited information method.
With few exceptions (e.g., Olsen, 1980), all approaches assume that the
underlying disturbance structure across equations is joint normal, an assump-
tion that is in keeping with the mainstream of econometric theory, but one
that leads to significant computational complexities in situations such as the
probit-regression model outlined above. The likelihood function for such
models is not generally globally concave, and FIML algorithms will occasion-
ally wander away from the maximum likelihood values unless they iterate
from initial parameter values that are close to the true maximum (Nelson,
Since a two-stage limited information method has been developed for this
class of models - the Mill's ratio method (Heckman, 1976, 1979; Lee,
1976) that is consistent and relatively inexpensive to program - this method
98 J. WHay

is being used both to estimate this class of models and also to provide
consistent initial starting values for FIML parameter estimation. However,
even the Mill's ratio method requires a probit maximum likelihood estimator
in the first stage, and is thus computationally expensive. Olsen (1980) has
shown that by dropping the underlying assumption of joint-normal distur-
bances, a linear probability modification of the Mill's ratio method allows
standard OLS regression techniques to be used to correct for nonrandom
sample bias in this class of models.
Hay (1980) proposes a logistic modification of the Mill's ratio method.
While this modification has only a minor computational advantage over the
Mill's ratio method in the case of a dichotomous sampling rule, it may be the
only computationally feasible selectivity bias correction in the case of a
polytomous sampling rule, where the number of nonrandom samples in the
data is large. The Mill's ratio method can be extended theoretically to include
polytomous models of this type, but in practice it requires computationally
burdensome estimation of an n-variate first-stage probit model. While the
Olsen linear probability modification is quite tractable in the dichotomous
sampling rule case, there are no clear theoretical extensions of it to the
polytomous case.
Moreover, as will be shown below, the logistic modification may be
embedded in a well-known stochastic utility model of choice - the condi-
tional logit model - allowing the statistician to examine hypotheses regard-
ing the behavioral structure of the nonrandom sample selection process. This
is particularly useful in cases where selectivity bias is due to self-selection, as
in the union participation-union wages analysis, for example.

3. Derivation of the two-stage logit-OLS estimator

Consider an individual i faced with choosing between two alternatives. The
individual is assumed to have a utility function that can be written in the

where W;j is a vector of observable attributes of the individual i and the

choice set j, OJ is a vector of parameters reflecting the weights that a
'representative' individual places on the characteristics W;j' eij is a stochastic
error term, reflecting individual i's idiosyncratic tastes or unobservable
characteristics. The individual is assumed to maximize Uij over the choice set
{iii = 1, 2}. McFadden (1973) has shown that under weak assumptions, if eij ,
ei2 are i.i.d. Extreme Value distributed, then necessary and sufficient condi-
tions exist for the probabilities of choice to be logistic given {W;j, W;2}.
Letting Pij represent the conditional probability that individual i chooses
choice j given {W;j}, then:
Physicians' specialty choice and specialty income 99

Furthermore, Vi == ei2 - eil is logistically distributed with c.dJ. F(vi) = (1 +

e-Vi)-I .
We now consider the simplest model of sample selection bias. Extensions
of this 'probit-type' model to 'Tobit-type', etc., models (e.g., Heckman
(1976)) are relatively straightforward. The regression model of interest is:


where Yi is observed if and only if Ii = 1, and

Ii = 1 iff Vi < Zi Y

where Xi' Zi are row vectors of exogenous observable variables, and /3, yare
column vectors of unknown coefficients. Equation (2) can be rewritten as:

Ii = 1 iff ei2 - eil < lV;10 1- lV;202

i.e., if Ui! ~ Ui2 ,
Ii = 0 iff ei2 - ei1 ~ lV;IOI - Wi2 02
i.e., if Uil < Ui2'
thus indicating how the underlying utility maximization behavior is taken to
affect the sample for which Yi is observed. 2
We assume that:

E(uiuj ) = o~ i = j
= 0 otherwise
E(ViVJ = o~ i= j
= 0 otherwise
E(uiVj) = pOuov i = j
= 0 otherwise
100 1. W.Hay

It follows from the linearity of E (u i IVi) that we can decompose

where var (Ei) = a~(1- p2) and E(EiVi) = O.

Given the sample selection rule, we only have nonzero observations on
the conditional distribution for Yi' f(YiIXi' Ii = 1) or f(y;!'¥;, Vi < ZiY)' It
follows from the decomposition of U that:


We assume additionally,

Var (Yi IX;, Vi < ZiY) -- P 2 -au2 Var (V; IVi < ZiY) +


Up to this point, the analysis parallels that of the Mill's ratio method and
the Olsen modification. Defining

where fO and F(.) are respectively the pdf and cdf of Vi' we may write:
E(Yi IX;, Vi < ZiY) = XJ3 + AiD. (6)
Given an estimate of Ai, Ai it is possible to estimate the parameters {3, 15 in
(6) using standard regression techniques. It is not necessary to make any
additional assumptions about the distribution of U i • When Vi is distributed as
standard normal,
Ai = -f(Ziy)[F(ZiY)rl
is termed the Mill's ratio, where fO and F(.) are the standard normal pdf and
cdf respectively. When Vi is uniformly distributed on the [0,11 interval, Ai =
(1 - ZiY) is the Olsen (1980) linear probability modification of the Mill's
ratio. 3
Physicians' specialty choice and specialty income 101

In the case proposed here, V. has a logistic distribution:

V. - a )
( 'r

f(v;) ~ r[ 1 + exp (_ ~ a V; )

F(v;) = [ 1 + exp ( - V; ~a ) r 1
= 1/2 [ 1 + tanh ( V; ;r a ) ]

- 00 < V; < 00, - 00 <a < 00 r >0

We will assume, without loss of generality,that a = 0, (flv = 0) and r = 1,
(a; = .n2/3).4
Given (7), it is possible to obtain a closed-form representation of E( V; IV;
< Z;y):

k =
Zi)' v·e
-Vi dv·

(1 - e


= -([log(1 + e- ziY )](1 + e- ZiY ) + Z;ye- ZiY }.
Defining the logistic probability F(Z;y) == Pi' (8) may be rewritten as:

= liP; [Pi 10g(P;) + (1- P;) 10g(1 - Pi)]
=-H/Pi •

Hi is the entropy of the dichotomy Ii conditional on ZiY' According to

established tenets of information theory,S Hi is the expected value of
information about the dichotomy 1;. -P)'i is equal to this entropy Hi' We
also find:

102 J. W. Hay


ZiY 2 -Vi

Ai= f
vie dVi
(1 + e-Vi )2'

= Jt2
+C r
(-If [ (I ZiYI 2 + 21 ZiyI + 2)e-nIZiYI _
n n

where Ci = 1 if ZiY > 0; Ci = -1 if ZiY ~ O.

Var(YiIXi,Vi < ZiY) = <P(F(ZiYt' Ai - AD + au(l- p2), (13)
includes a term that is an infinite alternating series. (12) may also be
expressed as

where r~kiYI is the incomplete gamma function. 6

The proposed estimation strategy may be outlined in the following
1. In the first stage, maximum likelihood techniques are used to develop the
logit estimator of y, p-
2. In the second stage, consistent estimates of A. i , Xi = A.i(Y) are substituted
into the set of regressors for Equation (5).
Yi = XJ3 + XiD +;i (14)
and this equation is estimated using OLS on the sample for which
nonzero data are observed on Yi'
The resulting estimates of p, D (~, 0) will be consistent, although not in
general efficient. If additional assumptions are made concerning the distribu-
tions of ui ' it may be possible to derive an alternate FIML estimator and
make efficiency comparisons with the two-stage logit-OLS (2SLO) estimator
outlined above. 7

4. Derivation ofthe polytomous 2SLO estimator

We turn now to the case where observations on a set of endogenous

Physicians' specialty choice and specialty income 103

variables are available only through a number of mutually exclusive non-

random samples. To fix ideas, consider for example the simultaneous
estimation of physician specialty choice and specialty income. Due to
differentials in factors determining specialty supply and demand, it may be
hypothesized that (reduced-form) specialty-specific income equations differ
For example, one specialty may be particularly prestigious, thus attracting
many medical students and driving down its future income potential, while
practicing in another specialty might require living or working in a relatively
unpleasant environment and thus be fairly unpopUlar. According to the
compensating differentials theory of labor economics, the relative incomes in
alternative specialties will adjust relative supplies and demands to create
equilibria in each of the specialty submarkets.
However, specialty income is only observed for the nonrandom sample of
physicians who chose that specialty. It is not practical to measure what a GP
would have earned had he or she chosen to become a general surgeon or a
pathologist. As long as there are unobservable or unmeasured factors
affecting both specialty choice and specialty income, standard OLS specialty-
specific income equations will exhibit sample selection bias.
Development of estimation techniques for simultaneous systems of equa-
tions under polytomous sampling rules has been somewhat limited. Heckman
(1977) has proposed a general extension of the Mill's ratio method that is
theoretically applicable to any situation with nonrandom samples or simulta-
neous discrete and continuous endogenous variables. However, this approach
is again restricted in practice by the computational expense of multivariate
probit analysis.
Consider an individual i faced with choosing between J alternatives {j =
1, 2, ... J}. We consider estimation of the following 'multiple selection
regression' model.
where Yii' uij are scalar-valued random variables, Xii is a row-vector of
exogenous variables, and {3i is a column vector of parameters. It is assumed
that Yij is observed if and only if Iij = 1, where

Iii = 1 iff vij < ZijY

Iij = 0 otherwise,
where Iii is a scalar indicator variable, vij is a column vector of i.i.d. logistic
random disturbances {vii' vi, ... , vi, ... , vt}, Zij is a (J - 1) X K matrix
of exogenous variables with rows {Zb, Z~, ... , zt}, and y is a
K -column vector of parameters. Moreover, for each i there exists a
single j E {1, 2, ... , J} such that Iij = 1. Returning to the stochastic
utility framework, the model implies:
104 1. W. Hay

Iij= 1 iff
v}j = en - eij < W;jOj - W;) 0) = Zijy
v~ = ei2 - eij < WijOj - W;202 = Z~y

vt = eiJ - eij < W;jOj - WiJo, = zty

Iij = 0 otherwise.

This is to say,

The set of points for which Vij = Vij' for any j, j' is of probability measure
It is assumed that,

E(uij)=E(vt)=O V jkE{1,2, ... ,J}

E(ulj.. u.,.,)
=a ujj"
i = i'
= 0 i ~ i',
If I f
= a 2v = n 2/3 i=i',j=j',k=k'
= n /3
2 i = i',j = k', k = j'
= n /6 i = i', (j, k, j', k') other than above

=0 i ~ i',
E(uij vt) = pp~2av i = i',
= 0 otherwise

where Pj = (Pj, Pj' ... , Pj)' is a column vector of order (J - 1) with Pj in

each position.8 These assumptions parallel those obtained in the multivariate
normal framework. Any bivariate conditional distribution exhibits a condi-
tional mean that is a linear function of the variable held fixed, weighted by
the variance and correlation coefficients. However, as Anderson (1958)
points out, this linearity property does not hold in the general case for
multivariate conditional distributions. It is certainly possible to construct
Physicians' specialty choice and specialty income 105

distributions obeying the linearity assumptions listed above. Consider, for

example, the case where uij is normal conditional on Vij"
The last equation above says that uij has a conditional expectation equal to
a linear weighted sum of the elements of vij" As in the dichotomous case, we
can thus use the linear decomposition


= au (1- P,~
L Pj),

to develop selectivity bias correction regressors for Equation (15).

When we estimate Equation (15) we seek to calculate the conditional
mean E(YijlX;j' I;j = 1) or E(YijIXij' Vij < ZijY). It is assumed that,


E(YijlX;j, Vij < ZijY) = XJ3j + a~~2 pi L E(vijlvij < ZijY)'


-1 -1

Var(YijIXij , Vij < ZijY) = auii L COV(Vij IVij < ZijY) L Pj+
v v


+ au}l - pi L Pj),

where COY (vijlVij < ZijY) is the conditional variance-covariance matrix

of Vij"
As in the dichotomous case:



OJ ;: a~12 Pl· L

In order to estimate Equation (18) it will be necessary to develop a

106 J. W Hay

consistent estimator of Aij based on the logit estimates of y, y. Consider the

k'th element of A ij ,9

At = F(ZijYrl J J 4r
J Z;jr
-00 vtf(vij) (}vij

IOg( 1+ :~:-,:,) 1 (19)

( 1+ I
h -1
e -Z;;y ) J

= (-1)U+ I) [ -Hik + log ( Pij )],

(1 - Pjk ) 1 - Pjk
where -Hik = Pik 10g(Pik ) + (1 - Pik ) log (1 - Pik ), is the dichotomous
entropy conditional on Zik Y, defined earlier.
Taking advantage of the formula for 2:~1 displayed above, it is possible to
rewrite (18) as:

~ X,{J! +
,,' lf(J-l)-j- loge Pij ) + (20)

t 10g(Pjk )
( 1 ~'p, }i-1Y+',
k-I J

where (l.
= pja,~I2.
Physicians' specialty choice and specialty income 107

Thus, as in the dichotomous case, Aij is an analytically closed-form

expression composed of simple combinations of the underlying logistic
probabilities of sample selection.
The polytomous 2SLO estimation strategy may be outlined in the follow-
1. In the first stage, maximum likelihood techniques are used to develop the
logit estimator of y, y.
2. In the second stage, consistent estimates of Aij , iij = Aily) are substituted
into the set of regressors for Equation (18),


and this equation is estimated using OLS on the sample for which
nonzero data is observed on Yij.

5. An empirical example

A model of physicians' specialty choice provides an appropriate setting for

the application of the 2SLO estimation procedure. It is quite plausible to
assume that a physician chooses that specialty among the various alternatives
which maximizes his or her utility. Income is a potential determinant of
specialty choice. However, selectivity bias may be present, suggesting that the
2SLO estimation procedure will be useful in deriving unbiased predictors of
specialty incomes for the full sample of physicians. The following model of
specialty income-specialty choice is proposed:
j s {1,2, .. . ,J}, (23)
where V ij is the utility of the i'th physician in the j'th specialty. l1jj is a vector
of exogenous variables assumed to affect utility, weighted by the taste
parameters OJ. Yij is physician i's income in specialty j, and eij is an i.i.d.
Extreme Value error term. 0 is a parameter reflecting the marginal utility of
income. It is not asumed to be specialty-specific. Xij is a vector of individual-
specific variables, such as age, sex, and race; that would have the same values
regardless of the physician's choice of specialty. lrj are parameters relating
the Yij to Xij. Vi is a vector of individual-specific disturbances. Since our
underlying hypothesis is that 0 » 0, i.e., that income has a positive impact
on the probability of specialty choice, it will not be necessary to obtain
estimates of the structural parameters of the specialty labor markets where Yij
108 J. W.Hay

is determined. (23) is already a reduced-form equation. The reduced-form

for (22) is:
Uij = W;A + X;jaj + aVi + eij' (24)
aj=n/j jE{1,2, ... ,J}.

Regardless of the distribution of Vi' the restrictions imposed on (22)-(24)

assure that the probabilities of specialty choice are logistic and can be
estimated by FIML using the set of Equations (24). It is assumed that the
disturbances eij , Vi obey the correlation assumptions of the 2SLO model.
The strategy to be employed in estimating (22) and (23) can be described
as follows:
1. First FIML techniques are used to estimate the reduced-form Equations
(24) for each specialty alternative, and thus derive consistent estimates of
Aij "
2. The A./s are inserted into the specialty-specific reduced-form income
Equations (23), generating consistent estimates of nj , icj • This allows
calculation of instruments Yij, for income in each alternative specialty j,
for each physician i.
3. The y/s are put into Equation (22), and FIML logit estimation will
generate consistent estimates of OJ, O.
The data consist of a selected subsample of 2121 physicians from the
Seventh Periodic Survey of Physicians (PSP7) developed by the American
Medical Association Center for Health Research and Development. These
data are based on a 1970 AMA survey of approximately 6000 AMA
Masterfile physicians. This data base has been widely used by both the
AMA, and under proprietary arrangements by a small number of outside
research organizations (Goodman and Jensen, 1979; Yett et aI., 1974).
The variables used in the analysis are defined as follows:
Yij = The continuous dependent variable. The physician's repor-
ted net income, calculated by subtracting total expenses
from reported gross income.
OWNERSHP = A dummy variable indicating ownership status of the
physician's medical school. 1 indicates a privately-owned
medical school, while 0 indicates a government-owned
AAMCINDX = An index of the quality of the physician's medical school.
This index ranges from a low of 0 to a high of 10. It was
devised by the research branch of the Association of
American Medical Colleges. No values were available for
foreign medical schools, so to avoid losing all observations
on foreign medical graduates (FMG's), missing values were
coded as zero.
GRADAGE = The physician's age at medical school graduation.
AGE = The physician's age in 1970.
Physicians' specialty choice and specialty income 109

FMG = Dummy variable: 1 indicates that the physician is a foreign

medical graduate.
BLACKMD = Dummy variable; 1 indicates that the physician graduated
from Howard or Meharry Medical Schools, which in 1970
represented the educational institutions for a substantial
majority of black physicians.
SEX = Dummy variable; 1 indicates that the physician is female.
The results of the physician specialty choice-specialty income estimation
are presented in Table 5.1. These results are based on a three alternative
model: the three specialty categories being: GP (general or family practice);
1M (internal medicine); and OT (all other specialties). This categorization
preserves the primary care - nonprimary care distinction that is of interest
to policymakers concerned with perceived shortages of primary care physi-
cians. While a finer specialty categorization would be preferable, the analysis
is limited by both the small number of individual-specific exogenous
variables and computer capacity constraints on the QUAIL 3.5 program. lO
The set of exogenous variables (W;j = J(j) are: OWNERSHP, AAMCINDX,
Column one of Table 5.1 presents the reduced-form logit estimation
results. The coefficients for the GP alternative are listed as 'EXOG-VAR'l
while the coefficients of the 1M alternative are listed as 'EXOG-VAR'2. The
coefficients for the third OT alternative are -('EXOG-VAR'l + 'EXOG-
Does income affect specialty choice? Using the 2SLO three-specialty
income equations, income predictions were made for each doctor in the
subsample regarding what he or she would have earned in the two other
specialty categories. Using the rule of putting each doctor into that specialty
category with the highest predicted (actual) income gave a 57.4 percent
sample success probability. This is clearly better than random assignment,
which would only predict correctly one-third of the time. It is also better than
a 'naive assignment' rule, which would assign doctors randomly based on the
sample frequencies of GP(0.215), IM(O.163), and OT(0.621), and would
have a prediction success rate of 45.9%. According to the Tchebycheff
inequality, the one-sided probability
Pr [p(income max) ~ p(naive assign)] < 0.016.
While the full logit model has a successful prediction rate of 62.9%, this
simple income maximization rule correctly predicts the specialty of over half
of the sample of physicians, better than random or naive assignment rules.
The next test of the hypothesis that income influences specialty choice was
to include the 2SLO predicted incomes for each doctor in each specialty into
a logit framework of specialty choice (see Table 5.1, Column two). This
corresponds to estimating the structural Equation (22). The resulting
estimate suggests that predicted income is strongly positively correlated with
specialty choice, with a t-value of 2.15. The implication is that the marginal
110 1. WHay

Table 5.1. Specialty choice equations logit estimation results.

Structural equation Structural equation

Variable name Reduced form 2SLO fitted income OLS fitted income

OWNERSHP1 -0.2543 -0.3682 -0.1714

(0.9428E-Ol )" (0.1084)" (0.1273)
AAMCINDXl -0.1312 -0.7822E-01 -0.1666
(0.3062E-Ol )" (0.39l5E-01)" (0.4768E-Ol)"
GRADAGE1 0.6308E-Ol 0.3206E-01 0.8437E-Ol
(0.13l0E-Ol)" (0.1828E-0 1)" (0.2564E-01 )"
AGEl 0.1677E-01 0.2904E-01 0.7617E-02
(0.3576E-02)" (0.6782E-02)" (0.101OE-01)
FMGl -0.3444 -0.4263 -0.2765
(0.1458)" (0.1507)" (0.1623)"
SEX1 -0.1055 0.1572 0.3751E-01
(181.4) (256.5) (256.5)
BLACKMDI 0.3198 0.4950 0.2018
(0.4714) (0.4780) (0.4866)
OWNERSHP2 0.2339 0.1749 0.2786
(0.9800E-Ol )" (0.1019)" (0.1084)"
AAMCINDX2 0.8405E-Ol O.6044E-Ol 0.9840E-01
(0.3355E-01 )" (0.3617E-Ol) (0.3667E-Ol )"
GRADAGE2 -0.4145E-01 -0.3065E-Ol -0.4509E-01
(0.1611E-01)" (0.1800E-Ol)" (0.1661E-01)"
AGE2 -0.1396E-Ol -0.3025E-Ol -0.1893E-02
(0.3901E-02)" (0.8387E-Ol) (0.1304E-Ol)
FMG2 0.1486 0.1555 0.1322
(0.1688) (0.1889) (0.1697)
SEX2 -0.1480E-Ol -0.5133 -0.1316E-Ol
(181.4) (258.5) (256.5)
BLACKMD2 -0.9055 -1.777 -0.2935
(0.7008) (0.8111 )" (0.9418)
INCM 0.5663E-Ol -0.4110E-Ol
(0.2838E-Ol )" (0.4241E-Ol)
CONSTANT! -3.629 -2.172 -4.725
(0.5583)" (0.8423)" (1.262)"
CONSTANT2 -0.4679 1.192 -1.893
(0.7047) (1.056) (1.612)

At convergence At convergence At convergence

Auxiliary Statistics.
LOG LIKELIHOOD -1891.0 -1889.0 -1890.0
SUM OF SQUARED RESIDUALS 4397.0 4234.0 4390.0
DEGREES OF FREEDOM 4226.0 4225.0 4225.0
Goodness of Fit Statistics.
LIKELIHOOD RATIO INDEX 0.1885 0.1894 0.1888

Asymptotic standard errors are in parentheses.

a Indicates significance at at least the 5% level, (two-tailed test).
Physicians' specialty choice and specialty income 111

utility of income is positive, and that an exogenous increase in income in a

given specialty will increase the probability that the physician will choose that
specialty .11
The other GP logit coefficients tend to support well-established prior
hypotheses. The 1M coefficients are quite different from the GP coefficients,
confirming the necessity of separating IM specialists from other primary care
physicians. Internists are more likely to come from private, high 'quality'
medical schools. They are more likely to be younger, and tend to be younger
at graduation from medical schools. Additional confirmation of prior
hypotheses is evidenced by the negative FMG 1 coefficients, Stevens
What happens when selectivity bias is ignored? In column three of Table
5.1 the structural Equation (22) is reestimated using OLS fitted incomes,
without the selectivity bias correction factors. In this estimation the income
variable actually has a negative, though statistically insignificant parameter
value, suggesting that specialty income does not affect utility, and that ceteris
paribus increases in income in a given speciality will have an insignificant
(perhaps negative) impact on the probability of choosing that specialty.
It is interesting to observe that this result, based on an estimation that
ignores selectivity bias, is consistent with the conclusions of Sloan (1970) and
Hadley (1975) who also ignored selectivity bias. However, when selectivity
bias is taken into account, the more standard economic hypothesis of a
significant positive impact of income on the probability of occupational
(specialty) choice is observed.
In conclusion, the 2SLO model of specialty choice and specialty income
indicates that physician income equations are indeed subject to specialty-
specific selectivity bias. It reaffirms standard labor theory hypotheses that
income has a tendency to regulate occupational choice, even in the highly
imperfect physician services labor market, and suggests that previous
contrary findings are based on models that ignore the selectivity bias in
specialty-specific income equations.


* Research support from the National Center for Health Services Research (NCHSR)
(DASH Grant # 03150) is gratefully acknowledged. I would like to thank the Center for
Health Policy Research of the American Medical Association (AMA) for providing
physician survey data. The' views expressed here are not necessarily those of either
NCHSR or the AMA.
1. See, for example, Lee (1976); Olsen (1975); Heckman (1974, 1977); Lee and Trost
(1978); Nelson (1977); Nelson and Olson (1978).
2. While a stochastic utility framework may also be developed in the case where Vi is
assumed to be joint-normal, it should be pointed out that this type of behavioral motiva-
tion is not applicable in the case where Vi is uniform. As Domencich and McFadden
(1975) point out, a uniformly distributed random variable cannot be written as the
difference between two identically independently distributed random variables.
3. In this cale Ai includes a location shift due to the fact that E(Vi) = 112 # o.
112 J. W. Hay

4. Since Vi is only identified up to a scalar multiple, a normalization of this sort is required.

5. See Shannon (1948).
6. This series converges rapidly due to the exponential term e-m1z,rl. Furthermore, the error
magnitude of an m-term approximation of Ai must be less than the magnitude of the m +
l'th term.
7. See Hay (1980) for a derivation of consistent estimators for p, a~.
8. A more general formulation would allow the correlation of ij andu vt to vary across k. As
is discussed elsewhere, the current formulation considerably simplifies the model and
reduces the number of parameters that must be estimated. See Hay (1980) for a further
discussion of this point.
9. See Hay (1980) for a derivation of this result.
10. A model extension to include a pediatrics category was explored. However, there was an
insufficient number of pediatricians in the sample to make this feasible.
11. It should be pointed out that while the coefficient estimates in column 2 of Table 5,1 are
consistent, the reported standard errors are not adjusted for the use of fitted income
rather than the true unobservable income in each specialty. They will be consistent under
the null hypothesis of no selectivity bias, but are only approximate under the alternative
12. See e.g., Stevens (1978) where it is argued that the lack of specialty employment at home
is a major push factor leading FMGs to migrate to the U.S.


Amemiya, T. (1973), 'Regression Analysis When the Dependent Variable is Truncated

Normal', Econometrica 41, 996-1017.
Anderson, T. W. (1958), An Introduction to Multivariate Statistical Analysis, John Wiley &
Sons, New York.
Domencich, T. and McFadden, D. (1975), Urban Travel Demand: A Behavioral Analysis,
North Holland, Amsterdam.
Ernst, R. and Yett, D. (1985), Physicians' Specialty and Location Choices, Ann Arbor,
Michigan, Health Administration Press.
Goodman, L. and Jensen, L. (1979), 'Economic Surveys of Medical Practice: AMA's Periodic
Survey of Physicians, 1966-1978', paper presented at the Third Biennial Conference on
Health Survey Research Methods: Reston, Virginia.
Hadley, J. (1975), 'Models of Physicians' Specialty Choice and Location Decisions',
unpublished Ph.D. dissertation, Yale University.
Hay, J. (1980), 'Occupational Choice and Occupational Earnings: Selectivity Bias in a
Simultaneous Logit-OLS Model', Ph.D. dissertation, Yale University. Published by the
National Technical Information Service, Rockville, Maryland.
Heckman, J. (1974), 'Shadow Prices, Market Wages, and Labor Supply', Econometrica 42,
Heckman, J. (1976), 'The Common Structure of Statistical Models of Truncation, Sample
Selection, and Limited Dependent Variables and a Simple Estimator for Such Models',
Annals of Economic and Social Measurement 5,475-592.
Heckman, J. (1977), 'Dummy Endogenous Variables in a Simultaneous Equation System',
Report 7726, Center for Mathematical Studies in Business and Economics, University of
Heckman, J. (1979), 'Sample Selection Bias as a Specification Error', Econometrica 47, 153-
Hogg, J. and Craig, A. (1965), Introduction to Mathematical Statistics, 2nd Ed., New York,
Macmillan Co.
Physicians' specialty choice and specialty income 113

Jenrich, R. (1969), 'Asymptotic Properties of Nonlinear Least Squares Estimators', Annals of

Mathematical Statistics 40,633-643.
Johnson, N. and Kotz, S. (1970), Continuous Univariate Distributions, Volumes 2 and 3 in the
series: Distributions in Statistics, John Wiley and Sons, New York.
Lee, L. (1976), 'Two Stage Estimation of Limited Dependent Variable Models', Ph.D.
dissertation, Department of Economics, University of Rochester.
Lee, L. and Trost, R. (1978), 'Estimation of Some Limited Dependent Variable Models with
Application to Housing Demand', Journal of Econometrics 8.
McFadden, D. (1973), 'Conditional Logit Analysis of Qualitative Choice Behavior', in P.
Zarembka (ed.), Frontiers in Econometrics, New York, Academic Press.
Nelson, F. (1977), 'Censored Regression Models with Unobserved, Stochastic Censoring
Thresholds', Journal of Econometrics 6, 309-327.
Nelson, F. and Olson, L. (1978), 'Specification and Estimation of a Simultaneous-Equation
Model with Limited Dependent Variables', International Economic Review 19,695-709.
Olsen, R. (1975), 'The Analysis of Models with One Continuous and One Dichotomous
Dependent Variable', unpublished working paper, Yale University.
Olsen, R. (1980), 'A Least Squares Correction for Selectivity Bias', Econometrica 48(3),
Shannon, C. (1948), Mathematical Theory of Communication, Urbana, Illinois.
Sloan, F. (1970), 'Lifetime Earnings and Physicians' Choice of Specialty', Industrial and Labor
Relations Review 24, 47-56.
Stevens, R. (1978), The Alien Doctors: Foreign Medical Graduates in American Hospitals,
Wiley & Sons, New York.
Yett, D. et al. (1974), An Original Comparative Economic Analysis of Group &Solo Practice,
National Technical Information Service, Rockville, Maryland.

Functioning: cost and financing

Subsidies, quality, and regulation in the U.S.
nursing home industry*

The RAND Corporation, 1700 Main Street, P.O. Box 2138, Santa Monica,
California 90406, U.S.A.

I. Introduction

In the nursing home industry, government regulators are concerned with

assuring a high standard of quality, providing the poor with access to care,
and controlling the expansion of the industry. To this end, the government
has created the Medicaid (and Medicare) patient subsidy programs and the
Certificate of Need (CON) cost containment program. This paper theoret-
ically and empirically analyzes the impact of Medicaid and CON policy on
nursing home behavior. The theoretical analysis is complicated by the fact
that both proprietary and 'not for profit' nursing homes exist, and the
empirical work is complicated by the problem that quality is not directly
Medicaid pays for the care of the financially indigent by directly reimburs-
ing nursing homes at a predetermined rate. As a result, nursing homes can
price discriminate between patients who finance their care privately and
patients whose care is financed by Medicaid. Nevertheless, nursing homes
are required to provide the same quality to both types of patients, where
quality is defined by the package of goods and services provided by the
nursing home. Quality is determined by the 'private pay' market. The greater
is 'private pay' demand relative to Medicaid, the higher is qUality. If the
'private pay' market did not exist, then nursing homes would face only
Medicaid demand, which is insensitive to quality, and therefore, would have
no incentive to provide more quality than is necessary to obtain government
Typically, Medicaid reimbursement rates are set by a cost plus method,
where the reimbursement per patient is equal to average cost plus some
return referred to as the Medicaid 'plus' facor. Our results show that
Medicaid policy makers face a quality-access trade-off. Specifically, we find
that increases in the Medicaid 'plus' factor cause nursing homes to reduce
quality and substitute Medicaid patients for 'private pay' patients. The
increase in the 'plus' factor raises the marginal profit of a Medicaid patient.
Therefore, homes have incentive to substitute Medicaid patients for 'private

G. Duru and f. H. P. Paelinck (eds.), Econometrics of Health Care, 117-139.

© 1991 Kluwer Academic Publishers.
118 P. 1. Gertler

pay' patients. Homes reduce 'private pay' demand and operating costs by
lowering qUality. These quality differences can be quite large. Specifically, in
our sample, we find that homes who recieve high Medicaid 'plus' factors
provide hundreds of thousands of dollars less in goods and services than
homes who receive average Medicaid 'plus' factors, ceteris paribus.
CON attempts to control nursing home expenditures by limiting the
supply of beds with capacity constraints and entry barriers. CON policy
makers are forced to trade off containing the size of the industry (and
therefore total Medicaid payments) against quality and access. Specifically,
we find that the capacity constraints and the reduced competition from the
entry barriers lead to lower quality and fewer Medicaid patients receiving
The theoretical analysis is based on Gertler (1985a, b).l The model relies
on the notion that there are quantity and quality aspects to production, and
both quantity and quality are endogenous. 2 Specifically, nursing homes
produce a series of commodities, such as medical care, room and board, and
social activities. The quality of nursing home care is the utility patients derive
from consuming this package. Nursing home output, then, is characterized by
the total number of patients and average qUality. Proprietary homes are
assumed to maximize profits by choosing quality and 'private pay' price
(which determines the mix of 'private pay' and Medicaid patients) subject to
the CON capacity constraint. 'Not for profit' homes are specified as utility
maximizers subject to CON capacity and break even constraints.
The empirical analysis estimates a linear specification of the reduced form,
but is complicated because quality is not directly observed. Instead of
directly estimating the reduced form quality equation, we use the theory of
production to specify a model in which quality is a latent variable, but the
parameters of the reduced form quality equation are identified. The model
has a Multiple Indicator Multiple Cause (MIMIC) interpretation, where the
indicators are conditional input demands which have been filtered to remove
all sources of variation except quality and random disturbances. 3
We discuss the regulatory environment in Section II, the theoretical model
in Section III, the empirical specification in Section IV, the data in section V,
the empirical results in Section VI, and summarizes in Section VII.

II. The nursing home industry

The nursing home industry has expanded from approximately S190 million
in 1950 to over S18 billion in 1980.4 Most of the expansion took place after
1966, the year in which the Medicaid program began. As of 1980, the public
share of nursing home expenditures was over 65%. Health care regulators
have the task of trying to control this expansion, while simultaneously
providing the poor with access to nursing home care and promoting a high
Subsidies, quality, and regulation in the nursing home industry 119
standard of quality. The major forms of government intervention are the
Medicaid patient subsidy program and the CON cost containment program.
Medicaid is an entitlement program established under the Social Security
Act to provide the poor with a minimum floor of health services. Through
direct subsidies, the Medicaid program makes health care available to
individuals who otherwise could not afford it. It is jointly financed by State
and Federal governments, but administered on a State basis. The Medicaid
program reimburses nursing homes a set fee for the care of Medicaid
patients. Typically, States pay nursing homes via 'costs plus' reimbursement,
although a few States have opted for a prospectively set flat reimbursement
The CON cost containment program was passed into law in response to
the rapid growth of the health care industry during the late 1960's and early
1970's. It requires that, in order to expand an existing nursing home or build
a new one, the government must certify that the proposed facility is indeed
+'needed'. Effectively, CON limits the existing capacity of existing nursing
homes and new entry into the market. s It was thought that the expansion
could be contained by limiting the available supply of nursing home beds.
In essence, government regulation has turned nursing homes into price
discriminators. The Medicaid program creates a second market for nursing
home care, and CON restricts supply so that there is excess Medicaid patient
demand. Therefore, Medicaid demand is perfectly elastic at the Medicaid
reimbursement rate. The excess Medicaid demand hypothesis is supported
empirically in Scanlon (1980).6 Hence, nursing homes compete with each
other for 'private pay' patients knowing that they can always admit Medicaid
patients at the Medicaid reimbursement rate if they have excess capacity.

III. A theory of nursing home behavior

A. Assumptions and notation

Nursing homes face 'private pay' and Medicaid demand. 'Private pay'
demand is given by X(P, Q), where X is the number of 'private pay' patients,
P is the price charged 'private pay' patients, and Q is average quality. 'Private
pay' demand is increasing in Q, and decreasing in P. In contrast, Medicaid
demand is perfectly elastic at the Medicaid reimbursement rate.
CON imposes a capacity constraint on each home. Since there is excess
Medicaid demand, the constraint is binding. It is specified as
X+M=X, (1)
where M is the number of Medicaid patients and X is the CON allowed
Nursing homes are required to provide all patients with the same level of
120 P. J. Gertler

qUality. Therefore, a nursing home's cost function can be specified as a

function of the total number of patients receiving care and average qUality.
Let the cost function for providing quality level Q to X patients be given by
C(X, Q). It is assumed to be increasing and convex, with the marginal cost of
caring for another patient increasing in quality.
Finally, Medicaid reimburses a nursing home its average cost plus r, the
Medicaid 'plus' factor. Hence, the Medicaid reimbursement per patient is
R =r+ ceX, Q)/X (2)

B. Proprietary nursing home behavior

1. Equilibrium
Homes choose 'private pay' price and quality so as to maxmuze profits
subject to the CON capacity constraint. The profit function is
Tl = PX(P, Q) + RIX - X(P, Q»)- ce.K, Q). (3)
The first order conditions are
PXp+X=RXp (4)
PXQ=(XfX)CQ +RXQ. (5)
'Private pay' price is chosen in (4) such that marginal 'private pay' revenue
equals the opportunity cost of Medicaid revenue, and quality is chosen in (5)
such that marginal 'private pay' revenue equals the marginal cost of quality
plus the opportunity cost of Medicaid revenue. Since the cost of caring for
Medicaid patients is recovered via Medicaid reimbursement, the marginal
cost of quality is weighted by the proportion of 'private pay' patients.

2. Policy comparative statics

a. The medicaid 'plus' factor. An increase in the Medicaid 'plus' factor
causes nursing homes to lower quality, has an ambiguous effect on 'private
pay' price, and causes homes to increase their number of Medicaid patients
at the expense of fewer 'private pay' patients. 7 The rationale behind this
result is that an increase in r raises marginal Medicaid revenue, making
Medicaid patients more profitable. Therefore, since the CON capacity
constraint is binding, homes want to substitute Medicaid patients for 'private
pay' patients. Homes can reduce 'private pay' demand by increasing their
'private pay' price and by lowering their quality. They surely lower quality,
since that also reduces their operating costs. From (4), given quality, 'private
pay' price is increasing in R. Therefore, 'private pay' price rises or falls
depending on the effect of r on R. From (2), the increase in r has a positive
direct effect on R, and a negative indirect effect via the reduction in average
cost from the fall in quality. The fall in quality also lowers 'private pay'
marginal revenue. Therefore, from (4), 'private pay' price rises (falls) if
'private pay' marginal revenue falls more (less) than the net fall in R.
Subsidies, quality, and regulation in the nursing home industry 121

b. The CON capacity constraint. An increase in capacity creates the incen-

tive for proprietary nursing homes to reduce quality, and has an ambiguous
effect on 'private pay' price. An increase in capacity raises the marginal cost
of quality. Consequently, marginal 'private pay' revenue becomes greater
than the marginal cost of quality plus the opportunity cost of Medicaid
revenue in (5). In response, nursing homes reduce quality. Again, 'private
pay' price rises or falls depending on the effect on R. The increase in
capacity affects R through average cost. It has a positive direct effect on
average cost, and a negative indirect effect via the fall in quality. Further, the
fall in quality reduces marginal 'private pay' revenue. Therefore, from (4),
'private pay' price rises (falls) if 'private pay' marginal revenue falls more
(less) than R.

c. CON entry policy. Entry increases competition for 'private pay' patients,
and thus reduces 'private pay' demand and marginal revenue to each nursing
home. In response, homes lower quality. The fall in quality decreases both
'private pay' marginal revenue and R. Hence, 'private pay' price rises (falls) if
marginal 'private pay' revenue rises less (more) than the rise in R.

C. 'Not for profit' nursing home behavior

1. Objectives
Unlike proprietary nursing homes, the objectives of 'not for profit' homes are
not well defined. Economists typically model 'not for profit' firms as utility
maximizers.8 The arguments of a 'not for profit' firm's utility function are
debatable and depend upon the institutional setting. In the nursing home
industry, where the religious institution dominates, we assume that not for
profit homes are basically altruistic in nature. Therefore, we expect these
homes to be concerned with quality, and with providing care to the poor.·
Hence, the 'not for profit' nursing home's utility function is assumed to be an
increasing function of quality and the number of Medicaid patients. 'Not for
profit' nursing homes are assumed to choose 'private pay' price and quality
so as to maximize utility subject to the CON capacity constraint and subject
to a break even constraint.9

2. Equilibrium
Let G( Q, M) be the nursing home's utility function, and A the Lagrange
multiplier on the break even constraint. Then, assuming that the Kuhn-
Tucker conditions are satisfied at an interior solution, the first order condi-
tions are
PXp +X-(lIA)GMXp =RXp . (6)
PXQ+ (lIA)(GQ - GMXQ) = (x&)CQ + RXQ • (7)

'Private pay' price is chosen in (6) such that marginal 'private pay' revenue
122 P. 1. Gertler

plus the marginal utility of Medicaid patients equals the opportunity cost of
Medicaid revenue, and quality is chosen in (7) such that the marginal 'private
pay' revenue plus the marginal utility of quality equals the marginal cost of
quality and marginal opportunity cost of Medicaid revenue.

3. Policy comparative statics

Adjustments in the policy variables affect the 'not for profit' nursing home's
first order conditions in exactly the same way as they affected the proprietary
first order conditions. 1O Therefore, the directions of 'not for profit' nursing
homes responses to policy changes is the same as proprietary nursing homes
responses, but magnitudes of response may differ.

IV. Empirical specification and econometric methods

Here, we present methodology for empirically investigating the impact of

Medicaid and CON policy on quality, 'private pay' price, and patient mix via
estimation of the reduced form. Except for the quality equation, estimation of
the reduced form is straightforward. Since quality is not directly observed,
the reduced form quality equation cannot be directly estimated. Instead, we
specify a model in which quality is a latent variable but the parameters of the
reduced form quality equation are identified.

A. The reduced form

The reduced form assumes a linear solution to the equilibrium conditions. It

specifies the endogenous variables to be linear functions of exogenous supply
and demand variables, and of Medicaid and CON policy variables. Let the
reduced form for the ith nursing home be


Xi = fJlO + L fJijZ;j + L fJ1,J+k Wij + Eli (9)

j-I k-I

Mi = fJ20 + L fJ2j Z ij + L fJ2, J + k W;j + E2i (i=I, ... ,n) (10)

j-I k-I
Pi = fJ30 + L fJ3j Z ij + L fJ3, J + k W;j + E3i (11)
j-I k-I
Qi = (340 + L (34j Z ij+ L (34,J+k W;j+ E4i , (12)
j-I k-I

where the Zy's include exogenous demand variables, the Medicaid 'plus'
factor, and the market concentration level implied by CON policy. The W;/s
are exogenous supply variables: input prices, capital stock, and the CON
Subsidies, quality, and regulation in the nursing home industry 123

capacity constraint. Finally, the f3;/s are unknown parameters, and the e;/s
are independently distributed random variables with zero mean.
A binding CON capacity constraint implies several restrictions on the
reduced form. If the constraint is binding, then X; and M; sum to X; for all i,
implying that Equations (9) and (10) sum to X; for all i. Let W;I be X;. Then,
a binding CON capacity constraint implies
f3Ij = -f32j for j = 0,1, .. . ,J,J + 2, ... , K (13)
f31,J+ I = 1 - f32,J+ I' (14)
and a singular covariance matrix. These adding-up restrictions imply that
Equation (10) is redundant. Therefore, we exclude Equation (10) from the
estimation. The remaining equations have the same right hand side variables
and no cross equation restrictions, which allows us to efficiently estimate (9)
and (11) by least squares, and then recover the coefficients in (10) from the
restrictions in (13) and (14).

B. The reduced form quality equation

1. A model of quality and factor demand

Input demand functions depend on output quantities, input prices, and in the
short run, capital stock. A nursing home's output can be characterized by the
number of patients under its care, and the total amount of quality it provides
all of its patients. Let cpj be nursing home i's total quality level, and Y;t be
nursing home i's demand for input t. Suppose the input demand functions
are linear. Then, nursing home i's input demand functions are

Y;t = aOt + aptCPj + L a kt W;k + Vi!> ( t = 1, ... , L ), (15)


where the a's are unknown coefficients and the vi/s are random distur-
bances. Total quality is just average quality times the total number of patients
(i.e. X;Q;). Therefore, from (12), the reduced form total qualitY equation is

cp; = f340X; + L f34jZ;jX; + L f34,J+ k W;kX; + e4 ;X;, (16)
j-I k-l

Substitution of (16) into (15) gives the reduced form input demand equations

1';t = a ot + L ak! W;k + YOtX; + L YjtZ;jX; +

k-2 j-l
+ L YJ+k,tW;jX;+ Y/;b (t= 1, .. . ,L),
124 P. J. Gertler

YOt = all + acpti34o, (19)
Yjt = arpti34j for j~ O. (20)

2. Identification and estimation

The reduced form quality and structural input demand equations are
identified with the imposition of several restrictions on the reduced form
input demand functions. First, quality must be normalized to some base. As
of now, it is measured in arbitrary units. If we divide quality by the
coefficient on quality in one of the input demand equations, then quality is
measured in the same units as that input. This is equivalent to restricting one
of the arp/s to unity. Second, the in~rcept of the reduced form quality
equation is identified by excluding X; from one of the input demand
functions. This assumption requires one of the input demands to depend only
on total quality and not the mix of quality and number of patients. Finally,
there are the cross equation restrictions implied by (19) and (20), which
require the reduced form input demand equations to be proportional to one
another in the coefficients on the exogenous variables in (16).
The reduced form quality and structural input demand equations are
estimated jointly using a minimum distance procedure described in
Chamberlain (1982). Chamberlain shows that this procedure consistently
and efficiently estimates a system of equations with nonlinear cross-equation
restrictions and general heteroskedasticity, and derives X2 statistics for
hypothesis testing.

3. Prediction and inference

This model can be interpreted as a MIMIC model, with filtered input
demands as indicators. The input demands are filtered to remove variation
due to input prices, total number of patients, and capital stock. The remain-
ing variation is due to quality and random disturbances. Therefore, the
indicators and quality have the same covariation with the explanatory
variables, but are measured in different units. The normalization bases
quality, the latent variable, to the same scale as one of the indicators, and the
proportionality constraints restrict the measurement model of quality to be a
one-factor model. The model can be thought of as a multivariate regression
of the filtered input demands against the explanatory variables with the right
hand sides being proportional to one another.
Since the intercept is identified, an index of quality can be predicted for
each home. These are indicies of input quantities which have been normal-
ized to account for differences in input prices, total number of patients, and
capital stock. Therefore, they measure the volume or intensity of services
nursing homes provide their patients.
Subsidies, quality, and regulation in the nursing home industry 125

These measures of service intensity allow us to make inferences about

qUality. A nursing home's quality is utility patients derive from consuming its
package of goods and services. A home can improve its quality by adjusting
the composition of its package so as to be more in accordance with patients'
preferences or by increasing the quantity of any commodity. Therefore, if
patients prefer nursing services to social activities, the home can improve its
quality without raising its operating costs by shifting resources from social
activities to nursing services.
Proprietary homes choose the quantity of each component to maximize
profits. The more 'private pay' patients are willing to pay for a particular
component and the lower its marginal cost, the greater the equilibrium
quantity of that component. If there is an exogenous increase in 'private pay'
patient demand, then the home invests in those components that yield the
greatest marginal profit. Therefore, holding input prices constant, increases in
service intensity raise patient utility. Since quality is the utility patients derive
from consuming a nursing home's package, observed increases in a home's
service intensity are tantamount to increases in quality. Hence, the estimated
coefficients on the policy variables in (12) can be interpreted as partial
effects on quality.
A similar argument can be posed for 'not for profit' nursing homes. 'Not
for profit' nursing homes are characterized as maximizing utility subject to a
break even constraint. Since the majority of 'not for profit' nursing homes are
operated by religious organizations, a major argument of their utility func-
tions is likely to be quality of care. Therefore, one would expect 'not for
profit' homes to construct their package of goods and services in accordance
with the preferences of their patients. Consequently, observed increases in a
'not for profit' nursing home's service intensity are also tantamount to an
increase in qUality.

V. Data

The data are constructed from New York State's 1980 survey of Long Term
Care Facilities. The sample consists of 455 nursing homes chosen from 798
possible cases. Excluded were government homes, hospital attached homes,
and non-reporting homes. In the sample are 288 proprietary and 167 'not for
profit' nursing homes. Unless otherwise specified, the variables are daily
averages, with the unit of observation being the nursing home. Descriptive
statistics are presented in Table 1.
In the reduced form, the dependent variables are 'private pay' price, the
number of Medicaid patients, and the number of private pay patients. The
dependent variables in the input demand equations are 100's nursing labor
hours, 1OO's of other labor hours, and a supplies quantity index. I I
The exogenous supply variables are the input prices and capital stock. The
input prices are the hourly nursing wage rate, the hourly other labor wage
126 P. J. Gertler

Table 1. Descriptive statistics: means and standard deviations.

Variable All homes Proprietary Not for profit

1. Private Pay 59.50 59.87 58.88

Price (19.42) (17.68) (22.15)
2. Medicaid 100.05 99.17 102.81
Patients (90.67) (85.88) (98.61)
3. Private Pay 23.96 25.44 21.40
Patients (20.49) (18.81) (22.95)
4. Nursing Labor 315.09 313.20 318.36
Hours (266.27) (224.27) (327.01 )
5. Other Labor 216.18 189.62 261.99
Hours (176.39) (125.88) (233.15)
6. Supplies Quantity 2.50 2.60 2.32
Index (2.04) (2.18) (1.78)
7. NursingHourly 7.82 7.67 8.06
Wage (2.75) (2.84) (2.59)
8. Other Labor 8.78 7.97 8.79
Hourly Wage (2.78) (2.44) (3.23)
9. Supplies 1.41 1.38 1.44
Price 0.10) (0.03) (0.15)
10. Capital 6.86 7.04 6.54
Stock (25.51) (31.77) (5.84)
11. Per Capita 7.15 7.08 7.27
Income (1.40) (1.44) (1.32)
12. Population 1.07 0.98 1.23
Over 65 (0.88) (0.87) (0.87)
13. % Patients 0.74 0.77 0.70
Same County (0.26) (0.24) (0.30)
14. Heath 1.43 1.51 1.31
Status Index (0.61) (0.57) (0.67)
15. Medicaid Plus 15.41 18.35 10.34
Factor (15.21) (13.96) (15.96)
16. Total 124.21 124.61 124.01
Patients (94.04) (89.33) (101.94)
17. Market 0.12 0.12 0.11
Concentration (0.11) (0.12) (0.09)
18. # of Observations 455 288 167

rate, and a supplies price index. Since, the majority of capital owned by a
nursing home is the facility itself, capital stock is measured as total area of
the facility in 1 OO,OOO's of square feet.
The exogenous demand variables are the per capita income of the people
living in the nursing home's market area, the population over age 65 in the
nursing home's market area, the proportion of 'private pay' patients in the
nursing home whose last residence before entering the nursing home was
located in the same county in which the nursing home is located, and an
index of health status of the patients in the nursing home. Income is
Subsidies, quality, and regulation in the nursing home industry 127

measured in 1000's of dollars, and population in 10,000's of people. The

income and population data come from the 1980 census. The proportion of
'private pay' patients from the same county is a measure of the distance of
the nursing home from the family and friends of its patients. Presumably,
nursing homes that are located closer to its patents' family and friends are
more attractive, ceteris paribus. The health index is really an index of ill-
health. 12
Since nursing homes compete only for 'private pay' patients the appro-
priate market to analyze is the 'private pay' patient market. Each home's
geographic 'private pay' patient market is somewhat complicated to measure
since the data is reported on a county basis. The problem is that a nursing
home's market may not be completely commensurate with the county in
which it is located. In particular, homes that are located on county borders
certainly compete for 'private pay' patients from both counties. Instead,
separate market areas are defined for each nursing home based on 'private
pay' patient census data. For each home, the survey reports the number of
'private pay' patients from each county in New York State currently residing
in the home. Homes participate in several county private pay patient markets.
A home's participation in a county market is given by the proportion of the
home's 'private pay' patients from that county. Thus, a home's market area is
defined as the counties in which its 'private pay' patients last resided, and the
proportion of its 'private pay' patients from each county. This market
definition guides the computation of the market variables.13
The policy variables are the Medicaid plus factor, the CON capacity
constraint, and the concentration of the home's 'private pay' patient market
area. New York computes the plus factors based on owner's equity in the
facility. Therefore, there is cross-sectional variation in the Medicaid plus
factor. The CON capacity constraint is measured as the average daily census
of patients in the home, and CON entry policy is captured by a Herfindahl
index of the concentration of each home's 'private pay' market. 14 Entry
reduces the concentration of a home's 'private pay' patient market.

VI. Results

The models developed in Section lIT were estimated separately for the
proprietary and 'not for profit' samples. The estimated proprietary reduced
form equations are reported in Table 6, the 'not for profit' reduced form is
reported in Table 7, and the structural input demand equations are reported
in Tables 8 and 9. The estimates are quite reasonable. As expected,
hypothesis tests, reported in section A, uniformity reject pooling the propri-
etary and 'not for profit' samples, and accept the restrictions that identify the
reduced form quality equation. In addition, the coefficients on the policy
variables are consistant with theory, and the signs on other independent
variables such as own price in the input demand equations are generally as
128 P.l. Gertler

one would expect. The policy results are discussed in detail in section B, and
are summarized in Tables 2 through 5.

Table 2. Estimated coefficients and t-statistics on medicaid and CON policy variables.

Policy variable

Medicaid 'plus' factor CON capacity constraint Market concentration

variable Profit Non-profit Profit Non-profit Profit Non-profit

Average -0.010 a -0.014a -0.077 a. c -0.137 c -1.182 a 0.141

quality (2.107) (2.135) (3.029) (1.138) (4.125) (0.216)

'Private pay' 0.045 -0.594a 0.012 -0.014 -18.612b 19.288

price (0.204) (2.623) (1.118) (0.420) (1.879) (0.840)

'Private pay' -0.489 a -0.253 0.086 a -0.016 -33.221" -33.901

patients (1.963) (1.000) (6.872) (0.433) (2.956) (1.318)

Medicaid 0.489" 0.253 0.913" 1.016a 33.221" 33.901

patients (1.963) (1.000) (72.820) (26.835) (2.956) (1.318)

a Significantly different from Zero at the 0.05 level.

b Significantly different from Zero at the 0.1 level.
c Independent Variable is measured in hundreds of patients.

A. Pooling and specification tests

We begin by testing the hypothesis that the proprietary and 'not for profit'
samples can be pooled in the reduced form 'private pay' price and 'private
pay' patients equations. The test statistics are 4.03 and 2.66, respectively, and
are distributed F(12, 431). The corresponding critical value at the 0.05
significance level is 1.77. Consequently, we reject pooling.
Before testing to see if the proprietary and 'not for profit' reduced form
quality and structural input demand equations can be pooled, we test to see if
restrictions discussed in the identification section are valid. The test statistics
are 20.08 for the proprietary sample and 16.68 for the 'not for profit'
sample, and are distributed X2(27). The corresponding critical value at the
0.05 significance level is 40.11. Consequently, we accept the hypothesis that
the restrictions are valid. Under the assumption that the restrictions are valid,
we test the hypothesis that the proprietary and 'not for profit' samples can be
pooled for the reduced form quality and structural input demand equations.
That test statistic is 99.74, and is also distributed X2 (27). Consequently, we
reject pooling.
Subsidies, quality, and regulation in the nursing home industry 129

Table 3. Comparative statics - the Medicaid 'plus' factor.

Mean Mean + $5 Mean + $10 Mean+$15

Proprietary homes $18.35 $23.35 $28.35 $33.35

Medicaid plus +27.25% +54.50% +81.74%
Quality index 2.27 2.22 2.17 2.12
-2.20% -4.41% -6.61%
Total expenditures' $2009 $1965 $1920 $1876
-$44 -$89 -$133
Private pay price $59.89 $60.12 $60.34 $60.57
+0.38% +0.75% +1.13%
Patient mix ratio 3.90 4.42 5.06 5.g8
+13.31% +29.84% +50.84%
Private pay patients 25.44 23.00 20.55 18.11
-9.61% -19.22% -28.83%
Medicaid patients 99.17 101.62 104.06 106.51
+2.35% +4.93% +6.89%

'Not for profit' homes $10.34 $23.35 $20.34 $25.34

Medicaid plus +125.82% +96.71% +145.07%
Quality index 2.77 2.70 2.63 2.56
-2.53% -5.05% -7.58%
Total expenditures' $2.432 $2371 $2309 $2248
-$61 -$123 -$184
Private pay price $59.86 $56.89 $53.92 $50.95
-4.96% -9.92% -14.88%
Patient mix ratio 4.80 5.17 5.58 6.06
+7.59% +16.20% +26.04%
Private pay patients 21.40 20.14 18.87 17.60
-5.91% -11.82% -17.73%
Medicaid patients 102.81 104.08 105.34 106.61
+1.20% +2.46% +3.56%

a Total expenditures are measured in $1000.

B. Comparative statics

1. The medicaid plus factor

The theoretical model predicts that an increase in the Medicaid 'plus' factor
causes nursing homes to reduce quality, adjust patient mix in favor of more
Medicaid patients at the expense of 'private pay' patients, and has an
ambiguous effect on 'private pay' price. As can be seen in Table 2, these
predictions are confirmed by the empirical results. The coefficients on the
Medicaid 'plus' factor in both the proprietary and 'not for profit' quality
equations are indeed negative and significantly different from zero. Further,
the coefficients are negative in both the proprietary and 'not for profit'
'private pay' patients equations, and are positive in the Medicaid patients
130 P. 1. Gertler

Table 4. Comparative statics - the CON capacity constraint.

Mean Mean+25 Mean + 50 Mean + 75

Proprietary homes
Total # of patients 124.61 124.61 149.61 199.61
+20.06% +60.19% +60.19%
Quality index 2.27 2.25 2.23 2.21
-0.85% -1.70% -2.54%
Total expenditures' $2009 $1992 $1975 $1,958
-$17 -$34 -$51
Private pay price $59.89 $60.19 $60.49 $60.79
+0.50% +1.00% +1.50%
Patient mix ratio 3.90 4.42 4.87 5.26
+13.38% +24.86% +34.79%
Private pay patients 25.44 27.59 29.74 31.89
+8.45% +16.90% +25.35%
Medicaid patients 99.17 122.00 144.82 167.65
+15.76% +46.03% +40.85%

'Not for profit' homes

Total # of patients 124.01 149.01 199.01 199.01
+20.16% +60.48% +60.48%
Quality index 2.77 2.74 2.70 2.67
-1.24% -2.47% -3.71%
Total expenditures' $2432 $2402 $2372 $2342
-$30 -$60 -$90
Private pay price $59.86 $59.51 $59.16 $58.81
-0.58% -1.17% -1.75%
Patient mix ratio 4.80 6.11 7.46 8.86
+27.08% +55.21% +84.46%
Private pay patients 21.40 21.00 20.60 20.20
-1.87% -3.74% -5.61%
Medicaid patients 102.81 128.21 153.61 179.01
+24.74% +49.41% +42.57%

a Total expenditures are measured in $tOOO.

equations (they are significantly different from zero in the proprietary

reduced form, but insignificant in the 'not for profit' reduced form). In
addition, the coefficient is positive, but not significant, in the propietary
'private pay' price equation, and is negative and significantly different from
zero in the 'not for profit' private pay' price equation.
In summary, these results imply that an increase in the Medicaid 'plus'
factor causes proprietary nursing homes to reduce quality, possibly increase
'private pay' price, and adjust patient mix in favor of more Medicaid patients.
Furthermore, 'not for profit' homes reduce quality and 'private pay' price,
and possibly adjust patient mix in favor of more Medicaid patients.
In order to gauge the magnitude of these effects, we compare predicted
Subsidies, quality, and regulation in the nursing home industry 131

Table 5. Comparative statics - CON entry policy.

Mean Mean + 0.01 Mean+0.05 Mean + 0.1

Proprietary homes
Concentration index 0.12 0.13 0.17 0.22
+8.33% +41.67% +83.33%
Quality index 2.27 2.26 2.21 2.15
-0.52% -2.60% -5.22%
Total expenditures" $2009 $1999 $1957 $1904
-$10 -$52 -$105
Private pay price $59.89 $59.70 $58.96 $58.03
-0.31% -1.55% -3.11%
Patient mix ratio 3.90 3.96 4.24 4.63
+1.62% +8.73% +18.82%
Private pay patients 25.44 25.11 23.78 22.12
-1.31% -6.53% -13.06%
Medicaid patients 99.17 99.50 100.83 102.49
+0.33% +1.67% +3.34%
'Not for profit' homes
Concentration index 0.11 0.12 0.16 0.21
+9.09% +45.45% +90.91%
Quality index 2.77 2.77 2.78 2.78
+0.05% +0.25% +0.51%
Total expenditures a $2432 $2433 $2438 $2444
+$1 +$6 +$12
Private pay price $59.86 $60.05 $60.82 $61.79
+0.32% +1.61% +3.22%
Patient mix ratio 4.80 4.90 5.30 5.90
+1.94% +10.39% +22.74%
Private pay patients 21.40 21.06 19.70 18.01
-1.58% -7.92% -15.84%
Medicaid patients 102.81 103.15 104.51 106.20
+0.33% +1.65% +3.28%

a Total expenditures are measured in $1000.

equilibria for small, medium, and large differences in the Medicaid 'plus'
factor. Table 3 reports predictions of the endogenous variables at the mean
of the data, and for nursing homes whose Medicaid 'plus' factor is five
dollars, ten dollars, and fifteen dollars greater than the mean, respectively.
These differences are observed within the sample, as the standard deviation
of the Medicaid 'plus' factor is 15.21. Deviations from the mean are also
Consider the differences in quality for proprietary homes. The 'mean'
home provides 2.27 units of quality per patient, the 'mean plus five' home
produces 2.22 units, the 'mean plus ten' home produces 2.17 units, and the
'mean plus fifteen' home produces 2.12 units. To get a more interpretable
measure of the magnitude of the quality differences, we translate these values
132 P. J. Gertler

Table 6. Proprietary reduced form estimated coefficients and T -statistics in parentheses.

'Private pay' 'Private pay'

Independent variable price Average quality patients

1. Constant 89.22 0.51 22.68

(2.06) (0.36) (0.46)
2. Total # of patients 0.01 -0.08 0.09
(1.12) (10.52) (6.87)
3. Nursing hourly wage 2.11 -0.14 -3.57
(2.31) (30.03) (3.44)
4. Other labor's hourly wage 1.48 0.04 0.75
(1.51) (0.86) (0.68)
5. Supplies price -41.73 0.58 -7.07
(1.35) (0.56) (0.20)
6. Capital stock 0.04 0.02 0.02
(1.65) (1.70) (0.65)
7. Health status 7.69 1.27 3.25
(4.48) (29.98) (1.66)
8. Income 0.40 0.04 3.33
(0.57) (3.00) (4.21)
9. Population -4.23 -0.04 -3.32
(2.51) (4.01) (1.91)
10. Medicaid plus factor 0.05 -0.01 -0.49
(0.20) (2.11) (1.96)
11. % Patients same county -12.51 0.19 6.95
(3.45) (2.21) (1.69)
12. Market concentration -18.61 -1.18 -33.22
(1.88) (4.12) (2.96)
13. R-squared 0.39 0.97

into a home's total expenditures on goods and services provided to patients.

in 1980, the average proprietary home total expenditure (variable costs) on
goods and services provided to patients was 2009 thousand dollars. There-
fore, the average cost per unit quality was 885 thousand dollars. Assuming a
constant marginal cost of quality we can extrapolate total expenditures for
the other homes: The 'mean plus five' home provides 44 thousand dollars
less in goods and services than the 'mean' home, the 'mean plus ten' home
provides 89 thousand dollars less, and the 'mean plus fifteen' home provides
133 thousand dollars less.
The 'not for profit' quality differences are even larger. The 'mean' home
provides 2.77 units of quality per patient, the 'mean plus five' home provides
2.70 units, the 'mean plus ten' home provides 2.63 units, and the 'mean plus
fifteen' home provides 2.56 units. Average total expenditures of 'not for
profit' homes over the sample period were 2432 thousand dollars, and
therefore, the average cost per unit quality was 878 thousand dollars.
Assuming a constant marginal cost of quality, the 'mean plus five' home
Subsidies, quality, and regulation in the nursing home industry 133

Table 7. 'Not for profit' reduced form estimated coefficients and T-statistics in parentheses.

'Private pay' 'Private pay'

Independent variable price Average quality patients

1. Constant 18.81 0.21 69.94

(0.92) (0.26) (3.04)
2. Total # of patients -0.01 -0.14 -0.01
(0.42) (1.14) (0.43)
3. Nursing hourly wage 1.04 0.06 -0.83
(0.84) (0.81) (0.60)
4. Other labor's hourly wage 0.67 -0.18 -1.02
(0.61) (2.98) (0.84)
5. Supplies price -5.33 0.69 -29.55
(0.42) (1.38) (2.06)
6. Capital stock 1.21 0.04 2.57
(2.10) (1.88) (3.98)
7. Health status 15.49 1.62 -8.96
(6.65) (23.30) (3.44)
8. Income 0.87 0.06 1.57
(0.73) (1.96) (1.17)
9. Population 4.59 -0.05 -7.09
(1.71) (0.66) (2.35)
10. Medicaid plus factor -0.59 -0.Ql -0.25
(2.62) (2.13) (1.00)
11. % Patients same county -6.43 0.29 12.41
(1.38) (2.05) (2.56)
12. Market concentration -19.29 -0.14 -33.90
(0.84) (0.21 ) (1.32)
13. R-squared 0.48 0.97

provides 61 thousand dollars less in goods and services than the 'mean'
home, the 'mean plus ten' home provides 123 thousand dollars less, and the
'mean plus fifteen' home provides 184 thousand dollars less.
Not surprisingly, there is almost no difference in the proprietary 'private
pay' prices. On the other hand, the 'not for profit' prices show substantial
differences; The 'mean plus five' home charges 'private pay' patients 5% less
than the 'mean' home, the 'mean plus ten' home charges 10% less, and the
'mean plus fifteen' home charges 15% less.
Finally, consider the differences in patient mix as represented by the ratio
of the number of Medicaid to 'private pay' patients. For proprietary homes,
the 'mean plus five' home has a ratio 13% higher than the mean home, the
'mean plus ten' home has a ratio 30% higher, and the 'mean plus fifteen'
home has a ratio 50% higher. For 'not for profit' homes, the 'mean plus five'
home has a ratio 8% higher than the 'mean' home, the 'mean plus ten' home
has a ratio 16% higher, and the 'mean plus fifteen' home has a ratio 26%
134 P. J. Gertler

Table 8. Proprietary input demand equations.

Independent variable Nursing hours Other labor hours Supplies

1. Constant 3.39 -2.92 16.38

(1.37) (2.78) (6.12)
2. Total quality 1.00' 0.24 0.04
(12.41) (0.75)
3. Total #- of patients O.OOb 0.86 2.04
(16.38) (12.76)
4. Nursing hourly wage 0.02 0.00 0.14
(0.44) (0.31) (2.90)
5. Other labor's hourly wage -0.05 -0.10 -0.11
(0.91) (3.95) (1.92)
6. Supplies price -2.20 -12.01
(1.22) (3.45) (6.14)
7. Capital stock -0.03 -0.04 -0.00
(1.72) (9.47) (0.60)

• Coefficient restricted to unity.

b Coefficient restricted to zero.

2. The CON capacity constraint

Theory predicts that an increase in capacity causes nursing homes to lower

quality, and has an indeterminate effect on 'private pay' price and patient
mix. Again, the empirical results are consistent with the theoretical predic-
tions. Since the CON capacity constraint is binding, the effect of constraint is
captured by the 'total number of patients' variable. The coefficient on this
variable is negative and significantly different from zero in the proprietary
quality equation, and negative but insignificant in the 'not for profit' quality
equation. It is positive in the proprietary 'private pay' price equation, is
negative in the 'not for profit' 'private pay' price equation, and insignificant in
both. In addition, the coefficient is positive and significantly different from
zero in proprietary 'private pay' patients and Medicaid patients equations, is
positive and significant in the 'not for profit' Medicaid patients equation, and
is negative but insignificant in the 'not for profit' 'private pay' patients
In summary, additional capacity causes proprietary homes to provide
lower quality, possibly charge higher 'private pay' prices, and fill approxi-
mately ten percent of new capacity with 'private pay' patients and ninety
percent with Medicaid patients. 'Not for profit' nursing homes may provide
lower quality, may charge lower 'private pay' prices, and fill 100% of new
capacity with Medicaid patients.
We gauge the magnitude of these effects by comparing predictions of the
endogenous variables at the mean of the data to small, medium, and large
differences in capacity. Table 4 reports predictions at the mean of the data,
Subsidies, quality, and regulation in the nursing home industry 135

Table 9. 'Not for profit' input demand equations.

Independent variable Nursing hours Other labor hours Supplies

1. Constant 0.21 -1.85 6.61

(0.26) (2.87) (9.46)
2. Total quality 1.00 0.26 0.08
(7.77) (2.00)
3. Total # of patients 0.00' 1.21 1.62
(7.61) (8.89)
4. Nursing hourly wage -0.17 0.07 0.Q1
(2.75) (1.92) (0.21)
5. Other labor's hourly wage 1.18 -0.06 -0.07
(2.99) (1.86) (1.89)
6. Supplies price -0.83 -0.04 -4.91
(1.08) (2.76) (9.08)
7. Capital stock 0.00 0.03 -0.01
(0.08) (1.19) (0.48)

a Coefficient restricted to unity.

at the mean plus 25 patients, at the mean plus 50 patients, and at the mean
plus 75 patients. Again these differences are observed in the sample, as the
standard deviation of the 'total number of patients' variable is 94.04.
Deviations from the mean are also reported beneath the predicted value.
Consider the differences in quality. The 'mean' proprietary home provides
2.27 units of quality per patient, the 'mean plus 25' home provides 2.25
units, the 'mean plus 50' home provides 2.23 units, and the 'mean plus 75'
home provides 2.21 units. Translating these differences into total expendi-
tures on goods and services, we find that the 'mean plus 25' home provides
17 thousand dollars less in goods and services than the 'mean' home, the
'mean plus 50' home provides 34 thousand dollars less, and the 'mean plus
75' home provides 51 thousand dollars less. Turning to 'not for profit'
homes, the 'mean' home provides 2.77 units of quality, the 'mean plus 25'
home provides 2.74 units, the 'mean plus 50' home provides 2.70 units, and
the 'mean plus 75' provides 2.62 units. In terms of expenditures on goods
and services the 'mean plus 25' home provides 30 thousand dollars less than
the 'mean' home, the 'mean plus 50' home provides 60 thousand dollars less,
and the 'mean plus 75' home provides 90 thousand dollars less.
Homes of all sizes charge the same 'private pay' prices, but larger homes
have higher Medicaid to 'private pay' patient ratios. In the proprietary case,
the 'mean plus 25' home has a ratio 13% higher than the 'mean' home, the
'mean plus 50' home has a ratio 25% higher, and the 'mean plus 75' home
has a ratio 35% higher. In the 'not for profit' case, the 'mean plus 25' home
has a ratio 27% higher than the 'mean' home, the 'mean plus 50' home has a
ratio 55% higher, and the 'mean plus 75' home has a ratio 84% higher.
136 P. 1. Gertler

3. Entry
Entry into the market reduces the market's concentration. Hence, CON entry
policy can be analyzed by looking at the coefficients on the market con-
centration index. Theory predicts that an increase in concentration causes
nursing homes to lower quality, and has an indeterminate effect on 'private
pay' price and patient mix. The empirical results in the proprietary model are
consistent with theory. In the proprietary reduced form, the coefficients on
the market concentration index are negative and significantly different from
zero in the quality, 'private pay' price, and 'private pay' patients equations,
and is positive and significant in the proprietary Medicaid patients equation.
Hence, entry causes proprietary homes to raise quality, raise 'private pay'
prices, and to substitute Medicaid for 'private pay' patients.
On the other hand, all the coefficients on the concentration index in the
'not for profit' reduced form equations are positive and not significantly
different from zero. Alternatively, it may be the case that potential 'not for
profit' patients do not view proprietary and 'not for profit' nursing homes
care as close substitutes. This would suggest that proprietary and 'not for
profit' nursing homes do not compete in the same markets even if they are
located geographically close to one another. Hence, a joint index of concen-
tration heavily weighted towards the proprietary homes market would be
relevant for proprietary endogenous variables, but not for the 'not for profit'
Again, we gauge the magnitude of these effects for proprietary homes by
comparing predictions of the endogenous variables at the mean of the data to
predictions for small, medium, and large differences in market concentration.
Table 5 reports the predictions at the mean concentration level, at the mean
plus 0.01, at the plus 0.05, and at the mean plus 0.1. As usual, these
differences are observed within the sample. The 'mean plus 0.01' home
provides 10 thousand dolalrs less in goods and services than the 'mean'
home, the 'mean plus 0.05' home provides 52 thousand dollars less, and the
'mean plus 0.1' home provides 105 thousand dollars less. On the other hand,
increases in concentration appear to have little effect on 'private pay' price
and patient mix.

VII. Conclusions

Health care regulators are concerned with assuring a high standard of quality,
providing the poor with access to care, and controlling the expansion of the
industry. With these goals in mind, this paper has analyzed effects of
Medicaid and CON policy on proprietary and 'not for profit' nursing home
We find that Medicaid policy makers are faced with a quality-access trade-
off. Specifically, nursing homes whose Medicaid reimbursement rates include
higher Medicaid 'plus' factors care for more Medicaid patients, but provide
Subsidies, quality, and regulation in the nursing home industry 137

lower quality. The empirical work suggests that homes with high 'plus' factors
provide substantially lower quality than homes with mean 'plus' factors.
These quality differences when translated into nursing home expenditures on
goods and services provided patients are observed to be in the hundreds of
thousands of dollars. Further, high 'plus' factor homes have Medicaid to
'private pay' patient ratios possibly 50% higher than mean 'plus' factor
We also find that CON policy makers must trade off containing the size of
the industry (and therefore total Medicaid payments) against quality and
access. Specifically, we find that the capacity constraints and the reduced
competition from the entry barriers lead to lower quality and fewer Medicaid
patients receiving care.
Finally, we observe differences in proprietary and 'not for profit' re-
sponses to regulatory policy. Both types of homes adjust quality and patient
mix in the same direction, but in different magnitudes. 'Not for profit' homes
tend to have larger quality responses. On the other hand, they have opposite
'private pay' price responses.


* I am indebted to Ralph Andreano, Thomas Coleman, Arthur Goldberger, Michael

Grossman, George Jakubson, Salih Neftci, Robert Porter, Harvey Rosen, Warren
Sanderson, Amy Taylor, Charles Wilson, and the participants of the NBER conference
on Incentives in Government Financing for helpful suggestions. Of course, I am
responsible for any remaining errors.
1. See these papers for proofs of theoretical results in this paper.
2. This representation of a firm's output is similar to general models analyzed in Spence
(1975), Sheshinski (1976), and Leffler (1982), and to nursing home models analyzed in
Bishop (1980) and Palmer and Vogel (1983).
3. See Joreskog and Goldberger (1975) for a discussion of MIMIC models, and Aigner et
at. (1984) for a general survey of latent variable models.
4. The source of statistics referenced in this section is The U.S. Department of Health and
Human Services' publication, Health, United States 1980.
5. The CON review boards are not just rubber stamps. Indeed, there is some casual
evidence in support of binding CON capacity and entry constraints. First, most nursing
homes operate above 90% capacity. Second, there is a long list of individuals in hospitals
waiting for nursing home openings. Finally, States such as New York have imposed
moratoriums on nursing home expansion.
6. Scanlon tests and accepts the hypothesis that there is excess Medicaid patient demand for
nursing home care. This is consistent with the facts cited in footnote 5.
7. For example see Hansmann (1981), Newhouse (1970), and Pauly and Redisch (1973).
8. There are several alternatives to modeling 'not for profit' nursing homes in this fashion.
Some 'not for profit' homes could be profit maximizers who have obtained 'not for profit'
legal status in order to take advantage of the tax breaks. In this case, profits are taken in
the form of salary and rent. Other non-altruist 'not for profit' homes, could be operated
by non-owner managers who are personal utility maximizers. These managers may
manipulate the operation of the home so as to maximize their own income, prestige, and
security. The result is an inefficient employment of resources. Hence, these homes are not
138 P. J. Gertler

cost minimizers. This case is discussed in Frech and Ginsburg (1980). The altruism
assumption suggests cost minimization.
9. The supplies quantity index is calculated as the total expenditures on supplies divided by
an index of the price of supplies. The supplies price index is a weighted average of the
prices of the commodities that constitute nursing home supplies. These commodities are
drugs, other medical supplies, food, energy, and other supplies. The prices of these
commodities are national price indices in 1977 dollars, and are reported in the Depart-
ment of Health and Human Services report, HEAL TH CARE FINANCING TRENDS,
1980. The weights are the proportion of a home's total supplies expenditures accounted
for by the particular commodity.
10. The health status index is computed from disability scores assigned patients. The
disability level of each patient in eight functional areas is reported in the survey. The
disability level in each functional area takes on one of three values: self care, partial care,
and total care. The functional areas are walking, transferring, wheeling, eating, toileting,
bathing, dressing, and breathing. Walking, transferring, and wheeling were treated as
mutually exclusive categories. Each home reports the number of patients in each cell,
where a cell is defined by functional area and disability level. The cells are then
aggregated by disability level. After that, the self care aggregate is multiplied by zero, the
partial help aggregate by seven, and the total help aggregate by fourteen. The scores are
then summed and divided by the total number of patients times 100. The result is an
index of the average ill-health of the patients in a facility. This index is used for the
purposes of quality control (see Ullmann, 1983).
11. For each home, the market population is computed as a weighted sum of the number of
persons over age 65 in each county, using the home's proportions of private patients from
the counties as weights. Similarly, the per capita income of the population in a home's
market area is computed as the weighted sum of the counties' per capita incomes, using
the same weights.
12. Since each county is treated as a separate market, and each home participates in several
county markets, the concentration of a home's private pay patient market depends upon
its degree of participation in the various county markets. Therefore, the concentration
level of a home's private pay patient market is a weighted sum of the county market
concentration levels, using the home's proportion of private pay patients from each
county as weights. The concentration of a county private pay patient market is computed
using a Herfindahl index, which is based on each home's share (proportion) of a county's
private pay patients. Specifically, it is the sum of squared shares (see Scherer (1980) p.
58). Entry reduces the value ofthis index.


Aigner, D., Hsaio, C., Kapteyn, A. and Wansbeek, T. (1984), 'Latent Variable Models in
Econometrics', In Z. Griliches and M. Intriligator (eds.), the Handbook of Econometrics,
Vol. 2, North-Holland, Amsterdam, pp. 1321-1393.
Bishop, C. E. (1980), 'Nursing Home Cost Studies and Reimbursement Issues', Health Care
Financing Review 2, 47-64.
Chamberlain, G. (1982), 'Multivariate Models for Panel Data', Journal of Econometrics 18,
Frech, H. E. and Ginsburg, P. B. (1980), The Cost of Nursing Home Care in the United
States: Government Financing, Ownership, and Efficiency', In J. Van der Gaag and M.
Perlman (eds.) Health, Economics, and Health Economics, North Holland, Amsterdam,
Gertler, P. J. (1985a), 'Subsidies, Quality, and Regulation in the Nursing Home Industry',
Subsidies, quality, and regulation in the nursing home industry 139

unpublished Ph.D. dissertation, Department of Economics, University of Wisconsin at

Gertler, P. J. (1985b), 'Regulated Price Discrimination and Quality: The Implications of
Medicaid Reimbursement Policy for the Nursing Home Industry', Working Paper No. 267,
Department of Economics, State University of New York at Stony Brook.
Hansmann, H. (1981), 'Non-Profit Enterprises in the Performing Arts', Bell Journal of
Economics 12,341-361.
Joreskog, K. and Goldberger, A. S. (1975), 'Estimation of a Model with Multiple Indicators
and Multiple Causes of a Single Latent Variable', Journal of the American Statistical
Association 70,631-639.
Klein, B. and Leffler, K. B. (1981), 'The Role of Market Forces in Assuring Contractual
Performance', The Journal of Political Economy 89, 615-641.
Leffler, K. D. (1982), 'Ambiguous Changes in Product Quality', American Economic Review
Newhouse, J. P. (1970), Towards a Theory of Non-Profit Institutions: An Economic Model of
a Hospital', American Economic Review 60, 64-75.
New York State Department of Health, Office of Health Systems Management (1980), Title
XIX Nursing Home Cost Survey'.
Palmer, H. C. and Vogel, R. J. (1983), 'Models of the Nursing Home', In R. J. Vogel and H. C.
Palmer (eds.), Long-Term Care: Perspectives From Research and Demonstrations, Health
Care Financing Administration, U.S. Department of Health and Human Services.
Pauly, M. and Redisch, M. (1973), 'The Not for Profit Hospital as a Physicans' Cooperative',
American Economic Review 63, 87-100.
Scan1on, W. J. (1980), 'The Market for Nursing Home Care: A Case of an Equilibrium with
Excess Demand as a Result of Public Policy', unpublished Ph.d. dissertation, Department
of Economics, University of Wisconsin at Madison.
Scherer, F. M. (1980), Industrial Market Structure and Economic Performance, Rand-
Sheshinski, E. (1976), 'Price, Quality, and Quantity Regulation in Monopoly Situation',
Economica 43, 417-429.
Spence, M. (1975), 'Monopoly, Quality, and Regulations', Bell Journal of Economics 6, 417-
Ullmann, S. G. (1983), 'Cost Analysis and Facility Reimbursement in the Long-Term Health
Care Industry', Department of Economics Working Paper, University of Miami.
U.S. Department of Health and Human Services, Health, United States 1980, U.S. Government
Printing Office.
Waldo, D. R. (1981), Health Care Financing Trends, Health Care Financing Agency, Bureau
of Data Management and Strategy.
A Poisson process of which the parameter contains
a non-stationary error: application to the analysis
of a series of deaths in a large hospital

Ecole Superieure de Commerce et d'Administration,
des Entreprises de Bourgogne Franche Comte, 29 rue sambin 21000 Dijon

1. Introduction

The analysis of phenomena often leads to the study of chronological series

whose elements are nonnegative and integer. That is the case of count data
that follow a Poisson law. 1
The analysis of such series has been the subject of studies for a long time.
For example, we could mention, the analysis of rates of car accidents
according to the characteristics of drivers (Weber) [35], the analysis of mine
disasters (Jarret) [22] and mortality due to cancer (Breslow) [4]. In all these
situations, the problem is to 'explain' with some variables, the chronological
series of realizations of Poisson random variables whose parameter differs at
each period. 2 In their work on the patents research and development
expenditures relationship of American firms, Hausman et al. [19] analyse
count data. The standard hypothesis is that the counts follow a Poisson law
with a parameter depending on exogeneous variables.
This model is often used when the relationship between the parameter and
the endogeneous variables is known with precision. In that case Weber [35]
uses generalised least squares. According to Frome, Kutner and Beauchamp
[9], the estimation of the parameters of the model can be found equivalently
either by G-L-S (equivalent to the maximum likelihood under the hypothesis
of normality) or the method of minimum chi-square. These three procedures
are applied if the function mean of the count is twice differentiable. 3
The parameter of the Poisson law may contain an unobservable random
error. The first used specification is the independence of the errors. Thus,
Greenwood and Yule (in Cox and Lewis [7], Johnson and Kotz [23] and
Weber [35]) lay down the hypothesis that the law of errors is a gamma one.
This parametric hypothesis is practical because the realizations law becomes
a binomial negative law. Cox and Lewis [7] consider the stationary case and
mention that the proposition of stationarity of the errors of the parameter of
the law is equivalent to the stationarity of the process of realizations. This
classical result is found in continuous time and the theoretical problem
becomes more complex for count data. These data are cumulating realiza-
tions on each period of the same length; we suppose that the parameter is

G. Duru and 1. H. P. Paelinck (eds.), Econometrics of Health Care. 141-157.

© 1991 Kluwer Academic Publishers.
142 B. Larcher

only varying at each period. The process of the counts does not stay
stationary and the results of Poisson's processes theory in continuous time
cannot be applied. The variance of the process is changing at each period of
time. The process of the error remains unknown; but their autocorrelations
can be indirectly estimated by the sample autocorrelations of another non
stationary process. This other process is the one of the difference between
the count and its unconditional expectation. Its knowledge increases the
predictability of the Poisson process and the coefficients of the unknown
process can be estimated.
Empirically, the hypothesis of independence of the residuals is often
inadapted. The error is either a stationary process or a non stationary one.
We will show situations where the process is an ARMA 4 stationary process
or an ARMA non stationary one, the later being called a process with time
changing coefficients and noted ARMA t • We introduce a non stationary
error and we think that the relationship between the errors depends on the
day in the week; the process is then said to be cyclostationary.5 For conven-
ience, this new process is called CARMA. The cyclostationarity appears
when the memory of the system is linked to days. All the CARMA processes
have a VARMA representation under constraints of some elements of two
matrices. The interpretations made for multiple series can be adapted to the
case of the cyclostationary processes.
Under the hypothesis where the process of the error is a CARMA, the
exhibited new process is a model whose coefficients are varying with time
and present some similarity with the Kalman filter [24, 26]. In another way,
the variance of this new process depends on past errors but the model differs
from the ARCH model of Engle [9] and the GARCH model of Bollerslev [3].
We use the asymptotic framework of Gourieroux et al. [12, 13, 14]. For
convenience, we choose the mean square that is equivalent to the maximum
likelihood under the hypothesis of normality. Asymptotically, the choice of
the normal law leads to estimators equivalent to those established by all
other linear exponential laws. Gourieroux et al. [12, 13, 14] apply the pseudo
maximum likelihood method to the basic Poisson model and quasi gener-
alised pseudo maximum likelihood to the Poisson model, the parameter of
which contains an error. The law of the error is unknown but the first two
moments are supposed to exist. In their approach, the errors are independ-
ent. Under the hypothesis of the dependence of the errors, the estimator of
quasi pseudo maximum likelihood has to be adapted to and a new procedure
of estimation has to be constructed. We show how to adapt the procedure of
the quasi generalised pseudo maximum likelihood in this particular case.

2. Estimator of the quasi generalised pseudo maximum likelihood in the

Poisson model
2.1. The standard Poisson model

At the end of each period, the cumulated realizations of the Poisson process
A Poisson process of which the parameter contains a non-stationary error 143

are observed: Yt is the number of events that appear at the time t; At is the
parameter of the Poisson law at time t and P(Yt) is the probability of
apparition of Yt events in t.
P(Yt) = exp(-A t) - I . 2.1:1
At is a random variable composed of a deterministic part exp(X;p) and a
random multiplicative non observable error ~ t = exp( ct ). The mean of the
error is one and its variance is 'YJ 2.
E(A t) = exp(X;p) 2.1:3
V(At) = exp(X;p)2 V(exp(c t = exp(2X;P)'YJ2 2.1:4
V(exp(C t» = 'YJ2 2.1:5
E(exp( ct» = 1. 2.1:6
In this section, we suppose that the process of the errors is a white noise with
a variance equal to one and the covariance between the errors ~, is null for
each lag h.
C(exp( ct) exp( Ct-h» = 'YJt" -h = O. 2.1:7
The use of the exponential function ensures that At is positive. The random
error allows us to introduce the imperfection in the determination of the
parameter of the Poisson law. In those conditions, the law of Yt conditional to
X; and c, is a Poisson law [L(y/X;, c,)]; but the law of Yt conditional to X;
alone is not a Poisson law. The law of Yt can easily be specified in the case
when ct follows a gamma law. The particular specification was introduced by
Greenwood and Yule in "accident-proneness" (in Cox and Lewis [7],
Johnson and Kotz [23], Weber [35]).

2.2. Estimators ofpseudo maximum likelihood

In other cases, the law of Yt is unknown. Under general hypothesis, the

pseudo maximum likelihood gives consistent and asymptotically gaussian
estimators. The method is proposed by Gourieroux et al. [13, 14] after
Burguette et al. [6] and White [36]. These results are applied to the doubly
stochastic laws and in particular to Poisson's law with a random parameter.
If the mean and the variance of Yt exist for all the laws of Yt> then the
estimators of maximum likelihood associated to any pseudo law are strongly
consistent 6 and asymptotically gaussian if and only if the pseudo law is
chosen in linear exponential laws 7 (theorems 1, 2 and 3 of [14]). The law
L[y/X;, ctl is a Poisson law. The existence of the mean and the variance of
the error involves the existence of the mean and the variance of the law of Yt.
144 B. Larcher

Under this hypothesis, the coefficients of the doubly stochastic Poisson

process can be estimated.
The variance of YI depends on the first two moments of the law L[y/X;, el]
and on the first two moments of the law of errors. For this law:
The variance of a Poisson law is equal to the difference between the moment
of order two and the moment of order one. The variance of YI is easily
deduced. 8

We take the formulation of Gourieroux et at. [13] and [14] in the univariate
case. In the univariate Poisson process, the asymptotic covariance is directly
deduced from multivariate formulations of asymptotic variance of the esti-
mators ofP-M-L.

v (f3) = E ( of
as x of3
~-I lof
)-1 E ( opof ~-I
Q ~-I 10f)

E ( of ~_I lof )-1 2.2:3

x op op
P is the vector of coefficients of the model associated to the vector of
exogeneous variables of the model:
f(X, f3) is the mean of YI(multivariate) conditional to X and f3.
~(X, f3) is the variance covariance of YI(multivariate) according to the
chosen pseudo law.
Q(X, f3) is the variance covariance of the true but unknown law of
Ex is the expectations according the exogogeneous variables.
When the parameter is disturbed by independent errors, the conditional
expectation of YI relative to Xcf3 is equal to the expectation of the conditional
expectation of YI relative to XcP and ~I.
f(X, f3) = EAexp(X' f3)~I] = exp(X' f3). 2.2:4

The variance covariance of the true but unknown law of y, is equal to:

In the univariate case, ~(X, f3) is the variance YI according to the chosen
pseudo law. The asymptotic covariance may be estimated by sampling.
However, it is possible to define estimators with minimum variance that are
named estimators of quasi generalised pseudo maximum likelihood (Q-G-P-
M-L). This procedure is defined [13] and [14] when the conditional law of y,
A Poisson process of which the parameter contains a non-stationary error 145

does not depend on time. This condition is equivalent to the hypothesis of

the independence of the parameter errors ('f] t, t _ h = 0 for each t at lag h ).9

2.3. Estimators of quasi generalised pseudo maximum likelihood in a

doubly stochastic Poisson model

The method of Q-G-P-M-L proposed by Gourieroux et al. [12, 13, 14, 34]
give strongly consistent estimators, asymptotically gaussian and with minimum
variance (theorems 4 of [14]). The first step gives an estimator of the
coefficients; the second step gives an estimator of the variance of Yt according
to the estimations of coefficients. The asymptotic variance of coefficients
does not depend on the variance of the chosen pseudo law.

v p_
as( )- Ex
(Of g_1 tOf
ob ~
)-1 2.3:1

For estimation, the two steps can be repeated until the numerical conver-
gence of the algorithm.

2.4. Example of application of Q-G-P-M-L when the pseudo law is the

normal law

In the doubly stochastic Poisson model, we chose the normal law. The
maximum of likelihood consists in maximising the following expression at the
first step.lO

Max zlz = - L (Yr - exp(XrP)/

fJ r~ I

V(y,) = E [(y, - exp(X;p»2] = exp(2X;(3)'f]2 + exp(X;(3). 2.4:2

The objective function is not concave. However, the estimation of (3 is found
by an algorithm based on Newton Raphson (Minoux) [27J. In fact for this
function, we are able to choose correctly a starting solution that ensures the
convergence to the maximum.
The following regression, with variables depending on the estimation of p,
gives us an estimator of 'f]2.
The variance of y, can be estimated. At the second step, the estimation of the
variance of y, is introduced.

T (y, _ exp(X,(3»2
Maxzlz=- L 2.4:4
fJ ,= I exp(X;p) + exp(2X;p)1j2 .
146 B. Larcher

The procedure is applied till the numerical convergence of the algorithm.

This model is defined if the errors are independent. By the way, we will show
the consequences abandoning this hypothesis.

3. Poisson processes with a dependent error

The count data are modelled by a doubly stochastic Poisson process and the
error of the parameter may be stationary or not. This model is used to model
the series of deaths in a large hospital. The number of daily deaths in the
hospital appears to be the realization sum of the indicator functions of all the
people present on that day. The probabilities for each one are unknown and
also the class of risk. We consider different classes of risks. The number of
deaths in the class with high risk is represented by the Binomial law (B(n!,
PI» with random parameters (nl' pJ and the number of deaths in the class
with low risk is modelled by a Poisson law with a random parameter [P(A I )].
Moreover, in the high risk class, the parameters PI can be time dependent
and this dependence can be different for different days of the week. This
decomposition in classes of risk is unknown and the relationship between
parameters cannot be deduced. We are using Poisson's law whose parameters
are stochastically non stationary. We think that this model is well adapted to
these data.

3.1. Meaning ofa Poisson law whose error of the parameter is an ARMAI

This is the case of daily series the parameters of which depend on other
random events whose present and future effects differ according to the day
when the events appear. 11

3.1.3. Cyclostationary ARMAI process

This process belongs to a specific class of processes quite different from the
class of non stationary processes that have an ARlMA or SARlMA repre-
sentation. The postulated cyclostationary is a particular ARMAI process but
not a cyclical cyclostationary process because the mean is already eliminated.
For a daily process, we lay down that the probabilistic relationships are
linked to the day they appear. This process is:

exp( cl) = <P d(t)CB) exp( cl) + e d(llB)c, and

d(t) E P ... Dl V t E P ... Tl 3.1:4

D = the number of periods in the cycle (7 for a week)
d( t) = a cyclical function which gives the day of t in the week.
<P d(t)( B) = the polynomial of the lag operator of the autoregressive part.
e d(llB) = the polynomial of the lag operator of the moving average part.
A Poisson process of which the parameter contains a non-stationary error 147

et = an error "" I.I.D.L(O, a2)

exp( ct ) = the multiplicative error of the parameter of the Poisson law.

3.1.4. VARMA representation under constraints of a cyclostationary ARMAI

A non stationary process whose coefficients depend on time and whose
covariances at lag h depend on a cyclical function d(t), is represented by a
V ARMA process under constraints. PI is the vector of the sequence of the
observations of the tth cycle; Pt = [Pt(I), pP), ... , PI(d), ... , PtCD)]. The
process Pt is a multidimensional process and may be set in the following

Pt = I WjEI _ j 3.1:5

Wj = a matrix of order D X D
E t _ j = a vector of a white noise of order D.
It can be decomposed in an autoregressive part and in a moving average part;
this decomposition is named vectorial ARMA or VARMA.
p q

Pt = I ¢IPI - 1 + I ()k E, -k' 3.1:6

1=0 k=O

The constraint is the ,po is strictly triangular inferior (,po. i,j = 0 for each j >
i). The autoregressive part corresponding to ,po represents instantaneous
causality in multiple series. PI represents days of a week, the realization of
the day before the day d of the week t is the day d - 1 whose realization is
in PI if d > 1. We have to introduce that ¢o and ()o which do not correspond
to d - 1, are necessarily equal to zero. ,po; i, j = ()o; i, j = 0 for each j > i.
PI = the vector of the realizations of a cycle (a week for instance).
¢k = a matrix of order D X D corresponding to autoregressivity of lag k
()k = a matrix of order D X D corresponding to moving average coefficients
of lag k.
A cyclostationary process may be analysed as a multivariate process P(I),
P(2), ... , P(d), ... , P(D). In the case of day data, P(I) is the realization of
Monday and P(2) is the realization of Tuesday and so on until Sunday. Each
univariate process is stationary and a multivariate process is called a
VARMA process (see Hannan [17], Granger and Newbold [16]).
The approach of the maximum likelihood is due to Akaike [11 and the
exact maximum likelihood to Hillmer and Tiao [20]. The procedure is
explained by Tiao and Box [33]. For an example of application, we may refer
to the analysis of pig market prices made by Cordier and Indjehagopian [211.

3.1.5. Granger causality in VARMA processes

The advantage of using a VARMA representation is to apply the concept of
148 B. Larcher

causality. A review of different concepts of causality and of different tests is

undertaken in the thesis of Lai Tong [25]. In a model VAR or VARMA and
for interpretation, each of the processes is stationary and is assimilated to a
variable. If PI = [Pltl and P2 = [P2J are two stationary processes, there is
Granger's causality between the variables PI and P2 if it is possible to predict
Pit better with the information of the past of P2n than with the information of
the past of Pit once the information of the past of P2t contains in Pit is taken
away. Then, the variable PI causes the variable P2 • The relationship of
causality may be represented by a network which makes apparent the set of
all the dynamic relationships between the variables of the process. When the
variables are days, each variable is a day and the principle is applied to the

3.1.6. ARMA t with VARMA representation in the Poisson ian case

The concrete example of a Poisson model, whose errors of parameters are
modelled by a process ARMAn is the series of daily deaths in a large
hospital. We dispose of the series of 2000 realizations. The X; are dummy
variables that represent the days of the week and the months of the year. The
sample cross autocorrelations between the residuals of the days show that the
errors of the parameter of Poisson's law are non stationary. For instance, the
coefficient of autocorrelation at the lag 2 is different for Monday and
Wednesday. The autocorrelation depends on the couple of days; a dozen
relationships are significant.
Preliminarily, we tested the relationship in a VAR (vectorial autoregres-
sive model). The results show the dynamic relationships between the days of
For an example, we give an explanation of the events with an ARMAt
modelling of errors. Suppose that an exceptional event happens on a given
day, and induces an increase in the expectation of the number of deaths. This
event will have lagged effects on the parameter of the following days and
these effects are different according to the day when the initial event appears.
We consider a specific surgical operation that always appears on Tuesday;
the two days of maximum risk of death are on Tuesday and two days later,
on Thursday. We suppose that the number of this surgical operation is a
random variable.
This random variable will have two contributions to the number of deaths.
The first, the non conditional mean is modelled by the coefficients f3
associated to Tuesday and Thursday. The second is the variation of the mean
of Thursday conditionally to the number of surgical operations that happen
on Tuesday. If, for a given Tuesday, the number of this surgical operation is
increasing, the conditional mean of Thursday is also increasing. The state of
Poisson's law of Tuesday will cause the state of Poisson's law of Thursday.
Then, the residual of Tuesday and Thursday will be correlated. If we
generalise to a set of surgical operations, the network of relationships
becomes complicated. The non stationary analysis of the series of residuals
A Poisson process of which the parameter contains a non-stationary error 149

gives the network of these relationships. But, without any additional informa-
tion, it is impossible to derive further explanations.

3.2. Relationship between the two processes of errors

At a lag different from zero, the covariance C(yl' Yt -h) between the random
variables Yt and Yt _ h is a function of the covariance between the errors of the
parameter of Poisson's law (Annex 1). C(yl' Yt-h) may be estimated by a
similar method to the one which is used to estimate the variance in Q-G-P-

3.2.1. Variances and co variances of the two processes

Each count is written as the sum of its conditional mean plus an error that
comes from a Poisson lottery. For convenience, we consider an additive
process of error and a parameter equal to X;{3 + ct = X;{3 + 'PiB)et. If we
suppose that E[('P,(B)c r)2] is small relatively to X;{3 in order to keep the
parameter At positive.

Yt = X;{3 + ct + v t
ct = 'Pt(B)ct 3.2:1
w, = 'P,(B)et + Ut

'Pt(B)e, = the non stationary process of the error ofthe parameter.
u, = the error of the Poisson lottery of law I. L(O, X;{3 + 'PrCB)et); the
variance of u, is equal to the variance of Yt with e, the unknown

Wt = the sum of the two processes.

last innovation, ut Z L(O, X;{3 + ~ 'P"jet _ j +

et and u, are respectively time independent and independent between

themselves. E(e,e'_J) = E(u,ut - J) = E(e,ut_k) = for eachj rf 0 and for
each k. The error of the parameter of the Poisson law is non stationary and
cyclostationary in our model. The instantaneous variances and autocovari-
ances are equal in every 1 period (1 = 7 in our case).
P-M-L associated with the normal law and Q-G-P-M-L give estimators of
{3. Those estimators of {3 are unbiased, but implicitly we need to use the
covariance of y, to preserve asymptotical properties. If we postulate inde-
pendence between the endogeneous variables, the Q-G-P-M-L under the
pseudo hypothesis of independence gives an unbiased estimator of {3 and an
unbiased estimator of W t may be derived. The instantaneous variances and
covariances of wt can then be evaluated.
For a lag different from zero, the sample autocovariance is an asymp-
totically unbiased estimator of the covariance. The autocovariances of wt are
equal to the autocovariances of 'P(B)e[" The following expression gives the
150 B. Larcher

expectation of this sample autocovariance for lags different from O.

E(9;"(h» = g;"(h) ( 1 - ~) -
1 n-l
- -2 L (n - k) [y~(h - lk) + r~(h + lk)]
n k -1 3.2:2
y;,,(h) = E[(E/ + uJ (E/- h + u/- h )] = r:(h).
The variance of this estimator is greater than the variance of the sample of
errors; unfortunately, it is impossible to observe a sample of errors. The
sample variance is not an asymptotically unbiased estimator of the variance
of error.

E(9;"(0» = y;"(O) ( 1 - ~) -
1 n-1

- -2 L (n - k) [y~(-lk) + y~(lk)]
n k-l

y;"(O) = y:(O) + E(u;) = L 'P'7, jO;'_i + 3.2:3

j - max(O, h)

V(E/) + E(u;) = V(E/)+X;,8 + L 'P'/,A-j+ V(e/).


E(u;) appears at the end of the last expression; it is the variance in t of Y/

that follows a Poisson law. E(u~) = V(y/'P'/(B» = X;,8 + ~ '¥/,jet _j + 0;
and an asymptotically unbiased estimator may be defined. 15 The autocovar-
iances and the variances can be estimated by a similar procedure used for
Q-G-P-M-L. The procedure has to be adapted to the non stationarity of the
error with w t = y, - X;{3.
(Yt - X;i3) (Y/ -h - X; -J3) = y:(h) + vt 3.2:4
(Yt - X;~)2 - X;~ = r:(O) + vt'
The variance of W t depends on the information about the process Et • We may
distinguish two cases. The first is the one that we use above. In the second,
'P't(B)e/ is unknown but the past information of W t is known. According to
the theory of non stationary processes, the process has an MA/( 00) or
ARtC 00) or an ARMA/(pl' q/), if and only if, the modules of the roots of the
A Poisson process of which the parameter contains a non-stationary error 151

polynomials (1 - <I> tC B» and [1 - E> t( B)] lie outside the unit circle.
et = 'lft(B)et = [1 - <I>~(B)rl [1 - E>~(B)let = rr~CB)et
The process et has an ARt representation and this process will be approached
by the process wt = IP'(B)w[' In these conditions, we have to evaluate the
variance of wtCy;"CO» by the method used in annex 1.
Y( = X;{3 + rr;"(B)wt + e;" + U t
y;"(O)=x;{3+rr;"(B)w(+ V(e;") 3.2:6
V(u,) = V(y/I;"_l) = X;p + rr;(B)wt + V(en.

3.2.2. Observable process w, in autoregressive case

The coefficients of the process 'If,(B)e, may be estimated only from the
estimations of the autocovariances. This method is asymptotically unbiased
but gives poor estimates in short series; moreover, the knowledge of
coefficients cannot increase the predictability for the process is unobservable.
For this reason, the only way to improve the predictability is to use the
process WI' We consider the simple case when the process 'lftCB)et is
autoregressive 'lftCB)et = <I>t(B)et + e,. The process w, is autoregressive and
the problem is now to evaluate a model with an autoregressive error.
y, = X;{3 + <I>;"(B)w, + Vt = X;{3 + <I>;"(B)w/ + e;"+ u, 3.2:7
wt = <I>;"(B)wt + v/ = <I>;"(B)wt + e;" + ut.
The error v/ is equal to e;" + un with e;", the residual error of e/ after
introducing the autoregressive part. u/ is the error due to the Poisson process
of parameter X;{3 + <I>;"(B)w/ + e;".
L j _ 1 'If/,j e/,j = EuC<I>;"(B)w/) then E(A/<I>;"(B)w/) = Eu(A/L 'If/,je, _). The
estimators X;{3 + <I>;(B)wt of A" which is the mean of Yn is an unbiased and
non convergent estimator. The knowledge of the process of W t allows us to
find an unbiased estimator of the mean of Yt; however, wt = 'lftCB)e/ + u/
contains the error u/ that is drawn in a Poisson lottery. The problem of the
estimation of <I>;" is particular; the variance of the process changes and this
model becomes a model with varying coefficients <I>;. The variance depends
on the error of the drawing in the Poisson law.

3.2.3. Process with time changing coefficients for an AR

The process of w/ is a process with varying variance and, consequently, the
coefficients of the autoregressive process change if et is stationary or not. For
convenience, this fact may be shown in case when the true process of the
152 B. Larcher

unobservable error is an AR process. By using the Yule Walker 16 equations

for the unknown process E,
E, = L ~; e, _j + e,. 3.2:9
j - 1

With yE the vector of the autocovariances at lag h different from zero, <pE
the vector of the coefficients and P the matrix of the autocovariances at lags
oto p -1, the Yule-Walker equations are:
yE = P <pE and <pE = (prl yEo 3.2:10
The coefficients <pE of the process e" can also be estimated by the resolution
of the Yule-Walker equations. For process of W,' the equations of Yule-
Walker are:

p w

E(wt_ 1 Wt) = L <Pt,j E(wt_ 1Wt-j) + E(Wt_ 1E7) + E(wt_Iut) = 3.2:11

j-I =0 =0

yW(1)= L <P7,jE(yw(j-1».
j= 1

The diagonal elements of the matrix r w of the autocovariances of wt change;

they are V( wt _;) = V(Yt _;) = V( U t _ i)' The matrix r:
(with subscript t)
depends on time and the coefficients of the process are time dependent.


A part of V(Et) has been explained by the process <PtCB) Wt. The true
variance of U t is V( ut) = V(Yt) = X,f3 + ~ <p jE, _ j + a; and depends on
unknown variables. The variance of ut conditional on the process W t is
V(u7) = V(Yt) = X;f3 + <Pt(B) wt + V(E7).
The generalised least square cannot be used because the variance of
exogeneous variables Wt-I' wt - 2 , ••• , w,_p changes over time. The process
Yt - X tf3 - <I> rC B) wt is then an orthogonal process. The matrix of variance
covariance changes with time. In a linear model with a homoskedastic
autoregressive error (Pierce) [27], O-L-S may be used to evaluate the
coefficients of the AR(p); the errors are recursively computed and intro-
duced in the model and the Yule-Walker equations would give an equiva-
lent result. If the errors are heteroskedastic, O-L-S cannot be used to
evaluate the coefficients <I> W of the process.
If the variance of the error is changing with time, the matrix of covariance
of the exogogeneous variables <I>;" is changing with time and <I>;" has to be
computed separately by inverting f;" at each t. If we write the relationship for
A Poisson process of which the parameter contains a non-stationary error 153

t and t -1, then we derive the relationship between cl>7 U'llb cl>7-1'
yW = yE = r7c1>7 = f7-14>7-1
n-W -
'V t -
n-w 3.2:13

3.2.4. Process with time changing coefficients for an ARt and comparison
with Kalman filter
Furthermore, when the process of the unobservable error is an ARt and in
particular a cyclostationary process, the autocovariances at a lag different
from 0 are cyclical, the coefficients of the process of the unobservable error
are cyclical but not the coefficients of the process of wf • All the terms of r;V
depend now on t. If the procedure is described in a sequential form and if we
use a bayesian formulation, then the system is similar to that of the Kalman
filter [24, 26). In its bayesian formulation, the model with changing coeffi-
cients is that of the Kalman filter. In a matrix form:
Wt=F;cI>;"_l + vt and F;=[Wt-\,Wt-2, ... ,Wt-p] 3.2:14
cl>7 = Ptcl>;V-1 + 'l']t·
The procedure of estimation takes into account the different properties that
we have exposed earlier and is presented in annex 2.

4. Concluding remarks

The analysis of the series of deaths gives an example of using the non
stationarity in a doubly stochastic Poisson model. Non stationarity is met in
many phenomena that are observed daily and must be correctly modelled by
using non stationarity and more precisely the cyclostationary properties. For
this reason, we think that the CARMA should be used in the future.
These processes may be used to define parameters of doubly stochastic
processes. In particular, for the Poisson process, the doubly stochastic
Poisson process has the advantage of giving us a flexible structure to model
phenomena, whose Poisson processes in continuous time are unknown and
whose parameters depend on unknown variables. The introduction of a non
stationary error allows us to improve predictability but lays down an
estimation of a model whose parameters are time dependent.
Practically, the estimation and identification procedures of stationary
processes, before or after transformation, are well known; its promotion is
principally due to Box and Jenkins [2). The underlying statistical foundations
can be easily extended to the cyclostationary case. Moreover, the set of
ARIMA model becomes a particular case of a model with error CARIMA. If
the exogeneous variables are dummy variables which represent time, the
process is equivalent to a CARIMA.
154 B. Larcher

The different results concern the univariate case and may be extended to
multivariate series of Poisson whose parameter errors follow a vectorial
CARIMA; this new model can take into account the relationships between
the parameters of different Poisson series.
However, further investigations and simulations have to be made to test
the situations where there is significant and increased predictability.

Annex 1

The variances and covariances of a Poisson process whose parameter

contains a stationary error are a function of the variance of the error of the
The characteristic function for the Poisson law is:
Oy(u) = e).(eiU-I)
Oy(u) = iAei).e).(e iU -I).

The derivative of order k for u = 0 is i k the moment of order k.

O~(O) = iA
0;(0) = i 2 A + l 2 A2

E(Yt) = exp(X;f3)
E(y~) = E(A + A2) = exp(X;f3) + exp(2X;f3) (17 2 + 1) (A3)
V(Yt) = exp(X;f3) + exp(2X;{3)172.

In the case where the model is additive:

Yt=X;+ et
V(Yt) = X;{3 + 17 2 • (A4)

The covariance of two independent random variables following a Poisson law

with dependent errors of parameters.

C(y" y, -h) = E[(exp(X;f3);t) (exp(X; -hf3);, _h)l-E(exp(X~f3);t)

-E(exp(X;f3);,) E(exp(X; -hf3);t -h)
= exp[(X; + X; -h)f3117t,t-h' (AS)

In the case where the model is additive, the covariance does not depend on
A Poisson process o/which the parameter contains a non-stationary error 155

Annex 2

Different steps of the algorithm


0.1. a(Yt) = 1
0.2. <P7(B)wt = 0
0.3. rr(r7)-1 rf = o.
1.1. Estimation of ,B

; (Yt - <P~(B)wt - XJ3)2

Maxvz = - t..
f3 I-I a(yi
1.2. Estimation of wt
wt = Yt -X;P

1.3. Estimation of r~(O)

(Yt - X;P)2 - X;P = r:(O) + Vt·

1.4. Estimation of r7(h) for h i' 0

(Yt -X;P)(YI - h - X; - hP) = r~( h) + Vt·

2.1. Estimation of V( e n
V(e7) is the residual variance of e/ and is equal to V(et) - E(explained by
the autoregressive process).

2.2. Estimation of r7(0) = V(Yt)

r7(0) = V(wt ) = X;P + <P7(B)wt + V(e7).

2.3. Estimation of <P7

<P7 = (r7)-1 r7.
3. Stopping criteria
The procedure is applied until the numerical convergence of the empirical
variability of Y1 is reached. (Variability of YI is Lt[yt - X;P - <P7(B)wt]2). The
numerical convergence has to be reached in phase 2 and in phases 1 and 2.
156 B. Larcher


1. Theoretical justifications are based either on the theory of waiting line models or on the
sum's convergence of indicator functions.
2. A non homogeneous Poisson model is a process with a parameter changing with time.
3. The function can be linear or not; if not, the non linear optimization model has to be
4. An ARMA process is an autoregressive moving average process the use of which is
widely due to Box and Jenkins [2J.
5. We have taken the word cyclostationarity from the signal theory, (Papoulis [29J, Gardner
and Franks [11 J "characterization of the cyclostationary random signal").
6. We refer to the convergence in probability or to the convergence almost surely or to the
convergence almost everywhere of integration theory (Serfling) [32J.
7. The main results of linear exponential law and models are in Monfort [28J.
8. These results are proved in annex 1.
9. These results are proved in annex 1.
10. At the first iteration, the variance is equal to one beacuse it is not possible to take into
account the variance of Yt that depends on unknown parameters f3 and 1]2.
11. The characteristics of the non stationary processes may be found in Priestley [3 I] p. 816
and the ARMAt is Hassairi [18]. The generalisation of the theorem of Wold to the non
stationary case was made by Cramer [8].
12. The concept of causality may be applied for the relations between the elements of a
univariate series; this use is found in particular in Brockwell [5].
13. The procedure as shown in the following part is not orthodox because the variance of this
process changes at each time.
14. n is conventionally the number of observations used to perform the variances.
15. If the mean is non significantly different from zero; the variances and the covariances are
performed from moment of order two and the estimators are unbiased.
16. For this, see Brockwell [5] or Gourrieroux and Monfort [IS].

[1 J Akaike, H. (1973), 'Maximum Likelihood Identification of Gaussian Autoregressive
Moving Average Models', Biometrika 60(2), 255-265.
[2] Box, G. E. P. and Jenkins, G. M. (1976), 'Time Series Analysis, Forecasting and
Control', San Francisco: Holden Day.
[3] Bollerslev, T. P. ~1986), 'Generalised Autoregressive Conditional Heteroskedasticity
with Applications in Finance, Ph.D University of San Diego.
[4J Breslow, N. E. (1984), 'Extra-Poisson Variation in Log-linear Models. Applied
Statistics', Royal Statistical Society, 33(1) 38-44.
[5J Brockwell, P. and Davis, R. (1987), Time Series, Theory and Methods. Springer Verlag
[6] Burguette, J. Gallant, R., and Souza, G. (1982), 'On Unification of the Asymptotic
Theory of non Linear Econometric Models', Econometrics Reviews 1(1): 151-190.
[7] Cox, D. R. and Lewis, P. A. (1966), 'The Statistical Analysis of Series of Events',
London: Methuen and Co Ltd.; New York. Wiley, J. and Sons Inc.
[8) Cramer, H. (1961), 'On the Structure of Purely non-deterministic Stochastic Processes',
Arkivfor matematik (4), 249-266.
[9] Engle, R. F. (1982), 'Autoregressive Conditional Heteroskedasticity with Estimates of
the Variance of U.K. Inflation', Econometrica (50), 987-1008.
[10J Frome, E. L. Kutner, M. H. and Beauchamp, J, (1973), 'Regression analysis of Poisson-
distributed Data', Journal of the American Statistical Association 68(344),935-940.
A Poisson process of which the parameter contains a non-stationary error 157

[II] Gardner, N. A. and Franks, L. E. (1975), Characterisation of Cyclostationary Random

Signal Processes, LE.E.E. transaction on information theory, IT 21.
[12] Gourieroux, C. Monfort, A. and Trognon, A. (1983), Estimation par la methode du
pseudo-maximum de vraisemblance, Cahier du seminaire d'econometrie, (25).
[13) Gourieroux, C. Monfort, A. and Trognon, A. (1984), 'Pseudo Maximum Likelihood
Methods: Application to Poisson Models', Econometrica (58),701-720.
[14] Gourieroux, C. Monfort, A. and Trognon, A. (1984), 'Pseudo Maximum Likelihood
Methods', Econometrica (52),681-700.
[15] Gourieroux, C. and Monfort, A. (1983), 'Cours de Series Temporelles', Economica ed.
[I6J Granger, C. W. and Newbold, P. (1977), Forecasting Economic Time Series, Academic
Press ed. New York.
[I7J Hannan, E. J. (1970), Multiple Times Series. New York, Wiley.
[18J Hassairi, A. (1979), Contribution a' I'etude des processus ARMA a coefficients
dependant du temps [These de specialite en mathematiques appliquees]. Universite
Paul-Sabatier de Toulouse.
[19J Hausman, J. Hall, B. H. and Griliches, Z. (1984), 'Econometric Models for Count Data
with an Application to the Patents-r&d Relationship', Econometrica (52),909-938.
[20J Hillmer S. C. and Tiao G. C. (1979), 'Likelihood Function of Stationary Multiple
Autoregressive Moving Average Models', Journal of the American Statistical Associa-
tion 74(367),652,660.
[21 J Indjehagopian, J. P. and Cordier, J. (1986), 'Multidimensional Analysis of a Commodity
Price System', Internationallournal of Forecasting 2,153-189.
[22J Jarret, R. (1979), 'A note on the Interval between Coal-mining Disasters', Biometrika
[23J Johnson, N. and Kotz, S. (1969), Discrete Distributions. Wiley ed.
[24J Kalman, R. E. (1960), 'A New Approach to Linear Filtering and Prediction Problems',
Journal of Basic Engineering 35-45.
[25] Lai Tong, H. W. (1984), 'Les tests de causalite en economie', [These de doctorat de
troisieme cycle Universite d' Aix MarseilleII]. Economie matbematique et econometrie.
[26J Mainhold, P. and Singpurwala, N. D. (1983), 'Understanding the Kalman Filter',
American Statistician 32, 127-133.
[27J Minoux, M. (1983), 'Programmation matbematique; tbeorie et algorithmes', Tome 1:
[28J Monfort, A. (1982) 'Cours de statistique matbematique Economica'.
[29] Papoulis, A. (1986), Probability, Random Variables and Stochastic Processes,
MacGraw-Hill ed.: Second edition.
[30J Pierce, D. (1971), Least Squares Estimation in the Regression Model with Autoregres-
sive-Moving Average Errors, Biometrika 58(2), 299,312.
[31 J Priestley, (1984) 'Spectral Analysis and Time Series', Vol. 2, Academic Press.
[32J Serfling, R. (1980), Approximation Theorem of Mathematical Statistics, John Wiley.
[33J Tiao, G. C. and Box, E. P. (1981), 'Modelling Multiple Time Series with Applications',
Journal of the American Statistical Association 76(376), 802-816.
[34] Trognon, A. 'Generalisation des tests asymptotiques au cas oil Ie modele est incom-
pletement specifie', Cahier du seminaire d 'econometrie; 26, 94-109.
[35] Weber, D. C. (1971) 'Accidental Rate Potential: An Application of Multiple Regression
Analysis of a Poisson Process', Journal of the American Statistical Association 66(334),
[36J White, H. (1982), 'Maximum Likehood Estimation of Misspecified Models', Eco-
nometrica 50(1), 1-25.
The construction of a model for medical cost and
labour (MEDIKA)


Department ofJustice, PO Box 20301,
2500 EH, 's Gravenhage, The Netherlands

1. Introduction

In this paper estimation results are presented of equations explaining

capacity, consumption and employment in a major part of the somatic health
care sector in the Netherlands. These equations are helpful to understand the
functioning of the Dutch health care sector. They conduce to more reliable
estimates of future cost and employment.
A comprehensive calculating scheme of the health care sector has been
developed and operationalised at the Central Planning Bureau. With this
scheme cost, employment and financing of each subsector of the Dutch
health care sector are estimated: The inputs of the scheme however are not
yet explicitly related with one another. With the equations described in the
Sections 2, 3 and 4, some essential inputs of the scheme are related.
Combination of the calculation scheme with these equations gives a start for
a structural model for medical cost and labour (MEDIKA).
The results in this paper are preliminary and do not describe the model
MEDIKA completely. A final report about MEDIKA has been published in
Dutch. 2 Unfortunately this report is not translated in English and will not be
forthcoming in the near future. In the complete version of MEDIKA, the
model has been extended with equations for the paramedical staff, medical
and other supplies in hospitals and nursing-homes. In the same way as for
nursing-homes submodels have been established for mental hospitals and
institutions for mentally deficients. Finally equations have been admitted for
labour supply of physicians, dentists, nurses and physiotherapists. The
extension first mentioned will have an influence upon the results of the
simulations in Section 5 of the present paper.
For a clear picture of the part of MEDIKA, that will be discussed in this
paper, a division of the Dutch health care sector in four levels of medical
care (echelons) is useful. Patients usually enter the health care sector by
visiting the general practitioner, who provides primary (first level) care and
who decides whether the patient needs specialist care. The specialist on his
turn decides whether the patient has to be treated in a policlinic (second

G. Duru and). H. P. Paelinck (eds.), Econometrics of Health Care, 159-185.

© 1991 Kluwer Academic Publishers.
160 R. J. A. M. van den Broek

level) or admitted into a hospital (third level). For long-term nursing a patient
will be admitted into a nursing-home (fourth level). This division into four
levels of medical care recurs in MEDIKA. However, no distinction could be
made between out-patient and in-patient treatments of specialists.
In MEDIKA are distinguished four types of variables: capacity, consump-
tion, employment and cost variables. This division is shown in Figure 1.1.
1. capacity. A measure for capacity is defined for each of the four levels of
medical care in Section 2. The capacity of the third level of care, i.e. the
supply of beds in hospitals, is supposed to be endogenous.
2. consumption. Equations for the flows of patients between the different
levels of health care provision and for the consumption within these levels

Employment Capacity Consumption

Primary care
(1st level) G.P.'s

Specialist care
(2nd-3rd level) Operations

(3rd level)

(4th level)

I I Exogenous variables(s) Direction dependency

D Endogenous variable(s)
with a time-lag
Direction dependency

Figure 1.1. Structure of the somatic health care according to MEDIKA (except the part about
cost and financing).
The construction of a model for medical cost and labour (MEDlKA) 161

are estimated in Section 3. The explanatory variables represent the

demand for and supply of health care facilities and the substitution
between these facilities.
3. employment. Equations explaining the employment of some important
categories of staff in hospitals and nursing-homes are estimated in Section
4. The changes in the number of staff are assumed to depend among
other things on capacity and consumption.
4. cost and financing. In this paper we will not describe the comprehensive
system of equations explaining the cost. Input variables are population,
numbers of insured, capacity, consumption, investments and employment
in the facilities concerned. Prices for each subsector are derived from
macro quantities.
We conclude this section with a short description of the insurance
schemes in the Netherlands. Almost all Dutch families with an annual income
below a certain level are publicly insured. The remaining 30% of the popula-
tion is privately insured. Publicly insured are completely covered for all
medical expenditures. The last years privately insured can take insurances
with deductibles for primary and specialist care. So we do not expect any
price elasticity for consumption of publicly insured and high level consump-
tion of privately insured. The way in which facilities are financed and doctors
are paid also differs for the two systems. In the public sector general
practitioners and a part of the out-patient specialist care is paid on restricted
capitation basis. In the private sector doctors are paid on a fee-for-service
basis. Hospital expenditures are mainly reimbursed on basis of the number of
patient days.
Medical expenditure in nursing-homes and in hospitals (if the average
length of stay exceeds one year) is completely covered since 1968 by way of
a collective levying (AWBZ-premium).3

2. Capacity

1st-2nd level

In the previous section four levels of the somatic health care system are
described. The capacity in the first and the second level is defined respective
as the "number"4 of general practitioners per 1000 population (GPP) 5 and
the number of specialists per 1000 population (SPP). Both variables are
exogenous. How many specialists work in policlinics is not known. So the
variable SPP is also a measure for the capacity of the third level.

3rd level

The number of beds in hospitals per 1000 population (BEDP) is used

besides the density of specialists as a measure for the capacity of the third
162 R. 1. A. M. van den Broek

level. The density of beds in hospitals has increased to 0.56 beds per 100
population in 1973, after which a decrease occurred (Diagram 5.3). Quality
aspects, like the number of functions in hospitals, are not taken into account
with this variable. Although the government controls the number of beds in
hospitals in principle, other factors, especially the consumption in the past
appear to determine the development of the number of beds (Vanden Broek
[3]). Therefore we specify the equation for the number of beds in hospitals
(BED) in the form of a stock-adjustment model:

[ BED ] = OC90~2'" OC90~6' [ BEDXP_2 ]">. (2.1)


On the one hand the number of beds is dependent with a time-lag on the
occupancy ratio (OC) in the past. We distinguish a short-term effect
(OC90_2t ) and a long-term effect (OC90_6). The following definition holds:
OC90_2t = 10/9 . t(OC_2 + OC_3) (2.2.a)
OC90_6 = 10/9 '1/5(OC_4 + OC_5 + OC_6 + OC_7 + OC_s). (2.2.b)
Both variables are defined in such away, that there is no influence of
these variables on the number of beds, when 90% of the beds is occupied.
On the other hand the relative change in the number of beds depends with a
time-lag on a standard for the density of beds (BEDXP), in proportion to the
density of beds (BEDP). Two exogenous variables determine BEDXP:
The realization of a certain density of beds will be dependent on the
economic possibilities, represented by the Gross National Product per capita
(GNPP). After establishing the AWBZ6 in 1968 a bed in a nursing-home
(BEDN) might form a substitute for a bed in a hospital. We take this into
account with the index mEDNA. Substituting (2.3) in (2.1) results in:

[ BED ] = OC90~t' OC90~6 [ a5 GNPP~2t' mEDN~2 ]02. (2.4)


The estimation results of this equation are given in Table 2.1.

The significant positive values of the coefficients IZo and a l indicate, that
the development of the number of beds in hospitals is highly dependent on
consumption in the past. This is not surprising. The consumption in a region
is an important indicator for the planning of beds in hospitals by the Depart-
ment of Welfare, Public Health and Culture and the Board for Hospital
provision. The increase of the Gross National Product has as was to be
expected a positive influence on the realised capacity of beds. The coefficient
of the variable mEDNA is not significantly different from zero.
The construction of a model for medical cost and labour (MED [KA) 163

Table 2.1. Estimation results of Equation (2.4): the relative change

in the number of beds in hospitals (BEDIBED_I)

Variable Coefficient Value t-value

OC90_ 2t au 0.426 2.4

OC90_6 al 0.392 2.6
BEDXP_/BEDP_ 2 ~ 0.645 5.1
GNPP-21 a3 0.169 3.9
IBEDNA-2 a4 -0.065 0.9
Constant as 1.257 1.1

i{2' 0.8112
Nb 25

a Corrected multiple correlation coefficient.

b Number of observations (= years).

4th level

The capacity in nursing homes is defined as the number of beds in nursing-

homes per aged person (BEDNA). The variable BEDNA is supposed to be

3. Consumption

1st level

There are no time-series available about the consumption of primary care in

the Netherlands. The development of the consumption is assumed to be
identical with the development of the number of general practitioners.

2nd level

The flow of patients from the first to the second level is approximated by the
number of referrals of publicly insured? The referral rate of publicly insured
(REFPU) increased over the period 1953-1979 (Diagram 5.1).
Several cross-section analyses are performed in the Netherlands to
quantify the influence of exogenous variables on the referral rate. The
following influences, relevant for a dynamic analysis, are found (Hoeksma
[15], pp. 30-63):
a positive influence of the density of specialists;
different results about the influence of the density of general practitioners;
a small positive influence of the percentage of aged persons.
164 R. J. A. M. van den Broek

The influence of the ageing population is small.s We use as explanatory

variables in the equation of the referral rate only the density of general
practitioners (GPP) and the density of specialists for internal diseases and
surgeons 9 (SPISP). The variable GPP is not sufficient as a measure for the
workload of general practitioners. The change in morbidity of the population,
represented by the average number of cases of illness per male worker
(ILLP), will influence the frequency of patient contacts with the general
practitioner. On a basis of former estimation results, we assume that the
elasticities of the variables ILLP and GPP are equal, but opposite (Van den
Broek [5], pp. 11-12). In formula:

REFPU= b [ GPP ]110. SPISpE(SPISP). (3.1)


Nieuwenhuis ([24], p. 8) thinks it is plausible, that during the last decennia

the number of specialists has increased to such a degree, that the demand for
specialist care (arising from the first level) has followed the supply less
closely. The hypothesis of a decreasing function E(SPISP) has been analysed
in Van den Broek [5], where has been chosen for the relation:

E S - bl (3.2.a)
(SPI P) - SPISP - b2

The elasticity between the referral rate and the density of the number of
specialists e(SPISP) can be derived:

e(SPISP)= E(SPISP) [ 1 _ Sp~~~;g~:~sp ] . (3.2.b)

The function E(SPISP) goes to infinite for SPISP = b 2 and is therefore

unsatisfactory for low values of SPISP. However the tail of the function is the
relevant part for predictions, because of the rapidly increasing number of
specialists. The equation is estimated with OLS with a fixed value for b2 • The
coefficient b2 is optimalised based on the correlation coefficient of the
equation under the restriction e(SPISP) ~ 1 (Table 3.1).
The elasticity between the density of general practitioners and the referral
rate is -0.39. Van der Gaag [14] p. 70 finds a lower value -0.67 for the
years 1960-1972. Although one has to be careful comparing results of
cross-section analyses with dynamic studies, it is remarkable that in cross-
section analyses just small negative or positive elasticities are found (Hooij-
mans [16], Rutten [26], Posthuma and Van der Zee [25]). The elasticity
e(SPISP) has decreased from 1.0 in 1954 to 0.25 in 1979. Van der Gaag
[14] p. 70 finds for the years 1960-1972 an elasticity of 0.26; lower than the
result in Table 3.1. Hooijmans [16], p. 9 finds for 1974, 1975 and 1976
The constmction of a model for medical cost and labour (MEDlKA) 165

Table 3.1. Estimation results of Equation (3.1): the number of

referrals per 100 publicly insured (REFPU).

Variable Coefficient Value t-value

GPP/ILLP bo -0.338 3.5

SPISP bI 0.030 9.3
b2 0.070
Constant h3 27.553 21.1

iP 0.9752
N 26

respectively 0.44, 0.39 and 0.35. In this last study one can perceive a
decreasing trend of the elasticity. The results of Hooijmans are comparable
to ours.

2nd-3rd level

In this subsection we will estimate an equation for the number of operations

of specialists per publicly insured. It is not possible to make a distinction in
the data between out-patient and in-patient operations (see for this a cross-
section analysis of Hooijmans [17]).
As Feldstein [10] pp. 9-11 describes, the specialist holds a particular
position in the market for specialist care. On the one hand he offers his
services to the patient. On the other hand he decides as agent of the patient,
to what extent the patient needs specialist care. A specialist in the Nether-
lands is paid per operation. 1O He therefore has a financial incentive to
operate frequently. Insurance coverage is complete for publicly insured.
Hence there are no financial limits as far the patient is concerned. This
explains, that no signs are visible of a saturation of the market for specialist
care, in spite of the rapid increasing number of operations in the past
(Diagram 5.2). Considering the preceding remarks one may expect the
number of operations 11 per 100 publicly insured (OPPU) to be highly
dependent on capacity: the density of specialists for internal diseases and of
surgeons, and the use of materials. This last factor is approximated by an
index of the quantity of medical supplies in hospitals per capita (IMSP). An
attempt to insert the referral rate in the equation, did not yield significant
results. We omit the number of aged persons per capita as explanatory
variable, because of the relative small difference between the number of
operations of aged persons and of persons below 65. The following equation
can then be specified.

166 R. 1. A. M. van den Broek

We admit the duminy D59 in Equation (3.3) to account for an important

change in definition of the number of operations.
The high elasticity 1.0 in Table 3.2 between the number of operations and
specialists confirms the expectations. The quantity of medical supplies has a
moderate positive impact on the number of operations.

Table 3.2. Estimation results of Equation (3.3): the number of

operations per 100 publicly insured (OPPU).

Variable Coefficient Value t-value

SPISP b4 1.017 4.1

IMSP bs 0.286 3.9
D59 b6 0.065 5.0
Constant b, 44.403 10.0

iP 0.9892
N 26

3rd level

Clinical consumption in hospitals is represented by the number of patients

admitted in a year (ADM) and the average length of stay (LS). The admission
rate in hospitals (ADMP) is increasing over the period 1953-1979. The
average length of stay has remained constant on a level of 20 days until
1967. After 1967 it has decreased to 14 days in 1979. This results in an
increase of the number of occupied beds per 100 population to 0.5 in 1969,
after which a gradual decrease appeared (Diagram 5.4).
The number of patients admitted is regarded in some studies as a measure
for the output of a hospital. The equation of the number of admissions then
can be specified as a production function (Feldstein [9] Chapter 4; Van
Montfort [23]). In this study we choose for a mixed specification of the
equation with explanatory variables representing demand as well as supply.
The demand for hospital care depends on variables indicating the morbidity
of the population and variables accounting for changes in the insurance-
system. The quantities, that represent the supply of medical care, can be
divided into variables, that are measures for the capacity of hospitals (3rd
level) and variables indicating the supply of facilities, that may be substitutes
for hospitals. Mixed specifications of the equations of the admission rate and
average length of stay are used in other countries (Feldstein [9], Davis and
Rusell [7], etc) as well as in The Netherlands (posthuma and Van der Zee
[25], Rutten [26], Hooijrnans [16], Van der Gaag [14]). The Dutch studies
only refer to the public sector. Nevertheless they are useful as guideline for
The construction of a model for medical cost and labour (MEDlKA) 167

the specification of the equations.

ADMP = b12 E I (AGEP)· [

365· BEDP]"" [ SP


LS = b E (AGEP) . [
365. BED ]b 13

. -SP- ]b .

18 2 ADM BED

An important development, that has changed the morbidity in the past, is the
ageing population. For that we introduce in the model the variable the
number of aged persons per capita (AGEP). It is assumed, that a change of
AGEP, ceteris paribus, does not influence the admission rate of aged persons
as well as the admission rate of persons under 65. The same is assumed for
the average length of stay of aged persons and persons under 65. When is
given that an aged person is admitted bI9 times more in hospitals and stays
b20 times longer than a person under 65, then the following relations hold for
the functions E;(AGEP) under the assumptions (see Appendix III).
EI(AGEP) = 1 + (b I9 -1) AGEP (3.6.a)
E2(AGEP) = (1 + (b 19 b20 -1) AGEP/EI(AGEP). (3.6.b)
Based on data of the consumption of aged persons and persons under 65
we choose as fixed values b19 = 1.8 and b20 = 2.0.

Insurance system
Cost for clinical care in hospitals are covered almost completely by an
insurance. Therefore prices hardly influence the consumption of clinical care.
Only the maximum length of stay, that is covered by the publicly insurance
companies, was increased twice. From 42 to 70 days in 1955 and from 70 to
365 days in 1964. We account for these changes in the insurance system with
two dummies DPU1 and DPU2.

The capacity of hospitals to admit patients is determined by the available bed
density given the length of stay (BEDP/LS). The average length of stay is
limited by the number of beds per admitted patient (BED/OPN). The
variable specialists per bed in hospitals (SP!BED) is a measure for the
clinical as well as the policlinical capacity. Therefore it is difficult to predict
the influence of the number of specialists per bed on the admission rate. It
may be expected that an increasing number of specialists per bed has a
negative influence on the average length of stay.
168 R. J. A. M. van den Broek

The general practitioner decides how may patients will be referred to a
higher level of care. As a matter of fact, in principle a patient can be
admitted into a hospital only after a referral by the general practitioner.
Therefore we add the referral rate (REFPU) in the equation for the admis-
sion rateP It might be possible, that an increase of the supply of beds in
nursing-homes will result in a shift of a relatively small number of patients
with a long average length of stay from the hospital to the nursing-home. This
has been taken into account by introducing the variable IBEDNA in the
equation for the average length of stay.
Successively we discuss the estimation results in Tables 3.3 and 3.4 of the
Equations (3.4) and (3.5). The equations are estimated with a two stage least
It can be deduced, that the increase of the percentage of aged persons
results in an increase of the elasticity between the admission rate and the
percentage aged persons from 0.06 in 1953 to 0.08 in 1979. The elasticity of
the capacity to admit (0.17) is lower than the value 0.39 Van der Gaag [14]
p. 92 finds for publicly insured. Even higher elasticities are found in cross-
section analyses of the public sector (Rutten [26], p. 95; Hooijmans [16], p.
22). An explanation is given by Hooijmans and Rutten [18], p. 12-17. They
ascertain, that the clinical consumption of privately insured in hospitals is
less dependent on the capacity than the consumption of publicly insured.
This happens perhaps, because privately insured in general value their time
more than publicly insured. An increase of the number of specialists per bed
causes a small increase of the admission rate, ceteris paribus. This may not
seem to coincide with the insignificant or even negative coefficient found for
the public sector. Hooijmans and Rutten ([18] pp. 13-14) however, do find
a positive value for the private sector. They give the following explanation.
"Here we may be confronted with an effect caused by differences in fees, that
specialists acquire by treating publicly or privately insured patients. The fees
in the private sector are considerable higher than those in the public sector.

Table 3.3. Estimation results of Equation (3.4): the admission rate

in hospitals (ADMP).

Variable Coefficient Value t-value

AGEP b l9 1.8
365 • BEDP/LS bs 0.171 3.2
SP/BED b9 0.051 2.0
REFPU blO 0.464 16.8
DPUI bll -0.012 4.0
Constant b12 1.289 3.7

R2 0.9981
N 26
The construction of a model for medical cost and labour (MEDlKA) 169

In a region with a high number of specialists per bed, other things being
equal, the specialists will have to treat more privately insured patients in
order to reach the same level of income as in other regions." The positive
elasticity 0.46 between the referral rate and the admission rate is confirmed
by the result 0.40 of Van der Gaag [14], p. 92. Cross-section analyses
produce lower elasticities.
According to Table 3.4 the length of stay in hospitals depends more on
the capacity than the admission rate does. The elasticity 0.69 is higher than
the value 0.37, found by Van der Gaag [14], p. 92 for publicly insured. Also
for publicly insured, Hooijmans [16] p. 22 finds an elasticity between 0.5 and
0.6 for the years 1974, 1975 and 1976. We expected a priori a slightly lower
elasticity in our analysis, because Equation (3.5) is estimated for publicly and
privately insured together. The coefficient b14 has as expected a negative sign,
but the value is insignificantly different from zero. According to Table 3.4
there has been substitution between hospitals and nursing-homes in the
seventies. Rutten [26], p. 96 draws the following conclusion in his cross-
section analyses for the years 1971 and 1973. "In contrast with the results
for 1971 we find for 1973 a significantly negative impact of the number of
beds in nursing-homes. This may be related to the fact that development
towards stimulating substitution between hospital care and nursing-home
care started in the late sixties, so realization of substitution of both forms of
care could not have occurred until 1973".
A quantity, that has not been accepted as explanatory variable in Equa-
tions (3.4) and (3.5), is the number of nurses employed by hospitals. In some
of the studies mentioned earlier a significantly positive coefficient has been
found between this variable and the admission rateP In these studies is
assumed, that the number of beds, given the length of stay, will be used more

Table 3.4. Estimation results of Equation (3.5): the average length

of stay in hospitals (LS).

Variable Coefficient Value t-value

AGEP b20 2.0

BED/ADM b13 0.691 4.3
SPIBED b I4 -0.151 1.7
IBEDNA biS -0.266 5.3
DPUL I bl6 0.005 0.8
DPU2_ 1 b17 0.012 3.2
Constant biB 1.847 1.3

R2 0.9947
N 25
p' 0.57

• Correction for autocorrelation. In stead of Yt = a + ~f3)C;t + Ut

is estimated (Yt - PYt- I) = a(l - p) + ~i f3i(Xit - pXit - I) + (ut -
170 R. J. A. M. van den Broek

intensely, when more nursing staff is available (Van der Gaag [14], p. 93). We
think it is plausible, that the relation holds the other way around: a relatively
high number of admissions will be the cause of a high demand for nurses. 14
This reasoning is elaborated in Section 4.
The number of occupied beds per 100 population in hospitals is defined
by the admission rate and the average length of stay.
OBEDP = ADMP . LS/365. (3.7)
The elasticities between the explanatory variables of Equations (3.4) and
(3.5) and the density of occupied beds in hospitals can be calculated easily.
The most interesting result is the high elasticity of 0.71 between the number
of beds and the number of occupied beds. Feldstein [11], pp. 85-87 states,
that the relation does not need to be a causal one. For instance it might be
possible, that the planned capacity is adjusted to the expected needs for the
future. In MEDIKA the supply of hospitals is endogenous. It is influenced by
the consumption in the past (see Section 2). It might also be possible, that the
physician has a built-in preference for a clinical treatment, what makes him
dependent on the available capacity (Fuchs [12], p. 96). In the Netherlands
this preference would be enforced by the existing financing system. Accord-
ing to this systeem hospitals have to maintain a certain level of beds occupied
to keep their finance balanced.
4th level

The clinical consumption in nursing-homes is expressed in the variable

number of occupied beds per aged person (OBEDNA). The occupancy ratio
in nursing-homes stayed continually high during the period 1972-1980
(95-97%); presumably because of a demand for nursing-home care, that
exceeded supply. According to the estimation results the number of occupied
beds in nursing homes depends with a time-lag completely on capacity.

4. Employment

We did not succeed in this section to estimate the effects of a reduction of

working hours on the number of staff. We confined ourselves to correct all
variables representing employment for differences in working hours.
2nd-3rd level

Policlinical staff and clinical staff can not be distinguished. We divide the
staff in hospitals into three categories: (para)medical staff, nursing staff and
general staff.
A. (Para)medical staff
There is a lack of information about the development in medical treatment.
The construction ofa modelfor medical cost and labour (MEDlKA) 171

One gets an impression with the rapid increase of the number of operations
by specialists (see Diagram 5.2). It is clear, that the increase of paramedical
staff (PMS) and medical staff15 in hospitals (6%) over the years 1962-1979
has caused an intensification of treatment. This has resulted in a detectable
influence on the development of the number of patient days in hospitals.1 6
The increase of the paramedical staff is possible because of the considerable
inflow from the training-courses. The variable is assumed to be exogenous in
this incomplete version of MEDlKA.

B. Nursing staff
We distinguish three types of nursing staff: qualified nursing staff (QNS),
student nursing staff (SNS) and a small remaining category other nursing staff
(ONS). The last category is kept exogenous. Diagrams 5.5 and 5.6 show, that
the number of in-service student nurses has decreased in the seventies. The
training of nurses outside the hospital was started at the same time. The
work, that remained to be done because of the departure of in-service
student nurses had to be carried out by qualified nurses.

B.l. Qualified nursing staff

The equation for the qualified nursing staff is formulated as a stock-adjust-
ment model. It is assumed, that a change in the number of employed
qualified nurses depends on both the supply of nurses (SQNS) and the
demand for nurses (DQNS) compared with the number of qualified nurses in
the previous year (QNS_ 1).
QNS - QNS_ 1 = co(SQNS - QNS_ 1) + c1(DQNS - QNS_ 1). (4.1)
Labour supply is usually defined as the sum of the employed and the
recorded unemployed. The recorded unemployment of nurses has been very
low over the period 1953-1979. For determining real supply of qualified
nurses however, it is also necessary to account for hidden unemployment.
The number of nurses that stops working is high. A part wants to restart
working after a certain period, if the circumstances are favorable. As
imperfect definition of the labour supply of nurses we use the sum of the
number of successful candidates in a year (CAND),17 corrected for the
average working time of a full-time job (T), and a fraction (1 - cz) of the
number of qualified nurses in the previous year.
SQNS = CAND' T+ (1- c2)QNS_ 1 • (4.2)
Based on earlier research (Van den Broek [4], p. 42) we estimate the
annual number of nurses, that stop working, to be 20% (cz = 0.2). Because of
the high potential supply of nurses, that do not work, one may not expect a
rigorous influence of SQNS on the development of the number of employed
qualified nurses.
The demand for qualified nurses is determined by the quantity of work
that has to be done. This quantity is assumed to be dependent on consump-
172 R. J. A. M. van den Broek

tion: the number of occupied beds (OBED) and admissions (ADM). Part of
the work is performed by student nurses (SNS) and other nurses (ONS),
which makes substitution 18 possible. This results in:
DQNS = c6 + c3 OBED_ 1+ c4 ADM_I + cs(SNS_ I + ONS_ I). (4.3)
It should be noticed, that hospitals cannot attract unlimited numbers of
staff. This limitation might influence the coefficients of Equation (4.3).
Substitution of Equations (4.2) and (4.3) into Equation (4.1) gives:
(QNS - QNS_ I) = co(CAND . T - c2 QNS_ I) +
+ cd[c6 + c3 OBED_ I + c4 ADM_ I +
+ cs(SNS_ I + ONS_1)1 - QNS_ I}
with Co + cI ~ 1. (4.4)
According to the estimation results in Table 4.1 the relation Co + cI ::::. 1
applies. This means, that the number of qualified nurses is adjusted to a
combination of supply and demand. The demand for nurses depends, as
expected, positively on consumption (the variables OBED and ADM). A
positive coefficient is also found by Van Aert and Montfort [11, p. 17 in a
cross-section analysis. The coefficient Cs indicates, that because of a lower
production 10 student or other nurses can be replaced by 7 qualified nurses.
Based on the number of working days and an assumption of the productivity
of students of each class Legerman and Van der Hoef [211, p. 10 estimate,
that the production of 10 student nurses equals the production of 6 qualified

Table 4.1. Estimation results of Equation (4.4): the qualified nursing staff in
hospitals (QNS).

Variable Coefficient Value Elasticity'

SQNS Co 0.476 0.45

DQNS C1 0.605 0.62
C2 0.2
OBED_ 1 C3 3.550 1.07
ADM_l c. 0.180 1.12
SNS_ 1 + ONS_ 1 C5 -0.678 -0.71
Constant c6 -9340.1

• Calculated in respect of the mean value of the variables.

B.2. Student nursing staff

Most of the student nurses in hospitals are trained for a qualification as
general nurse in a course of 3.5 years. In this paper the equations of
MEDIKA, which explain the flows of student nurses in hospitals, are not
described. If suffices to note, that the changes in the number of student
The construction of a model for medical cost and labour (MEDlKA) 173

nurses in hospitals (SNS) and successful candidates (CAND) can be

explained by these equations. The most important explanatory variable is the
inflow of students for general nursing into the first class (IS 1).

C. General staff
The general staff can be divided into two categories: domestic staff (DS) and
administrative staff (AS). It is assumed, that the number of general staff
changes proportionally to the need for support because of other activities in
the hospital (medical treatment, nursing).

c.l. Domestic (and technical) staff

The increase of the domestic staff is about the same as the increase of the
number of beds after a correction for the general reductions in labour time
(Diagram 5.7). Domestic work is divided into a fixed quantity of work per
available bed and a variable quantity of work dependent on the number of
occupied beds.
According to Table 4.2 the major part of the housework depends on the
number of occupied beds.

Table 4.2. Estimation results of Equation (4.5): the domestic and

technical staff in hospitals (DS).

Variable Coefficient Value t-value

BED C7 0.359 3.8

OBED Cs 0.656 5.8
Constant C9 2.589 2.2

R2 0.9624
N 19

C.2. Administrative staff

Diagram 5.8 shows, that the administrative staff in hospitals has increased
rapidly. We expect, that some activities per bed have to be performed. A
certain quantity of administrative work is necessary per admitted patient. A
more intensive clinical treatment or an increase of the policlinical activities
will also require more administration. Therefore we add the paramedical staff
(PMS) as explanatory variable to the equation. The dummy D72 is inserted
because of a rupture in the data.
Table 4.3 shows, that the increase of the administrative staff is caused
174 R.I. A. M. van den Broek

Table 4.3. Estimation results of Equation (4.6): the administrative

staff in hospitals (AS).

Variable Coefficient Value t-value

BED C lU 0.279 3.9

APM Cll 0.631 3.1
PMS Cl2 0.627 7.5
D72 Cl3 -0.096 24.5
Constant Cl4 0.002 3.8

iP 0.9989
N 19

mainly by an increase of the number of admissions and an increase of the

paramedical staff. The influence of the number of beds is relatively small.

4th level

The major part of the staff in nursing-homes consists of nurses and general
staff. A set of equations similar to those for hospitals explains in MEDIKA
the development of the qualified nursing staff (QNSN), the student nursing
staff (SNSN), and the general staff (DSN + ASN) 19 in nursing-homes. The
specification and the estimation results of these equations are not given in
this paper.

5. Simulations

In this section we try to gain some insight in the efficacy of MEDIKA. First
of all we have simulated the period 1953-1979. The most important results
are shown in Diagrams 5.1-5.8. Especially the differences between realiza-
tion and simulation of the number of beds in hospitals are considerable. This
happens, because differences between the simulation and realization of the
number of beds and consumption in the past have an important influence on
the prediction of the number of beds (Table 2.1). The error made with the
prediction of the number of beds effects consumption directly (Tables 3.3
and 3.4). The important turning-point in the development of the number of
beds in the seventies however is simulated correctly by MEDIKA.
The next four variants are simulated in order to get acquainted with
variant Al SP(IS) +10%;
variant A2 GP +10%;
variant A3 (I)BEDN +10%;
variantA4 lSI +10%.
The construction of a model for medical cost and labour (MEDlKA) 175

In variant Al we examine the consequences of a continued increase of the

number of specialists.2o It is assumed, that the percentage of specialists for
internal diseases and surgeons remains constant. In the variants A2 and A3
consequences are calculated of an increase of the first respective the fourth
level. The last variant reproduces consequences of an increase of the inflow
of students for general nursing on the total number of nursing staff. The most
important results are given in Table 5.1.
After a short period an increase of the number of specialists with 10%
causes an increase of the referral rate with 3.0% and of the number of
operations with 10.2%. The development of the referral rate and a direct
positive influence of the number of specialists result in an rise of the
admission rate in hospitals of 2.4%. Simultaneously the average length of stay
decreases with 3.0%. The qualified nursing staff and administrative staff in
hospitals increase with 1.3% and 1.5% because of changes in consumption.
The domestic staff decreases with 0.5%.
The number of beds in hospitals will be lowered 0.7% after twenty years
because of a drop of the occupancy ratio. This accomplishes a further
decrease of consumption.
In variant A2 is assumed, that the number of general practitioners
increases with 10%. This results in a drop of the referral rate with 3.6%,
followed by a drop of the admission rate in hospitals with 1.9%. This means

Table 5.1. Consequences of the variants AI-A4 for the most important endogenous
variables of MEDlKA after 1 year (20 years) in percents.

Al A2 A3 A4
SP(IS) + 10% GP+ 10% (I)BEDN + 10% lSI + 10%
variable lY 20Y lY 20Y 1Y 20Y lY 20Y

REFPU 3.0 2.9 -3.6 -3.6

OPPU 10.2 10.2
ADMP 2.4 2.4 -1.9 -1.9 0.5 0.6
LS -3.0 -3.6 1.4 0.8 -2.8 -5.3
OBED -0.7 -1.3 -0.6 -1.1 -2.4 -4.8
OC -0.7 -0.8 -0.6 -0.5 -2.3 -1.8
BED -0.7 -0.6 -2.9
SNS+ONS 5.4 8.5
QNS 1.3 1.1 -1.5 -2.2 -0.8 -2.5 -0.8 -1.7
DS -0.5 -1.1 -0.4 -1.0 -1.5 -4.2
AS 1.5 1.3 -1.2 -1.4 0.3 -0.5
OBEDN 7.7 9.7
DNS+ ASN 2.4 2.4
QNSN+ 4.1 5.9
0.7 (SNSN+

a Nursing staff in nursing-homes corrected for productivity differences.

176 R. J A. M. van den Broek

that given a constant number of beds in hospitals at short notice, the average
length of stay increases with 1.4%. This increase will be reduced after 20
years (0.8%) as a consequence of the adjusted number of beds in hospitals.
An increase of the number of beds in nursing-homes with 10% (variant
A3) causes an immediate increase of general and nursing staff in nursing-
homes with respective 2.4 and 4.1%. On the other hand it reduces the
average stay in hospitals with 2.8%. This reduction will increase to 5.3% after
20 years. This will result in a significant decrease of the staff employed in
According to the results of variant A4 the inflow of students has two
consequences for the number of qualified nurses. On the one hand the
increase of the inflow of students with 10% will result after a few years in an
increase of 8.5% in the number of successful candidates for general nursing.
The number of qualified nurses will rise in consequence of this increase of
supply. On the other hand the need for qualified nurses diminishes, because
of the rising number of student nurses employed in hospitals. These two
effects result in a reduction of the nursing staff with 2.1 % after 3 years. The
long-term reduction is less (1.7%).
The possibilities of MEDIKA are illustrated by calculating the conse-
quences of the variants AI-A3 for cost and employment in the Dutch
health care sector, using the calculating scheme mentioned in the introduc-
tion. Not all inputs of the scheme are determined by the structural equations
described in this paper. For this we use a few simple supplementary assump-
tions, which we will not describe. In the introduction of this paper was
written that the model, described in this paper, is incomplete. This fact
influences the results of the variants. So the excercition serves for illustration
only. Table 5.2 shows the consequences for the employment of a change in
the number of specialists, general practitioners or beds in nursing-homes
according to the variants AI-A3.
Other things being equal, 10 new specialists will initiate at short notice in
hospitals the employment of 2 members of the general staff and 5 qualified
nurses. On the other hand employment in hospitals decreases with 15
persons when 10 general practitioners start working. According to A3, a rise
of the number of beds in nursing homes will have no important consequences
for total employment after 20 years. The increase of employment in nursing-
homes will by then be compensated by a decreased employment in hospitals.
In Table 5.3 the results are given of calculatings concerning the conse-
quences of variants AI-A3 for the cost of the health care sector. The results
only serve as illustration of the possibilities of MEDIKA just like the
calculations in Table 5.2.
According to variant A 1 every new specialist will generate at short notice
extra cost of f 56,000 in a hospital. The purchase of medical equipment,
paramedical staff and assistents, that may be induced by the specialist is not
included in this amount.
In variant A2 we assume, that the general practitioners will not save
The construction of a model for medical cost and labour (MEDlKA) 177

Table 5.2. Consequences for employment in the Dutch health care sector per specialist/
general practitioner/bed in nursing-homes after 1 and after 20 years in full-time equivalents.

Al A2 A3
per general per bed in
per specialist practitioner nursing-homes
lY 20Y lY 20Y lY 20Y

Primary care 1.72" 1.72'

Specialist care b 1.00' 1.00"
Hospitals 0.71 0042 -1.54 -2.29 -0.13 -0042
- General staff 0.20 -0.02 -0.60 -0.92 -0.07 -0.24
- Nursing staff 0.50 0.44 -0.90 -1.37 -0.06 -0.18
Nursing-homes 0.30 DAD
- General staff 0.07 0.07
- Nursing staff 0.23 0.33
Total 1.71 1.42 0.18 -0.57 0.17 -0.02

• Not in full-time equivalents.

b Including specialists on the pay-roll of hospitals.

Table 5.3. Consequences for the cost of the Dutch health care sector per specialist/general
practitioneribed in nursing-homes after 1 and after 20 years in Dutch florins (X 1000).

Al A2 A3
per general per bed in
per specialist practitioner nursing-homes
lY 20Y lY 20Y IY 20Y

Primary care 219 219

Specialist care" 174 172 -42 -44 -1 -2
Hospitals 56 33 -106 -155 -6 -28
Nursing-homes 30 36
Total 231 205 72 20 23 6

• Including specialists on the pay-roll of hospitals.

expenses even at a long range. The same holds for a reinforcement of the
fourth level, because only a part of the new patients in nursing-homes would
otherwise have been admitted into hospitals. In a complete analysis homes
for aged persons and care for the elderly should also have to be implicated.

The conclusion may be drawn, that MEDIKA offers possibilities to tackle

some problems existing in the health care sector. It will be necessary to
enlarge the modeL Further more attention will have to be paid to the connec-
tion of the structural equations, described in this paper, and the calculating
scheme (see introduction). The calculations in Section 5 can only be seen as
178 R.I. A. M. van den Broek

Diagram 5.1. Referrals per 100 publicly insured .














i i i t j i i t i i i i
1952 195. 1956 1958 1960 1962 196. 1966 1968 1970 1972 1974 1976 1978 1980

Diagram 5.2. Operations per 100 publicly insured.







i i
1952 1954 1956 1958 1960 1962 1964 ~966 1968 1970 1972 1974 1976 1978 '9~O
The construction of a model for medical cost and labour (MEDlKA) 179

Diagram 5.3. Beds in hospitals per 100 population.





j I i i i
1952 1954 1956 19S8 1960 1962 1964 1966 1968 1970 1972 1976 1978 1980

Diagram 5.4. Occupied beds in hospitals per 100 population.




0.40 "


i i • i i , ' i i i
1952 1954 1956 \9Sa 1960 1962 1964 1966 1968 1970 1972 1974 1976 1978 1980
180 R. I. A. M. van den Broek

Diagram 5.5. Qualified nursing staff in hospitals.





"E 26

-:l: 24

" 22
x 20



1 i i j i i i i i
1952 1954 1956 1958 1960 1962 1964 1966 1968 1970 1972 1974 1976 1975 1960

Diagram 5.6. Student nursing staff in hospitals.



E 22

x 16

The construction of a model for medical cost and labour (MED lKA) 181

Diagram 5.7. Domestic staff in hospitals.




E 2.2
X 18


[ j j i i ! i i i i i i t i
1952 1954 1956 195a 1960 1962 1964 1966 1968 1970 1972 1974 1976 1978 1980

Diagram 5.S. Administrative staff in hospitals.

1952 1954 1956 1958 1960 ~962 1964 :966 ~96B ~970 1972 1974 1975 :972 1980
182 R. 1. A. M. van den Broek


1. The author was employer of the Central Planning Bureau till November 1984 and after
that the Department of Welfare, Public Health and Culture. Presently he is working at the
Department of Justice.
2. "Arbeidsvraag en arbeidsaanbod in de gezondheidszorg op lange termijn". Project
personeelsvoorziening kwartaire sector. Bulletin no. 5. Centraal Planbureau, Sociaal en
Cultureel Planbureau. Den Haag, juni 1984.
3 Algemene Wet Bijzondere Ziektekosten.
4. In the paper we always correct for differences in working time over the years.
5. For the signification of the symbols see Appendix II.
6. Algemene Wet Bijzondere Ziektekosten. Cost for nursing-home care will be paid
according to this bill out of a collective premium.
7. There are no data available about privately insured.
8. Assuming a constant ratio between the referral rate of aged persons and persons under
65, the referral rate has changed over the years 1953-1979 only with 1.6% as a
consequence of the ageing population.
9. General practitioners do not refer patients to specialists in the field of anaesthesiology,
bacteriology, clinical chemistry, pathological anatomy, radiology and radiotherapy.
10. Minor poJiclinicai operations are an exception. In the Netherlands the specialist is paid a
fixed amount for each publicly insured, that is referred to him. This amount depends on
the duration of the policlinical treatment. Only major policlinical operations are paid
11. We only use data about specialists for internal diseases and surgeons.
12. It is assumed, that the referral rate for private insured has the same development.
13. Different results are found in relation to the average length of stay.
14. This assumption is supported by the scheme of directives for staff in hospitals. According
to these directives the allowed number of nursing staff depends on the number of beds
15. Among whom specialists.
16. The influence ofthe variable SPIBED in Equations (3.4) and (3.5).
17. Only those candidates are counted, who have graduated from a training-course relevant
for hospitals.
18. The substitution between qualified nurses, and student or other nurses is a linear process.
For that we specify Equation (4.3) linear.
19. General staff = domestic and administrative staff.
20. Also of specialists, that are on the pay-roll of hospitals.
21. Excluding eye-surgery.

Appendix I. References
[1) Aert, J. H. van, Montfort, A. P. W. P. van (1977), Basisonderzoek Kostenstructuur
Ziekenhuizen, deel 5. Personeelscategorieen en kostensoorten. Nationaal Ziekenhuisin-
stituut, Utrecht. 73 pp.
[2) Altman, S. H. (1970), The Structure of Nursing Education and its Impact on Supply.
Empirical Studies in Health Economics, The Johns Hopkins Press, London, pp. 335-
[3) Broek, R. J. A. M. van den (1979), Eerste aanzet tot een simultane verklaring van
gebruiksen capaciteitsontwikkeling van ziekenhuisbedden in Nederland. Centraal Plan-
bureau, notitienr. 11, hoofdafd. IV, Den Haag, 25 pp.
[4) Broek, R. J. A. M. van den (1981), Middellange termijnraming intramurale gezond-
heidszorg, Centraal Planbureau, notitienr. 1, hoofdafd. IV, Den Haag, 81 pp.
[5) Broek, R. J. A. M. van den (1982), Het aantal verwijzingen door huisartsen. Centraal
Planbureau, notitienr. 4, hoofdafd. IV, Den Haag, 14 pp.
The construction of a model for medical cost and labour (MED [KA) 183

[6J CentraaI Bureau voor de Statistiek (1981), Inkomens vrije beroepen. Een methodolo-
gische studie, M13, 98 pp.
[7J Davis, K. and RuseH, L. B. (1972), The Substitution of Hospital Outpatient Care for
Inpatient Care', The Review of Economics and Statistics 54, 109-120.
[8] Es, J. C. and van Pijlman, H. R. (1970), 'Het verwijzen van ziekenfondspatienten in 122
Nederlandse huisartsenpraktijken', Huisarts en Wetenschap 13,433-449.
[9] Feldstein, M. S. (1967), Economic Analysis for Health Service Efficiency, North-
Holland Publishing Company, Amsterdam, 332 pp.
[10J Feldstein, M. S. (1973), Econometric Studies of Health Economics, Discussion paper
291. Harvard Institute of Economic Research, Cambridge, 94 pp.
[11] Feldstein, P. J. (1979), Health Care Economics, John Wiley & Sons, Inc. U.S.A., 457
[12J Fuchs, V. R. (1974), Who Shall Live? Basic Book Inc., New York.
[13J Fuchs, V. R. and Kramer, M. J. (1972), Determinants of Expenditures for Physicians
Services in the United States 1948-1968, National Bureau of Economic Research,
Occasional Paper 117, DHEW Publication No. (HSM) 73-3013, 63 pp.
[14] Gaag, J. van der (1978), 'An Econometric Analysis of the Dutch Health Care System',
Proefschrift, Rijks Universiteit Leiden, 160 pp.
[15] Hoeksma, B. H. (1978), 'De polikliniek als schakel in de gezondheidszorg', Scriptie,
College voor Ziekenhuisvoorzieningen. Utrecht, 97 pp.
[16] Hooijmans, E. M. (1976), 'Estimates of a Model of the Dutch Health Care System over
the Years 1974, 1975, 1976', Report 82.01b Center for Research in Public Economics,
Leiden, 31 pp.
[17J Hooijmans, E. M. (1982), 'Het aantal klinische en poliklinische verrichtingen volgens
tarief 3', Center for Research in Public Economics, Leiden, 40 pp.
(18) Hooijmans, E. M. and Rutten, F. F. H. (1982), 'The Impact of Supply on the Use of
Hospital Facilities; Differences Between High and Low Income Groups in the Nether-
lands', Report 82.04. Center for Research in Public Economics, Leiden, 21 pp.
[19] Hurd, R. W. (0000), 'Equilibrium Vacancies in a Labor Market Dominated by non
Profit Firms: The "Shortage" of Nurses', The Review of Economics and Statistics 55, 2.
[20] Jacobs, P. (1974), 'A Survey of Economic Models of Hospitals', inquiry 11,83-97.
[21J Legerman, A. and Hoof, J. A. M. van den (1977), 'Schatting van de macro-economische
effecten van de substitutie van in-service opleidingen voor de verpleegkunde diploma'S
A, B en Z door MBO-v. VoMil', Stafbureau Raadsadviseur Lange Termijn Planning,
Leidschendam, 2 del en.
[22J Lempers, F. B. (1974), 'Een model voor de gezondheidszorg in Nederland', Centraal
Planbureau, notititienr. 12, Hoofdafd. IV, Den Haag, 43 pp.
[23] Monfort, A. P. W. P. van (1980), 'Production Functions for General Hospitals',
Proefschrift Katholieke Hogeschool Tilburg, 184 pp.
[24] Nieuwenhuis, A. (1975), 'Het verwijzen van ziekenfondspatienten door de huisarts',
Centraal Planbureau, notitienr. 15, Hoofdafd. IV, Den Haag, 16 pp.
[25] Posthuma, B. H. and Zee, J. van der (1977), Tussen eerste en tweede echelon 1',
Nederlands Huisartsen Instituut, Utrecht, 104 pp.
[26] Rutten, F. F. H. (1978), 'The Use of Health Care Facilities in the Netherlands',
Proefschrift, Rijks Universiteit Leiden, 225 pp.
[27J Yett, D. E. (1970), The Chronic "Shortage" of Nurses: A Public Policy Dilemma.
Empirical Studies in Health Economics, The Johns Hopkins Press, London, pp. 357-

Appendix II. Description of the variables

The addition of P/PU/A indicates, that the variable is divided by the population/publicly
insured/number of aged persons in the Netherlands;
Variables with an i as subscript are i years accelerated;
184 R. J. A. M. van den Broek

- Variables representing employment are corrected for differences in working hours.

Endogenous variables

ADM(P) the number of admission in hospitals (per 100 population)

AS(N) the administrative staff in hospitals (nursing-homes)
BED(P) the number of beds in hospitals (per 100 population)
BEDXP a standard for the number of beds in hospitals per 100 population (Equation
CAND the number of successful candidates for general nursing (only those candi-
dates are counted, who have graduated from a training relevant for hospitals)
DQNS the demand for qualified nurses in hospitals (Equation (4.3»
DS(N) the domestic staff in hospitals (nursing-homes)
LS the average length of stay in hospitals
OBED(P) the number of occupied beds in hospitals (per 100 population)
OBEDN(A) the number of occupied beds in nursing-homes (per 100 aged persons)
OC(90) the occupancy ratio in hospitals (with standard 0.9 Equation (2.2»
OP(pU) the number of operations (per 100 publicly insured)
QNS(N) the qualified nursing staff in hospitals (nursing-homes)
REF(PU) the number of referrals (per 100 publicly insured)
SNS(N) the student nursing staff in hospitals (nursing-homes)
SQNS the supply of qualified nurses in hospitals (Equation (4.2»

Exogenous variables

AGE(P) the number of aged persons (per capita)

BEDN(A) the number of beds in nursing-homes (per 100 aged persons)
Dl dummy for the year 1900 + i
= 0 if year < 1900 + i
= 1 else

DPUl dummy representing the extension of insurance coverage of the maximum

length of stay in hospitals for publicly insured from 42 to 70 days in 1955
= 0 if year < 1955
= 1 else
DPU2 dummy representing the extension of insurance coverage of the maximum
length of stay in hospitals for publicly insured from 70 to 365 days in 1964
= 0 if year < 1955
= 1 else

GNPP the average Gross National Product per capita

GP(P) the number of general practitioners (per 100 population)
IBEDNA index of the number of beds in nursing-homes per aged person after the
establishing of the A WBZ
= 1 if year < 1969

ILLP index of the average number of cases of illness per male worker (1970 = 1)
IMSP index of the quantity of medical supply in hospitals per capita
lSI the inflow of students for general nursing in the first class
P the population ofthe Netherlands
The construction of a model for medical cost and labour (MEDlKA) 185

ONS(N) the other nursing staff in hospitals (nursing-homes)

PMS(N) the paramedical staff in hospitals (nursing-homes)
SP(P) the number of specialists 21 (per 1000 population)
SPIS(P) the number of specialists for internal diseases and surgeons (per 1000
T index ofthe average working time for a full-time job (1953 = 1)

Appendix III. The calculation of the function El (AGEP) and E2 (AGEP)

It is assumed that an aged person is admitted b l9 times more in hospitals and stays b20 times
longer than a person under 65
ADM",+ = b ADMIi,- (l.a)

LS 65 + = b20 LS 65 -· (l.b)
By definition holds:
ADM = ADM lis + + ADM lis - (2.a)
ADM· LS = ADM6S+ • LS 65 + + ADM6s - • LS 6S -. (2.b)
The combination of (l.a) and (2.a) gives
ADM lis - (3.6.a)
ADMP = P _ AGE [1 + (b 19 -1) AGEP].
The admission rate of persons under 65 is explained by the other exogenous variables. The
equation for the average length of stay is a bit more complicated.
From (l.a), (2.b) and (3.6.a) can be derived:

LS = LS _ [ 1 + (b 19 b20 - 1) AGEP ] . (3.6.b)

65 1 + (b 19 - 1) AGEP
The microeconomics and econometrics of
bonus systems in health insurance

lnstitut flir Empirische Wirtschaftsforschung, Universitiit Zurich


An impartial observer of the ongoing international discussion on cost

containment in health care would be struck by the one-sidedness of the
argument. Focus is invariably on cost-sharing as a sanction meted out to the
user of services, both in research and in policy debates (Cairns and Snell,
1978; Feldstein, 1973; Newhouse et al., 1981). The alternative of creating
incentives to non-users of medical care seems to be just about forgotten.
However, private health insurers in Western Germany have been o.ffering
rebates as well as experience-rated bonuses to their members for several
years, and their experiences should be of interest internationally (Uleer,
1981). Therefore, a project was initiated by the Robert Bosch Foundation in
1981 with the objective of systematically investigating the properties of such
contracts. In particular, this contribution is based on two working hypothe-
ses: First, rebates and bonus options are deemed to be at least as 'attractive to
insureds as conventional cost sharing alternatives. Second, these new con-
tracts are predicted to dampen utilization of services very much like tradi-
tional cost-sharing methods, with experience-rated bonus options having
even more impact than fixed rebates for no claims.
The plan of this paper is as follows. First, the rebate offer is compared to a
cost-sharing contract from the point of view of the consumer. Rebate and
bonus options are usually introduced into all policies written by a. particular
private insurer; thus, if an insured does not like them, he must turn to
another company. In view of this mobility barrier, the mere fact that nine out
of ten leading private insurers in Western Germany offer rebates or bonuses
in 1985 whereas only one did so in 1980 is not sufficient to prove that these
new options are in the interest of their members. For this reason, a micro-
economic argument is developed showing that these new policies may be
superior to conventional ones from the consumers' point of view as well.
The second part of the paper contains a description of policies written by
three insurers: A, B, and C. A continues to offer plans with and without
deductibles and coinsurance; B pays back a rebate amounting to three

G. Dum and 1. H. P. Paelinck (eds.), Econometrics of Health Care, 187-202.

© 1991 Kluwer Academic Publishers.
188 P. Zweifel

monthly premiums if the insured abstains from filing a claim during a year; C
offers a bonus amounting to two monthly premiums (as of 1982) in the first
year with no claims, three in the second year, and four starting from the third
consecutive year with no claims. Using a simple microeconomic model,
predictions are derived concerning the policies' respective ability to induce
an insured to forego ambulatory medical care given a minor impairment of
his health. These predictions are subjected to empirical tests in the third part
of the paper. Rebates and bonuses also create incentives for not submitting a
claim although medical care was consumed. Particular care is taken to
eliminate this submission effect because it amounts to a mere cost-shifting
between the insured and the insurer. Social savings only accrue if the insured
changes his behavior, i.e. if he reduces his utilization of medical care services.
But by doing so, he might jeopardize his future health. Therefore, the
empirical analysis is extended to cover two consecutive years in an attempt to
detect traces of a toothsaw pattern in medical utilization. The concluding
section of the paper contains a discussion and evaluation of the results, with
particular reference to the ongoing debate about the reform of social health

The welfare economics of health insurance policies

At first sight, a rebate offer might seem to be nothing but a gimmick on the
part of the insurer for cashing in higher premiums earlier, with the insured
foregoing interest income he could have if his money was not tied up with the
insurance. This conception of a rebate or a bonus option presupposes two
things, however: First, there would have to be a full commitment on the part
of the insured to leave this earmarked money in the account, thus making
absolutely sure that cash will be available for covering the net cost of medical
treatment accruing under a conventional policy. Second, the degree of risk
aversion must be the same regardless of the state of health. By way of
contrast, Figure 1 below shows an individual that has a stronger degree of
risk aversion when ill than when healthy. There is at least some indirect
empirical evidence to support such an assumption, see Fuchs (1982). For
simplicity, a conventional cost sharing policy containing a deductible of D
Dollars is compared to a financially equivalent rebate offer. An insured
covered under the conventional policy has initial wealth of Wo - PD' with PD
symbolizing the premium of this policy. If he falls ill, the insured has to pay
D Dollars out of his own pocket, resulting in final wealth Wo - PD - D.
Using a probability of 50% for falling ill, his expected wealth amounts to
The rebate offer it. selected such that it is financially equivalent to the
conventional alternative, i.e. E(W)R = E(W)D. With no discounting for
present value and 50% probability of illness, this simply means that the
premium of the rebate contract (PR) costs as much as the premium PD of the
Bonus systems in health insurance 189

.- ,

=W,,-PD-D = E(W)R

Figure 1. Comparing a rebate option with a conventional cost sharing plan.

deductible contract plus one half of the deductible. Despite this financial
equivalence, the two alternatives are not equivalent as soon as the insured
value gains and losses differently, depending on health status (see Dionne
and Eeckhoudt, 1983, for some implications of status-dependent Von
Neumann-Morgenstern utility functions). If risk aversion is stronger given
bad health than given good health, the utility function Ui (W) exhibits a
higher degree of concavity from below than does the utility function Uh( W).
Under a conventional deductible plan, the individual starts at Ub when
healthy and ends up at U"D when ill, resulting in expected utility E (U)D.
If covered by a policy with a rebate offer, the same individual has initial
wealth of only Wo - PR , associated with utility U"R. But he has a 50% chance
of cashing in a rebate. Since he is healthy then, he moves up along utility
function Uh( W) to reach point utin Figure 1. Due to that function's low
degree of concavity, expected utility amounts to E (U)R. Since E (Uh >
E (U)D' the rebate offer dominates a policy with cost sharing despite the fact
that the two plans are financially equivalent.
It should be emphasized that this argument does not prove that traditional
contracts (as well as self-insurance) are always inferior to rebate offers or
190 P. Zweifel

comparable bonus options. At least in theory, a conventional contract could

be complemented by another policy covering the net cost of medical care. If
indeed available, such a combination might well be preferred to a rebate
offer. Additionally, self-insurance could be a viable alternative if interest to
be earned on cast not tied up in insurance premiums is high enough; see
Zweifel (1985) for more details. This argument can be summed up in

Conclusion 1

For individuals whose degree of risk aversion is higher when ill than when
healthy, a rebate option will tend to dominate both a financially equivalent
policy with cost sharing as well as self-insurance.

This proposition cannot be tested empirically on the basis of available data.

For the records do not contain information concerning choice among private
insurers nor the reasons for this choice.

Three health insurance policies compared

Given that individuals are insured by the same company, they are exposed to
much the same financial incentives. Therefore, generating predictions con-
cerning one type of policy only is not sufficient. Rather, differential predic-
tions among the three insurers must be derived, e.g. comparing insurer B
(offering a fixed rebate of three monthly premiums) to insurer C (offering an
experience-rated bonus of two to four monthly premiums). In order to be
able to deal with insurer C in the sequel, the planning horizon is two years
throughout. In Figure 2, a simple two goods model is shown with hours of
medical service (M) on the horizontal and all other goods (X) on the vertical
axis. In an attempt to mirror limited consumer sovereignty in health care,
choice is restricted between 0 and MJ physician hours (which are assumed to
cost 150 OM or 50 U.S. Dollars).
The straight line AA' depicts the budget constraint of a member of
insurance A with full coverage. It runs close to the origin because such a plan
has a high price. By way of contrast, BB' mirrors the lower premium of the
rebate policy written by insurer B. If a member of B accepts the treatment
offer comprising M] hours of care, he attains utility level 1; at point B-. If he
saves his rebate by foregoing ambulatory care, his new budget constraint
originates from point B+ on the X axis. However, he would have to face the
full price for every physician minute, making him move along the steep
budget constraint towards point R. The two budget constraints intersect at
point R in Figure 2, indicating a submission threshold: For an annual bill
falling short of R (about 4 hours' work of ambulatory care), it is better to pay
it out of one's own pocket. Beyond point R, the insured fares better
submitting his medical bill.
Bonus systems in health insurance 191

\S '\
------~- --- --B+'
\ B-+
\ '\
\ \


Br--- ----__~----~~~--------~~--~~~

2 3 4 5 6 7 8 11 12 13 (physician hours,
9 10
at 50 $)

Figure 2. Incentives contained in rebate option (litsurer B), 2 years horizon.

192 P. Zweifel

Given the indifference curves shown in Figure 2, point B- (M, hours of

medical care) yields slightly higher utility than alternative B+ (no medical
care and rebate saved). In this case, the rebate option fails to induce the
insured to go without medical care. A fortiori, the same individual would also
take advantage of the treatment offer if covered by A's plan because any
point along AA' ranks higher than point A on the X axis.
Now let the situation repeat itself in the following year, with the physician
fixing intensity of care at M, hours and the health problem being of the same
severity as in the year before. Over the two-year period, the choice is
between 0 and M, hours of care if the insured did not see the physician in
the first year and between M, and 2M, hours of care if the insured went to
see the doctor in the first year. But under the assumptions made, the
individual should decide exactly the same way in the second year as in the
first. If he saw the physician during the first year (as in Figure 2), he should
tum to him again .. His second year budget constraint originates either at
point B-+ (saving his rebate) or B- (consuming care). According to
indifference curve J~, obtaining M, (= 6 hours) of care is again superior to
having none. The same holds true if the individual were covered by the no
cost-sharing policy of A. On the other hand, if the enrollee of B went without
medical care during the first year, he should continue to do so, too. This
argument may be summed up by

Conclusion 2

Other things being equal, a member of insurance A will be more inclined to

demand medical care for the treatment of a minor health problem than a
similar member of insurance B.

Figure 3 depicts the decision problem facing an enrollee of insurer C. While

indifference curves are exactly the same as in Figure 2, budget constraints
differ due to the experience-rated bonuses offered by C. In view of the fact
that next year's bonus depends on this year's utilization of medical services,
dynamic optimization methods are called for in principle (Bellmann, 1957;
Intriligator, 1971, Ch. 13). However, in a very simple special case (no
discounting of future receipts and outlays, identical utility functions in only
two consecutive periods of equal length), a graphical analysis is possible. The
limiting case of indifference serves as the point of departure in the first
period, allowing the optimal decision in the first period to be determined in
the light of the optimal decision made in the second period. Moreover, the
bonus to be reaped in the first year is assumed to amount to three months'
premiums, the same as the fixed rebate offered by insurer B (see Figure 2
above). This implies that the insured did not submit a claim in the previous
Thus, the insured has a choice between two budget constraints in the first
year: If he falls back on insurance, the relevant boundary is CC' C", reflecting
Bonus systems in health insurance 193


\ C"
\ \
\1" \
\ \
\ \

\ J"

A A'

2 3 4 5 6 7 8 9 10 11 12 13 (physician hours
at 50 $)

Figure 3. Incentives contained in an experience-rated bonus option (Insurer C), 2 years

194 P. Zweifel

the fact that insurer C's plans have a deductible of 250 DM (1.67 physician
hours) throughout. If he saves his bonus, the constraint is given by the
straight line starting from point C3+. Should he decide to see the physician in
the first period, he ends up at point C3 - , which is by assumption on the same
level of utility as C3+.
Budget constraints in the second year depend on the decision made in the
first year. If the individual took advantage of the physician's treatment offer
after all, his point of departure in the second year is C3 - in Figure 3. Should
he fall back on insurance once more, the boundary defined by points C3- ,
C3- ' , and C3 - " applies. On the other hand, he could earn a bonus amount-
ing to two monthly premiums if he managed to refrain from consuming
medical care in the second period. But this incentive will not suffice to make
him go without care because the same individual under the same conditions
was just indifferent when the bonus at stake was as high as three monthly
payments. Thus, consuming Ml hours of medical care will rank higher than
the prospect of saving the bonus in the second year, given that ambulatory
care was preferred to the bonus in the first year.
Alternatively, the individual might have saved his bonus in the first year.
In that case, the attainable bonus in the second year will amount to four
monthly premiums, corresponding to the point of departure C4+ on the X
axis of Figure 3. This is to be compared to consuming Ml of care in the
second year, symbolized by point C4 - in Figure 3. Since the individual
previously was indifferent between medical care and the bonus of three
monthly premiums, he will certainly save his bonus of four monthly pre-
miums in the second year. Thus, the optimal decision in the second year is to
refrain from consuming medical care, implying that the total bonus at stake
in the first year is not three but as much as seven monthly premiums. Under
these circumstances, the individual insured by C will refrain from calling on
the physician in both years. This argument leads up to

Conclusion 3

The prospect of saving an even higher bonus in the subsequent year may
induce members of insurance C to refrain from consuming ambulatory care
in the current as well as in the subsequent year whereas a member of
insurance B would have demanded medical care in both periods.

It is important to see that the statement holds true only in the case of minor
illness. Should the physician deem the health problem severe enough as to
warrant 2Ml hours of care in the first period, then even an enrollee of
insurer B would want to take advantage of this offer. In Figure 3, the newly
attainable point is in the neighborhood of C3+" (at M = 12), which certainly
ranks higher than point C4 + on the X axis, according to indifference curve
fl. Alternatively, the insured himself might rate his health problem as serious,
Bonus systems in health insurance 195

which would be reflected by indifference curves sloping down more steeply

than in Figure 3. In that event, an attainable point like c4 - (corresponding to
M\ hours of care) is preferred to point C4 + (reflecting receipt of the rebate).

Empirical results

This section is devoted to empirical tests of the theoretical predictions

derived above. The sample used comprises segments of populations enrolled
by the three insurers, with full coverage policies selected from A and S. As
to insurer C, a mandatory deductible of 250 DM (85 U.S. Dollars at 1985
exchange rates) applies throughout.

Eliminating effects of submission

As a rule, insureds will submit their medical bills only if their total exceeds
the value of the rebate (plus a deductible if applicable), cf. intersection points
Rand U in Figures 2 and 3. Beyond this threshold, the billings distribution
should be complete. Since conclusions 2 and 3 do not refer to the filing
decision but to impacts on actual demand for ambulatory care, analysis of the
billings distribution must focus on values at or above this submission
Submission thresholds are determined for a two-year horizon since only
billings for the years 1981 and 1982 are jointly available for all three
insurers. This threshold is highest for enrollees of insurer C: In 1981, the
bonus at stake could be as high as seven monthly premiums (undiscounted).
For these insureds, the threshold amounts to 'seven monthly premiums plus
deductible 250 DM'. This is a conservative estimate because members who
had not attained maximum bonus in 1982 or did not count on saving their
bonus in subsequent years would have submitted smaller billings to their
insurance as well.
Members of insurances A and B must be assigned virtual submission
thresholds that would be in effect if they were enrolled by C. First, in each
age-sex-cell of insurer C's population, the maximum value of the submission
threshold is determined. Second, these values are assigned to members of
insurers A and B in the same age-sex-cell.
For empirical estimation, a binary dependent variable D82 is defined as
follows: If the billing for ambulatory medical care exceeds the submission
threshold, it takes on the value of 1, otherwise it is O. Table 1 below
documents the construction of this dependent variable, along with some
distributional information. Table 2 contains analogous information concern-
ing explanatory variables. Among other things, it shows that the three
insurers have a different age structure in that the share of members aged 45
to 54 varies between 12% with A and 27% with C.
196 P. Zweifel

Table 1. Dependent variable and auxiliary variables used for their construction, 19S2.

Variable Definition Mean Std. dev.

DS2 = 1: Costs of ambulatory care in 0.29

excess of THRESHOLD
LIMITA Limit above which insureds 0 o
LIMITB submit their medical bills, 771 223
LIMITC defined for subsamples A, B, and C 1534 347
equivalently defined for all
insureds of given age and sex l379 339

Table 2. Explanatory variables.

Variable Definition Means


A B C (*)

Al924 = 1: Age between 19 and 24 years 0.04 0.03 0.03 0.03

A3544 = 1: 35 44 0.33 0.40 0.27 0.33
A4554 = 1: 45 54 0.12 0.19 0.27 0.20
A5564 = 1: 55 64 0.06 O.OS O.IS 0.11
A6599 = 1: Age above 65 years 0.05 0.02 0.15 O.OS
A1924F = 1: Females with A1924 = 1 0.02 0.01 0.01 0.D1
A2534F = 1: A2534 0.l3 0.07 0.03 0.Q7
A3544F = 1: A3544 0.09 0.09 0.09 0.09
A4554F = 1: A4554 0.04 0.06 0.09 0.06
A5564F = 1: A5564 0.02 0.03 0.Q7 0.04
A6599F = 1: A6599 0.03 0.01 0.10 0.05
PRN1 = 1: hospital insurance:
private room 0.31 O.lS 0.31 0.27
PRN3 = 1: hospital insurance:
common ward 0.20 0 0.04 O.OS
INSB = 1: insured by B
Rebate: 3 monthly premiums inc!.
ambulatory, hospital, dental care 0 0 0.31
INSC = 1: insured by C
Bonus: 2, 3, or 4 monthly premiums
(ambulatory care only) depending
on damage experience 0 0 0.3S

DSIA = 1: insured by A and DS1 = 1

(lagged dependent variable) 0.11
DS1B = 1: insured by Band DS1 = 1
(lagged dependent variable) 0.09
DSIC = 1: (insured by C and DS1 = 1
(lagged dependent variable) 0.10

Note. (*) Total sample, used in the estimate shown on the left hand side of Table 3.
Bonus systems in health insurance 197

Short run impact of rebates and bonuses

Table 3 below contains estimation results of logit regression (Harrel, 1980),

using D82 as the dependent variable. On the left hand side of the table,
regressors refer to the current year 1982 only; later on, the list of explana-
tory variables will be expanded to include utilization during the previous
year. Age gradients and sex differentials very much conform to expectations.
Here, discussion will focus on the crucial variables INSB and INSC,
indicating membership of insurer B and insurer C, respectively (enrollees of
A constitute the left-out category). Both coefficients are negative and highly
significant at a confidence level of 0.001. Relative to members of insurance
A, members of Bare 9.4 percentage points less likely to have ambulatory
care costs in excess of the joint threshold value defined above. This finding
confirms Conclusion 2. A similar result holds true for insureds of C, with the
estimated impact amounting to 11.5 percentage points. This differential in
favor of the experience-rated bonus offer is very much in accordance with

Table 3. Probability for ambulatory care costs to exceed the THRESHOLDS defined -
Table 1, 1982.

Variable Coefficient t-Value Coefficient t-Value

A1924 -0.171** -2.62 -0.074 -0.86

A3544 0.056** 2.74 0.034 1.39
A4554 0.078** 3.27 0.029 1.00
A5564 0.180*** 6.38 0.082* 2.41
A6599 0.278*** 7.20 0.167*** 3.41
A1924F 0.239** 2.79 0.D17 0.14
A2534F 0.163*** 6.12 0.096** 2.80
A3544F 0.115*** 4.99 0.069* 2.49
A4554F 0.073* 2.53 0.036 1.03
A5564F 0.008 0.22 -0.020 -0.47
A6599F -0.029 -0.70 -0.020 -0.36
PRIVI 0.066*** 4.81 0.022 1.36
PRIV3 -0.074** -2.83 -0.098* -2.33
VERSB -0.094*** -5.71 -0.054* -2.06
VERSC -0.115*** -7.10 -0.097*** -3.78
D81A 0.380*** 14.28
D81B 0.396*** 14.63
D81C 0.458*** 17.93

Chi2=255/df= 15 Chi 2 = 966/df= 18

N = 5784/CONC = 0.605 N = 4655/CONC = 0.742

Note. Intercepts not shown. Coefficients are estimated partial impacts on probability, derived
from multiplying the parameters of a logit regression by p(1 - p), with p = average probability
(Theil, 1972, p. 169). *(**, ***) denote 0.05 (0.01,0.001) levels of statistical significance; t-
values to be interpreted asymptotically. df: degrees of freedom. N: number of observations.
CONC: share of concordant pairs between predicted and actual observations.
198 P. Zweifel

the prediction of Conclusion 3. An analogous estimation using data for 1981

(not shown) produces quite similar results, admitting of

Conclusion 4

The predictions formulated in conclusions 2 and 3 concerning the incentives

of the three policies examined are confirmed without exception. Thus, the
fixed rebate of insurer B and to an even greater degree the experience-rated
bonus offer of insurer C reduce demand for ambulatory medical care at
billing levels that lie above a conservatively determined submission threshold.

This conclusion can be criticized on three major grounds.

1. It may be argued that members of insurance C have a planning horizon of
less than two years. This would imply that the impact of the bonus is still
intermingled with that of the deductible. However, in research using data
of insurer B only, the effect of the deductible was found to fade out
rather quickly with increasing values of the annual medical bill: A
deductible of 300 DM (100 U.S. Dollars at 1985 exchange rates) does
not appear to have any recognizable effect on billings beyond a threshold
of 1000 DM (330 U.S. Dollars). Among members of insurer A, a
deductible of 250 DM (85 U.S. Dollars) was found to lose its impact
beyond a threshold of as low as 350 DM (120 U.S. Dollars). These low
threshold values should be compared with the submission threshold of no
less than 1379 DM (460 U.S. Dollars) used here, ct. Table 1. Thus, at
billings of 1000 DM and more, a deductible of 250 DM is very unlikely
to have an impact of its own that would bias the estimate of the bonus
2. Attempting to save one's bonus or rebate might well jeopardize health in
subsequent years. Recent research based on the Rand Health Insurance
Study (Brook et ai., 1983) suggests an absence of such negative side
effects. Moreover, the present study deals with ambulatory medical care
only. As soon as hospitalization is envisaged, saving one's rebate or bonus
is out of question as a rule. Nevertheless, some additional empirical
evidence will be presented below concerning the occurrence of a tooth-
saw pattern that would be consistent with too little ambulatory care in the
first year, causing deterioration of health status and higher ambulatory
care outlays in the second year.
3. Since privately insureds can choose their insurer, good risks may conceiv-
ably select the insurance offering the largest bonus for a sequence of years
with no claims. Conclusion 4 would then reflect not the impact of
incentives contained in different policies but merely the effects of self-
selection. However, changing from one insurer to another entails a great
deal of transaction costs, slowing down the process of self-selection. The
plans analyzed in this paper were launched no more than five years prior
to the observation period. Additionally, insurers offering plans that are
Bonus systems in health insurance 199

strongly exposed to moral hazard protect themselves by requiring physi-

cal exams at entry. Nevertheless, the issue of self-selection is important
enough to merit some further empirical investigation.

Intermediate run aspects

Testing the validity of criticisms 2 and 3 in a comprehensive manner would

require a period of observation of five years at least. Effects that are deferred
by more than five years do not count much from an economic point of view
because of discounting to present value. Unfortunately, only data for the
years 1981 and 1982 are jointly available from the three participating
insurers. The results presented in the sequel thus are no more than prelimi-
nary indications of effects expected to hold in the intermediate to long run.
In a multi-period context, demand for medical care in the current year
probably will contribute greatly to explaining medical care consumed in the
following year and possibly in several following years. The reason for this
correlation over time lies with important determinants of demand for medical
services that do not enter insurance records. But the expected amount of
correlation over time depends on the maintained hypothesis.
In particular, if criticism No. 2 obtains, then members of insurance A
should be least characterized by a tooth-saw pattern outlays for medical care.
Given full protection, they have no reason for spending too little on medical
care today, causing higher expenditures tomorrow. In other words, correla-
tion over time in outlays should be very marked among members of
insurance A but less so in the case of insurers B and C, whose enrollees
might be characterized by some degree of tooth-saw pattern of utilization.
This line of thought results in

Conclusion 5

Intertemporal stability of ambulatory medical care consumption should be

highest among members of insurance A under the hypothesis that financial
incentives cause negative side effects on health in the intermediate to long

Turning to the self-selection hypothesis advanced in criticism No.3, there are

two likely effects. To the extent that good risks are attracted by an insurer
offering large rebates and bonuses for no claims, great intertemporal stability
should be observed for insurers Band C. But secondly, rebates and bonuses
may turn marginally bad risks into good ones, with bonuses creating the
incentive of keeping them in that category (see conclusion 4). Thus, the
probability for someone who has consumed medical care in the first period
to abstain from consuming medical care in the second period should be high
among members of insurer B, resulting in a reduced degree of intertemporal
stability. Moreover, the probability of not consuming any care in two
200 P. Zweifel

consecutive years should be highest for insurer C, resulting in maximum

intertemporal stability. This argument can be summed up in

Conclusion 6

If self-selection of risks is an important factor, then intertemporal stability

must be greatest among members of insurance C, followed by insurers B and
A. However, intertemporal stability should also be highest for C if financial
incentives transform marginally bad risks into permanent good ones.

The statements contained in conclusions 5 and 6 can be subjected to a

preliminary empirical test by including an explanatory variable that indicates
whether or not the medical bill of the previous year has exceeded the
pertinent threshold value. These are the variables D81A, D81B, and D81C
appearing in Table 3. On the right hand side of Table 3, it can be seen that
members of insurers B and C still are less likely to exceed equivalent
threshold values then members of insurer A, according to the extended
specification. In the case of insurer B, however, the estimated differential is
reduced to 5.4 percentage points. As could be expected, the three new
regressors D81A, D81B and D81C are highly significant. But the coefficient
pertaining to insurer A (D81A) has a value of 0.380 only which is less than
the one of D81C, amounting to 0.458. This clearly is incompatible with the
prediction of conclusion 5 that a tooth-saw pattern holds among enrollees of
insurers B and in particular C. Interestingly enough, the coefficient of D81B
(0.396) is statistically undistinguishable from the coefficient of D81A
(0.380), whereas the coefficient of D81C is the highest (0.458). This is
exactly what one would have predicted if bonuses have the power to
transform marginally bad risks into permanent good ones. Summing up, there
is justification for

Conclusion 7

Conclusion 5 fails to be confirmed empirically; at least in the intermediate

run, neither insurers B nor C seem to be characterized by a tooth-saw
pattern. Moreover, there is evidence in favor of an 'educational impact' of the
dynamic bonus offer of insurer C, as stated in Conclusion 6.

Concluding remarks

The starting point of this contribution is the fact that there is an alternative to
the negative sanctions contained in conventional health insurance policies.
This alternative takes the form of rebate options and experience-rated
bonuses as offered by German private health insurers. From the consumer's
point of view, these new options may well be preferable to plans featuring
Bonus systems in health insurance 201

deductibles and coinsurance (conclusion 1). At the same time, rebates and in
particular bonuses are predicted to have a dampening impact on demand for
ambulatory medical care (conclusion 2). Moreover, the experience-rated
bonus is predicted to continually reduce demand even more than a roughly
comparable rebate offer (conclusion 3).
These predictions are subjected to empirical tests using insurance files
from three German private health insurers for the years 1981 and 1982.
Great care is taken to eliminate effects on accounted billings that merely
reflect the insured's decision not to submit a bill. Rather, estimated effects on
the billings distribution should reflect modifications of behavior, as studied in
the theoretical model. Insureds exposed to a rebate offer are found to exceed
a threshold value of almost 1400 DM (465 U.S. Dollars) for ambulatory
medical care with a lower likelihood than insureds having a conventional
policy, with age, sex, and supplementary hospital insurance held constant.
This dampening effect is even more pronounced among members of an
insurance that offers an experience-rated bonus (conclusion 4). While
seeking to save their rebate or bonus, insureds might jeopardize their future
health. Statistically, such behavior would give rise to a tooth-saw pattern in
medical care cost (conclusion 5). On the other hand, these estimated impacts
could be due to a mere self-selection of risks. Under this premise, insureds
having either a rebate or a bonus option should have a particularly stable
pattern of utilization over time. Alternatively, a bonus might educate insureds
to become permanently good risks, resulting in very high intertemporal
stability (conclusion 6). An analysis of billings from two consecutive years
leads to rejection of the tooth-saw pattern and self-selection hypotheses while
yielding some evidence in favor of an 'educational effect' of bonuses
(conclusion 7).
Thus, these insurance options with their positive rather than negative
economic incentives can be commended for their dampening impact on
health care cost. In view of the notorious financing problems of almost all
Western social health insurance schemes, experiences made by private
insurers with their innovative plans may well be of relevance to social health
insurance in countries such as Austria, Belgium, France, Germany, and the
Netherlands. Since self-selection always plays a certain role in insurance
markets where consumers have a choice, a conclusive comparison of plans in
terms of their impact on medical care costs would require an experiment of
the type of the health insurance study initiated by the Rand Corporation
(Manning et aI., 1981). Ever since Beck's (1974) study of the effects of cost-
sharing on the poor, the notion that copayment may restrain demand for
medical care much more at lower than at higher income levels has been a
major concern to policy makers. Although this study is based on a high
income segment of the population, the stability of estimated relationships was
investigated by dividing the samples of insurers A and B into three broad
socio-economic groups (Zweifel, 1988). Coinsurance rates, deductibles, and
rebates all turned out to have minimum effect on utilization among enrollees
202 P. Zweifel

of the uppermost group and maximum effect in the lowest group, which is
still very much middle class, of course. However, rebates and bonuses are
defined relative to an insurance premium, and to the extent that lower
income groups buy less insurance, financial incentives for refraining from
medical care consumption are scaled down along with income, which is not
true of deductibles and coinsurance rates. Moreover, these effects are small
compared to primary determinants of demand for medical care such as age
and sex. In conclusion, these innovations in health insurance merit serious
attention in the debate about future reforms of social health insurance.


Beck, R. G. (1974), The Effects of Co-payment on the Poor', Journal of Human Resources
Bellmann, R. (1957), Dynamic Programming, Princeton, N.J.: Princeton University Press.
Brook, R. H. et al. (1983), 'Does Free Care Improve Adults' Health?' Results from a
Randomized Controlled Trial', New England Journal of Medicine 309(23), 1426-1434.
Cairns, J. A. and Snell, M. C. (1978), 'Prices and the Demand for Care', In Cu1yer, A. J. and
Wright, K. G. (ed.), Economic Aspects of Health Services, London: Martin Robertson, pp.
Dionne, G. and Eeckhoudt, L. (1983), Risk A version and State-dependent Preferences, Paper
submitted to the 10th Seminar of the European Group of Risk and Insurance Economists.
Feldstein, M. S. (1973), 'The Welfare Loss of Excessive Health Insurance'. Journal of Political
Economy 81(1), 251-280.
Fuchs, V. R. (1982), 'Time Preference in Health', In Fuchs, V. R. (ed.), Economic Aspects of
Health, Chicago: University of Chicago Press, pp. 93-120.
Harrel, F. E. (1980), 'Procedure Logist', In SAS Supplementary User's Guide. Raleigh. N.C.:
SAS Institute Inc.
Intriligator, M. D. (1971), Mathematical Optimization and Economic Theory, Englewood
Cliffs: Prentice-Hall.
Manning, W. G. et al. (1981), 'A Two-part Model of the Demand for Medical Care:
Preliminary Results from the Health Insurance Study", In Van der Gaag, J. and Perlman,
M. (eds.), Health, Economics, and Health Economics, Amsterdam: North Holland, pp.
Newhouse, J. P., Marquis, K. H. and Morris, C. N. (1981), 'Some Interim Results from a
Controlled Trial of Cost Sharing in Health Insurance', New England Journal of Medicine
Theil, H. (1972), Statistical Decomposition Analysis, Amsterdam: North Holland.
Uleer, C. (1981), 'Erfahrungen der privaten Krankenversicherung mit der Selbstbeteiligung
(Experiences with Cost Sharing madc by Private Hcalth Insurcrs),. Pharmazeutische
Industrie 43(11),1070-1076.
Zweifel, P. (1985), 'Experiences with Rebates for No Claims in Health Insurance', In Goppl,
H. and Henn, R., 3. Tagung fiber Geld, Banken und Versicherungen, Vol. II, Karlsruhe:
Verlag VVW, pp. 1519-1534.
Zweifel, P. (1988), 'Premium Rebates for No Claims: The West German Experience', In
Frech, H. E. III (ed.), Health Care in America, San Francisco: Pacific Research Institute,
Microsimulation of the costs of the health system in
the Federal Republic of Germany

lnsfifutfiir Soziale Medizin, Freie Universitiit Berlin

1. Problems of providing a model of the German health system

During the last years the development of the cost structure of the German
health care system is not quite different from that of other European
countries. Likewise one finds in all countries tendencies to reduce or stabilize
the increase rates of the costs and tendencies to get more information about
what's going on within the field of health care.
Unlike the American or British health system, the German system can be
characterized by two general restrictions: there exist detailed legal provisions
with regard to the scope, structure and type of care of the health service, and
the bulk of the services is taken up by payment in kind.
Two books, the Social Law Code (Sozialgesetzbuch) and the Imperial
Insurance Code (Reichsversicherungsordnung) according with a lot of
administrative regulations, determine:
who has to be insured at the Statutory Health Insurance scheme (SHI,
Gesetzliche Krankenversicherung)
which types of services may be provided and to what extent
according to which criteria the remuneration for the services should be
who is permitted to provide ambulatory (outpatient) and stationary
(inpatient) services
It is also ruled that the patient receives services as a consequence of his
being insured in the SHI and that the physician, the hospital and the
pharmacist must settle their accounts through interconnected collective
institutions with the SHI. The patient is usually kept in ignorance of the costs
which have arisen from his treatment.
This constellation results in considerable deficiencies in information,
which become especially noticeable in times of general economic stagnation
or recession. Due to the construction, that the costs of the health services
arise from the individual use made of those services, both the SHI and the
politicians only become aware of the exact extent of the total expenditure of
the SHI some considerable time after the end of the year. On the other hand,

G. Duru and 1. H. P. Paelinck (eds.), Economelricsoj Health Care, 203-224.

© 1991 Kluwer Academic Publishers.
204 R. Brennecke

the insurance contributions are levied in advance from the insured. Secondly,
the transfer income distribution for groups of people selected according to
certain criteria, cannot be simply conveyed since the patients do not know
the costs of their services. Thus, direct econometric analyses of the use made
of the health services, with prices and services, are not feasible.
Nevertheless, considerable interest is being shown in the development of
models of the health system both by the administration and politicians as well
as by academics. This is due not only to the difficulties with regard to the
planning of the health expenditure and the definition of adequate contribu-
tions but also because of other problems which cannot be gone into in any
detail here. The Ministry of Work and Social Affaires (Bundesministerium
fUr Arbeit und Sozialordnung) engaged a group of experts in 1970 to
envolve a macromodel for the calculation of health expenditure. The study,
with the data and methods then at its disposal, was unable to come to a
satisfactory conclusion. In 1975 the SRI commissioned the German Institute
for Economic Research (Deutsches Institut fUr Wirtschaftsforschung) to
examine the possibilities of creating a model for the ascertainment of the
SRI's expenses. The Institute concluded that a great number of influential
factors, especially on an individual level, had an effect on expenditure and for
that reason the development of a model seemed, at that time, to be impos-
sible. In 1976, a regional model of demand, developed on the system-
dynamic-basis, was published by Klimpke (1976). However, the model was
not directly suited to the estimation of costs. In the meantime an other
macromodel was constructed by Camphausen (1983).
One might say that the quintessence of previous studies reveals that there
is almost no point in developing a model of the German health system based
on macrodata. Changes in the law and shifts in structure can be taken into
consideration only with great difficulty. A microanalytically oriented formu-
lation on the other hand, would appear to be more promising. Here however,
the quality of the results depends decisively on the available data. Research
in the Federal Republic has to face several problems in this respect, in
connection with the data protection laws. Problems, which lead one to search
for second best solutions.

2. The basic structure ofthe microsimulation system of the Sib 3

Fortunately, the Special Collaborative Programme 3 (Sfb 3) of the Univer-

sities of Frankfurt and Mannheim were able to obtain microdata with regard
to employment, income consumption, health complaints and illnesses, visits
to physicians and purchases at the pharmacist's for households and individ-
uals. These data form the basis of the micro simulation model of the Sfb.
Figure 1 shows the simplified structure of the model (Galler and Wagner
1981), which now is implemented at the Free University of Berlin, too. The
basis of the simulation is a representative survey for the Federal Republic
Microsimulation of the costs of the health system 205

I<-_--Idiate stor.

period t employment
1----4---.1 Income

tax and

heal th care
1----+----+---""*--+1 financing

Drawing according to Galler and Wagner, 1981.

Figure 1. Simplified Structure of the Microsimulation Model of the Sfb 3.

consIstmg of 40,000 households and their members. First, the change in

demographic characteristics is simulated for each person of the first house-
hold. The persons will be one year older in the time span chosen by the Sfb
and the possibility of children leaving the parental home, of marriage, birth
and death are stochastically determined. As a result it is possible that a new
household will be formed or that a household disperses. Subsequently, school
educaton and vocational training and changes in employment are simulated.
The income from independent and dependent work, which is reduced by tax
and added to by monetary transfer payments, is calculated on the basis of the
simulation of employment. The next module, the module of health care
financing, simulates health insurance status, insurance company, utilization,
and costs of utilization of health services (see below). Finally the Sfb will
attempt to simulate household expenditure on durable consumer goods as
well as on actual consumption.
The result of this sequence of simulations is a household whose charac-
teristics are changed according to the underlying stochastic hypotheses, or
a new household is created, i.e. through marriage. The simulation goes on
with the next household etc. until the characteristics of all households and
their members are changed. If a household due to, for example, death or
emigration, no longer exists, this fact is merely accordingly marked in order
to portray eventual hereditary transmissions in the following year. After
206 R. Brennecke

completion of the simulations, a new simulation survey is available which

serves as the starting point for the next calculation.
At the present time some of the hypotheses of the model are estimated on
the basis of aggregate data, which are derived from official statistics edited by
the Federal Statistical Office or the Ministry of Work and Social Affairs.
Other hypotheses, such as those modelling health insurance (Baumann and
Brennecke, 1990) and impatient care (Diillings, 1989) were estimated by
using data of the german Socio-economic panel which has started in 1984
(Hanefeld, 1987).

3. Structure of the health module

For the foreseeable future, the development of a microsimulation system of

health costs is possible only on the basis of microdata. Since the survey
always includes both households and individuals, for whom we no longer
need to develop the demographic and employment patterns, the starting
point for the creation of a model is very promising.
The individual in each household, taken as a whole with all his socio-
economic traits, is the starting point for the simulation of health. Figure 2 is
designed to make the connections clear (Brennecke, 1984). On the basis of
the available variables, and later with the help of the results gained from the

heal th estimate determine
costs 01
r - - - - - + i in~~~~unsce length of
Incapaci ty

household determine household
with all typo 01 wltn the
variables health incapacity var iables
of all insurance 01 work? no 01 all
persons funds persons
at por iod at periOd


II estimate
length of
stay and



estimaw estimate
frequencies costs ot
of vlsi 18, prescrip-
and costs tions
01 care

Figure 2. Simplified Structure of the Health Module of the Microsimulation Model.

Microsimulation of the costs of the health system 207

previous period, it is necessary to simulate the type of sickness fund and the
insurance status, because the treatments and the type of fee vary among
different insurances.
The health insurance scheme of the FRG consists - with respect to costs
- of five parts. The most important part is the SHI, which covers about 90%
of the population (Stone, 1980). The SHI is very dismembered and com-
posed of about 1200 independent sickness funds (RVO-funds) and 15
substitute funds (Ersatzkassen). Each fund calculates their contribution-rate
on the basis of their own situation of expenditure. Within the RVO-funds
there are four types of funds: local funds (Ortskrankenkassen), factory funds
(Betriebskrankenkassen), guild funds (Innungskrankenkassen) and agricul-
tural funds (Landkrankenkassen). Miners (Bundesknappschaft) and mariners
(Seekasse) have entirely seperate funds. The four types of RVO-funds have
linked together to State Associations of the funds (Landesverbiinde, 11 for
each type), the State Associations to one Federal Association for each type
of fund (Bundesverbiinde).
About 7% of the population, mostly civil servants and self-employed, but
also some employees, are covered by private health insurance. Only less than
1% are not covered at all. The remaining population is covered by special
schemes, e.g. armed forces, civil servant state subsidy or government grant.
On the other hand two additional systems are present, the Statutory Accident
Insurance (SAl) and the pension insurance funds. All health care, which is
caused by an accident will be payed by the SAl, rehabilitation mostly is
financed by the pension insurance funds.
Statutory Health Insurance and the pension insurance fund are financed
by contributions in the form of earmarked pay-roll taxes (Geissler, 1980).
50% of the contributions are payed by the employer, the rest by the insured.
Depending on their working status employees are obliged to be insured in
the SHI. Blue collar workers are generally insured, white collar workers up to
an annually changing wage ceiling. Family members of the insured persons
are free of charge as long as they are not employed. Until 1982 pensioneers
were covered free of charge, now they have to contribute to the SHI with an
increasing proportion of their pension. The contributions to the SAl are
totally payed by the employers, each employee is automatically insured.
After having determined the insurance for each person, the second
estimation sequence will cover strongly aggregated illness patterns, for each
individual. From the combination of complaints and socio-economic vari-
ables we estimate the number of visits to various consultants or specialists, as
well as to the family doctor and to the dentist (compare Section 4).
The introduction of cost-information on the basis of received treatments is
not that good as it should be. When one studies the structure of the data of
the German health system it becomes clear why we are forced to select a
second best solution for the estimation of costs. The SHI data are available in
computerized form only for some sickness funds, where the conversion to
E.D.P. has already been completed. Nevertheless, the data contain detailed
208 R. Brennecke

information on visits to the doctor, the diagnosis and therapy recommended

by him and the resulting costs. The information on hospital stay with certain
variables and on all prescriptions is likewise stored in the data. However,
exact details for each individual with regard to household type, education,
employment, consumption, tax, and transfer payments etc., are not included
in the data since they are irrelevant to the settlement of the accounts. Thus,
we are unable to combine these data by a statistical linking process with our
data base.
We have therefore chosen to solve the problem by calculating so-called
cost matrices on the basis of SHI data. These matrices, organized according
to specialists, age and sex of the individual, the frequency of visits to the
doctor and type of illness, ought to tell us the average type of treatment and
its costs. These basic costs (in later years the newly introduced classification
by a point-system will be used) ought then to be multiplied by the relevant
rate of increase of the fee in each case. Under the assumption that demand is
accurately simulated, we hope, in this way, to be able to simulate the
ambulatory costs. The number of specialists will be simulated parallel to this
sequence and linked to the simulation of education and employment patterns.
Likewise on the basis of SHI data, we hope to be able to determine
probability distributions for the issue of prescriptions. The prescription costs
ought then to be evaluated according to specific illness patterns. With regard
to hospital admissions we estimated logit-functions including measures of
physician utilization as independent variables. Hospital lengths of stay are
determined by means of OLS-regression functions.
According to our plan so far we have been unable to include a number of
various additional services of the health system. Nevertheless, with the
branches of the health service which are included in the study we hope to
simulate 90% of the total costs and thereby to have taken the major part of
the health system into consideration.

4. Preliminary results of estimation

During the development of the module our work is concentrated on the

estimation of various coefficients, of which only those to simulate the
primary contact and the following visits to a physician should be mentioned
here. Connected with this work a lot of problems have to be solved.
First of all the simulation period of the model covers one year. During this
period one cannot argue that visits to different specialists are independent
from each other. It would be necessary to estimate the coefficients with
simultaneous methods. On the other hand it is impossible to solve a system
of simultaneous equations for each of the nearly 120,000 persons of the
simulation sequence of one year due to time and cost aspects. As alternative
we decided to estimate contacts and frequencies of visits during a quarter of
the year and to simulate four quarters.
Microsimulation of the costs of the health system 209

Secondly it is well known that different factors will influence the primary
contact with a physician and the following visits. Our data base contains a lot
of variables, from which the used are shown in Table 1. The types of
variables are very similar to comparable researches (Wan and Soifer 1974,
1975; Manning et aI., 1981) and comments at this point are not necessary.
For testing the hypothesis of different influences we used a binomial logit
estimator to calculate contacts and regression analysis to estimate the
number of visits.
A next problem was to define sex-specific reference groups with a satisfac-
tory number of cases and a meaningful combination of variables. After
various tests we found for men the group
married and living in a two-persons-household in a region with up to
50,000 inhabitants, fully employed and insured at the local funds, low
degree of education and average state of subjective health and no illness
patterns of the list.
Accordingly the reference group for women was found as
married and living in a two-persons-household in a region up to 50,000
inhabitants, too, but out of work, low degree of education and insured in
local funds due to their husbands (without own contributions to the

Table 1. Explanation of variables.

Abbrev. Meaning

CASES number of cases

REFGRP number of persons belonging to the reference group
R2 Multiple correlation coefficient
RH0 2 relative variation of the loglikelihood
DF degrees of freedom
FSTAT F - statistic of the equation
CONST constant factor
YEAR number of the year 1970 to 1977
QUARTx = 1 if quarter x of the year is true
AGE age of the person
AGE*2 (age/lOf
AGE*3 (age/l0)3
LOG AGE logarithm naturalis of the age
SINGLE = 1 if only 1 person lives in the household
MARRIED = 1 if person is married
OTHERH = 1 if person belongs to an other household
ADULD number of adults in the household
CHILD number of children in the houseold
EDUCAL = 1 if low degree of education
EDUCAA = 1 if average degree of education
EDUCAH = 1 if high degree of education
FULEMP = 1 if fully employed
210 R. Brennecke

Table 1. (continued)

Abbrev. Meaning

PAREMP = 1 if parthly employed

JOBLESS = 1 if unemployed, but no not-employment status
NOTEMP = I if not employment status
FARMER = I if self-employed farmer or their family member
SELEMP = 1 if self-employed person or their family member
CIVILS = 1 if civil servant
WHICOL = I if white collar worker (Angestellter)
BLUCOL = I if blue collar worker (Arbeiter)
PENSION = 1 if pensioneer
OTHESO = 1 if other social status
SHICOM = I if compulsory insured at the Statutory Health Insurance (SHI)
SHI PEN = 1 if as pensioneer insured at the SHI
SHIVOL = 1 if voluntary insured at the SHI
SHIFAM = 1 if person as family member insured at the SHI
PRIV IN = I if fully privately insured
NOT IN = 1 not health insured
LOCFUN = I if insured at local funds (Ortskrankenkassen)
FACFUN = I if insured at factory funds (Betriebskrankenkassen)
OTHFUN = I if insured at guild, agricultural, miners or mariners funds
SUB FUN = I if insured at substitute funds (Ersatzkassen)
COM SMA = 1 if community with less than 5,000 inhabitants
COM AVE = 1 if community with 5,000-100,000 inhabitants
COMLAR = I if community with more than 100,000 inhabitants
PHPOR physician-plopulation-ratio
METABO disturbances of metabolism
NEURO sleeplessness. exhaution, nervous diseases
CIRCU circulatory, coronary disturbances
RESPIR respiratory complaints
TEETH toothache etc.
DIGEST indigestion
KIDNEY diseases of the kidneys, cystitis, abdomen
CUTAN cutaneous diseases
SKEMUS diseases of the skeleton, muscular tissue
.VGOOD = I if subjective state of very good health
GOODH = 1 if subjective state of good health
AVERAH = 1 if SUbjective state of average health
BADH = I if subjective state of bad health
VBADH = 1 if subjective state of very bad health
GENPAR general practioner
INTERN specialist of internal diseases
GYNACO gynacologist
EAR NOS ear, nose, and throat specialist
DERMAT dermatologist
RADIOL radio logist
ORTHO orthopedist
UROLO urologist
OCULI oculist
DENT dentist
Table 2. Results of the logit-estimation of contacts with physicians, men older than 13 years.


CASES 18916 18916 18916 18916 13357 7126 7126 7126 18916
REFGRP 146 146 146 146 83 37 37 37 124
RH0 2 0.12 0.12 0.06 0.25 0.08 0.11 0.23 0.05 .18
DF 38 38 38 38 38 38 38 38 39


CONST 0.52 3.6 5.04 18.0 4.40 11.0 5.21 10.0 4.21 10.0 4.66 6.1 7.75 6.9 4.16 6.9 3.98 18.0 (3
Time factor :::::
JEAR 0.01 1.3 0.04 3.4 0.04 2.1 0.09 3.4 om 0.4 0.15 2.3 0.22 2.6 -0.08 1.5 0.07 7.0 ~
Predisposing factors
AGE -0.Q3 3.9 0.03 2.8 -0.02 1.0 -om 0.7 0.02 0.8 -0.04 1.2 0.04 1.1 0.02 1.0 0.05 5.0 .Q.,
AGE*2 0.04 4.9 -0.Q3 2.3 0.01 0.4 om 0.3 -0.03 1.3 0.01 0.5 -0.01 0.3 -0.01 0.2 -0.07 5.7
SINGLE -0.06 1.2 -0.02 0.3 -om 0.0 0.14 0.9 0.14 1.0 -0.51 2.0 -0.12 0.4 0.27 1.7 -0.14 1.8 t")
MARRIED * * * * * * * * *
OTHERH -0.08 1.4 -0.28 2.5 -0.71 4.0 0.02 0.1 0.50 3.6 -0.47 1.9 0.02 0.1 0.57 3.2 -0.12 1.5
EDUCAL * * *
* * * * * * ...;:;-
EDUCAA -0.03 0.6 0.48 6.4 0.09 0.7 0.41 2.7 -om 0.0 0.09 0.5 0.56 2.3 0.23 1.5 0.16 2.5 n:.
EDUCAH -0.28 4.7 0.63 7.0 0.59 4.3 0.40 2.2 0.16 1.0 0.34 1.5 0.23 0.7 0.06 0.3 0.42 5.6 ;:;-
FULEMP * $::>
* * * * * * * * :::.:-
PAREMP 0.31 1.9 0.31 1.3 0.57 1.7 0.03 0.1 0.18 0.4 -0.33 0.4 0.98 1.3 0.38 0.7 0.48 2.2 ;:;-
JOBLESS -0.15 0.9 0.04 0.1 0.03 (1.1 -(LOS 0.1 -0.05 0.1 0.60 1.4 0.09 0.1 -0.17 0.4 -1.25 4.0 ~
NOTEMP 0.21 1.2 (l.O I 0.0 1.25 2.5 0.58 1.3 -0.11 0.2 -1.39 1.6 13: 0.0 -0.02 0.0 0.32 1.5 '"...n:.
FARMER -0.61 4.9 -0.14 0.5 -1.47 1.9 -1.98 1.4 -0.54 1.4 -0.81 0.8 -1.01 0.9 0.01 0.0 -0.43 2.1 ~
SELEMP -(U8 4.4 0.14 1.0 0.04 0.2 -0.52 1.6 -0.44 1.7 -0.56 1.3 0.47 1.0 -0.54 1.6 -0.07 0.5
CIVILS 0.07 0.7 0.34 -0.18 -0.29 -0.43 -0.44 2.6 N
2.3 0.8 0.9 -0.51 2.0 1.1 0.36 0.7 1.4 0.31 ,......

Table 2. (continued) ~

WHICOL -0.13 2.2 0.11 1.1 0.05 0.3 -0.29 1.3 -0.20 1.3 0.13 0.5 -0.20 0.6 -0.36 1.6 0.30 3.7 8""
BLUCOL * * * * * * * * *
PENSION -0.25 1.2 0.08 0.2 -1.84 2.9 -1.95 2.4 0.18 0.3 0.70 0.8 -12'- 0.0 -0.01 0.0 -0.16 0.5
OTHER SO 0.10 1.1 -0.26 1.1 -1.07 2.8 -0.05 0.2 0.19 0.8 -0.05 0.1 -13.- 0.0 0.03 0.1 0.29 2.2

Enabling factors
SHICOM * * * * * * * * *
SHIPEN 0.08 0.5 0.12 0.5 0.83 1.7 1.48 2.0 -0.57 1.3 0.82 1.4 0.62 0.9 -0.03 0.1 -0.03 0.1
SHIVOL -0.10 1.6 0.14 1.4 0.23 1.5 0.20 0.9 0.22 1.2 0.30 1.1 -0.14 0.4 0.36 1.6 0.13 1.6
SHIFAM -0.53 3.2 0.38 1.3 -0.01 0.0 -0.21 0.5 -0.06 0.1 1.93 2.6 -0.43 0.4 -0.24 0.5 -0.14 0.7
PRIVIN -0.34 3.6 0.43 3.0 0.08 0.3 0.56 1.8 -0.02 0.1 0.83 2.2 -0.34 0.6 0.38 1.1 0.34 2.9
NOT IN -0.63 3.0 0.51 1.7 0.20 0.4 -0.61 0.5 -0.34 0.5 0.60 0.6 -1.19 0.8 0.50 0.8 0.12 0.4
LOG FUN * * * * * * * * *
FACFUN 0.09 1.9 0.12 1.4 0.01 0.1 0.12 0.7 0.18 1.4 -0.17 0.8 0.15 0.6 0.12 0.7 0.06 0.9
OTHFUN 0.06 1.0 -0.38 3.4 -0.20 1.3 -0.26 1.3 0.24 1.7 -0.31 1.2 -0.16 0.4 -0.10 0.5 0.09 1.1
SUB FUN -0.03 0.6 0.24 3.0 0.08 0.7 0.14 0.9 -0.05 0.4 0.03 0.2 0.18 0.7 0.26 1.6 0.18 2.6
COM SMA * * * * * * * *
COM AVE -0.15 3.8 0.48 6.2 0.50 4.3 0.55 3.6 -0.10 0.8 0.62 3.1 0.12 0.5 0.24 1.6 0.14 2.4
COMLAR -0.30 7.3 0.80 11.0 0.52 4.4 0.70 4.6 0.24 2.1 0.74 3.6 0.14 0.6 0.44 2.9 0.26 4.3

Morbidity (need factors)

METABO 0.64 6.0 0.64 5.8 -0.18 0.7 0.14 0.5 -0.17 0.7 0.19 0.6 0.24 0.7 0.17 0.7 0.03 0.2
Table 2 (Continued)


NEURO 0.08 2.0 0.13 2.1 -0.01 0.0 0.11 0.9 0.31 3.1 -0.17 1.1 0.21 1.1 0.37 3.0 0.16 2.9
CIBCU 0.61 16.0 0.53 8.3 0.10 1.0 -0.16 1.2 0.37 3.6 -0.34 2.1 0.07 0.3 0.13 1.0 0.19 3.3 r:)'
RESDIR 0.44 13.0 -0.07 1.2 0.89 9.8 -0.02 0.2 0.13 1.4 -0.18 1.3 -0.13 0.7 0.05 0.5 0.Q1 0.1
DIGEST 0.39 9.9 0.42 7.1 0.23 2.5 0.09 0.7 0.61 6.5 -0.13 0.8 0.07 0.4 0.22 1.8 0.07 1.2 ....'"~
KIDNEY 0.33 4.5 0.42 4.7 0.10 0.7 0.16 0.8 0.73 5.5 -0.22 0.8 2.75 14.0 0.33 1.8 0.08 0.8 0:::
CUTAN 0.09 1.1 -0.22 1.7 0.19 1.1 3.38 30.0 -0.25 1.3 -0.12 0.4 0.11 0.3 0.21 1.0 0.25 2.5 i:i'
SKEMUS 0.23 6.2 0.10 1.7 0.18 1.9 -0.09 0.7 0.30 3.0 1.51 9.6 0.03 0.2 0.09 0.7 0.10 1.9 6'
TEETH x x x x x x x x 2.34 46.0
V GOON -0.33 5.9 -0.32 2.5 -0.27 1.6 -0.22 1.1 -0.42 2.1 -0.77 2.2 -1.09 1.8 -0.22 1.0 -0.06 0.8 .Q,
01< 01< 01< 01< So
GOODH * * * * * ~
AVERAH 0.38 9.8 0.55 7.4 0.28 2.6 0.18 1.3 0.25 2.1 0.73 4.2 0.02 0.1 0.20 1.5 -0.08 1.3
BADH 0.79 13.0 0.84 8.9 0.53 3.5 0.22 1.1 0.65 4.4 1.13 4.7 -0.30 1.0 0.36 1.9 -0.04 0.5 ~
VBADH 1.04 11.0 1.00 8.4 0.77 4.0 0.19 0.7 0.80 4.2 1.46 5.0 -0.27 0.7 0.29 1.2 -0.40 2.9
* Variable of the reference group. ;:::r.
x Variable not defined. ~
, Coefficient results from small cell frequencies. §:
Source: Logit estimations from 47,000 questionaires of 24 surveys during the years 1970 to 1977. The analysis was done by Pia Zollmann.

"{'Ihle 3. Results of the logit-estimation of contacts with physicians. women older than 13 years. N

CASES 27H60 27H60 27H60 27860 27H60 19740 10490 10490 10490 27H60 ~
REFGRP 140 140 140 140 140 93 41 41 41 126 ~
RHO' 0.11 0.11 0.14 O.OH 0.24 0.07 0.10 0.19 0.06 0.16
DF 3H 3H 3H 3H 3H 38 38 38 38 39 8"

CONST 0.53 5.3 5.6 31.0 3.45 24.0 4.61 17'.0 4.33 14.0 4.64 15.0 6.23 11.0 9.46 9.9 3.05 8.4 2.97 22.0

Time factor
JEAR 0,0\ 0.9 0.06 6.3 0.11 14.0 0.01 0.2 0.05 2.5 0.06 2.4 O.OS 1.5 0.24 2.9 -0.01 0.4 0.05 5.6

Predisposing factors
AGE -0.01 3.0 0.05 6.7 0.08 11.0 -0.01 0.9 -0.02 1.6 0,0\ 0.4 0.04 2.1 0.08 2.5 -0.03 2.4 0.03 4.9
AGE*2 0.02 5.0 -0.04 5.6 -0.13 15.0 -0.01 0.3 0.01 0.4 -0.02 1.4 -0.06 2.6 -0.09 2.4 0.03 2.7 -0.05 6.9
SINGLE 0.02 0.6 -0.04 0.7 -0.16 3.0 0.09 0.9 -0.02 0.1 0.10 0.9 -0.05 0.3 0.21 0.8 -0.08 O.S -0.21 3.7
OTHER II -0.06 1.3 0.09 1.1 -0.26 3.9 -O.OS 0.6 0.03 0.2 0.08 0.6 0.03 0.2 0.10 0.3 -0.19 1.4 -0.26 3.9
EDUCAA -0.14 4.0 0.54 10.0 0.17 4.0 0.27 3.1 0.36 3.4 0.20 2.1 0.71 5.3 0.24 1.0 0.44 4.6 0.45 10.0
EDUCA II -0.32 5.0 0.75 8.6 0.36 5.0 0.33 2.3 0.64 3.9 0.13 O.S 0.12 0.4 -0.46 1.0 0.54 3.3 0.49 6.6
FULEMP 0.24 2.1 -0.23 1.2 -0.53 3.7 0.05 0.2 -0.69 1.9 -0.33 1.0 -0.2S 0.4 0.98 1.0 0.55 1.4 0.07 0.5
PAREMP 0.22 2.0 -0.16 0.8 -0.25 1.8 (l.O9 0.3 -0.44 1.2 -0.24 0.7 -0.57 0.8 0.86 0.9 0.63 1.5 OJ I 0.8
JOBLESS 0.13 0.7 -0.12 0.4 -0.30 1.4 0.16 0.4 -0.30 0.5 0.04 0.1 -0.21 0.2 0.42 0.3 0.22 0.4 0.03 0.1
FARMER -0.33 2.3 -0.44 1.3 -0.34 1.3 -14.' 0.1 -0.45 0.6 0.18 0.3 -0.21 0.2 -13.' 0.0 -0.04 0.1 -0.63 2.6
SEL EMP -0.34 2.0 0.22 I.l 0.06 0.4 0.27 0.9 -0.41 0.9 -0.15 0.4 0.36 0.5 -1.86 1.6 -0.51 1.1 -0.07 O.S
CIVILS 0.21 1.4 0.07 0.3 0.57 3.2 0.57 1.7 0.21 0.5 0.48 1.2 0.34 0.4 0.04 0.0 -0.44 0.9 -0.04 0.2
Table 3. (continued)


WHI COL -0.18 2.1 0.12 0.7 (UI 2.7 0.08 0.3 -0.01 0.0 0.19 0.7 0.10 0.2 -0.94 1.6 -0.32 1.0 0.05 0.4
BLU COL -0.10 1.1 -0.02 0.1 (UO 0.8 0.02 0.1 -0.04 0.1 0.28 0.9 0.02 0.0 -1.0 1.6 -0.13 0.4 -0.03 0.3
PENSION 0.22 3.1 -0.07 0.7 0.01 0.1 0.07 0.4 -0.16 0.6 -0.10 0.5 -0.21 0.6 0.43 0.7 0.34 1.7 -(J.09 0.8
OTHER SO * * r)'
Enabling factors '"§.
SHICOM 0.01 0.1 0.29 2.2 0.16 1.5 -0.18 0.8 0.62 2.1 0.01 0.0 0.20 0.5 0.36 0.4 -0.20 0.7 0.02 0.2 :;::
1.6 -0.03 0.2 -0.14 1.2 -0.05 0.2 0.49 1.5 0.08 0.1 0.25 1.4 0.13 1.2 !:)
SHI PEN -0.08 1.2 0.16 0.7 0.30 ...
SHI VOL -0.20 2.1 0.23 1.7 0.04 0.4 -0.38 1.6 0.47 1.7 0.13 0.6 0.08 0.2 -0.09 0.2 -0.01 (J.O 0.06 0.5 C·
PRIV IN -0.56 8.7 0.68 7.3 0.23 2~ ~06 0.4 0.10 0.5 0.18 1.0 0.29 1.1 0.19 0.4 -0.01 0.1 0.35 4.4
NOTIN -0.82 4.7 -0.25 0.7 -0.20 Oh ~25 0.6 -14: 0.1 -1.48 1.3 0.45 0.4 -12.0 0.0 -0.51 0.7 -0.21 0.7
FAC FUN -0.01 0.1 0.14 2.1 0.11 2.1 0.11 1.0 0.13 1.0 -0.06 0.5 0.15 0.8 O.S2 3.1 -0.02 0.2 0.n6 n.9 ~
OTH FUN -0.06 1.2 -0.02 0.3 OJ)7 I.l 0.19 1.4 -0.09 0.5 -OJl3 0.2 0.05 0.2 0.22 0.6 -0.08 0.6 (LOS 0.7
0.3 O.2S 5.9
SUB FUN -0.10 2.8 0.46 8.2 0.37 8.3 0.37 4.0 0.06 0.6 0.19 1.9 0.28 1.9 0.53 2.2 0.03

COM AVE -0.17 5.3 0.50 8..1 0.27 5.9 0.31 3.1 0.29 2.6 0.18 1.8 0.20 1.2 O.OS 0.3 0.18 1.7 0.1 1 2.3 ::::-
COM LAR -0.44 13.0 0.87 15.0 0.61 14.0 0.66 6.9 0.44 3.9 0.32 3.2 0.49 3.0 (US 0.7 0.30 2.8 0.19 4.2 ~
Morbidity (need factors) ~
METABO 0.67 8.5 0.37 4.5 (1.07 0.6 0.14 O./l -0.06 0.3 -0.10 0.6 -0.35 1.3 -O.S5 1.2 0.34 2.4 -0.10 0.9 ~


Table 3 (Comiflued) ~

NEURO 0.07 2.3 0.19 3.9 0.21 5.3 0.20 2.5 0.17 1.7 0.49 5.3 0.08 0.6 0.01 0.0 0.35 4.0 0.19 4.7
CIRCU 0.61 21.0 0.59 12.0 0.11 2.8 0.26 3.1 0.13 1.4 0.12 1.3 -0.13 1.0 -0.28 1.4 0.23 2.6 0.11 2.7
RESPIR 0.33 12.0 -0.10 2.3 -0.Q3 0.9 1.01 13.0 -0.01 0.1 0.28 3.7 -0.19 1.6 0.14 0.8 -0.01 0.1 0.11 3.0
DIGEST 0.21 6.9 0.35 8.0 0.19 4.8 0.05 0.6 0.13 1.4 0.43 5.5 0.14 1.1 0.01 0.1 0.26 3.2 0.10 2.5
KIDNEY 0.21 4.9 0.17 3.0 1.10 25.0 -0.15 1.6 -0.17 1.4 0.24 2.6 -0.24 1.5 2.61 13.0 0.16 1.5 0.01 0.1
CUTAN 0.06 1.0 0.21 2.4 0.06 0.8 0.49 4.1 3.26 38.0 0.14 1.0 -0.09 0.4 -0.15 0.4 0.37 2.6 0.19 2.5
SKEMUS 0.12 3.8 0.19 4.1 0.06 1.5 0.38 4.8 -0.09 0.9 0.35 4.2 1.27 9.1 -0.27 1.3 0.42 5.0 0.17 4.2
TEETH x x x x x x x x x 0.02 50.0
V GOOD -0.49 8.9 -0.33 2.8 -0.12 1.9 -0.54 3.0 -0.21 1.3 -0.33 1.6 0.32 1.2 0.36 0.8 -0.56 2.8 0.09 1.4
GOODH * * * * * *
AVEPAH 0.35 11.0 0.38 6.3 0.07 1.7 0.11 1.1 -0.07 0.6 0.25 2.3 0.50 2.9 0.46 1.7 -0.13 1.3 -0.09 2.0
BADH 0.71 16.0 0.73 10.0 -0.01 0.2 0.38 3.2 0.17 1.2 0.79 6.4 0.98 4.9 1.00 3.3 -0.10 0.8 -0.11 1.8
VBADH 0.72 10.0 0.93 10.0 -0.14 1.5 0.57 3.7 0.17 0.8 1.09 7.1 1.61 7.0 1.14 3.1 -0.13 0.7 -0.44 4.2

• Variable of the reference group.

x Variable not defined.
• Coefficient results from small cell frequencies.
Source: Logit estimation from 47,000 questionaires of 24 surveys during the years 1970 to 1977. The analysis was done by Pia Zollmann.
Table 4. Results of the regression of frequencies of visits, men older than 13 years.


CASES 9547 1883 703 481 668 277 172 448 3286
REFGRP 209 32 14 7 12 4 3 5 51
FSTAT 100.2 18.9 3.4 4.9 5.5 4.7 3.6 3.9 19.4
R2 0.25 0.17 0.14 0.12 0.18 0.21 0.11 0.06


CONST 4.77 3.6 7.10 3.3 1..76 0.5 -5.66 1.1 7.49 2.6 -23.7 1.0 -3.40 0.2 4.80 0.5 0.96 0.7 C5

Time factors
YEAR -0.05 2.7 -0.06 2.1 0.03 0.6 0.11 1.6 -0.08 2.1 0.29 0.9 0.09 0.3 -0.06 0.5 0.01 0.5 $S"'
QUART2 0.05 o.s -0.15 0.7 -0.37 0.9 -0.02 0.1 -0.14 0.6 -2.82 1.1 0.26 0.1 1.85 1.6 0.13 0.9 0·
QUART3 -0.21 2.4 0.05 0.3 0.92 2.6 0.03 0.1 0.41 1.5 -0.66 0.3 -1.19 0.6 1.66 1.7 -0.19 1.6
QUART4 0.22 2.5 -0.49 2.5 0.56 1.6 -0.90 2.3 0.01 0.1 -1.40 0.6 0.70 0.3 -1.60 1.4 0.14 1.2 ~
Predisposing factors r,
AGE '"/;;'
AGE*2 0.01 2.5
LOG AGE 0.60 3.5 3.24 2.9 0.88 2.0 ~
SINGLE 0.37 1.8 0.25 2.0 S.

MARRIED * * * * * ~
* * * ~
OTHER II 0.52 2.3 0.24 1.7 1.69 2.7 1.19 1.8 ~

ADULT -0.14 1.8 0.18 3.3 ~

CHILD -0.08 3.7 -0.09 1.7 -0.05 1.9 ~
EDUCAL * * * * * * * * * '";:s~
EDUCAII -0.17 2.0 -0.45 3.1 -0.30 2.1
FULEMP * * * * * * * * * N
PAREMP 0.79 1.7 ......

"lid,l" 4. (continlled)
HOTEMP 0.59 9.0 -0.54 2.0 r"l
SELEMP -0.68 1.9 1.09 2.3
CIVILS 0.23 2.2 -0.63 1.8
>I< >I< >I< >I< >I< >I<
BLUCOL * * *
PENSION 0.31 2.4 1.02 2.4 -1.31 2.4
OTHESO -0.23 3.0 0.55 1.7 -0.23 1.9 -5.63 2.2 0.77 3.3

Enabling factors
>I< >I< >I< >I< >I< >I< >I< >I< >I<
SHIPEN 0.63 1.9 -0.80 1.8
SHIVOL -0.24 3.1 -0.56 2.1 0.33 2.4 -0.94 2.1
SHIFAM -0.28 2.1 -0.69 2.0 7.61 3.2
PRIVIN -0.47 4.5 -0.31 1.9 -0.60 1.9
NOT IN 2.36 3.0 -0.69 1.9
>I< >I< >I< >I< >I< >I<
LOCFUN * * *
OTHFUN 0.15 2.0 0.33 1.7
SUB FUN -0.18 2.9 -0.25 1.9 0.74 1.9
>I< >I< >I< >I< >I< >I<
COM SMA * * *
COMLAR 0.10 2.0 0.58 2.6 0.80 3.6 -0.26 1.8
PHPOR -0.20 3.2 -0.15 1.8
Table 4 (Continued)


Morbidity (need factors)

METABO 0.40 3.9 0.55 2.9 x
NEURO -0.52 2.6 x
CIRCU 0.44 8.8 0.70 6.2 -0.41 2.0 x
RESPIR -0.12 2.7 0.28 2.7 0.35 1.8 x
DIGEST 0.13 2.6 -0.22 2.2 -0.58 1.7 x
KIDNEY 0.22 2.7 x ~
CUTAN 0.20 2.0 -0.55 2.2 0.93 4.8 x
SEEMUS 0.11 2.2 1.56 4.1 -0.36 2.3 x C5
TEETH x x x x x x x x 0.7Sx 12.0 '"
VGOOD -0.55 5.7 -0.72 2.0 :;::
GOOD II -0.35 6.0 -0.24 1.7 -0.84
3.9 -0.19 2.9 :=to
AVERA II * * * * * * * * * ~
BAD II 1.07 16.0 0.76 5.4 0.52 1.7 0.93 2.1 ..Q.,
V BAD II 2.30 26.0 1.70 9.4 0.90 5.6 1.12 1.9 0.54 2.0 ~
Contacts with other physicians ~
GENPRA x -l.02 8.7 -0.35 1.8 -0.22 1.9 -0.84 2.4 -1.0 2.9 x 1;;-
INTERN 0.17 1.7 x -1.16 2.3 x ..Q.,
EAR NOS 0.28 2.3 x x ~
DERMAT -0.67 1.7 x 0.90 3.7 -1.90 2.2 0.71 2.2 x ;:;-
RADIOL 0.72 7.1 0.45 2.2 x -0.41 1.8 x ~
ORTHO 0.76 4.0 x x :::.-
* Variable of the reference group.
x Variable not defined. tv
Source: 47,000 questionaires of 24 surveys during the years 1970 to 1977. \0
Table 5. Results of the regression of frequencies of visits, women older than 13 years. N
CASES 12892 2652 4384 806 631 727 304 125 740 4297
REFGRP 361 42 130 13 14 15 6 3 8 106 ;3
FSTAT 134.7 32.4 13.9 4.6 6.2 4.6 3.7 4.1 3.8 25.6
R2 0.19 0.18 0.07 0.G7 0.12 0.07 0.21 0.34 0.05 0.06

CONST 4.62 5.6 4.58 2.6 6.49 6.5 3.81 1.0 4.81 1.2 -1.75 0.7 37.2 1.5 3.89 0.2 6.54 1.0 3.10 2.3

Time factors
YEAR -0.03 2.8 -0.02 1.0 -0.03 2.4 -0.06 1.3 -0.05 0.9 0.02 0.5 -0.46 1.4 -0.12 0.4 -0.07 0.8 -0.03 1.5
QUART2 -0.12 1.4 -0.09 0.5 0.17 1.8 0.64 1.8 -0.70 1.7 -0.09 0.4 3.14 1.2 2.71 1.1 0.67 0.8 0.08 0.6
QUART3 -0.14 2.0 -0.37 2.4 -0.01 0.0 0.31 1.0 -0.12 0.3 -0.16 0.6 3.09 1.5 4.94 2.4 0.56 0.9 -0.10 0.9
QUART4 -0.28 3.7 -0.43 2.5 -0.01 0.1 -0.46 1.4 0.21 0.6 0.03 0.1 -3.73 1.7 -6.51 2.7 -0.33 0.5 -0.01 0.1

Predisposing factors
AGE' 1 0.01 4.4 0.01 2.1 -0.Ql 2.5 -om 2.1 0.01 4.1 0.01 2.9
LOG AGE -1.54 5.7 2.05 2.6 0.98 2.0 1.16 2.7 4.12 3.6 0.44 2.6
SINGLE -0.23 2.7 1.36 3.2
MARRIED • * * * * * * *
OTHERH 1.23 2.3 0.30 1.7
ADULT -0.09 3.1 -0.11 2.5 -0.13 2.0
CHILD -0.04 1.8 0.36 1.8
EDUCAL * * * * * * •
FULEMP -0.36 2.2 -0.29 2.9 -1.26 1.8
Table 5. (continued)


PAREMP -0.49 2.6 -0.31 2.9 -1.92 2.3

HOTEMP * * * *
SELEMP 0.29 1.7 3.73 3.1
CIVILS -1.09 2.3
WHICOL 0.18 1.9 0.13 1.8 ~
BLUCOL 0.40 3.5 0.18 1.7 (:i'
PENSION 0.23 3.2 d
OTHESO * * * * * * '"§"
Enabling factors S'
SHICOM 0.33 2.0 1.17 1.7 0
SHIPEN 0.41 3.7 0.53 1.9 -1.16 2.2 -1.23 2.3
SHIVOL -0.27 2.2
* * * * * *
PRIVIN -0.41 4.4 §
LOCFUN * * * * * * .Q,
FACFUN 1.13 2.7
OTHFUN 0.41 2.5 1.33 2.1
SUB FUN -0.16 3.3 -0.16 1.7
COM SMA * * * * * * ""-
COMLAR 0.15 3.4 0.23 2.3 So
PHPOR 0.06 2.2 ~
Morbidity (need factors) ~
METABO 0.66 7.9 0.70 4.2 1.18 2.2 x tv

Table 5. Results of the regression offrequencies of visits women older than 13 years. N
NEURO 0.10 2.3 x
CIRCU 0.45 9.8 0.23 2.2 0.38 1.8 ;S
-0.68 1.8 x (1)
PESPIP -0.13 2.8 0.87 2.5 x
DIGEST 0.19 4.3 x
KIDENY 0.13 2.2 0.55 11.0 -1.31 2.7 0.81 2.2 x
CUTAII 0.50 1.8 0.85 4.5 1.37 2.0 x
SKEMUS 0.19 4.3 0.22 2.4 x
TEETH x x x x x x x x x 0.90 14.0
V GOOD -0.55 5.2 -0.55 2.1 -0.34 3.9 -0.77 1.8 -0.67 1.8 -1.32 1.7 -0.27 2.6
GOOD II -0.34 6.4 -0.63 5.0 -0.10 1.8 -1.11 2.3 1.01 2.2
AVERA II * * * *
BAD II 1.00 19.0 1.05 9.5 0.21 3.0 0.56 2.1 0.40 3.7 0.73 1.7 0.32 3.5
VBADII 2.06 25.0 1.54 10.0 0.59 5.1 0.61 2.0 0.88 2.1 0.57 3.9 1.47 2.8 0.86 1.7 0.38 2.0 0.73 4.2

Contracts with other physicians

GENPRA x -1.26 13.0 -0.23 4.9 -0.44 2.2 x
INTPN 0.19 2.2 x -0.25 3.7 -0.91 3.2 -0.99 2.6 x
GYNACO x -0.43 2.2 -0.68 3.3 -0.20 2.1 x
EAR NOS 0.44 2.5 x x
DEPMAT x 0.41 2.1 x
RADIOL 0.75 7.2 x -1.16 2.5 x

• Variable of the reference group.

x Variable not defined.
Source: 47,000 questionaires of 24 surveys during the years 1970 to 1977.
Microsimulation of the costs of the health system 223

insurance). They should state their subjective health as average, too, and
should claim no illness patterns of the list.
The logit results of contacts with physicians of various specialities are shown
in Tables 2 and 3. As expected the strongest influence factors on primary
contacts to a physician are illness patterns and the subjective state of health.
Besides these factors the other variables are not so important, but one can
find additionally significant coefficients. Age for instance is - varying among
the different specialities - important, but the employment status scarcely.
The social status of men is nearly irrelevant, that of women seems to have an
influence on contacts with general practioners and gynacologists.
As Tables 4 and 5 show, the results of the frequencies of visits are a little
bit different from those of contacts. The correlation coefficients (multiple R2)
of all regressions are relatively low. Besides the effect of microdata regres-
sion (Yett et al., 1979; Dworschak and Wagner, 1983; Galler, 1983) one
reason may be that variables, which represent physician-induced demand, are
scarcely included at all. The R2 of the regression of frequencies of visits to
oculists with 0.11 could· show this particulary clearly. Normally one visit
is sufficient to establish the state of the eyes and to be issued with new
spectacles. Only diseases of the eyes necessitate subsequent visits, and the
oculist requests the patient to return for a further visit. We are unable to
include variables which represent this kind of demand.
For all specialists specific illness patterns are generally significant, the
subjective state of health only in some regressions. The age and the contact
with other specialists are significant factors, too.
An interpretation of all coefficients of our analysis would be beyond the
scope of the paper (but compare Zollmann and Brennecke, 1984). Our
ongoing work will be concentrated on introducing physician-population-
ratios into the logit analysis and on combining the results with average costs
per visit, as described below.


Baumann, M. and Brennecke, R. (1990), Das Krankenversicherungsmodul im Mikrosimula-

tionssystem des Sjb 3, Sfb 3-Arbeitspapier Nr. 312, Frankfurt, Mannheim.
Brennecke, R. (1984), Zur Konstruktion des Gesundheitsmoduls im Mikrosimulationssystem
des Sonderforschungsbereichs 3, Sfb 3-Arbeitspapier Nr. 145, Frankfurt, Mannheim.
Campbausen, B. (1983), Auswirkungen demographischer Prozesse auf die Berufe und die
Kosten im Gesundheitswesen, Springer-Verlag Berlin, Heidelberg, New York, Tokyo.
Diillings, J. (1989), Ein Mikrosimulationsmodell der stationiiren Krankenversorgung - Daten-
grundlage und Hypothesenstruktur, Sfb 3-Arbeitspapier Nr. 289, Frankfurt, Mannheim.
Dworschak, F. and Wagner, G. (1983), Zur Fortschreibung von Erwerbseinkommen in
Mikrosimulationsmodellen, Das Beispiel des Sfb 3 Modells, Sfb 3-Arbeitspapier Nr. 80
Frankfurt, Mannheim.
Galler, H. P. (1983), Zur Erkliirung des individuellen Arbeitsangebotes, Sfb 3-Arbeitspapier
nr. 104. Frankfurt, Mannheim.
224 R. Brennecke

Galler, H. P. and Wagner, G. (1983), 'Das Mikrosimulationsmodell', In Krupp et al. (ed.),

Alternativen der Rentenreform '84, Campus-Verlag, Frankfurt, New York.
Geissler, U. (1980), 'Health Care Cost Containment in the Federal Republic of Germany', In
Brandt et al. (ed.), Cost-Sharing in Health Care, Springer-Verlag, Berlin, Heidelberg, New
Klimpke, W. A. (1976), Dynamische Systemanalyse der ambulanten und stationiiren Kranken-
versorgung einer Region, Wahl-Verlag, Karlsruhe.
Manning W. G. et al. (1981), 'A Two-Part Model of the Demand for Medical Care: Prelimi-
nary Results from the Health Insurance Study', In Gaag and Perlmann (ed.), Health,
Economics and Health Economics, North-Holland Pub!. Co., Amsterdam, New York,
Stone, D. A. (1980), The Limits of Professional Power, National Health Care in the Federal
Republic of Germany, The University of Chicago Press, Chicago and London.
Wan, T. T. H. and Soifer, S. J. (1974), 'Determinants of Physician Utilization: A Causal
Analysis', In Journal of Health and Social Behavior, 15.
Wan, T. T. H., Soifer, S. J. (1975), 'A Multivariate Analysis of the Determinants of Physician
Untilization', In Socia-Economic Planning Sciences 4.
Yett, D. E. et at. (1979), A Forecasting and Policy Simulation Model of the Health Care
Sector, Lexington Book, Lexington, Massachusetts, Toronto.
Zollmann, P. and Brennecke, R. (1984), Ein Zweistufen-Ansatz zur Schiitzung der Inanspruch-
nahme ambulanter iirztlicher Leistungen, Sfb 3-Arbeitspapier Nr. 137, Frankfurt, Mann-

Segmentation and classification.
An application to patients' risk estimation


URA 934 Batiment 101 University LYON I, Boulevard du 11 Novembre 1918,
69622 Villeurbanne, France

1. Introduction

In many fields of research we have to deal with the so called "classification

problem" namely that of assigning an individual x to one of several pre-
specified classes wi' j = 1 ... r. That assignment is based on some features
or measurements made on the individual x, Qi(X), i = 1 ... p. This problem
is difficult because there often exists a substantial amount of variability in the
measurements of individuals belonging to the same class.
Various approaches have been developed according to the nature of the
exogenous variables Qi, and the specificity of the application field [3,4, 5].
In this paper, the exogenous variables are of the discrete type; we build a
tree structure on the training set, and information theory provides us with the
choice criterion for the exogenous variable to be chosen at each non terminal
node of the tree.
In order to optimize the use of the information given by the training set,
we show that it is more efficient to build a non arborescent structure.
This structure provides each class Wi with a profile defined by the
variables Qi. Thus, it can be used both as a segmentation and a classifier tool.
Finally, an illustration is given, which deals with the pronostic of the
evolution in burnt patients.

2. Notation

Let X be the set of individuals, for simplicity (without loss of generality) let
us consider a problem of two classes WI and W2, with binary features Qi,
j= 1. . . p.
Thus, the endogenous variable Q, and the exogenous variables Qi are the
following mappings:
Q:X -- {WI' w 2}
j= 1, .. . ,p,
where q{, q~, denote the two possible states of Qi.

G. Duru and J. H. P. Paelinck (eds.), Econometrics of Health Care, 227-235.

© 1991 Kluwer Academic Publishers.
228 J. -Po Auray et al.

In the following, for notation simplicity, we will not make any distinction
between the label W1 and the subset of X which are defined as follows:
{x E X: Q(x) = wd.
The training step consists in recording for a subset T of X the following
(Qj(x),j = 1 .. . p; Q(x)), 'Vx E T.
The problem is to classify individuals y, y fE T, by observing as few variables
Qj as possible.

3. An arborescent procedure

3.1. Description

Let us consider the arborescent segmentation given by Figure 1, which deals

Figure 1.
Segmentation and classification 229

with a toy example where

I Card( T) = 200
card(T n Wi) = 100, i = 1,2

! q{ is denoted by '0' Vj
q~ is denoted by '1' Vj

Each node s of the graph corresponds to a subsample ~ of T; for instance

the node S4 corresponds to the individuals x of T such that:
Q\x)=O and Q4(X)=1.
To every node s, we associate the counts n~, i = 1, 2, of individuals
belonging to Wi n Ts ' in Figure 1, this information is taken into account by
the mark:

To each non terminal node we associate a feature Qi, and this association
aims to create new nodes s, such that one of the two numbers n;, n;
is near
to zero.
In that toy (ideal) example, the five terminal nodes contain only individ-
uals belonging to one of the two classes.
Such a method clearly provides a description of the classes, for instance
class W 2 coincides on T with the individuals x such that:


3.2. An arborescent classifier tool

Now, the problem is to classify individuals y, y $ T. Within the previous

framework (Figure 1), let us consider an individual y, y $ T, such that
QI(y) = 0 and Q4(y) = O. (I)
We will naturally associate y to the node S3 of the tree, and will consider the
subset Xs3 of X defined as follows:
XS3={xEX:QI(x)=0 and Q 4(x)=0}.
The likelihood of the class WI for the individual y, with respect to informa-
230 1. -Po Auray et al.

tion (I), can be measured by the probability p( (OIlX.3 ), the estimation of

which is 2/9 on the training set.
Thus we can use the arborescent structure of the Figure 1 as a classifier
tool by choosing decision thresholds.

3.3. An algorithm for building the arborescent structure

In accordance with the previously defined objectives (3.1, 3.2) the terminal
nodes s of the tree should verify the following property:
{one of the two numbers n!, n;, is equal to zero, or near to zero.}
This suggests using all the exogenous variables Qj, j = 1 ... p, and thus
considering the tree with 2 P terminal nodes. Unless p is small or the size of
the training set is fairly large, this approach does not work. The 'quality' of
n~/(n! + n;) as an estimator of p(OJ/x,), as measured for example by a
confidence interval, is directly proportional to the (square root of the) sample
size (n! + n;), thus on average inversely proportional to the depth of s in the
Consequently, we have to prevent the exponential increase in the number
of nodes.
We propose a sequential procedure, which requires the choice of two
we have to decide on the conditions which will define a terminal node,
- we have to decide on the way of associating an exogenous variable to a
non terminal node.
For the last choice, we will use Infonnation Theory. Let us consider an
exogenous variable Q, which has not yet been used between the root of
the tree and a given non terminal node s. We compute (see 3.4) a measure
I(Q, s) of the infonnation conveyed by Q at the node s (note that I(Q, s) is
always positive).
We choose for s the exogenous variable Q* which maximises I(Q, s)
among all the possible Q. We will say that a given node s is terminal if one
the two following conditions is fulfilled:
i) I(Q*,s) < e,where eis a given positive threshold,
ii) n! + n; ~ t, where t is a given integer.
These rules are clearly motivated by the concern of non increasing the
number of nodes without a significant infonnation gain.

3.4. Information conveyed by an exogenous variable at a non terminal node

Let I( OJ/Xs ) denote an estimator of the probability p( OJ/Xs ) (previously

n~/(n! + n;) was one such an estimator, but further we will see other
possible choices). We associate to the node s an uncertainty measure h(s),
Segmentation and classification 231

for instance by using the Shannon entropy, we define h(s) as follows:

2 1
h(s) = L f(w/Xs) log - - -
i-I f(w/Xs)

It can be noticed that

f( w/x,) = 1
h(s) ~ 0, h(s) =0 ~ or (1)
f( w2 / Xs) = 1.
The choice of a feature Q, associated to the node s, leads to the creation of
two new nodes SI' S2 (in accordance with Figure 2), the characteristic subsets
of which are:
X s, = Xs n {x EX: Q(x) = O}
Xsz = Xs n {x E X: Q(x) = 1 }.
We can now define I (Q, s) as the average decrease in uncertainty:
I(Q, s) = h(s) - [f(Xs)h(SI) + f(XS2 )h(S2)] (2)
where f(Xs) denotes an estimator of P(Xs) (for instance f(Xs) = card(T,.)I
card(T». The following property guarantees the consistency of this choice:
I(Q, s) ~ O. (3)
In this short presentation we have worked with the Shannon entropy, but a
larger family of uncertainty measures is available within our framework. (In
fact an uncertainty measure is available as soon as it verifies the two
conditions (1) and (3), [7]).

Figure 2.
232 I.-P. Auray et al.

4. A non arborescent procedure

4.1. General description of the method

The toy example of Figure 3a shows a difficulty which may accur with the
arborescent algorithm described in the previous section.
The nodes S3' S4' S5 are terminal (the parameter t was equal to 10), and the
decision risk about the class of an individual y belonging to Xs3 or Xs4 is very
high. This suggests 'fusing' the nodes S3' S4 (Figure 3b), and in so doing,
creating a new node G which is not necessarily terminal.
Thus we can imagine that there exists a feature Q operating the segmenta-
tion of the node G (Figure 3b), which leads to terminal nodes G" G2 , the
associated decision risk of which are quite small with regard to the decision
risk associated to the nodes S3' S4. Thereby we have transformed the
arborescent structure into a lattice structure, and the profiles associated to
the terminal nodes will be described by means of the operators 'AND' and
'OR'. For instance the characteristic subset associated to the node G, (Figure
3) in T, is the following:
TGJ = {x E T: (Ql(X) = 0 AND Q(x) = 1) OR
(Ql(X)= 1 AND Q2(x)=OAND Q(x) = I)}

4.2. An algorithm for constructing the lattice structure

For consistency a unique information measure must be used for the two
following operations:
- creation of new nodes ('splitting'),

Fig 3.a Fig 3.b
Figure 3.
Segmentation and classification 233

- union of nodes ('fusing').

In accordance with this principle, the information measure I(Q, s) defined in
paragraph 3, can no longer be used.
The fusing of the two nodes s), S2 in node s (Figure 4) is the inverse
operation of the splitting of s, by use of an imaginary feature Q.
As the I(Q, s) measure is always positive, it would lead to an information
measure which would always be negative for the fusing operation.
Therefore we have looked for an uncertainty measure JP, such that the
variation in uncertainty resulting from splitting (in accordance with (2» is not
necessarily positive, especially for nodes, the sample size of which is small.
Thus, we have chosen to work with the Daroczy entropy [2],

(where (3 E ]0,1 D, and made the following choice for f( w/ Xs)

f(w/Xs ) =
n: + 1
n; + n; + 21
where 1 is a positive parameter [8].

5. Experimental results

The center for burnt patients of the Hopital E. Herriot (Lyon, France) has
requested a statistical approach to septic risk in burnt patients [6].
Three possible evolutions have been distinguished for burnt patients:
alive without infection (this type of evolution will correspond to class w),
- alive with infection (class ( 2 ),


S 5

Figure 4.
234 f.-P. Auray et al.

- death (class W3).

In the training set, there were respectively 87, 60 and 27 individuals
belonging to classes WI' w2 and w3 •
In this approach six exogenous variables were considered:
AGE, HEIGHT, WEIGHT, and three different indices of burnt
severity: BUS, UBS, BSA.
The lattice procedure produces the results shown in Figure 5 (f3 = 0.99, A. =
1). We obtain three terminal nodes SI' S2' S3:
SI corresponds to a subpopulation of individuals with good chances of
recovery without infection,

Figure 5.
Segmentation and classification 235

S2 corresponds to a subpopulation with good chances of recovery

although often after infection,
S3 corresponds to high risk subpopulation.


[I] Auray, J. P., Duro, G., Terrenoire, M., Tounissoux, D. and Zighed, A. Un logiciel pour
une methode de segmentation, Revue Inf. Sci-Hum., n° 64, pp. 64-78.
[2] Daroczy, D. (1970), 'Generalized Information Functions', Information and Control 16,
[3] Devijver, P. and Kittler, J. (1982), Statistical Pattern Recognition, Prentice Hall.
[4] Duda, R. and Hart, P. (1973), Pattern Classification and Scene Analysis, Wiley, New-
York. .
[5] Fu, K. S. (1982), Syntactic Pattern Recognition and Applications, Printice Hall.
[6] Marichy, J., Buffet, G., Zighed, A. and Laurent, Ph. (1984), Early Detection of Scepticemia
in Burnt Patients, Actes 3rd int. Conf. on Syst. Sci. in Health Care, Munich, pp. 505-
[7) Tounissoux, D. (1981), 'Processus sequentiel adaptatif de reconnaissance de forme pour
l'aide au diagnostic', These Lyon 1.
[8J Zighed, A. (1985), 'Methodes et outils pour les processus d'interrogation non arbores-
cents', These Lyon 1.
A general equilibrium model of health care

State University of New York at Binghamton, School of Management, New York, U.s.A.
Erasmus Universiteit, Rotterdam, The Netherlands

1. Introduction

Our objective in this paper will be to develop a theoretical behavioral model

of each of the sectors below and link them; later on we intend to use
empirical observations to estimate the parameters and test the validity of the
The health care sector here covers the following subsectors:
1. physicians;
2. patients;
3. hospitals;
4. medical drugs;
5. insurance.

2. Patients' behaviour model

2.1. Foundations

Let Uk denote the utility function of the k-th individual, k = 1, 2, ... K, in

the society postulated, where
Uk = uk[q; h(v,f, e); v,f, eJ. (1)
The utility of the k-th patient at a given point of time is assumed to be a
function of all other goods and services, q, health h, (function of the next
arguments), visits to the doctor v (maybe measured in minutes), consumption
of medicines and drugs f, and visits to the hospital, e. The budget constraint
is expressed as follows (the consumer index being skipped for simplicity
purposes, exogenous variables W.r.t. the complete model being starred):
r* = Pqq + Pvv + ptf + Pee. (2)
The consumer's objective is to choose the values of the arguments q, v, f
and e such that (1) is maximum subject to the budget constraint (2); the

G. Duru and J. H. P. Paelinck (eds.), Econometrics of Health Care, 237-248.

© 1991 Kluwer Academic Publishers.
238 M. Chatterji and J. H. P. Paelinek

Lagrange fimction is
L = u[q; h(v,f, e); v,f, el- l(q + P.v + prf+ Pee - r*) (3)
the price of other goods and services being normalised to 1.
Putting equal to zero derivatives with respect to q, v,! and c, one gets,
taking into account the fact that 'consumption' of ! and c are functions of v,
m(v) and n(v) (doctors' prescriptions): I
L'q = u'q - l =0 (4)
L~ = u~(h~ + hfm' + h~n') + u~ + ufm' + u~n'­
- l(p. + Prm' + Pen') = O. (5)
Substituting (4) into (5), one obtains
, u~(P. + Pim' + Pcm') - (u: + u~m' + u~n')
uh = h: + h~m' + h~n' (6)
From (6) one sees that the marginal utility of health will be declining if the
marginal utility of visits increases, either through its health effect (denomina-
tor) or its psychological effect (numerator), leading to more health expendi-
tures in the 'normal' case (declining marginal utilities).

2.2. Insurance

It is also well known that the price of ethical drugs, visits and hospitals cost
are not paid completely by the individual: a portion, :re, is returned to the
patient. Let us assume that it is fixed and denoted by :re*, so budget Equation
(2) can be rewritten as 2
r** = r* + :re!v + :rei! + :re~c. (2 ')
With that new constraint (2 ') Equation (6) can be rewritten as
, _ u~[(P. - :re:) - (Pi - :r(7)m' - (Pc - :re:)n'l - (u: + u~m' + u~n') 6'
uh - h: + h~m' + h~n' ( )
of which the interpretation is obvious.
Again for each individual patient k, the budget constraint can integrate the
fact that his/her budget equals his/her income plus the amount he/she gets
back for his/her health expenses payment minus a total tax payment, i.e., a
fixed a-portion he/she has to contribute for his/her contribution to health
insurance; with such a budget equation equalling r***, Equation (6') can be
easily extended by correcting each :re* - term with a factor (1 - a), and the
new equation has again an obvious interpretation.

2.3. Social welfare

Let us now consider a welfare function for society as a whole, 1jJ, Let
A general equilibrium model of health care 239

Y:. ~ [ud with Uk as in (1). One now wants to maximize tJl(Y:.) subject to two
restrictions; first a global budget restriction

L [(qk + PvVk + Prfk + PeCk) - (n:vk + njfk + n~ck) +


ak(qk + PvVk + ptA + Peed] ~ L rk ~ r*. (7)


The left hand side in (7) denotes total net outlays for the nation as a whole
and the right hand side is total income; a second condition is
(at - a k ) (rk - r l ) ~ 0 (8)
which arises from the reasonable assumption that when r k > rl , a k ~ al.
The Lagrange function now is

- L (n:vk + njfk + n~ck) +


+~ a k ( qk + PvVk + Pt/k + PeCk) - rk ] -

- L ,ulk( at - ak)(rk - r1)· (9)

1 > k

Differentiating L with respect to a k, qk' vk,A and Ck one gets

L ~k = -,urk + ,ulk(rk - rf) = 0 (lOa)
leading up to

2= l-~. (lOb)
rk ,ulk

Furthermore, for each k:

L~ = tJl~(u) [u;,' (hv + hjm' + h~n') + u~ + utm' + u~n'l-
- ,u[(1 + ad (Pv + Ptm ' + Pc m') - (n~ + njm' + n~n')l-

- L ,ulk( a 1 - a k) (Pv + Pfm' + Pen') = 0 (12)

I> k

and similarly for the other variables.

240 M. Chatterji and J. H. P. Paelinck

Equation (10) through (12) and assimilated can be solved to get the
demand functions of the patients which follow:
qZ = q(rk> Pv, PC' PI' n:, n~, nj, ak) (13a)
vZ = Vk(rk> Pv, PC' PI' n:, n~, nj, ak) (13b)
Notice that to patient k, ak is given, as is the set Pv, PC' PI' and in this
'social case', r k (not starred). Demand function (13a) will not interest us any
more; note that functions (13c) and (13d) are 'derived' demand functions (via
the functions m(v) and n(v».

2.4. Specifying differential premium rates according to income

In 2.3 one has considered different values of a for different k, as a function

of a constraint; however, one can introduce a specific function
The result in this case is that in place of the parameters Ilkl we have only
one parameter to compute, to wit y; choosing different values of Po, Pi will
determine appropriate shares of tax contribution from people belonging to
different income brackets.
The formal derivation of the modified Lagrangian (w.r.t. (9» is left as an
exercise to the reader

2.5. Variable reimbursement rate

So far one has assumed that the reimbursement parameters n:, nj and n~
are arbitrary determined by a central authority; however, the problem may
be how to choose values for which total social utility 1fJ(yJ is maximum.
In taking the derivative of (9) with respect to nv' e.g., one should take into
account the demand and 'derived' demand functions (13), a tedious exercise
which is not pursued further here.

2.6. Private and social calculation

To understand the difference between 2.1 and 2.3, let us consider the
following simple example; let the utility function of the k-th individual be
Uk = uk(g) (15)
with his budget constraint
l!..*' 9.. = rk· (16)
The solution for maximum Uk with the corresponding budget cosntraint
A general equilibrium model of health care 241

will be
i.e., the classical result that the ratio of the marginal utility to the price of any
good i will be fixed for all k individuals in proportion to the marginal utility
of income of those individuals, Ak •
When one considers a preference function for society as a whole, as
before, and again denotes it as 1/J(11:.), where 11:. ~ [Uk], the Lagrangian for
maximizing total utility under a budget constraint becomes


which gives the condition

1/J~u~i - ppf = 0 (19a)


If one compares (17) with (19b) one sees that as long as P/1/Jk equals Ak ,
the conditions for individual and social optimizing will lead to the same
equilibrium result.
Equation (19b) can be rewritten as




It is interesting to note that then in our health model


fif = pf - :n:f. (22b)

3. Physician's behavior model 3

The utility function of a physician j is postulated to be

uj = uj[r(vpv' cPc), h( v, f, c), p*], j = 1, ... ,J. (23)
242 M. Chatterji and 1. H. P. Paelinck

The utility of the physician depends on his income, which again depends
on the number of visits received multiplied by his unit fee, and the number of
hospital patients times income obtained from sending patients to the hospital;
furthermore medical ethics plays a part, depending on the number of visits,
the number of prescriptions and the number of patients sent to the hospital;
finally intervenes the general price level, p*.
The utility maximisation principle applied to (23) gives

du au ( av) au ak av (24a)
dpv = or v + Pv apv + ok ov apv = o.

dpv = v au or )-1 au
or [ 1 + ( 1 + ( au aft
aft ov
-1) E vp",]
Pv (24b)

where Evpv denotes the visiting price elasticity of the patients to the doctor's
The solution of Equation (24b) gives the supply function of the physician;
and proceeding in a similar fashion for clinic visits and medicines, the
physicians 'supply functions' for private visits, clinic visits and medicines will
be given by

f Zs -- fS( Pv,PoP *) .

For office visits the demand and supply equations in partial equilibrium
now are


For a given number of patients in a community going to doctor j,4 the

demand and supply functions are:


Solving the supply and demand function of and confronting the physician,
one gets the equilibrium price and quantity in a partial equilibrium setting;
one sees immediately that from fixing (centrally) Pjv' disequilibrium could
A general equilibrium model of health care 243

4. Hospitals

Assuming that the objective of the hospitals is to maximize their net income,
i.e., to choose the price of hospital admission Pc and payments to doctors Pc
in (23) such that
i.e. x is a binary variable, (30)-(31) being valid for each hospital h = 1, ... ,
H; specification (30)-(31) allows of respecting the condition r ~ O.
The Lagrangian is
L = xr(pc' Pc) - a(x2 - x). (32)
Taking derivatives with respect to Pc> Pc and x, and setting equal to zero,
results in:
dL dr
--=x--=O (33a)
apc apc
ar (33b)
afJc afJc
ax = r - a(2x - 1) = O. (33c)

From (33c) one gets

1 r
x=-+- (34)
2 2a
the sign of r/2a deciding of the existence of the hospital. s This hospital
sector model is too simplistic: hospital rates in most countries are fixed by
public authorities, and besides, management of a complex hospital depends
on many other factors such as case mix, technology, size, location, type of
hospital, etc. Such realities have to be incorporated in future versions of this

s. Drug and pharmaceutical industry behaviour

The objective function of the ethical drug and pharmaceutical industries (d =

1, ... D) is assumed to maximise profits
where i denotes investment in research and development with, the con-
244 M. Chatterji and 1. H. P. Paelinck

i ~ i* (36)
The Lagrangian is given by
L = xs(i, PI) - 'fJ(i - i*) - O(X2 - x) (38)
leading up to
aL as (39a)

as (39b)
apf apf
aL s - O(2x - 1) 0 (39c)
ax = =

1 s
x=-+-- (39b)
2 20
with a similar interpretation as above.
It is realised that the complexities of the drug and pharmaceutical
industries require a more sophisticated formulation, to be left for further

6. Insurance company behaviour

The income function of the insurance companies (i = 1, ... I) is given by


the first argument representing premium receipts (Zik per definition), the
other ones disbursements. Consumer ik belongs to a subset of the K con-
sumers, supposed to be known a priori (but see later on).
So the Lagrangian first order maximum conditions are

A general equilibrium model of health care 245

Conditions such as
t; ~ 0, 'Vi (42a)
:7li, ~ PVik (42b)
should be included, (42b) and (42d) taking into account the allocation of
consumers to doctors and clinics respectively. Again it is realised that the
above formulation for reimbursement is too simplistic: in some countries,
there is compelte national health insurance, in others it is partial; there
are institutional arrangements like HMO's, the possibility of co-payments,
deductibles etc., which have to be incorporated in future developments.

7. Equilibrium

Neglecting qk, as said before, one can now draw a list of equations and
endogenous variables.

7.1. Equations

7.1.1. Consumer demand (Equations (13»:

vt = vk(r!; Pjv, Pho Pf; :7l;v, :7l;c' :7lif , a ik ) (13b)
ft = h(r"t.; Pjv' Pho Pf; :7liv' :7lic' :7lif , aid (13c)
c% = c(r!; Pjv' Phc> Pf; :7liv' :7lic> :7lif , a ik )· (13d)
This formulation supposes that the consumers have already allocated
themselves to J doctors and H clinics, and that the pharmaceutical industry
(one firm producing only one drug in this model) satisfies all consumers.6
The number ofthe relevant Equations (13) is 13KI.

7.1.2. Doctors'supply (Equation (29»

vi = vj(Pjv, Pjc> p*) (29)
There are I1J such equations.
7.1.3. D-S equilibrium

L vi:2: vJ (44)

There are again 0 such equations.

246 M. Chatterji and 1. H. P. Paelinck

7.1.4. Hospital supply

From Equations (33) one can derive the supply functions

ch = C(Phc) (45)

and the bid-price functions

p% = P(PhJ· (46)

There are IZH such equations.

7.1.5. D-S equilibrium


supposing the consumers to be already subdivided into H subsets, again

[ll] equations.

supposing doctors to be allocated to clinics; there are QJ such equations.

7.1.6. Drug supply
This is given from conditions (39) as

F = f(Pt, iO) (49)

OJ equation.
7.1.7. D-S equilibrium


OJ equation.
7.1.8. Insurance supply and equilibrium
The conditions here are, consumers being supposed to be allocated to
insurance companies:

then being
m of these equations, the same applying to n iv ' nit' n ic ( [l[I
A general equilibrium model of health care 247

7.2. Endogenous variables

The list is as follows:

v% K
f% K
c% K
Pjv 1
Phe H
PI 1
Hiv I
Hie I
Hif I
aik I
VJ 1
C1 H
A 1
Phe H
f' 1
Total: 3H + 41 + 31 + 3K + 2.
The equations listed under 7.1 number equally, so a general health sector
equilibrium could exist with the usual provisos: existence, uniqueness,
economic meaningfulness, stability. Moreover, we have initially allocated
consumers to doctors (contrarily to the habit of 'doctor shopping'!) and
hospitals, and doctors to hospitals, supposing moreover the number of
doctors and hospitals to be known (the same problem arises in fact with the
drug industry where we have postulated a unique firm with a unique
product). Especially in a spatial setting these aspects are raising serious
analytical difficulties, 7 left for further research.

8. Conclusions

In this paper, we have sketched an outline of a health care system involving

major sectors, distinguishing endogenous and exogeneous vriables. The
system should be refined as said above, before one should pass on to an
econometric model of health care delivery to be estimated and tested. As
hinted at, spatial aspects should not be neglected ('spatial medicometrics').8


1. We skip the possibility of free purchase of drugs, which can be easily included in the
2. For the sake of simplicity, we are not considering different possible types of health
248 M. Chatterji and f. H. P. Paelinck

insurance i.e. complete national health insurance, private insurance, copayment, deducti-
bles, Health Maintenance Organisations, etc.
3. See also P. Zweifel, 1982.
4. This supposes that the set of patients with cardinal K is subdivided into J subsets of
cardinal Kj , ~ j Kj = K; we will corne back to that remark.
5. (J will be positive, as r < 0 implies x = O.

6. This has to be extended in later versions to multiple drugs and competition amongst firms
producing the same drug.
7. See J. H. P. Paelinck, 1985, pp. 35 ff; a possible treatment is through Tinbergen-Bos
systems (ibidem, pp. 52 ff).
8. See J. H. P. Paelinck, 1984.


Paelinck, J. H. P. (avec la collaboration de J.-P. Ancot, H. Gravesteijn, J. H. Kuiper et Th. ten

Raa). (1985), Elements d'Analyse Economique Spatiale, Geneve, ERESA.
Paelinck, J. H. P. (1984), 'Les difficultes de la medicometrie regionale', in A. S. Bailly et M.
Perlat (eds.), Medicometrie regionaie, Editions Anthropos, Paris, pp. 13-19.
Zweifel, P. (1982), Ein Oekonomisches Modell des Arztverhaltens, Springer, Berlin.
Advanced Studies in Theoretical and Applied Econometrics
1. Paelinck, J.H.P. (ed.): Qualitative and Quantitative Mathematical Economics. 1982
ISBN 90-247-2623-9
2. Ancot, J.P. (ed.): Analysing the Structure of Econometric Models. 1984
ISBN 90-247-2894-0
3. Hughes Hallet, A.J. (ed.): Applied Decision Analysis and Economic Behaviour.
1984 ISBN 90-247-2968-8
4. Sengupta, J.K.: Information and Efficiency in Economic Decision. 1985
ISBN 90-247-3072-4
5. Artus, P. and Guvenen, o. (eds.), in collaboration with Gagey, F.: International
Macroeconomic Modelling for Policy Decisions. 1986 ISBN 90-247-3201-8
6. Vilares, M.J.: Structural Change in Macroeconomic Models. Theory and Estima-
tion. 1986 ISBN 90-247-32n-8
7. Carraro, C. and Sartore, D. (eds.): Development of Control Theory for Economic
Analysis. 1987 ISBN 90-247-3345-6
8. Broer, D.P.: Neoclassical Theory and Empirical Models of Aggregate Firm
Behaviour. 1987 ISBN 90-247-3412-6
9. ltalianer, A.: Theory and Practice of International Trade Linkage Models. 1986
ISBN 90-247-3407-X
10. Kendrick, D.A.: Feedback, A New Framework for Macroeconomic Policy. 1988
ISBN Hb: 90-247-3593-9; Pb: 90-247-3650-1
11. Sengupta, J.K. and Kadekodi, G.K. (eds.): Econometrics of Planning and Ef-
ficiency. 1988 ISBN 90-247-3602-1
12. Griffith, D.A.: Advanced Spatial Statistics. Special Topics in the Exploration of
Quantitative Spatial Data Series. 1988 ISBN 90-247-3627-7
13. Guvenen, O. (ed.): International Commodity Market Models and Policy Analysis.
1988 ISBN 90-247-3768-0
14. Arbia, G.: Spatial Data Configuration in Statistical Analysis of Regional Economic
and Related Problems. 1989 ISBN 0-7923-0284-2
15. Raj, B. (ed.): Advances in Econometrics and Modelling. 1989 ISBN 0-7923-0299-0
16. Aznar Grasa, A.: Econometric Model Selection. A New Approach. 1989
ISBN 0-7923-0321-0
17. Klein, L. R. and Marquez, J. (eds.): Economics in Theory and Practice. An Eclectic
Approach. Essays in Honor of F. G. Adams. 1989 ISBN 0-7923-0410-1
18. Kendrick, D. A.: Models for Analyzing Comparative Advantage. 1990
ISBN 0-7923-0528-0
19. Artus, P. and Barroux, Y. (eds.): Monetary Policy. A Theoretical and Econometric
Approach. 1990 ISBN 0-7923-0626-0
20. Duru, G. and Paelinck, J.H.P. (eds.): Econometrics of Health Care. 1990
ISBN 0-7923-0766-6

Kluwer Academic Publishers - Dordrecht I Boston I London