Académique Documents
Professionnel Documents
Culture Documents
JOURNAL OF
HEALTH
ECONOMICS
JOURNAL OF HEALTH ECONOMICS Publication information: Journal of Health Economics (ISSN 0167-6296). For 2015, volumes 39–44 are scheduled for publication.
Subscription prices are available upon request from the Publisher or from the Regional Sales Office nearest you or from this journal’s website
(http://www.elsevier.com/locate/jhe). Further information is available on this journal and other Elsevier products through Elsevier’s website:
(http://www.elsevier.com). Subscriptions are accepted on a prepaid basis only and are entered on a calendar year basis. Issues are sent by
standard mail (surface within Europe, air delivery outside Europe). Priority rates are available upon request. Claims for missing issues should
Aims and Scope be made within six months of the date of dispatch.
This Journal seeks articles related to the economics of health and medical care. Its scope will include the following
topics: production of health and health services; demand and utilization of health services; financing of health services; Advertising information: If you are interested in advertising or other commercial opportunities please e-mail Commercialsales@elsevier.com
and your enquiry will be passed to the correct person who will respond to you within 48 hours.
measurement of health; behavioral models of demanders, suppliers and other health care agencies; health behaviors
and policy interventions; efficiency and distributional aspects of health policy; and such other topics as the Editors Funding body agreements and policies
may deem appropriate. Applications to problems in both developed and less-developed countries are welcomed. Elsevier has established agreements and developed policies to allow authors whose articles appear in journals published by Elsevier, to
comply with potential manuscript archiving requirements as specified as conditions of their grant awards. To learn more about existing
agreements and policies please visit http://www.elsevier.com/fundingbodies
Editors
J. CAWLEY, Department of Policy Analysis and Department of Economics, Cornell University, Ithaca, NY, USA. Orders, claims, and journal enquiries: Please contact the Elsevier Customer Service Department nearest you:
E-mail: johncawley@cornell.edu St. Louis: Elsevier Customer Service Department, 3251 Riverport Lane, Maryland Heights, MO 63043, USA; phone: (877) 8397126 [toll free
M. CHALKLEY, Centre for Health Economics, University of York, Heslington, York, UK. within the USA]; (+1) (314) 4478878 [outside the USA]; fax: (+1) (314) 4478077; e-mail: JournalCustomerService-usa@elsevier.com
Tokyo: Elsevier Customer Service Department, 4F Higashi-Azabu, 1-Chome Bldg, 1-9-15 Higashi-Azabu, Minato-ku, Tokyo 106-0044, Japan;
E-mail: martin.chalkley@york.ac.uk phone: (+81) (3) 5561 5037; fax: (+81) (3) 5561 5047; e-mail: JournalsCustomerServiceJapan@elsevier.com
M.E. CHERNEW, Department of Health Care Policy, Harvard Medical School, Boston, MA, USA. Singapore: Elsevier Customer Service Department, 3 Killiney Road, #08-01 Winsland House I, Singapore 239519; phone: (+65) 63490222;
E-mail: chernew@hcp.med.harvard.edu fax: (+65) 67331510; e-mail: JournalsCustomerServiceAPAC@elsevier.com
Oxford: Elsevier Customer Service Department, The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, UK; phone: (+44) (1865) 843434;
D. CUTLER, Department of Economics, Harvard University, Cambridge, MA, USA. E-mail: dcutler@harvard.edu fax: (+44) (1865) 843970; e-mail: JournalsCustomerServiceEMEA@elsevier.com
E. MEARA, The Dartmouth Institute of Health Policy & Clinical Practice, Dartmouth College, Lebanon, NH, USA.
E-mail: ellen.r.meara@dartmouth.edu Author enquiries
For enquiries relating to the submission of articles (including electronic submission) please visit this journal’s homepage at
N. RICE, Centre for Health Economics, University of York, Heslington, York, UK. http://www.elsevier.com/locate/jhe. For detailed instructions on the preparation of electronic artwork, please visit http://www.elsevier.com/
E-mail: nigel.rice@york.ac.uk artworkinstructions. Contact details for questions arising after acceptance of an article, especially those relating to proofs, will be provided
L. SICILIANI, Department of Economics and Related Studies, University of York, Heslington, York, UK. by the publisher. You can track accepted articles at http://www.elsevier.com/trackarticle. You can also check our Author FAQs at http://www.
elsevier.com/authorFAQ and/or contact Customer Support via http://support.elsevier.com.
E-mail: luigi.siciliani@york.ac.uk
A.D. STREET, Centre for Health Economics, University of York, Heslington, York, UK. Illustration services
E-mail: andrew.street@york.ac.uk Elsevier’s WebShop (http://webshop.elsevier.com/illustrationservices) offers Illustration Services to authors preparing to submit a manuscript
but concerned about the quality of the images accompanying their article. Elsevier’s expert illustrators can produce scientific, technical and
medical-style images, as well as a full range of charts, tables and graphs. Image ‘polishing’ is also available, where our illustrators take your
Associate Editors image(s) and improve them to a professional standard. Please visit the website to find out more.
J.E. ASKILDSEN, University of Bergen, Bergen, Norway.
K. BAICKER, Harvard School of Public Health, Boston, MA, USA. USA mailing notice: Journal of Health Economics (ISSN 0167-6296) is published bimonthly (January, March, May, July, September and
P.P. BARROS, Universidade Nova de Lisboa, Lisbon, Portugal. November) by Elsevier B.V. (P.O. Box 211, 1000 AE Amsterdam, The Netherlands). Periodicals postage paid at Jamaica, NY 11431 and
additional mailing offices (not valid for journal supplements).
A. BASU, University of Washington, Seattle, WA, USA. USA POSTMASTER: Send change of address to Journal of Health Economics, Elsevier Customer Service Department, 3251 Riverport Lane,
H. BLEICHRODT, Erasmus University, Rotterdam, The Netherlands. Maryland Heights, MO 63043, USA.
R.P. ELLIS, Boston University, Boston, MA, USA.
AIRFREIGHT AND MAILING in USA by Air Business Ltd., c/o Worldnet Shipping Inc., 156-15, 146th Avenue, 2nd Floor, Jamaica, NY 11434,
J. GLAZER, Tel Aviv University, Tel Aviv, Israel. USA.
S. GLIED, Columbia University, New York, NY, USA.
M. GROSSMAN, National Bureau of Economic Research, New York, NY, USA. The paper used in this publication meets the requirements of ANSI/NISO Z39.48-1992
J. GRUBER, Massachusetts Institute of Technology, Cambridge, MA, USA. (Permanence of Paper).
R. KAESTNER, University of Illinois at Chicago, Chicago, IL, USA. Printed by Henry Ling Ltd, Dorchester, UK.
M. KIFMANN, University of Hamburg, Hamburg, Germany.
A. LLERAS-MUNEY, University of California at Los Angeles, Los Angeles, CA, USA.
J. MULLAHY, University of Wisconsin-Madison, Madison, WI, USA.
O.A. O’DONNELL, Erasmus University, Rotterdam, The Netherlands. “For a full and complete Guide for Authors, please go to: http://www.elsevier.com/locate/Jhe”
P. OLIVELLA, Universitat Autònoma de Barcelona, Barcelona, Spain.
C. PROPPER, University of Bristol, Bristol, UK.
A. SCOTT, University of Melbourne, Victoria, Australia.
doi:10.1016/S0167-6296(15)00058-2
Journal of Health Economics 42 (2015) A1
The Editors of the health economics journals named below 2. Authors engaging in “data mining,” “specification searching,”
believe that well-designed, well-executed empirical studies that and other such empirical strategies with the goal of produc-
address interesting and important problems in health economics, uti- ing results that are ostensibly “positive” (e.g. null hypotheses
lize appropriate data in a sound and creative manner, and deploy reported as rejected).
innovative conceptual and methodological approaches compatible
with each journal’s distinctive emphasis and scope have potential Henceforth we will remind our referees of this editorial philos-
scientific and publication merit regardless of whether such stud- ophy at the time they are invited to review papers. As always, the
ies’ empirical findings do or do not reject null hypotheses that may ultimate responsibility for acceptance or rejection of a submission
be specified. As such, the Editors wish to articulate clearly that the rests with each journal’s Editors.
submission to our journals of studies that meet these standards is
encouraged. American Journal of Health Economics
We believe that publication of such studies provides properly European Journal of Health Economics
balanced perspectives on the empirical issues at hand. Moreover, Forum for Health Economics & Policy
we believe that this should reduce the incentives to engage in two Health Economics Policy and Law
forms of behavior that we feel ought to be discouraged in the spirit Health Economics Review
of scientific advancement: Health Economics
International Journal of Health Economics and Management
1. Authors withholding from submission such studies that are oth- Journal of Health Economics
erwise meritorious but whose main empirical findings are highly
likely “negative” (e.g. null hypotheses not rejected).
http://dx.doi.org/10.1016/j.jhealeco.2015.06.002
0167-6296/© 2015 Published by Elsevier B.V.
Journal of Health Economics 42 (2015) 1–16
a r t i c l e i n f o a b s t r a c t
Article history: Mandatory information disclosure may allow sellers to observe and respond to other sellers’ attributes
Received 21 June 2013 (seller peer effects) as well as informing consumers of the sellers’ attributes (consumer learning effect).
Received in revised form 3 June 2014 Using the data from mandatory information disclosure of antibiotic prescription rates for the common
Accepted 24 October 2014
cold in Korea, this paper shows that while average prescription rates decreased after the disclosure, more
Available online 24 February 2015
than 30% of the clinics increased their antibiotic prescriptions. Moreover, clinics that were prescribing
relatively fewer antibiotics than other local clinics before the disclosure requirement were more likely to
JEL classification:
increase their prescription rate. The average prescription rates also declined less in markets with stronger
I1
L1
clinic competition. These results are consistent with seller peer effects.
D8 © 2015 Elsevier B.V. All rights reserved.
Keywords:
Information disclosure
Peer effects
Antibiotic overuse
1. Introduction improve quality (called consumer learning effects) or (ii) sellers learn
their competitors’ attributes from mandatory information disclo-
When sellers have more information than buyers on the sure and influence each other’s quality (called seller peer effects).
attributes of a product, the sellers can overstate the product’s qual- Even though consumer learning effects suggest that mandatory
ity and overcharge the buyers. Such an information asymmetry information disclosure should increase the quality of all sellers,
problem can lead to the collapse of markets (Stigler, 1961), dis- we show that seller peer effects may decrease the quality of some
tort investment decisions, and undermine the quality and safety of sellers.
products and services including health care, foods, education, and Therefore, when introducing mandatory information disclo-
the environment. Therefore, there is an increasing use of manda- sure, it is important for policy makers to understand the existence
tory information disclosure as a regulatory mechanism to address and the extent of seller peer effects. For example, in markets where
this information asymmetry problem and to improve the quality of sellers themselves do not know the attributes of other sellers, an
products and services. information disclosure policy can introduce seller peer effects as
However, mandatory information disclosure can reveal the well as consumer learning effects. Moreover, if the disclosed infor-
attributes of products and services not only to consumers but also mation is difficult for consumers to find or interpret, seller peer
to other competing sellers. That is, even though the previous litera- effects can dominate consumer learning effects.
ture has largely focused on the effects of information disclosure to Before we proceed further, it is worth clarifying the definition
consumers, mandatory information disclosure can directly affect of peer effects in this paper. We define peer effects as a situation
the interaction among sellers. In particular, when a seller learns where an individual’s behavior or decisions are influenced by oth-
that most other sellers were providing lower quality services, the ers’ behavior in a relevant peer group, called “endogenous peer
seller may reduce its quality after the information disclosure. effects” by Manski (1993). Such peer effects can arise from an intrin-
In this paper, we consider a simple theoretical framework to sic social preference for behaving like others. Such peer effects
distinguish whether (i) consumers learn the attributes of sellers can also arise from rational decisions to obtain higher economic
from mandatory information disclosure and pressure the sellers to payoffs. For example, following the behavior of others can be cost-
efficient and rational (Bikhchandani et al., 1992). While some may
argue that peer effects arising from social preference are the true
∗ Corresponding author. Tel.: +82 2 880 8551. peer effects, we are more interested in whether consumers learn
E-mail addresses: ilkwon@snu.ac.kr (I. Kwon), dswin27@snu.ac.kr (D. Jun). about sellers’ behavior from mandatory information disclosure or
http://dx.doi.org/10.1016/j.jhealeco.2014.10.008
0167-6296/© 2015 Elsevier B.V. All rights reserved.
2 I. Kwon, D. Jun / Journal of Health Economics 42 (2015) 1–16
the competing sellers do. Therefore, in this paper, we do not dis- 2. Background and previous literature
tinguish between peer effects based on social preference and peer
effects based on rational or strategic choice. 2.1. Information disclosure
Empirically, we examine the effects of the 2006 mandatory pub-
lic disclosure of the antibiotic prescription rates for the common Most previous literature has focused on the effect of information
cold of every clinic and hospital in Korea. In 2012, the director gen- disclosure on consumers, or consumer learning effects. For example,
eral of the World Health Organization (WHO) warned that overuse without mandatory information disclosure, consumers may not be
of antibiotics has led to widespread drug-resistant pathogens that able to observe the quality of a product. Then, as Akerlof (1970)
are more difficult, toxic, and costly to treat.1 However, antibiotics shows, firms cannot benefit from high quality and may leave the
are still frequently prescribed for the common cold, often because market, which can lead to the collapse of the whole market, called
of patient demands and hospital competition, even though they are the lemon problem. In this case, quality information disclosure to
not useful for fighting infections caused by viruses like the com- consumers would benefit high quality firms, and provide incentives
mon cold, most sore throats, and bronchitis (Bennett et al., 2011; to improve quality.
Robohm and Ruff, 2012). Thus, on February 9th of 2006, the Min- Also, quality information disclosure can allow consumers to
istry of Health and Welfare in Korea began disclosing antibiotic identify high quality firms more easily, and make them more sensi-
prescription rates for the common cold online through the public tive to differences in quality. Then, information disclosure can lead
disclosure website of the Health Insurance Review and Assessment to more competition among firms and may improve the quality of
Service (HIRA). products (see, e.g., Stigler, 1961; Butters, 1977; Salop and Stiglitz,
On average, we find that the antibiotic prescription rates for 1977; Jin and Leslie, 2003).2
the common cold have decreased from 60% to 51% after the However, the empirical evidence on the effect of information
information disclosure. Surprisingly, however, we uncover a large disclosure on quality (or other performance measures) is generally
amount of heterogeneity among the clinics. More than 30% of mixed. For example, Chipty and Witte (1998) find that information
clinics have increased their antibiotic prescription rates after the availability on the quality of child care has no significant effect on
information disclosure. In particular, among clinics whose antibi- the quality of the care. However, Jin and Leslie (2003) find that
otic prescription rates were in the lowest quartile of local clinics information disclosure on restaurants’ hygiene has significantly
before disclosure, almost half of them increased their prescription improved their hygiene.
rates after disclosure. This finding is more consistent with seller (or In the Health Care industry, Vladeck et al. (1988) do not find any
clinic) peer effects. That is, when a clinic finds out that other clinics significant differences in the occupancy rates between high- and
were prescribing relatively more antibiotics than itself, it is more low-mortality rate hospitals after the release of the HCFA (Health
likely to increase its antibiotic prescriptions. Care Financing Administration) data on hospital-specific mortal-
Alternatively, consumers may prefer higher antibiotic pre- ity, while Mennemeyer et al. (1997) do find a small but significant
scription rates, and may have pressured the lower-than-average effect. Longo et al. (1977) examine the impact of an obstetrics con-
antibiotic prescribing clinics to increase their prescription rates. sumer report on hospital behavior in Missouri, and find that half of
However, the evidence shows that for those clinics that were pre- the hospitals improved the quality of their hospital care. Shekelle
scribing antibiotics relatively more than other local clinics before et al. (2008) provide a systematic survey of more recent studies,
the information disclosure, consumers started visiting those clinics but show mixed results as well. In the electricity industry, many
less after the disclosure. Moreover, in townships where consumers states in the US require electricity providers to disclose price and
responded more negatively to the antibiotic prescription rates, the fuel mix so that consumers can compare prices and environmen-
average antibiotic prescription rates decreased more. These results tal impacts. However, these disclosure policies have not induced
suggest not only that consumers learned from the information dis- much consumer switching (Bird, 2009).
closure, but also that informed consumers prefer lower antibiotic Note that the previous literature on mandatory information dis-
prescription rates for the common cold. closure has mainly focused on the changes in consumers’ behavior
We also find that in townships with relatively more clinics, the from learning new information on product quality (consumer learn-
average antibiotic prescription rates after the information disclo- ing effect), which can induce the changes in firms’ behavior. Few
sure decreased less. This result suggests that stronger competition studies, however, have considered the direct effect of informa-
led to relatively higher antibiotic prescription rates and that the tion disclosure on firms’ behavior. Some exceptions include the
clinic peer effects triggered by mandatory information disclosure studies on the effect of information disclosure on firms’ collu-
have reinforced this competition effect. sion (see, e.g., Albaek et al., 1997; Njoroge, 2003).3 However, these
Overall, the empirical evidence supports both consumer learn- studies do not explain why firms often oppose mandatory infor-
ing effects and seller peer effects. The previous literature has mation disclosure.4 Consequently, when the effect of mandatory
implicitly assumed that sellers can observe their competitors’
attributes even before mandatory information disclosure, and has
focused on consumer learning effects. This paper contributes to 2
On the other hand, quality information disclosure may allow consumers to per-
the literature by showing that when sellers cannot observe the ceive the difference between firms, and increase product differentiation among
attributes of their competitors’ products and services, mandatory firms. Then information disclosure would reduce competition (Nelson, 1974; Jin
information disclosure can allow the sellers to learn their com- and Leslie, 2003).
3
petitors’ attributes and potentially trigger perverse peer effects. There is also a theoretical literature that shows firms would disclose their quality
voluntarily if they know each others’ quality, called the unraveling effect (Grossman
Because the seller peer effects can cancel out some of the consumer
and Hart, 1980; Milgrom, 1981). Therefore, it is a theoretical puzzle why firms
learning effects, our results may also explain why some previous in reality do not disclose their quality (see Board, 2009). Our empirical evidence
studies have found no significant effect of information disclosure. suggests that firms may not know each others’ quality (see also Matthews and
Postlewaite, 1985; Shavell, 1994).
4
For example, in 2000 the National Hospital Association opposed a proposal
to impose mandatory information disclosure on fatal and other serious medical
errors. (CNN News February 22, 2000) In 1998, the National Restaurant Association
1
Available from http://www.cbsnews.com/8301-504763 162-57398949- strongly opposed the mandatory display of hygiene “grade cards” (Food Council
10391704/who-antibiotic-overuse-so-prevalent-scraped-knee-could-be-deadly/ News, Vol. 5, Issue 1, January 2002). In 2006, the Korean Congress attempted
I. Kwon, D. Jun / Journal of Health Economics 42 (2015) 1–16 3
information disclosure is insignificant, studies often blame con- new information to the competing clinics (that is, the compet-
sumers’ mistrust, disinterest, or lack of understanding (see, e.g., ing clinics did not know the disclosed information before), in the
Hibbard and Jewett, 1997; Marshall et al., 2000). presence of peer effects, the disclosed information would trigger
Jun and Chung (2011) also analyze the effect of information dis- the social multiplier effects. However, the regulation would not
closure on antibiotic prescription rates in Korea. However, they change clinic or market characteristics, and consequently should
focus on the change in average prescription rates, and do not ana- not change the exogenous or correlated effects. Thus, the change
lyze clinic heterogeneity or consumer response. More importantly, in antibiotic prescription rates after the regulation would be due
they do not consider the difference between consumer learning to the endogenous peer effects, not to the exogenous or correlated
effects and seller peer effects. effects. On the other hand, the information disclosure would also
change the consumers’ demand function. Therefore, we still need
2.2. Peer effects to distinguish between the changes due to clinic peer effects and
the changes due to consumer learning effects.
Information disclosure regulations can provide new informa- Mas and Moretti (2009) distinguish between two types of
tion to competing firms as well as to consumers. In the presence peer effects. When a worker is observed by a high productiv-
of peer effects, learning competitors’ behavior can directly affect ity worker, they find that the productivity of the worker being
firms’ behavior. observed increases. However, when a worker observes another
As Fortin et al. (2007) summarize, peer effects can arise for highly productive worker, the productivity of the first worker does
several reasons. In the context of antibiotic prescription for the not increase. In other words, they find a significant social pres-
common cold, doctors may feel less guilt and prescribe more antibi- sure effect, but no social conformity effect. It is not clear whether
otics when they find that other doctors are prescribing potentially this pattern will hold in other work environments, but their study
ineffective antibiotics as well (the social conformity effect).5 Also, shows that it is important to distinguish between the social pres-
when doctors find out that their peers are prescribing antibiotics sure effect and the social conformity effect in discussing peer
for the common cold, they can learn that prescribing antibiotics for effects.
the common cold may have benefits without strong side effects,
may not lead to drug-resistant bacteria, and may not induce strong 2.3. Antibiotic overuse and the common cold
regulatory resistance (the social learning effect). Finally, when other
doctors prescribe antibiotics to attract more patients instead of The common cold, or Acute Upper Respiratory Tract Infection
educating the patients on the ineffectiveness and potential harms (ARTI), is one of the most common illnesses known to humans and
of antibiotics, doctors may feel it is unfair and become more likely one of the most common reasons patients visit hospitals. Annu-
to prescribe antibiotics to restore equity (the fairness effect).6 ally, about $227 million are estimated to be spent on antibiotics
As Schelling (1978) and Akerlof (1980) point out, these peer for the treatment of the common cold in the United States.8 How-
effects can generate a “social multiplier effect”, as observing other ever, antibiotics do not treat upper respiratory infections caused by
doctors’ antibiotic prescription rates would encourage and rein- viruses like the common cold. Controlled clinical trials have consis-
force the increase in prescription rates even further (see also tently shown that antibiotics therapy does not treat the common
Glaeser et al., 2003; Fischer and Huddart, 2008). Malani et al. (2008) cold. In addition, antibiotics may have caused many complications
argue that overuse of antibiotics is due to a social norm estab- and side-effects. In particular, the overuse of antibiotics has led to
lished among doctors and patients, and is difficult to change in the a rise in antibiotic-resistant bacteria (Gonzales et al., 2001). Infec-
short-term. tions due to penicillin-resistant bacteria are especially difficult to
Empirically, there is increasing evidence of peer effects in treat. In the United States, 4–7 billion dollars are spent on the treat-
economics.7 However, the observed correlation among clinics’ pre- ment of resistant infections each year (Lautenbach et al., 2001;
scription rates does not necessarily imply peer effects, because Bennett et al., 2011).
clinics in a given market may be subject to similar unobserved char- The causes for the overuse of antibiotics, especially for the
acteristics and shocks (exogenous effects and correlated effects). As common cold, are controversial. Patients may demand antibi-
Manski (1993) shows, it is generally difficult to distinguish endoge- otics to mask symptoms and gain psychological comfort (Butler
nous peer effects from exogenous or correlated effects. et al., 1998). Doctors may overprescribe antibiotics to retain their
In this paper, we exploit the introduction of an information patients (Brody, 2005). Also, when it is not clear whether the infec-
disclosure regulation. Note that the regulation can disclose new tion is caused by a virus (that cannot be treated by antibiotics)
information (on prescription rates in our context) not only to con- or bacteria (that can be treated by antibiotics), doctors may sim-
sumers but also to competing clinics. If the regulation does disclose ply prescribe antibiotics to avoid time-consuming medical tests to
discern whether the infection is bacterial or viral. Or doctors may
prescribe antibiotics in order to avoid potential lawsuits and con-
to pass a bill mandating a notice for antibiotics on the prescription, but failed flicts for not providing antibiotics when it turns out to be a bacterial
mostly due to opposition by the hospitals. (See http://www.yakup.co.kr/news/ infection later. Currie et al. (2012) show that financial kickbacks
index.html?cat=11&cat2=51&cat3=&mode=view&nid=82641&num start=5170 (or rebates) from pharmaceutical companies for prescribing antibi-
&pmode=, in Korean.) otics are also important causes of overprescription in China.
5
Doctors often unwillingly prescribe antibiotics due to pressure from patients,
knowing that antibiotics are ineffective for the treatment of the common cold. In
According to OECD Health Data 2011, Korea has the sixth high-
other words, prescribing antibiotics for the common cold can have psychic costs to est rate of antibiotic use among OECD countries. In particular, the
the doctors. Gordon (1989) and Myles and Naylor (1996) argue that in the case of average antibiotic prescription rates for the common cold in 2005
tax evasion, individuals can derive a psychic payoff, or feel less guilt, from adhering were over 60%. Consequently, the prevalence of S pneumoniae with
to the average pattern of their reference group.
6 reduced susceptibility to penicillin is an alarming 70% in Korea,
Spicer and Becker (1980) show that those who believe that they are treated
unfairly by the tax system are more likely to evade taxes. compared to 25% in the United States (Conly, 1998).
7
See, for example, Gaviria and Raphael (2001) for drug use, Wilson (2007) for
cigarette smoking, Sacerdote (2001) for GPA, Carrell et al. (2008) for academic
cheating, Duflo and Saez (2002) for investment decisions, Fortin et al. (2007) for
8
tax evasion. Cited from http://www.npcentral.net/ce/colds/cold.shtml.
4 I. Kwon, D. Jun / Journal of Health Economics 42 (2015) 1–16
In part to reduce the overuse of antibiotic prescriptions, in 2000 (i) average antibiotic prescription rates would decrease;
the Korean government prohibited doctors from selling medica- (ii) an individual clinic’s antibiotic prescription rate can decrease if
tions and allowed them to write prescriptions only. Before 2000, it is relatively higher than other clinics’ or if the social pressure
doctors were allowed to dispense/sell medications to their patients. effect is sufficiently strong;
As a result, under the old system doctors had financial incentives (iii) an individual hospital’s antibiotic prescription rate can increase
to prescribe and dispense more medications including antibiotics. if it is relatively lower than other clinics’ and if the social pressure
In 2000, the Health Insurance Review and Assessment Service effect is sufficiently weak.
(HIRA) was established to monitor and encourage proper drug
prescription. Despite these efforts to prevent overuse of antibi-
otics, antibiotic prescription rates have remained high (Ministry 3.2. Consumer learning effects
of Health and Welfare, 2006). On January 5th, 2006, the Seoul
Administration Court gave a verdict mandating the disclosure of On the demand side, from mandatory information disclosure,
the antibiotic prescription rates for the common cold of every clinic consumers may learn individual clinics’ actual antibiotic prescrip-
and hospital in Korea. tion rates, and update their belief on each clinic’s quality. We
assume that average consumers regard lower antibiotic prescrip-
tion rates for the common cold as a good signal for the clinic’s
3. Theoretical framework quality. Therefore, after the information disclosure, the market
demand for relatively lower (higher) prescribing clinics would
3.1. Peer effects increase (decrease). Then consumer learning along with clinic com-
petition would lead clinics to reduce their antibiotic prescription
On the supply side, we assume that antibiotic prescription rates rates. This is called the consumer learning effect.
are subject to peer effects among clinics. However, peer effects can Note that consumers’ ideal levels of antibiotic prescription rates
arise only when the behavior (that is, antibiotic prescription rates) for the common cold can differ, and some consumers may want
is observable by peers. Ali et al. (2011), for example, show that ado- even higher antibiotic prescription rates for their clinics. However,
lescents’ sleep habits and breakfast consumption (unobservable by market demand reflects the aggregate (or average) of individual
their peers) are not influenced by peers, but that participation in demands. Thus, as long as consumers’ average ideal level of antibi-
sports or eating at fast food restaurants (observable by their peers) otic prescription rates for the common cold is lower than clinics’
are influenced by peers. actual antibiotic prescription rates, the market would regard lower
Suppose that clinics could not observe other clinics’ antibiotic antibiotic prescription rates as a good signal for a clinic’s quality.
prescription rates before the information disclosure. Then manda- Yoo et al. (2009) show in a 2009 survey that 80% of Korean con-
tory information disclosure would allow clinics to observe the sumers think clinics are prescribing too many antibiotics. They also
antibiotic prescription rates of other clinics, and trigger peer effects show that only 10.7% of consumers want antibiotic prescriptions
among the clinics. for the common cold. Therefore, while consumers may want some
According to Mas and Moretti (2009), there can be two types level of antibiotic prescription for the common cold, it appears that
of peer effects. The first type is the social conformity effect. Before most clinics are prescribing more antibiotics than the average con-
the information disclosure, when clinics could not observe other sumer wants. For example, as discussed above, doctors may simply
clinics’ antibiotic prescription rates, each clinic would prescribe prescribe antibiotics to avoid time-consuming medical tests and
antibiotics based on its own norm or expectations of other clin- consumer education to determine whether an infection is bacte-
ics’ antibiotic prescription rates. After the information disclosure, rial or viral. Or doctors may prescribe antibiotics in order to avoid
if a clinic finds out that many other local clinics were prescribing potential lawsuits and conflicts for not providing antibiotics when
relatively more (fewer) antibiotics than itself, the social confor- it turns out to be a bacterial infection, or possibly to gain financial
mity effect would lead the clinic to increase (decrease) its antibiotic kickbacks from pharmaceutical companies.
prescription rates. In the context of antibiotic prescription rates for the common
Note that the social conformity effect may arise from a social cold, mandatory information disclosure can also educate con-
preference for behaving like others or from rational decisions. For sumers about the difference between viral and bacterial infection
example, when a clinic finds out that it is prescribing more antibi- and the ineffectiveness of antibiotics for viral infections such as the
otics than the average, it may reduce its antibiotic prescription rates common cold. Thus, mandatory information disclosure can lower
because of possible social or regulatory pressure, learning of the consumers’ ideal level of antibiotic prescription rates for the com-
increased danger of drug-resistant bacteria or the potential side mon cold as well. Then, low (high) antibiotic prescription rates
effects of antibiotics. for the common cold would become an even better (worse) sig-
The second type of peer effect is the social pressure effect that nal for clinic quality, and this would put stronger market pressure
would pressure clinics to do the socially ‘right’ thing. In the context on clinics to reduce their antibiotic prescription rates. Also, con-
of antibiotic prescription for the common cold, the socially right sumers would become less likely to visit clinics for the common
thing would be not to prescribe antibiotics. cold in general, and more likely to get over-the-counter medicine
Since the social conformity effects for individual clinics are likely from pharmacies instead. That is, the total market demand for clinic
to cancel each other out, mandatory information disclosure would visits is likely to decrease.
lead to less average antibiotic prescription rates among clinics due Moreover, assuming that consumers prefer lower antibiotic pre-
to the social pressure effect. scription rates for the common cold (, which we will test empirically
Then, in the case where clinics could not observe other clin- below), the consumer learning effect should reduce both the aver-
ics’ antibiotic prescription rates before the information disclosure, age and individual antibiotic prescription rates of all clinics. In
we can summarize the peer effects triggered by the information contrast, from Proposition 1, clinic peer effects predict that some
disclosure in the following proposition. clinics would increase antibiotic prescription rates, while the aver-
age prescription rates may fall.
Proposition 1. Suppose that mandatory information disclosure Then, in the case where consumers could not observe clin-
triggers peer effects among clinics. Then, ics’ antibiotic prescription rates before the information disclosure,
I. Kwon, D. Jun / Journal of Health Economics 42 (2015) 1–16 5
(after disclosure)
(after disclosure)
Competition Competition
we can summarize the consumer learning effects triggered by the would decrease even more when there is stronger clinic compe-
information disclosure as follows: tition in the market. (See Fig. 1(b) for an illustration.)
To summarize,
Proposition 2. Suppose that mandatory information disclosure
triggers consumer learning effects. Then, Proposition 3.
(i) market demand for relatively lower (higher) prescribing clinics (i) If mandatory information disclosure triggers clinic peer effects,
would increase (decrease); it would reduce the average antibiotic prescription rates less (in
(ii) total market demand would decrease; absolute value) when competition is stronger.
(iii) the average antibiotic prescription rates would decrease; (ii) If mandatory information disclosure triggers consumer learning
(iv) individual clinic’s antibiotic prescription rates would decrease. effects, it would reduce the average antibiotic prescription rates
even more (in absolute value) when competition is stronger.
3.3. Competition effects
As discussed in the beginning, one of the main justifications for
Suppose that before the mandatory information disclosure reg- the mandatory information disclosure policy has been based on
ulation, neither the clinics nor the consumers could observe other the interaction between consumer learning effects and the com-
clinics’ antibiotic prescription rates. As discussed earlier, clinics petition effect. That is, when consumers are informed of the true
may prescribe antibiotics for cost saving or financial kickbacks. quality of products and services, they would choose sellers with
Also, when consumers cannot observe or compare antibiotic pre- higher quality products and services. Therefore, with mandatory
scription rates, prescribing more antibiotics may increase demand information disclosure, competition among sellers would force the
as it can mask the symptoms, and reduce the number of tests. Then, sellers to increase the quality of products, and/or drive the low
with more clinics, stronger competition can lead to higher antibi- quality product sellers out of the market.
otic prescription rates. For example, Fogelberg and Karlsson (2012) However, Proposition 3 suggests that if the sellers themselves,
and Bennett et al. (2011) show that stronger competition has a not the consumers, are informed of the true quality of their
positive effect on antibiotic prescription rates. competitors’ products and services by mandatory information dis-
Suppose that after the mandatory information disclosure reg- closure, stronger competition can reduce the effect of mandatory
ulation, clinics observe other clinics’ antibiotic prescription rates information disclosure, which undermines the conventional justi-
but consumers cannot, possibly because the disclosed information fication for a mandatory information disclosure policy.
is too difficult for consumers to find or interpret. In the presence of It is also worth emphasizing that Proposition 3 provides a pos-
clinic peer effects, stronger competition would lead to even higher sible way to distinguish between consumer learning effects and
antibiotic prescription rates because if one clinic increases its pre- clinic peer effects even when individual clinic level prescription
scription rates, other clinics would increase their prescription rates data are not available. Propositions 1 and 2 show that individual
as well, in accordance with the social multiplier effect. That is, while clinics’ antibiotic prescription rates would respond to the informa-
Proposition 1 predicts that the average antibiotic prescription rates tion disclosure differently depending on whether the information
will fall due to the social pressure effect, if competition among clin- disclosure triggers clinic peer effects or consumer learning effects.
ics becomes stronger, the prescription rates after the disclosure However, both effects predict that the market average antibiotic
would be relatively larger. Therefore, when competition is strong, prescription rates will fall. Therefore, unless researchers have data
information disclosure would decrease antibiotic prescription rates on individual clinic level antibiotic prescription rates, it can be
less in absolute value. (See Fig. 1(a) for an illustration.) difficult to distinguish between clinic peer effects and consumer
Now suppose that after mandatory information disclosure, con- learning effects.
sumers can observe and compare the antibiotic prescription rates Proposition 3, however, shows that the interaction effect
of local clinics, and that they prefer lower prescription rates. With between competition and information disclosure on average
more clinics, consumers have more choices for clinics, and can antibiotic prescription rates would differ depending on whether
switch clinics more easily. Then, after the information disclosure, the clinic peer effect or the consumer learning effect dominates.
stronger competition among clinics would lead to lower antibiotic Therefore, even with market level data on average antibiotic pre-
prescription rates. That is, if information disclosure triggers con- scription rates, one can potentially distinguish between the two
sumer learning effects, the average antibiotic prescription rates effects.
6 I. Kwon, D. Jun / Journal of Health Economics 42 (2015) 1–16
9
Ministry of Health and Welfare 2007-1-12 press kit “Results from a Survey on right decision. However, such risk adjustments were not made. On the other hand,
Medical Service and Provision After Information Disclosure of Antibiotics Prescrip- the clinics can prescribe antibiotics to such patients under different diagnostic codes.
12
tion Rates”. The township characteristics are available from Seoul Statistics Information.
10 13
As of April 2014, various new performance ratings for several diseases (e.g. dia- Given that the data report the antibiotic prescription rates for the common cold,
betes), operations (e.g. breast cancer), and prescriptions (e.g. antibiotics for acute most clinics in the data are specialized in general medicine, internal medicine, pedi-
otitis media in children) can also be found and compared on the same website. atrics, and otorhinolaryngology. However, some clinics in the data are specialized in
However, this information was not disclosed during our sample period (2005–2009). surgery, tuberculosis, and dermatology. Thus, we will include the clinic or the spe-
11
For some patients, such as infants or those with a prior history of lower respira- cialty fixed effects in our analysis. Also, focusing on general medicine and internal
tory infections, prescribing antibiotics for the common cold proactively can be the medicine specialties only (47% of the sample) does not change the results.
I. Kwon, D. Jun / Journal of Health Economics 42 (2015) 1–16 7
Fig. 2. Online disclosure of antibiotic prescription rates for the common cold. Note: The English translations in the figure are made and inserted by the authors.
dummies, and 23 medical specialty dummies.14 The estimated to −8.98 % . These estimates are close to the estimates from earlier
effect of information disclosure is a 9.67 percentage point reduction studies. For example, Jun and Chung (2011) estimate the effect to
in the average antibiotic prescription rate for the common cold. In be around −9.53% to −6.49 % .
column (2), we control for clinic characteristics and township char- From Propositions 1(i) and 2(iii), the decline in the average
acteristics. In column (3), we control for township fixed effects. In antibiotic prescription rates is consistent with both clinic peer
column (4), we control for clinic fixed effects. Finally, in column effects and consumer learning effects. That is, clinics may have
(5), we use the balanced sample only. These models show that the reduced their antibiotic prescription rates either because of social
estimated effect of information disclosure is in the range of −9.67% pressure to do the right thing or because of market pressure to keep
the informed patients.
However, we cannot rule out the possibility that the decline was
driven by other concurrent unobserved shocks such as a sudden
14
Including year dummies or a linear time trend does not change the results.
8 I. Kwon, D. Jun / Journal of Health Economics 42 (2015) 1–16
Table 1
Summary statistics of selected variable.
District (gu) College degree (%) 42,020 23.19 7.12 11.74 38.20
decrease in the more severe type of common cold. Jun and Chung rates of every clinic were disclosed at least once during our sam-
(2011) provide difference-in-difference estimates using those clin- ple period. Then, the difference-in-difference estimates are likely
ics whose antibiotic prescription rates were not disclosed online as to underestimate the true effect of the mandatory information dis-
a comparison group. Recall that the clinics that have written out less closure. Therefore, unless specified otherwise, we will focus on the
than one-hundred prescriptions for antibiotics per quarter were disclosed samples with the clinic fixed effects as in column (4) of
not disclosed online. In columns (1)–(3) of Table 3, we replicate Table 2 for a base specification, and use the difference-in-difference
the results from Jun and Chung (2011) with the clinic fixed effects. model in column (3) of Table 3 to check the robustness of the results.
“Open” is a dummy variable to indicate whether the prescription Also, clinics may have known about the mandatory informa-
rates are disclosed online or not. Note that column (2) of Table 3 tion disclosure and changed their prescription rates even before the
shows that the effect of disclosure for the comparison group is actual disclosure on February 9th, 2006. Recall that the litigation
much smaller. Alternatively, in column (3), we use the full sample, for the information disclosure was filed in June 2005, and that the
and estimate the effect of the interaction term between disclosure verdict for the disclosure was delivered on January 5th, 2006. But
and the open dummy variable. From these difference-in-difference our disclosure dummy variable in Tables 2 and 3 assumes that the
estimates, the effect of mandatory information disclosure is about event took place in the beginning of the second quarter in 2006.
−5.2 % . Then, our estimates is likely to under-estimate the true effect of
However, even when a clinic’s antibiotic prescription rate is not mandatory information disclosure.
disclosed online in one quarter, the clinic’s prescription rates can However, the court verdict on January 5th, 2006 came as a sur-
be disclosed online in other quarters when the number of its pre- prise because the government had argued that prescription rates
scriptions becomes more than one hundred. In fact, the prescription were a part of business secrets and were exempted from any
Table 2
Change in antibiotic prescription rates for the common cold (dependent variable = antibiotic prescription rate (%)).
Notes: Column (3) controls for the township fixed effects. Columns (4) and (5) control for the clinic fixed effects. In column (5), we use the balanced sample only.
*
Significant at 10%.
**
Significant at 5%.
***
Significant at 1%.
I. Kwon, D. Jun / Journal of Health Economics 42 (2015) 1–16 9
.04
(a) full sample
65
.03
prescription rate(%)
60
Density
55
.02
50
.01
2005q1 2006q1 2007q1 2008q1 2009q1
time
0
-100 -50 0 50 100
65
out that there are large variations. For example, when we plot a
2005q1 2006q1 2007q1 2008q1 2009q1 histogram for the simple difference in the average antibiotic pre-
time
scription rates before and after the information disclosure for each
Fig. 3. Average antibiotic prescription rates.
clinic, Fig. 4 shows that the changes in prescription rates vary
widely across clinics. In particular, 30% of the clinics have increased
their antibiotic prescription rates after the regulation.
information disclosure requirement. Therefore, it is unlikely that Proposition 2(iv) shows that assuming consumers prefer fewer
clinics changed their prescription rates before January 2006. On the antibiotic prescriptions, consumer learning effects should lead to a
other hand, it is possible that some clinics started changing their fall in the prescription rates of every clinic. However, Proposition
prescription rates right after the verdict even before the actual dis- 1(ii) and (iii) show that clinic peer effects can cause individual clin-
closure. In fact, Fig. 3 shows that the average prescription rates ics’ prescription rates to either increase or decrease depending on
started to decline in the first quarter of 2006. Thus, in column (4) of whether their prescription rates are relatively higher or lower than
Table 3, we have excluded the sample from the first quarter of 2006. their peers’. Therefore, Fig. 4 is more consistent with Proposition 1,
Alternatively, in column (5), we redefined the disclosure dummy suggesting the existence of clinic peer effects.
to one if it is strictly after the fourth quarter of 2005. However, the More specifically, Fig. 5 shows each individual clinic’s prescrip-
results do not change much. tion rate over time in one particular township. Note that those
clinics with higher-than-average pre-disclosure prescription rates
Table 3 are more likely to decrease their rates post-disclosure (e.g. ID = E,
Change in antibiotic prescription rates for the common cold: robustness (dependent F). Also, those clinics with lower-than-average prescription rates
variable = antibiotic prescription rate (%)).
pre-disclosure are more likely to increase their rates (e.g. ID = C, D).
(1) (2) (3) (4) (5) These patterns are consistent with the clinic peer effects as dis-
Open = 1 Open = 0 All Open = 1 Open = 1 cussed in Proposition 1(ii) and (iii), even though not all clinics
Disclosure −9.3494*** −3.4096*** −4.0870*** −9.2608*** follow these patterns.
(0.1397) (0.6693) (0.3606) (0.1418) To check whether the patterns in Fig. 5 generalize to other town-
Open 3.4230*** ships, we first create dummy variables, P25, P50, P75, and P100
(0.5727)
where P25 = 1 if a clinic’s antibiotic prescription rate in 2005.Q1
Disclosure × open −5.2779***
(0.3899) is lower than the 25 percentile in a township, P50 = 1 if a clinic’s
Disclosure 1 −9.3696*** antibiotic prescription rate is between the 25 percentile and 50
(0.1492) percentile in a township, and so on. Then, for the townships with
Fixed effect Clinic Clinic Clinic Clinic Clinic
more than 100 observations, we estimate the following model by
Observations 42,020 7,594 49,614 39,400 42,020 each township:
R-squared 0.1057 0.0049 0.0747 0.1068 0.0947
Notes: All regressions include quarterly dummies and clinic fixed effects. Column (1) prescription rateit = ˇ0 + ˇ1 Disclosuret ∗P25i
uses the sample of clinics whose prescription rates are disclosed online (Open = 1).
Column (2) uses the sample of clinics whose prescription rates are not disclosed + ˇ2 Disclosuret ∗P50i + ˇ3 Disclosuret ∗P75i
online because they had less than 100 prescriptions (Open = 0). In column (3), we
use the full sample and estimate the interaction effect between “Disclosure” and + ˇ4 Disclosuret ∗P100i
“Open” dummies. In column (4), we exclude the observations in 2006Q1. In column
(5), we redefine the “Disclosure” dummy to be one if date is strictly after 2005Q4.
+ (Quarter Dummiest ) + ıi + it , (1)
*
Significant at 10%.
**
significant at 5%. where Disclosuret =1 if date is after 2006.Q1, and ıi is a clinic fixed
***
Significant at 1%. effect.
10 I. Kwon, D. Jun / Journal of Health Economics 42 (2015) 1–16
A B C D
100
50
0
prescription rate(%)
E F G H
100
50
0
2005q12006q12007q12008q12009q12005q12006q12007q12008q12009q1
I J
100
50
0
2005q12006q12007q12008q12009q12005q12006q12007q12008q12009q1
time
Graphs by clinic ID
Fig. 6 shows the histograms for the estimated ˇ1 , ˇ2 , ˇ3 and ˇ4 From Fig. 6(d), a majority (90%) of the township’s top 25% pre-
for each township. Note that ˇ1 measures the change in antibiotic disclosure clinics decreased their rates post-disclosure. However,
prescription rates for those clinics whose prescription rates before from Fig. 6(a), almost a half (47%) of the lowest 25% pre-disclosure
the information disclosure were in the lowest 25 percentile in the clinics increased their rates post-disclosure.
township. Likewise, ˇ4 measures the change in antibiotic prescrip- Alternatively, in Table 4, we estimate a probit model where the
tion rates for those clinics whose pre-disclosure rates were in the dependent variable is one if a clinic has increased its antibiotic pre-
top 25% in the township. scription rates after the information disclosure regulation, and zero
.04
.02 .03
.04
Density
Density
.02
.01
0
.02 .03
.02
Density
Density
.01
.01
0
Table 4
Increase in antibiotic prescription rates: probit analysis (dependent variable = 1 if a clinic has increased prescription rate after the information disclosure, = 0 otherwise.)
(0.0843) (0.1047)
P75 −0.4398*** −0.3572***
(0.0913) (0.1125)
P100 −0.6571*** −0.6148***
(0.0881) (0.1111)
Prescription rate −0.8416*** −0.7941***
Ranking (before disclosure) (0.1074) (0.1342)
Medical speciality Yes Yes Yes Yes
Random effect Township Township Township Township
Notes: All regressions control for number of doctors, clinic age, number of clinics, population, the share of age 60 and older, medical speciality dummies and the township
random effects. In columns (2) and (4), we use the balanced sample only. P50 is equal to one if prescription rate before the information disclosure was between 25 percentile
and 50 percentile, and zero otherwise. P75 and P100 are defined in a similar way.
*
Significant at 10%.
**
Significant at 5%.
***
Significant at 1%.
otherwise. Column (1) of Table 4 shows that clinics are relatively are consistent with our hypothesis that information disclosure
more likely to increase their rates if they were relatively lower regulation can allow the clinics to learn their competitors’ prescrip-
within a township pre-disclosure. Also, when we control for the tion rates and trigger perverse peer effects.
relative ranking of antibiotic prescription rates within a township
before disclosure, column (3) shows similar results. In column (2)
and (4), we restrict the analysis to the balanced sample, but the 5.3. Regression to the mean
results do not change.
Therefore, not all clinics have decreased antibiotic prescription Regression to the mean can be an alternative explanation for
rates after the information disclosure regulation. In particular, clin- the finding that those clinics with lower than average antibiotic
ics with relatively lower pre-disclosure prescription rates are much prescription rates pre-disclosure are more likely to increase pre-
more likely to increase their rates after the regulation. These results scription rates post-disclosure. To evaluate the extent of regression
to the mean bias, we focus on the sample after 2007.Q1. Then we
hypothetically assume that there was an information disclosure in
Table 5
Regression to the mean (dependent variable = antibiotic prescription rates (%)).
2008.Q1. From Fig. 3, the market prescription rates appear to have
reached a new steady state equilibrium by 2007Q1. Thus, a hypo-
(1) (2) (3) thetical disclosure in 2008Q1 should not have any further effect on
(a) 2007Q1–2009Q2 the prescription rates.
Disclosure (hypothetical) 0.1772 3.8382*** 3.2614*** In Table 5(a), we measure the relative ranking (or CDF) of
(0.1324) (0.2923) (0.3125)
each clinic’s prescription rates within its township in 2007.Q1
Ranking (at 2007Q1) 75.7823*** 76.2029***
(1.3518) (1.6701) (before the hypothetical information disclosure), called “Rank-
Disclosure × ranking −6.4400*** −5.0534*** ing”. Then, we estimate the effect of the interaction term between
(0.4431) (0.4717) the hypothetical disclosure dummy and the ranking. Column (1)
in Table 5(a) shows that the hypothetical disclosure has no sig-
Random effect Clinic Clinic Clinic
nificant effect on the antibiotic prescription rates as expected.
Observations 21,854 21,223 15,547 Column (2) in Table 5(a), however, shows that the interaction term
between the hypothetical disclosure dummy and the prescription
(b) 2005Q1–2006Q4
rate ranking has a negative and significant effect. That is, those
Disclosure (actual) −10.3428*** −1.4340*** −1.3909***
(0.1770) (0.3762) (0.4411) clinics with relatively higher prescription rates in 2007.Q1 have
Ranking (at 2005Q1) 71.2326*** 68.5265*** decreased their prescription rates after the hypothetical informa-
(1.3069) (1.6671) tion disclosure in 2008.Q1. Because there was no real disclosure in
Disclosure × ranking −15.0551*** −14.8017***
2008.Q1, this effect is likely to be due to the regression to the mean
(0.5615) (0.6612)
bias.
Random effect Clinic Clinic Clinic For comparison, in Table 5(b), we estimate the same model for
the sample before 2007.Q1. Because the real information disclo-
Observations 19,306 17,363 12,240
sure was implemented in 2006.Q1, the effect of the interaction
Notes: All regressions control for the number of doctors, the number of doctors and
term between the (real) disclosure dummy and the prescription
staff, clinic age, number of clinics, population, share of age 65 and older, quarterly
dummies, and medical speciality dummies. In column (3), only the balanced samples
rate ranking (in 2005.Q1) would include both the regression to the
are included. In (a), “Ranking” is the relative ranking of antibiotic prescription rates mean bias and the real policy effect. From column (2) in Table 5(a),
within the township in 2007Q1 (=1 if the highest, =0 if the lowest). “Disclosure” is the coefficient of the interaction term for the hypothetical dis-
one if time is after 2008Q1. In (b), “Ranking” is the ranking of antibiotic prescription closure is −6.4, while from column (2) of Table 5(b), that for the
rates within the township in 2005Q1 (=1 if the highest, =0 if the lowest). “Disclosure”
real disclosure is −15.05. Therefore, it seems that even after tak-
is one if time is after 2006Q1 and zero otherwise.
*
Significant at 10%. ing into account the regression to the mean bias, those clinics
**
Significant at 5%. with relatively higher antibiotic prescription rates before the (real)
***
Significant at 1%. information disclosure are more likely to reduce their prescription
12 I. Kwon, D. Jun / Journal of Health Economics 42 (2015) 1–16
Table 6
Competition and information disclosure (dependent variable = antibiotic prescription rates (%)).
Open=1 All
Notes: All regressions include quarterly dummies and clinic fixed effects. “Competition” is measured by the number of clinics per 1000 population. Columns (1) and (2) are
estimated for those clinics whose antibiotic prescription rates are disclosed (Open=1). Columns (3) and (4) incude those clinces whose antibiotic prescription rates are not
disclosed (Open=0). Columns (2) and (4) use the balanced sample only.
*
Significant at 10%.
**
Significant at 5%.
***
Significant at 1%.
rates after the disclosure.15 Again, these results are consistent with disclosure dummy and other market characteristics. Therefore, we
clinic peer effects. will first estimate the heterogeneity in market responses, and ana-
lyze the robustness of the competition effect when controlling for
5.4. Competition effects the interaction effects with other market characteristics.
As discussed in Proposition 3, competition can have different
6. Consumer learning effects
effects on the change in antibiotics prescription rates depending
on who is learning from the information disclosure or whether So far we have focused on the evidence for clinic peer effects on
information disclosure triggers clinic peer effects or consumer the supply side. In this section, we analyze the extent of consumer
learning effects. If consumers learn from the disclosure, stronger learning effects on the demand side.
competition would make the change in the prescription rate more
negative. However, if mainly the clinics learn from the disclosure, 6.1. Consumer preference and learning
stronger competition would make the change less negative. (See
Fig. 1.) Recall that from mandatory information disclosure, consumers
Thus, in Table 6, we measure competition by the number of clin- can learn not only the antibiotic prescription rates of their local
ics per 1000 population, and estimate the effect of competition on clinics but also the fact that antibiotics do not treat the common
the change in antibiotic prescription rates due to the information cold. Then, as discussed in Proposition 2, consumers would visit
disclosure. Table 6 column (1) shows that the interaction between clinics less after the information disclosure. Also, if consumers pre-
the disclosure dummy and competition has a positive and signif- fer lower antibiotic prescription rates, they would visit those clinics
icant effect on the prescription rates. In column (2), we use the with relatively high prescription rates even less frequently.
balanced sample only, but the results are robust. That is, when there Therefore, in Table 7, we estimate how the number of patient
are more competing clinics, the information disclosure regulation visits for the common cold has changed after the information dis-
decreases the prescription rates less in absolute value. closure, especially depending on the clinics’ relative ranking by
In columns (3) and (4), we use the non-disclosed clinics as a antibiotic prescription rates.16
comparison group (open = 0), and estimate the interaction effect Column (1) of Table 7 shows that after the information disclo-
between disclosure and competition. Despite our previous caveat sure, the number of patient visits for the common cold decreased
that this comparison group is likely to underestimate the effect by 15.9%. That is, consumers may have learned that antibiotics do
of information disclosure, the coefficient of the interaction term not treat the common cold, and stopped going to the clinics for the
among disclosure, open, and competition is positive and signifi- common cold. For example, as shown in Fig. 2(b), the information
cant. That is, higher competition reduces the effect of disclosure disclosure website displays a clear message that antibiotics do not
in absolute value. Overall, these results are more consistent with treat the common cold.
clinic peer effects as discussed in Proposition 3. Moreover, column (2) of Table 77 shows that the coefficient
Alternatively, the effect of information disclosure may vary sys- of the interaction between disclosure dummy and lag ranking of
tematically across markets. And our competition measure may be antibiotic prescription rates is negative and significant.17 That is,
correlated with some of the other market characteristics. Though the number of patient visits to those clinics with relatively high
our clinic fixed effects should control for other market character- antibiotic prescription rates have decreased even more. This result
istics, they do not control for the interaction effect between the suggests that consumers did learn and compare local clinics’ antibi-
otic prescription rates. In particular, this result is consistent with
15
Alternatively, we have used the sample (2005Q1–2005Q4) before the actual
16
disclosure, and repeated the placebo test pretending that there was information As Fig. 2(a) shows, the clinics were disclosed in the reverse order of antibiotic
disclosure in 2005Q2. The results are essentially the same as Table 5(a). For example, prescription rate ranking.
17
the estimated coefficient for the interaction between disclosure and prescription We control for the lag ranking because only the prescription rates in the previous
rate ranking is −6.162. quarter are disclosed online.
I. Kwon, D. Jun / Journal of Health Economics 42 (2015) 1–16 13
Table 7
Patient visits and antibiotic prescription rates (dependent variable = log(number of patient visits for the common cold)).
Notes: All regressions include quarterly dummies and clinic fixed effects. Columns (3) and (5) are estimated for data before the information disclosure. Columns (4) and (5)
are estimated for data after the information disclosure. Ranking (t − 1) is the lag relative ranking of antibiotic prescription rates in a township where 0 is the lowest and 1
is the highest. Deviation is the lag difference between the clinic’s antibiotic prescription rate and the township median antibiotic prescription rate. Deviation+ is equal to
Deviation if Deviation > 0 and zero otherwise. Deviation− is equal to Deviation if Deviation < 0 and zero otherwise.
*
Significant at 10%.
**
Significant at 5%.
***
Significant at 1%.
our assumption that average consumers prefer lower antibiotic pre- Table 8
Lag vs. contemporenous ranking of antibiotic prescription rates (dependent vari-
scription rates for the common cold.
able = log(number of patient visits for the common cold)).
In columns (3) and (4), we estimate the model separately for
before and after the information disclosure. From column (3), (1) (2)
Before After
before the information disclosure, the relative ranking of a clinic’s
antibiotic prescription rates has no significant effect on the num- Ranking (t − 1) −0.0067 −0.1486***
ber of patient visits. This result confirms that before the disclosure, (0.0320) (0.0467)
Ranking (t) −0.0168 −0.0471
consumers did not know individual clinics’ antibiotic prescription
(0.0320) (0.0628)
rates. Fixed effect Clinic Clinic
However, from column (4) of Table 7, after the information dis-
Observations 9550 4923
closure, the relative ranking of a clinic’s antibiotic prescription rates R-squared 0.3317 0.6069
has a negative and significant effect. Again, this result is consis-
Notes: All regressions include quarterly dummies and clinic fixed effects. Column
tent with consumer learning effects and consumers’ preference for
(1) is estimated for data before the information disclosure. Column (2) is estimated
lower antibiotic prescription rates. for data after the information disclosure. Ranking (t − 1) is the lag relative ranking of
Alternatively, consumers may avoid clinics that deviate from antibiotic prescription rates in a township where 0 is the lowest and 1 is the highest.
the norm, including those clinics with much lower prescription Ranking (t) is the contemporaneous ranking.
*
rates than average. Thus, we measure the difference between each Significant at 10%.
**
Significant at 5%.
clinic’s prescription rate and the township median, called “Devia- ***
Significant at 1%.
tion”. Then, we define Deviation+ as equal to Deviation if Deviation
is positive and zero otherwise. Likewise, we define Deviation− as
equal to Deviation if Deviation is negative and zero otherwise. evidence suggests that consumers actually prefer lower prescribing
Columns (5) and (6) of Table 7 show that both Deviation+ and clinics.20
Deviation− have negative and significant effects on the number of
patient visits after the information disclosure. And the coefficients 6.2. Consumer learning vs. clinic manipulation
of the two variables are not statistically different.18 That is, con-
sumers seem to prefer lower antibiotic prescription rates for all An important caveat for our analysis is that clinics may have
clinics. manipulated the disclosed antibiotic prescription rates. For exam-
Note that we have interpreted the finding that pre-disclosure ple, because the denominator for antibiotic prescription rate is the
low-prescribing clinics are more likely to increase their rates number of patient visits, clinics can reduce antibiotic prescription
post-disclosure as evidence for clinic peer effects. However, rates by asking patients to visit clinics more frequently. However,
such a finding can also arise if the consumers prefer higher as column (1) of Table 7 shows, the number of patient visits has
antibiotics prescription rates, and insist that low-prescribing decreased, rather than increased, after the information disclosure.
clinics should increase their antibiotics prescription rates. In Alternatively, clinics can change the patients’ diagnostic codes
fact, patient demand for antibiotics has been blamed as the for those patients prescribed with antibiotics. Because only the
main reason for the overuse of antibiotics.19 However, our antibiotic prescription rates for the common cold, or acute upper
respiratory tract infection, are disclosed, clinics can change the
diagnostic codes, for example, to a lower respiratory tract infection
18
The p-value is 0.45. such as pneumonia, and continue to prescribe antibiotics. Then,
19
“A number of factors influenced the tendency to overprescribe [antibiotics],
including mostly patient demand, but also time pressure to end patient visits sooner,
fear of malpractice lawsuits if a prescription is denied,...” Forbes (July 9, 2012)
20
available from http://www.forbes.com/sites/gerganakoleva/2012/07/09/private- Currie et al. (2012) also find that the overuse of antibiotics in China is not demand
physicians-drive-up-antibiotic-resistance-helped-along-by-patients/. driven but is largely a supply-side phenomenon.
14 I. Kwon, D. Jun / Journal of Health Economics 42 (2015) 1–16
Table 9
The effect of clinic and consumer characteristics on learning (dependent variable = log(number of patient visits for the common cold)).
Note: All regressions include quarterly dummies and clinic fixed effects. Clinic age and pediatric dummy variable are measured by each clinic in 2007. Population density
and share of age 9 under are measured by each township in 2009. The share of college graduates is measured by each district in 2009.
*
Significant at 10%.
**
Significant at 5%.
***
Significant at 1%.
the reported number of patient visits for the common cold would Since we do not have data on individual consumer charac-
decline as shown in column (1) of Table 7. teristics, in Table 9, we estimate how the average consumer
However, if clinics have manipulated the patients’ diagnostic characteristics in each township and clinic characteristics affect
codes in such a way, both the reported number of patient visits for consumer learning. As a benchmark, in column (1) of Table 9,
the common cold and the antibiotic prescription rates for the com- we estimate the effect of an interaction term between disclosure
mon cold would decline together at the same time at the clinic level. dummy and clinics’ antibiotic prescription ranking. The coefficient
But both columns (2) and (4) of Table 7 show that when the ranking of this interaction term measures how much consumers’ response
of antibiotic prescription rates of a clinic declined, the number of to the antibiotic prescription ranking has changed after the infor-
patient visits for that clinic increased. mation disclosure, and thus provides a measure for consumer
Moreover, manipulation of diagnostic codes would affect the learning.
number of patient visits and the antibiotic prescription rates at the In columns (2)–(6), we estimate how this measure for con-
same time. On the other hand, because the antibiotic prescription sumer learning depends on clinic and consumer characteristics.
rates in the previous quarter are disclosed online, consumer learn- Column (2) shows that the interaction term with the clinic age is
ing effects suggest that the number of patient visits would depend positive and significant. Because consumers respond negatively to
on the lag antibiotic prescription rates, not on the contemporane- the antibiotics ranking after the information disclosure, this result
ous rates. implies that the consumer learning is smaller in absolute value if
Thus, in columns (1) and (2) of Table 8, we control for both the clinics are older. One interpretation is that for older clinics, con-
lag ranking of antibiotic prescription rates and contemporane- sumers have observed their quality for a long time and have a strong
ous rates at the same time. Note that the contemporaneous rates prior on their quality. Then, the newly disclosed information would
are not significant either before or after the disclosure. However, not change consumers’ posterior belief on their quality much. Thus,
the lag prescription rates are significant only after the disclosure. when old clinics’ antibiotic prescription ranking increases, con-
Therefore, while we cannot rule out the possible manipulation of sumers may respond less negatively.
disclosed prescription rates, our results appear to be driven by Column (3) in Table 9 shows that the interaction term with pop-
consumer learning effects rather than manipulation of disclosed ulation density has a negative and significant effect, which implies
prescription rates.21 that consumer learning is larger in absolute value when population
density within a township is high. Recall that a 2007 government
6.3. Heterogeneity in consumer learning survey showed that only 7% of consumers have actually visited the
information disclosure website. The significant evidence for con-
We consider consumer learning as a process where consumers sumer learning despite the low website visit rates suggests that
use the disclosed information to (Bayesian) update their belief on consumers may have learned from some of their neighbors. Thus,
clinics’ quality. Because consumers’ priors and weights on the dis- our result is consistent with a hypothesis that in more densely pop-
closed information are subjective, the extent of learning can be ulated areas, consumer learning through neighbors is likely to be
heterogeneous depending on consumer and clinic characteristics. more important.
Column (4) shows that the interaction term with the share of
college graduates has a positive and significant effect. That is, the
consumer learning effect is smaller in absolute value in townships
21
Alternatively, consumers may have interpreted high antibiotic prescription with a more highly educated population. This result may seem
rates as a signal for more infectious patients and avoided that clinic. This is unlikely,
counter-intuitive. However, a 2010 survey shows that more highly
however, because the awareness of hospital-acquired infection has unfortunately
been very low in Korea, especially during our sample period. educated respondents are more likely to believe that antibiotics can
I. Kwon, D. Jun / Journal of Health Economics 42 (2015) 1–16 15
Table 10
The effect of clinic and consumer characteristics on learning (dependent variable = antibiotic prescription rates (%)).
Note: All regressions include quarterly dummies and clinic fixed effects. Clinic age and pediatric dummy variable are measured by each clinic in 2007. Population density
and share of age 9 under are measured by each township in 2009. The share of college graduates is measured by each district in 2009.
*
Significant at 10%.
**
Significant at 5%.
***
Significant at 1%.
treat the common cold.22 Also, more highly educated consumers less after information disclosure regulation for older clinics and for
may put more weight on their own prior belief and relatively less townships with more educated consumers.
weight on the newly disclosed information by the government. To confirm such predictions, in Table 10, we estimate how the
Thus, the consumer learning effect can be smaller for more highly changes in antibiotic prescription rates vary with clinic and con-
educated consumers. sumer characteristics. More specifically, as in Table 9, we interact
Column (5) shows that the interaction term with the share of clinic and consumer characteristics with the disclosure dummy
children has a negative and significant effect. That is, consumer variable. In our base specification, column (1), the average antibi-
learning is larger in absolute value in townships with more chil- otic prescription rates have decreased by 9.34% after the mandatory
dren. This result suggests that parents with children are either information disclosure. Column (2) shows that the interaction term
putting more weight on the disclosed information or are more sen- between the disclosure dummy and clinic age is positive and sig-
sitive to the perceived quality of clinics. Thus, they respond more nificant. That is, the antibiotic prescription rates decreased less for
sensitively to the disclosed antibiotics prescription ranking. Simi- older clinics. Note that this pattern is consistent with the pattern
larly, column (6) shows that patient visits to pediatric clinics have of consumer learning found in column (2) of Table 9.
become more sensitive to the antibiotics ranking after the disclo- Likewise, the estimates from columns (3)–(7) of Table 10 are
sure, compared with the other specialities. remarkably consistent with those found in the corresponding
In column (7), we control for all the interactions at the same columns (3)–(7) in Table 9. That is, Table 9 shows that the extent of
time. The qualitative results do not change. But the interaction consumer learning from information disclosure is larger in more
with the share of children becomes insignificant largely due to the densely populated townships with relatively more children or
control for the interaction with the pediatric dummy variable. The pediatric clinics but with fewer college graduates. Then, given
interaction with population density also becomes insignificant, but consumer preference for lower antibiotic prescription rates, antibi-
the sign of the coefficient does not change. otic prescription rates should decrease more in such townships.
These results suggest that there is significant heterogene- Consistent with this prediction, columns (3)–(7) of Table 10 show
ity in consumer learning, depending on clinic and consumer that the antibiotic prescription rates have decreased more in more
characteristics.23 densely populated townships with relatively more children or
pediatric clinics but with fewer college graduates. Also note that
the positive effect of the interaction between disclosure and com-
6.4. Heterogeneity in consumer learning and change in antibiotic petition is robust in all specifications.
prescription rates
7. Conclusion
The patterns of heterogeneity in consumer learning are impor-
tant as they can help policy makers to predict when the information Mandatory information disclosure is an increasingly popular
disclosure regulation will be more effective. For example, our evi- regulatory device to reduce the information asymmetry problem
dence shows that for older clinics and more educated consumers, between sellers and buyers. Consequently, the previous literature
the extent of learning from the information disclosure is smaller. has focused on whether consumers learn from the information
Thus, we can predict that the antibiotic prescription rates would fall disclosure and pressure sellers to increase their quality, called con-
sumer learning effects. This paper, however, shows that mandatory
information disclosure can also allow sellers to observe their com-
22
petitors’ attributes, and trigger peer effects among them. More
Korea Food & Drug Administration PressKit (2011-04-26).
23
We have also analyzed the effects of the share of elderly in the population and
specifically, in the context of antibiotic prescription rates for the
the size of local tax revenue. But the effects were not significant, and are not reported common cold, this paper shows that some clinics have increased,
due to space constraints. instead of decreasing, their antibiotic prescription rates when they
16 I. Kwon, D. Jun / Journal of Health Economics 42 (2015) 1–16
found out that other local clinics were prescribing more antibiotics Duflo, E., Saez, E., 2002. Participation and investment decisions in a retirement plan:
than they were. Therefore, even though the average prescription the influence of colleagues’ choices. Journal of Public Economics 85, 121–148.
Fischer, P., Huddart, S., 2008. Optimal contracting with endogenous social norms.
rates have decreased after the mandatory information disclosure, American Economic Review 98 (4), 1459–1475.
the decline was smaller when there were more peer clinics. Sara, F., Karlsson, J., 2012. “Competition and Antibiotics Prescription,” IFN Working
In the literature on peer effects, this paper provides an alterna- Paper No. 939.
Fortin, B., Lacroix, G., Villeval, M.-C., 2007. Tax evasion and social interactions. Jour-
tive way of identifying peer effects. While most previous studies nal of Public Economics 91 (11), 2089–2112.
on peer effects have attempted to find an exogenous change in the Gaviria, A., Raphael, S., 2001. School based peer effects and juvenile behavior. Review
behavior of peers, it is difficult to find an exogenous shock that of Economics and Statistics 83 (2), 257–268.
Glaeser, E., Shleifer, A., 2003. The rise of the regulatory state. Journal of Economic
affects some peers but not others. This paper suggests that the
Literature 41, 401–425.
exogenous change in the observability of peer behavior can be an Gonzales, R., Malone, D.C., Maselli, J.H., Sande, M.A., 2001. Excessive antibiotic use
alternative way of identifying the peer effects, which can avoid the for acute respiratory infections in the United States. Clinical Infectious Diseases
33 (6), 757–762.
“reflection problem” discussed by Manski (1993).
Gordon, J.P.P., 1989. Individual morality and reputation costs as deterrents to tax
This paper also finds significant evidence for consumer learning evasion. European Economic Review 33 (4), 7–805.
effects. After the information disclosure, consumers were less likely Grossman, S.J., Hart, O.D., 1980. Disclosure laws and takeover bids. Journal of Finance
to visit clinics with higher antibiotic prescription rates for the com- 35 (2), 323–334.
Hibbard, J.H., Jewett, J.J., 1997. Will quality report cards help consumers? Health
mon cold. This result suggests that overuse of antibiotics may not Affairs 16 (3), 218–228.
be driven by patient demands, but by hospital competition and an Jin, G.Z., Leslie, P., 2003. The effect of information on product quality: evidence
information asymmetry problem. We also find significant hetero- from restaurant hygiene grade cards. Quarterly Journal of Economics 118 (2),
409–451.
geneity in consumer learning. In particular, after the information Jun, D., Chung, G., 2011. Analysis on the effect of information disclosure – antibiotic
disclosure, consumers became more sensitive to the ranking of prescription rates for the common cold in hospitals and clinics in Seoul. Korean
antibiotic prescription rates in townships with younger clinics, Association for Policy Studies 20 (2), 109–142.
Lautenbach, E., Patel, J.B., Bilker, W.B., Edelstein, P.H., Fishman, N.O., 2001. Extended-
higher population density, lower education, and more children or spectrum ˇ-lactamase-producing Escherichia coli and Klebsiella pneumoniae:
pediatric clinics. Consequently, we find that antibiotic prescription risk factors for infection and impact of resistance on outcomes. Clinical Infec-
rates have also declined more in such townships. These results may tious Diseases 32 (8), 1162–1171.
Longo, D.R., Garland, L.G., Wayne Schramm, Judy Fraas, Barbara Hoskins, Vicky
explain why some previous studies have found that the informa-
Howell, 1977. Consumer reports in health care: do they make a difference in
tion disclosure policy had a significant effect while others have not. patient care? Journal of the American Medical Association 278 (19), 1579–1584.
These results also suggest when the information disclosure policy Marshall, M.N., Shekelle, P.G., Leatherman, S., Brook, R.H., 2000. The public release of
performance data: what do we expect to gain: a review of the evidence. Journal
can be most effective.
of the American Medical Association 283 (14), 1866–1874.
One of the limitations of this study is that we do not analyze Malani, A., Buchman, T.G., Dushoff, J., Effron, M.B., 2008. Antibiotic overuse: the
what caused clinic peer effects. For example, such peer effects may influence of social norms. Journal of the American College of Surgeons 265.
arise from rational selfish decisions such as social learning or strate- Manski, C.F., 1993. Identification of endogenous social effects: the reflection prob-
lem. Review of Economic Studies 60 (3), 531–542.
gic interactions, or from intrinsic social preference. Such an analysis Mas, A., Moretti, E., 2009. Peers at work. American Economic Review 99 (1), 112–145.
could be an interesting topic for future studies. Matthews, S., Postlewaite, A., 1985. Quality testing and disclosure. RAND Journal of
Economics, 328–340.
Mennemeyer, S.T., Morrisey, M.A., Howard, L.Z., 1997. Death and reputation: how
References consumers acted upon HCFA mortality information. Inquiry 34 (2), 117–128.
Milgrom, P., 1981. Good news and bad news: representation theorems and applica-
Akerlof, G.A., 1970. The market for ‘lemons’: quality uncertainty and the market tions. Bell Journal of Economics 12 (2), 380–391.
mechanism. Quarterly Journal of Economics 84 (3), 488–500. Myles, G.D., Naylor, R.A., 1996. A model of tax evasion with group conformity and
Akerlof, G.A., 1980. A theory of social custom, of which unemployment may be one social customs. European Journal of Political Economy 12 (1), 49–66.
consequence. Quarterly Journal of Economics 94 (4), 749–775. Nelson, P., 1974. Advertising as information. Journal of Political Economy 81 (4),
Albaek, S., Mollgaard, P., Overgaard, P.B., 1997. Government-assisted oligopoly 729–754.
coordination? A concrete case. Journal of Industrial Economics 45 (4), Njoroge, K., 2003. Information pooling and collusion: implications for the livestock
429–443. mandatory reporting act. Journal of Agricultural and Food Industrial Organiza-
Ali, M.M., Heiland, F.W., 2011. Weight-related behavior among adolescents: the role tion 1.
of peer effects. PLoS ONE 6 (6), e21179. Robohm, C., Ruff, C., 2012. Diagnosis and treatment of the common cold in pedi-
Bennett, D., Che-Lun Hung, Tsai-Ling Lauderdale, 2011. Health care competition and atric patients. Journal of the American Academy of Physician Assistants 25 (12),
antibiotic use in Taiwan. Harris School of Policy 12. 43–47.
Bikhchandani, S., Hirshleifer, D., Welch, I., 1992. A theory of fads, fashion, custom, Sacerdote, B., 2001. Peer effects with random assignment: results for Dartmouth
and cultural change as informational cascades. Journal of Political Economy 100 roommates. Quarterly Journal of Economics 116 (20), 681–704.
(5), 992–1026. Salop, S., Stiglitz, J., 1977. Bargains and Ripoffs: a model of monopolistically com-
Bird, L., 2009. Disclosure Issues: Renewable Energy Purchasing. Mimeo. petitive price dispersion. Review of Economic Studies 44 (3), 493–510.
Board, O., 2009. Competition and disclosure. Journal of Industrial Economics 57 (1), Schelling, T.C., 1978. Micromotives and Macrobehavior. Norton, New York/London.
197–213. Shavell, S., 1994. Acquisition and disclosure of information prior to sale. RAND
Brody, H., 2005. Patient ethics and evidence-based medicine – the good healthcare Journal of Economics, 20–36.
citizen. Cambridge Quarterly of Healthcare Ethics 14, 141–146. Shekelle, P.G., Lim, Y.-W., Mattke, S., Damberg, C., 2008. Does public release of
Butler, C.C., Rollnick, S., Maggs-Rappaport, R.F., Slott, N., 1998. Understanding the performance results improve quality of care? A systematic review. The Health
culture of prescribing: qualitative study of general practitioners’ and patients’ Foundation, London, UK.
perception of antibiotics for sore throats. BMJ 317 (7159), 637–642. Spicer, M.W., Becker, L.A., 1980. Fiscal inequity and tax evasion: an experimental
Butters, G.R., 1977. Equilibrium distributions of sales and advertising prices. Review approach. National Tax Journal 33 (2), 171–175.
of Economic Studies 44 (3), 465–491. Stigler, G.J., 1961. The economics of information. Journal of Political Economy 69 (3),
Carrell, S.E., Malmstrom, F.V., West, J.E., 2008. Peer effects in academic cheating. 213–225.
Journal of Human Resources 43 (1), 173–207. Vladeck, B.C., Goodwin, E.J., Myers, L.P., Sinisi, M., 1988. Consumers and hospital
Chipty, T., Witte, A.N., 1998. “Effects of Information Provision in a Vertically Differ- use: the HCFA ‘death list’. Health Affair 7 (1), 122–125.
entiated Market,” NBER Working Paper No. 6493. Wilson, J., 2007. Peer effects and cigarette use among college students. Atlantic
Conly, J., 1998. Controlling antibiotic resistance by quelling the epidemic of overuse Economic Journal 35 (2), 233–247.
and misuse of antibiotics. Canadian Family Physician 44, 1769–1784. Yoo, H.J., Song, E., Lee, K.U., Lee, E.K., Lee, J.A., 2009. Misuse of antibiotics and related
Currie, J., Lin, W., Meng, J., 2012. “Antibiotic Abuse in China: Supply or Demand?,” awareness of consumers. Journal of Korean Association for Crisis and Emergency
Working Paper. Management 1, 98–122.
Journal of Health Economics 42 (2015) 17–28
a r t i c l e i n f o a b s t r a c t
Article history: Over the 1976–2010 period, total mortality shifted from strongly procyclical to being weakly or unrelated
Received 24 September 2014 to macroeconomic conditions. The association is likely to be poorly measured when using short (less
Received in revised form 10 March 2015 than 15 year) analysis periods. Deaths from cardiovascular disease and transport accidents continue to
Accepted 11 March 2015
be procyclical; however, countercyclical patterns have emerged for fatalities from cancer mortality and
Available online 20 March 2015
external causes. Among the latter, non-transport accidents, particularly accidental poisonings, play an
important role.
Keywords:
© 2015 Elsevier B.V. All rights reserved.
Mortality
Health
Recessions
Macroeconomic conditions
http://dx.doi.org/10.1016/j.jhealeco.2015.03.004
0167-6296/© 2015 Elsevier B.V. All rights reserved.
18 C.J. Ruhm / Journal of Health Economics 42 (2015) 17–28
Some investigations suggest that mortality has become less where Mkjt is the mortality rate from source k in state j at year
procyclical or countercyclical in recent years. Using methods and t, U is the state unemployment rate, X a vector of covariates, ˛
data similar to Ruhm (2000), Stevens et al. (2011) find that a one a state fixed-effect, a general time effect, T a state-specific lin-
percentage point increase in the state unemployment rate was ear time trend, ε is the error term, and ˆ provides the estimated
associated with a 0.40% reduction in total mortality from 1978 to macroeconomic effect of key interest9 .
1991, but a smaller 0.19% decrease when extending the analysis The year effects (kt ) hold constant determinants of death that
through 20066 . McInerney and Mellor (2012) estimate that a one- vary uniformly across locations over time (e.g. advances in widely
point rise in joblessness lowered the mortality rates of persons 65 used medical technologies or behavioral norms); the location fixed-
and over by 0.27% during 1976–1991, but raised them 0.49% from effects (˛kj ) account for those that differ across states but are
1994 to 2008. Svensson (2007) uncovers a positive relationship time-invariant (such as persistent lifestyle disparities between
between Swedish unemployment rates and heart attack deaths residents of Nevada and Utah). Since the supplementary time-
from 1987 to 20037 . varying state characteristics (Xjt ) do not necessarily control for all
Changes in health behaviors provide a potential mechanism for time-varying determinants of death, the models also include state-
the mortality response. Consistent with this, reductions in drink- specific trends (Tjkt )10 . The 1976–2010 analysis period reflects the
ing, obesity, smoking and physical inactivity during bad economic availability of consistent data on state unemployment and mor-
times have been demonstrated (Ruhm and Black, 2002; Ruhm, tality rates. The macroeconomic impact is then identified from
2005; Gruber and Frakes, 2006; Freeman, 1999; Xu, 2013), and within-location variations in mortality rates, relative to changes
Edwards (2011) shows that individuals spend more time socializ- in other states and after controlling for demographic characteris-
ing and caring for relatives during such periods. However, research tics and state-trends. Since the impact of national business cycles is
using recent data again raises questions about the strength and absorbed by the time effects, discussions of macroeconomic effects
direction of these relationships. Charles and DeCicca (2008) indi- refer to changes within-states rather than at the national level.
cate that male obesity is countercyclical; Arkes (2009) obtains a One way of investigating whether the impact of macroecono-
similar result for teenage girls (but not boys); Arkes (2007) shows mic conditions on mortality has changed is to compare predicted
that teenage drug use increases in bad times; Dávlos et al. (2012) effects differ across sub-periods. However, since such estimates are
uncover a countercyclical pattern for some types of alcohol abuse often sensitive to the choice of starting or ending years, two alter-
and dependence; Colman and Dave (2013) suggest that increased native strategies are employed. First, models for total mortality are
leisure-time exercise during periods of economic weakness is more estimated with differing starting and ending dates, and with vary-
than offset by reductions in work-related physical exertion. Such ing lengths of the analysis period. The second, and main, method
findings are provocative although, as shown below, they should be specifies analysis periods of fixed duration and then sequentially
viewed with skepticism because the analysis periods are too short estimates models for all alternative sample windows permitted by
(eight years or less) to provide definitive results. the data. Most commonly, 20-year periods are used with results
Using U.S. data covering 1976–2010, the present study obtained for 16-windows ranging from 1976–1995 to 1991–2010.
examines whether the relationship between macroeconomic con- Figures are frequently provided with point estimates (and
ditions and mortality has changed over time. Comparability with sometimes confidence intervals) on the unemployment rate coeffi-
previous investigations is maximized by using empirical meth- cient presented for each analysis window. Tables are also supplied
ods that conform closely to that research8 . Three primary results showing unemployment coefficients and standard errors for the
emerge. First, total mortality has shifted from being strongly pro- first and last of the 20-year periods (1976–1995 and 1991–2010),
cyclical to being weakly related or unrelated to macroeconomic denoted by ˆ and ŝ , respectively, where equals 1 (2) in the first
conditions. Evidence from prior research that deaths decline when (last) period. I test whether the macroeconomic effect has changed
the economy deteriorates largely reflects the inclusion of early by providing estimates for ˆ = ˆ 2 − ˆ 1 .
sample years, when this was the case. Second, the results obtained Using Eq. (1), (ek ˆ − 1) × 100% provides the predicted per-
using relatively short (less than 15 year) periods show consider- centage change in mortality from source k resulting from a one
able instability and should probably be viewed as unreliable. Third, percentage point increase in the unemployment rate. While these
fatalities due to cardiovascular disease and, to a smaller degree, estimates show the relative size of the macroeconomic effect, they
transport accidents continue to be procyclical, whereas strong do not directly indicate changes in the absolute number of pre-
countercyclical patterns for cancer and some external sources of dicted fatalities because, for example, large relative effects may
death (particularly accidental poisonings) have emerged. imply small absolute changes for sources that are responsible for
few deaths. These relative effect sizes are translated into absolute
numbers through estimates of:
2. Research design
ek
ˆ
− 1 × k D (2)
This analysis uses variations of previously employed panel data
where ˆ = ˆ 2k − ˆ 1k , D is the average annual number of deaths
methods (e.g. by Ruhm, 2000) to analyze the relationship between
(2222,313) and k is the share of deaths due to source k over the
macroeconomic conditions and mortality rates. The estimating
1976–2010 period.
equation is:
ln Mkjt = ˛kj + Xjt ˇ + Ujt + kt + Tjkt + εkjt (1)
9
Unemployment rates are used to proxy macroeconomic conditions; however, a
procyclical variation in mortality does not imply that the loss of a job improves
health. To the contrary, Sullivan and von Wachter (2009) show that job loss is
associated with increases in individual mortality rates.
6 10
The estimated reduction rises to 0.33% over the 1978–2006 period when using Mortality trends vary considerably across sources of death, with large secular
age-adjusted mortality rates. reductions for total mortality and that from cardiovascular disease and external
7
Using time-series methods for the U.S. from 1961 to 2010, Lam and Piérard sources, a relatively flat trend for cancer, and an increase for other disease deaths.
(2014) also argue that total and cardiovascular mortality have become less pro- State-year population weights were also sometimes incorporated but unweighted
cyclical over time, while motor vehicle fatalities remain strongly procyclical. estimates are generally preferred (Wooldridge, 1999; Butler, 2000; Solon et al.,
8
One exception is the use of an uncommonly detailed set of age controls. 2015) and so are focused upon below.
C.J. Ruhm / Journal of Health Economics 42 (2015) 17–28 19
A potential concern is that calculations based on (2) do not ICD-8 and ICD-9 and between ICD-9 and ICD-10 coding systems;
account for the possibility that a portion of the trend in macro- however, the correspondence is imperfect. These issues are typi-
economic effects on overall mortality rates could reflect secular cally minor when looking at broad causes of death (e.g. those from
changes in the shares of deaths due to specific sources. This was cardiovascular disease) but are important for many specific sources
examined using a variation of the Oaxaca (1973) and Blinder’s of mortality. To provide information on this, the National Cen-
(1973) decomposition method. The method and results, which are ter for Health Statistics has calculated “estimated comparability
summarized in the online appendix, indicate that almost all of the ratios” indicating the relative number of deaths in 1996 attributed
macroeconomic effect was due to changes in the coefficients, rather to a specific cause using ICD-9 and ICD-10 classifications (Anderson
than in mortality shares, so that predictions obtained from (2) are et al., 2001) and, similarly, for 1976 using ICD-8 versus ICD-9 codes
useful. (Klebba and Scott, 1980).
When the estimated comparability ratios are close to one (i.e.
3. Data and descriptive statistics a similar number of deaths are reported using either ICD system),
issues of data comparability are likely to be minor and well cap-
Annual average state unemployment rates, the main prox- tured by the inclusion of regression year fixed-effects. For example,
ies for macroeconomic conditions were obtained from the U.S. the estimated comparability ratios are 1.013 and 1.003 for CVD
Department of Labor’s Local Area Unemployment Statistics Database and cancer fatalities, when using ICD-8 and ICD-9 codes, and 0.998
(www.bls.gov/lau/lauov.htm), which provides monthly estimates and 1.007 for ICD-9 and ICD-10 categories. However, the poten-
of total employment and unemployment rates for census regions tial problems are greater for some numerically important causes
and divisions, states, metropolitan statistical areas, counties, and of death, and for others that have been analyzed in previous
some cities11 . Concepts and definitions underlying the LAUS data research15 . Due to these concerns, the analysis of disease mortality
come from the Current Population Survey. Mortality data are from is restricted to the major categories of CVD and malignant neo-
the Center for Disease Control and Prevention’ Compressed Mor- plasms, as well as a generic grouping of all other disease types16 . A
tality Files (CMF) (www.cdc.gov/nchs/data access/cmf.htm), which fuller investigation is provided for subcategories of external deaths,
contain information for every death of a U.S. resident including: including those from transport accidents, non-transport (other)
state and county of residence, year of death, race and sex, Hispanic accidents, intentional self-harm (suicide), and homicide/legal
origin (after 1998), age group (16 categories), underlying cause of intervention. Because non-transport accidents will be shown to be
death (ICD codes and CDC recodes). Data prior to 1988 are pub- particularly important, separate analysis is conducted for the sub-
licly available; those from 1989 to 2010 were obtained by special components: falls, drowning/submersion, smoke/fire/flames, and
agreement with the CDC. Population data (the denominator in the poisoning/exposure to noxious substances17 .
mortality rate calculations) from 1981 on come from the National Appendix Table A1 details the ICD codes used to classify causes
Cancer Surveillance Epidemiology and End Results (SEER) program of death. Means and sample standard errors for mortality rates (per
(http://www.seer.cancer.gov/data)12 . These were supplemented 100,000 population)and state characteristics are detailed in the
by census estimates, included in the CMF files, for 1976–1980. online appendix. Appendix Table A2 illustrates how the sources
In addition to total annual mortality rates, sex-specific death of death changed over the analysis period, showing numbers
rates were constructed, as were fatality rates for five age groups and shares of fatalities during 1976–1995 and 1991–2010. As
(<25, 25–44, 45–64, 65–74, and ≥75 year olds) and deaths from expected, given increased life expectancy, the proportion of mor-
major diseases and external causes13 . The SEER data were addi- tality accounted for by the elderly has grown substantially. Declines
tionally used to construct independent variables for the share of in the share of cardiovascular deaths has been offset by increases in
the state population who were female, nonwhite, Hispanic, and mortality from other diseases. The fraction from external sources
aged <1, 1–19, 45–54, 55–64, 65–74, 75–84 and ≥85 years old14 . changed little, with reductions from fatal transport accidents and
The analysis of cause-specific mortality introduces complica- homicides being compensated for by increases in non-transport
tions. From 1976 to 1978, cause of death was categorized using the accidents, particularly poisoning deaths.
8th revision of the International Classification of Diseases (ICD-8
codes). ICD-9 codes were used between 1979 and 1998, and ICD-10 4. The declining procyclicality of total mortality
categories since 1999. Crosswalks have been established between
Fig. 1 supplies three ways of examining whether the procycli-
cality of total mortality has diminished over time, by estimating Eq.
11
(1) for different time periods. Solid lines show point estimates and
Some recent studies of macroeconomic patterns of health behaviors have ana-
dotted lines the 95% confidence intervals.
lyzed county-level or MSA data (e.g. Charles and DeCicca, 2008; An and Liu, 2012).
This has potential advantages (e.g. examining smaller regional economies) and Fig. 1A displays unemployment rate coefficients where the anal-
disadvantages (e.g. greater measurement error). For this investigation, the major ysis period begins in 1976 and ends in years ranging between 1995
disadvantage is that a consistent data series of county unemployment rates only and 2010. The magnitude of the estimated macroeconomic effect
begins in 1990 and the Department of Labor cautions against using county level
declines monotonically but modestly as the sample is extended to
data prior to that time. Also, Lindo (2015) provides evidence that the health effects
of macroeconomic conditions are understated when using more disaggregated (e.g. more recent periods, ranging from −0.0043 when the last year is
county rather than state) data. Preliminary analysis revealed similar results using 1995 to −0.0034 when it is 2010. All of the coefficients are sig-
state and county data starting in 1990. nificantly different from zero and these results, which are largely
12
The SEER data are designed to supply more accurate population estimates for
intercensal years than standard census projections, and to adjust for population
shifts in 2005, resulting from Hurricanes Katrina and Rita. Differences between the
15
SEER and CMF population estimates are miniscule prior to 2000 but are sometimes For instance, the ICD-10 to ICD-9 comparability ratios are 0.698, 1.232 and 1.554
reasonably large (up to 3%) after 2003. for influenza/pneumonia, kidney disease (nephritis, nephrotic syndrome, nephro-
13
I examined other age-specific death rates, including infant mortality rates, in sis) and Alzheimer’s disease.
16
preliminary analysis, but focus on these age groupings since the large majority of Ruhm (2013) provides a preliminary analysis examining three sub-components
deaths occur to those who are relatively old. of CVD, five categories of malignant neoplasms and five types of other diseases.
14 17
Hispanic population shares are not provided prior to 1981. Therefore, shares for These account for 65% of deaths due to non-transport accidents. The most
1976–1980 were extrapolated as a linear trend for changes occurring between 1981 important remaining category, “other and unspecified transport accidents and their
and 1986. sequelae”, is not comparable over time.
20 C.J. Ruhm / Journal of Health Economics 42 (2015) 17–28
Fig. 1. Unemployment coefficients for total mortality using different analysis samples. (A) Sample begins in 1976 and continues through specified year. (B) Sample begins
in specified Year and Continues through 2010. (C) Analysis sample covers 20-year period.
similar to those of previous research, do not alter the conclusion especially negative health consequences of the great recession of
that mortality is procyclical. 2007–2009.
The sensitivity of findings to the choice of sample periods can The choice of 20-year sample windows is arbitrary and may
be seen more explicitly in Fig. 1B, where the sample always ends conceal an increased procyclical variation of mortality toward the
in 2010 but the starting year varies between 1976 and 1991. The end of the data period. This possibility is investigated in Fig. 2,
unemployment coefficient attenuates from −0.0034 for the entire which replicates Fig. 1C, but for periods of between 5-years and
sample period to between −0.0029 and −0.0009 when the starting 20-years. Two findings deserve mention. First, at shorter durations,
year is 1978 or later18 . Perhaps more importantly, the data fail to the estimates become more volatile and less precise. For instance,
reject the null hypothesis of no macroeconomic effect for periods when using 5-year windows, the unemployment coefficients fluc-
beginning after 1988. tuate wildly for even small changes in timing (e.g. from 0.0120
Fig. 1C displays results using 20-year sample windows begin- for 1996–2000 to −0.0077 for 1999–2003) but almost always fail
ning in the specified year. The left-most entry shows that the to reject the null hypothesis of no macroeconomic effect. Sec-
unemployment coefficient for 1976–1995 is −0.0043, while the ond, the standard errors have typically increased for more recent
farthest right result shows that it is −0.0010 for 1991–2010. Total samples. As a result, the estimates obtained using 10-year or 15-
mortality is significantly procyclical (negative unemployment rate year analysis windows, while less volatile than those using 5-year
coefficients) for all 20-year windows starting between 1976 and periods, still lack sufficient precision to determine whether the
1987, but the predicted effect diminishes steadily for windows possible partial reversion of the macroeconomic effects in recent
beginning after 1982 and is small and insignificant for those years (toward more procylical mortality) is real or reflects sta-
starting in 1988 through 1991. This pattern is not caused by tistical noise19 . An important implication is that the findings of
18 19
The full sample estimate is in line with previous results. Ruhm (2000) obtains a When using 10-year periods, the average standard error is 46% larger for analysis
slightly larger 0.5% reduction in total mortality but the current estimate is close to windows beginning between 1989 and 2001 than for those starting between 1976
the 0.3% decrease obtained in Stevens et al.’s (2011) preferred specification. and 1988 (0.0018 versus 0.0013).
C.J. Ruhm / Journal of Health Economics 42 (2015) 17–28 21
Fig. 2. Unemployment coefficients for total mortality using different sample windows.
20
unemployment coefficient between 1976–1995 and 1991–2010 is
Charles and DeCicca’s (2008) analysis of male obesity used data from 1997 to
2001; Arkes’ (2007, 2009) investigation of teenage body weight utilized informa-
positive and statistically significant.
tion from 1997 to 2004, Dávlos et al.’s (2012) study of alcohol abuse and dependence Table 1 shows the unemployment coefficients, from estimat-
compared 2001–2002 and 2004–2005, Colman and Dave’s (2013) research on work ing Eq. (1), for total, sex-specific and age-specific mortality. Results
and leisure-time physical activity covered 2003–2010, Cotti and Tefft’s (2011) anal-
ysis of alcohol-related vehicle fatalities used data from 2003 to 2009, and Tekin et al.
(2013) investigated a variety of health outcomes and behaviors from 2005 to 2011.
21
For example, the unemployment rate coefficient (standard error) for the and −0.0019 (0.0009) for unweighted data without trends, weighted data without
1991–2010 analysis window is 0.0022 (0.0017), 0.0039 (0.0020), −0.0010 (0.0010) trends, unweighted data with trends and weighted data with trends.
22 C.J. Ruhm / Journal of Health Economics 42 (2015) 17–28
Table 1
Estimated macroeconomic effects on specific sources of mortality.
Note: Dependent variable is the natural log of the specified state mortality rate, obtained from the Compressed Mortality Files, for 1976 to 2010 (n = 1785). The first two
columns show the coefficient on the state unemployment rate for 20-year subsamples (n = 1020) covering 1976–1995 and 1991–2010. The regressions also include vectors
of state and year dummy variables, state-specific linear time trends, and controls for the share of the state population who are: female, nonwhite, Hispanic, and aged <1,
1–19, 45–54, 55–64, 65–74, 75–84 and ≥85 years old. The third column shows the difference between the unemployment coefficients for the 1991–2010 and 1976–1995
subsamples. Robust standard errors, clustered at the state level, are shown in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1.
in the first column refer to 1976–1995, those in the second to The trends also appear to be relatively pronounced for the young
1991–2010, with the final column showing the difference between and middle-aged. We are unable to reject the null hypothesis of a
the two. A one point rise in unemployment predicts a statistically zero unemployment rate effect in 2010 for all age groups. Specif-
significant 0.43% reduction in total mortality during 1976–1995 ically, a one percentage point increase in joblessness reduced the
compared to a small and insignificant 0.10% decrease in 1991–2010 predicted death rates of <25, 25–45, 45–64, 65–74, and ≥75 years
(see the first row). The 0.33% difference between these two periods old by 1.6%, 0.8%, 0.2%, 0.4%, and 0.4% in 1976–1995 but increased
is statistically significant and indicates that the procyclicality of them by 0.2%, 0.2%, 0.2%, −0.2%, and −0.2% in 1991–2010. We are
mortality has largely disappeared in recent years22 . unable to reject the null hypothesis of a zero unemployment rate
To address the possibility that the observed secular trends effect in 1991–2010 for all age groups. The patterns for all possi-
reflect a change in the relationship between unemployment rates ble 20-year windows are qualitatively similar (see Fig. 4B). The 95%
and macroeconomic conditions, rather than in the health effects of confidence intervals (not shown) exclude positive unemployment
economic conditions, I estimated specifications that controlled for rate coefficients for all two-decade periods beginning prior to 1986
nonemployment (the percentage of the 16 and over civilians unem- for ≥75 year olds, before 1989 for <25 year olds, and earlier than
ployed or out of the labor force) rather than unemployment rates23 . 1983 for those aged 65–74. Conversely, a zero coefficient is rarely
In these models a one percentage point increase in the nonem- rejected for 45–64 year olds.
ployment rate predicted a statistically significant 0.39% reduction
in total mortality during 1976–1995 and an insignificant 0.02%
5. Heterogenous effects across sources of death
decrease in 1991–201024 . The highly significant 0.37% difference
is slightly larger than that obtained using unemployment rates.
Table 2 and Figs. 5 and 6 stratify disease versus external sources
The remainder of Table 1 and Fig. 4 summarize subgroup analy-
of death, and then separately examine three disease and four exter-
ses, stratified by gender and age. In Fig. 4, and subsequently, thicker
nal causes. The three disease categories: cardiovascular, cancer
lines indicate sources with relatively high mortality shares25 . The
and other diseases, accounted for 42%, 23% and 29% of deaths
evidence suggests larger secular changes in macroeconomic effects
over the 1976–2010 period. The four external sources: transport
for men than women. In 1976–1995, a one point rise in unem-
accidents, other (non-transport) accidents, suicides and homicides
ployment predicted a 0.44% reduction in male mortality and a
were responsible for 2.2%, 2.4%, 1.4% and 0.9%. Finally, four specific
0.41% decrease for females. This effect completely disappeared
types of non-transport accidents are considered – falls, drown-
by 1991–2010 for men but fell only half as much for women.
ing/submersion, smoke/fires/flames and poisoning/exposure to
The declining procyclicality of mortality has been particularly pro-
noxious substances – which constituted 0.7%, 0.2%, 0.2% and 0.5%
nounced for males since 1982, while showing a steadier reduction
of fatalities.
for females (see Fig. 4A).
Levels and trends of the macroeconomic effects differ markedly
for mortality from disease versus external causes. A one point
rise in joblessness lowered predicted disease mortality by 0.33%
in 1976–1995 versus 0.14% in 1991–2010, a stastically insignifi-
22
Coefficients on the other time-varying state level covariates are provided in the cant change. By contrast, a much larger 1.5% reduction in external
online appendix. deaths was estimated for the earlier years versus a statistically sig-
23
For instance, declines in labor force participation rates were particularly pro-
nounced during the “great recession” that began in 2007, when compared to other
nificant 0.8% increase in the later ones, and the difference is highly
economic downturns (Shierholz, 2012). significant. Fig. 5 suggests an almost monotonic but modest atten-
24
A one percentage point increase in the unemployment rate predicts a 0.73 uation over time in the unemployment coefficient for deaths from
percentage point rise in nonemployment over the full period, in models control- disease, with statistically significant negative estimates obtained
ling for state demographic characteristics, time trends, and state and year dummy
for all two decade periods starting before 1986. Conversely, the
variables. Changes in interstate migration are also unlikely to explain the results.
Migrants tend to be healthy and to move from areas of higher to lower unemploy- predicted effect for external causes was negative and fairly sta-
ment rates (Halliday, 2007), introducing a countercyclical mortality effect. Mortality ble for 20-year periods beginning between 1976 and 1982, but
might therefore have become less procyclical if migration rates were increasing over attenuated steadily for analysis windows starting from 1982 to
time. However, migration rates instead peaked around 1980 and have fallen sharply 1990, with a statistically insignificant effect for those with first
since then (Malloy et al., 2011).
25
For example, in Fig. 4B, the line for ≥75 year olds is thick because they account for
years between 1984 and 1989, and significantly positive unem-
51% of mortality, from 1976 to 2010, whereas that for <25 year olds is thin because ployment rate coefficients obtained for those initiating in 1990 or
they are responsible for less than 4% of deaths. 1991. These results help to explain the sharp reversal in the effects
C.J. Ruhm / Journal of Health Economics 42 (2015) 17–28 23
Fig. 4. Unemployment coefficients for sex-specific and age-specific mortality. (A) Sex-specific mortality. (B) Age-specific mortality.
Table 2
Estimated macroeconomic effects on cause-specific mortality.
Note: See note on Table 1. *** p < 0.01, ** p < 0.05, * p < 0.1.
Fig. 6. Unemployment coefficients for deaths from specific diseases and external causes. (A) Specific diseases. (B) External causes. (C) Other accidents.
of macroeconomic conditions on deaths of younger persons, who Research for earlier time periods (e.g. Ruhm, 2000; Neumayer,
disproportionately die from external causes26 . 2004; Miller et al., 2009) documents a strong procyclicality of
cardiovascular deaths but with little macroeconomic variation in
5.1. Diseases cancer fatalities, and attributes this to the likelihood that short-
term behavior changes (e.g. smoking, diet and exercise) more
There are striking disparities across types of diseases. Cancer strongly influence the risk of CVD than cancer deaths. A conceivable
mortality was unrelated to the economy in 1976–1995 but strongly explanation for the findings just described is that the relation-
countercyclical by 1991–2010, whereas CVD mortality remained ship between macroeconomic conditions and health behaviors has
strongly procyclical throughout (see the top panel of Table 2 and remained relatively stable, while cancer mortality has become
Fig. 6A). The procyclicality of other disease mortality declined over more sensitive to the availability of financial resources and access
time but the change between 1976–1995 and 1991–2010 was not to (procyclical) health care due to improvements in expensive med-
quite significant at the 0.05 level27 . ical treatments and technologies28 .
26 28
For example, <45 year olds accounted for 56% of external deaths (from 1976 to The cost per cancer case rose from $47,000 in 1983 to $70,000 in 1999 (Philipson
2010) but less than 10% of total mortality. et al., 2012), with many expensive new medical treatments and chemotherapy
27
This result is sensitive to weighting. Using weighted data, the unemployment agents coming into use in the 1990s and early 2000s (Cutler, 2008). The contin-
rate coefficient was −0.0029 in 1976–1995 and −0.0020 in 1991–2009. The differ- ued procyclicality of CVD mortality could also occur for other reasons, such as a
ence was 0.0009 with a standard error of 0.0031. (stable) deleterious health effect of air pollution (Heutel and Ruhm, 2013).
C.J. Ruhm / Journal of Health Economics 42 (2015) 17–28 25
Table 3
Change in effect of macroeconomic conditions on predicted number of deaths.
Sex-specific
Males 0.5121 4900*** 1625–8185
Females 0.4879 2402 −487–5298
Age-specific (years)
<25 0.0389 1655*** 1009–2305
25–44 0.0572 1202** 138–2276
45–64 0.1865 1586** 177–3000
65–74 0.2031 799 −454–2054
≥75 0.5140 2565* −206–5343
Note: Predicted changes are for a one-percentage point increase in unemployment. Share of deaths is for 1976–2010. “Predicted in # of deaths” is calculated as (ek ˆ
−
1) × k D, where — 2 − 1 , for the predicted unemployment coefficient in period , with = 1 in 1976–1995 and = 2 in 1991–2010; D is the average annual number
of deaths during 1976–2010 (2222,313); k is the share of deaths from source k. 95% confidence intervals are estimated as (e(k±1.96×s
ˆ k ) − 1)k D, for sk the standard error
on ˆ k .
***
p < 0.01,
**
p < 0.05,
*
p < 0.1.
5.2. External causes the unemployment coefficient rose from −0.0148 in 1976–1995
to 0.0422 in 1991–2010, with a strong countercyclical pattern
There is considerable heterogeneity in the effects for specific emerging for 20-year windows beginning after the early 1980s.
sources of external deaths (see the second panel of Table 2 and There are more modest changes for deaths from falls or drownings
Fig. 6B). One of the most consistent previous research findings is and the procyclicality of fatalities from fires largely disappears in
that transport fatalities are procyclical29 . This effect persists but has recent years but, as shown later, this is always a relatively minor
weakened recently, with a one percentage point rise in the unem- source of mortality. Given these results, accidental poisonings
ployment rate predicting a 2.6% decrease in 1976–1995 versus a receive special attention below.
0.9% reduction in 1991–2010. Suicides increase with joblessness,
consistent with most prior studies, and this effect has strengthened
over time: a one point growth in unemployment was associated 6. Predicted changes in number of deaths
with an insignificant 0.4% rise is suicides during 1976–1995 versus
a highly significant 1.7% increase in 1991–201030 . I next demonstrate that external sources of deaths, especially
The most noteworthy finding is that fatal non-transport acci- those for non-transport accidents and among these accidental poi-
dents have switched from being strongly procyclical to sharply sonings, play a key role in explaining the declining macroeconomic
countercylical: a one point rise in unemployment reduced pre- responsiveness of total mortality. All numerical calculations are
dicted mortality rates by 1.7% in 1976–1995 but increased them based on Eq. (2) and refer to a one percentage point increase
0.9% in 1991–2010, with nearly monotonic growth over time. The in unemployment. The discussion focuses on predicted secular
parameter estimates were negative and significant for all 20-year changes in macroeconomic effects, rather than levels for a single
periods starting prior to 1985 but insignificantly positive for those analysis period.
beginning after 1987. The first row of Table 3 shows that a one point increase in
The bottom panel of Table 2 and Fig. 5C provide additional speci- unemployment predicts 7253 more fatalities in 1991–2010 than in
ficity on non-transport accident deaths, showing that the secular 1976–199531 . As noted, the procyclicality of mortality weakened
trends are dominated by changes for accidental poisonings, where over time more for men than women, so that males account for
two-thirds of the overall change in the macroeconomic effect (see
29
Previous analyses have often examined motor vehicle deaths, which consti-
tuted 94% of transport accident fatalities from 1976 to 2010. Transport deaths are 31
The unemployment coefficient was −0.00428101 for 1976–1995 and
considered here because they are coded more consistently across time. −0.00102266 in 1991–2010. The resulting difference of .00325834 implies around
30
The unemployment coefficient was positive in all 20-year windows and statis- 0.33% more fatalities, or 7253 additional deaths per year, based on 2222,313 fatali-
tically significant for those beginning after 1987. ties annually: (exp[−0.00102266–−0.00428101] − 1) × 2222,313 = 7252.86.
26 C.J. Ruhm / Journal of Health Economics 42 (2015) 17–28
the second and third rows of Table 3)32 . When decomposing by 7. Discussion
age, the most striking finding is the extent to which the declining
procyclicality is concentrated among the relatively young: persons The strong procyclical pattern of mortality present in the 1970s
under <25 are responsible for less than 4% of deaths but 23% of and 1980s has been largely eliminated in recent years. The pat-
the secular trend in the macroeconomic impact; those under 45 tern varies across sources of deaths, with much larger secular
(65) comprise less than 10% (30%) of fatalities but almost two-fifths changes observed for external than disease causes. All types of
(over three-fifths) of the predicted change over time. external deaths became less procyclical or more countercyclical
The remainder of Table 3 examines specific causes of mortal- during the analysis period, with particularly large changes for
ity. A one percentage point rise in unemployment predicts almost non-transport accidents, and within this category, for accidental
3600 more annual deaths from external sources in 1991–2010 than poisonings. Among diseases, cardiovascular mortality continues to
1976–1995, or 49% of the change in total mortality. This occurs even fall sharply when the economy deteriorates, whereas cancer deaths
through only 7% of all deaths are due to such causes, and helps to have became substantially countercyclical. These findings are rel-
explain the large changes for males and <45 year olds (for whom evant not only for understanding of the production of health but
external deaths account for over 40% of mortality). Cancer and other also for measuring the size and effects of business cycle fluctua-
diseases also play a role, although the estimates are imprecise for tions. Egan et al. (2013) argue that procyclical mortality implies
the former and sensitive to the use of sampling weights for the that business cycle fluctuations are milder than when calculated
latter33 . By contrast, cardiovascular disease – the number one killer using standard GDP measures, but this may have become less true
– explains none of the secular change, as the unemployment rate in recent years.
coefficient actually becomes more negative in later years. Some estimates are sensitive to changing the starting and end-
The third panel of Table 3 provide a more detailed decompo- ing dates of analysis. Such parameter instability is particularly
sition of external deaths. Non-transport accidents are of special problematic when the sample window is short – probably anything
interest because the unemployment coefficient switches from large less than 15 years – raising concerns about the findings of many
and negative to positive, accounting for over 1400 additional deaths recent related investigations that have used brief (often less than
annually, or 40% of the predicted rise in external cause mortality. 10 year) timespans. One contribution of this study is to provide par-
This increase is 11% larger than for all cancers, even though non- simonious methods of illustrating the sensitivity of the results the
transport accidents constitute only 2% of all fatalities, versus 22% length of the analysis window, and to the first and last years exam-
for malignant neoplasms. Transport accidents are also important, ined. Another caveat is that specific sources of death are implicitly
explaining 880 extra deaths per year. The effects on suicides and treated here as being independent of each other, although some
homicides are in the same direction but of considerably smaller prior research (Yeung et al., 2014) identifies potential correlations
magnitude. between them. This is not an issue for the analysis of total mortality
The bottom panel of Table 3 presents separate results for four but may be important when considering specific causes of death as
sources of non-transport accidents, which together explain around competing risks.
two-thirds of deaths from this cause. The role of accidental poison- Mechanisms for the previously observed procyclical variation
ings is remarkable. Although accounting for just 0.5% of deaths, a in mortality remain poorly understood, so it is speculative as to
one point rise in unemployment is predicted to result in 709 more why the relationship has changed in recent years. Two possibilities
annual poisoning fatalities in 1991–2009 than in 1976–1995, or are intriguing. First, the change dates to (20-year) analysis periods
half the change in non-transport accident fatalities, 20% of that for beginning in the early 1980s, which precisely coincides with the
external deaths and almost 10% of the overall mortality effect. reduction in macroeconomic volatility that has been referred to
as the “Great Moderation” (Stock and Watson, 2003; Bernanke,
2004)34 . This raises the possibility that the mortality patterns here
are part of a broader change in the effects of short-term changes
32
Since separate (unconstrained) models are estimated for different sources of
death, the total contribution of changes in predicted group-specific mortality can
sum to more or less than the effect predicted for total mortality.
33 34
With weighted data, a one point unemployment increase predicts 580 more Productivity also shifted from procyclical to acyclical or slightly countercyclical
deaths from other diseases in 1991–2010 than in 1975–1995 at about the same time (Galí and van Rens, 2010).
C.J. Ruhm / Journal of Health Economics 42 (2015) 17–28 27
in macroeconomic performance, or in the role of unemployment economic weakness has long been associated with diminished
rates as a proxy for macroeconomic conditions. With regards to the mental health (Ruhm, 2000, 2003; Charles and DeCicca, 2008;
latter, it is noteable that the residual variation in state-year unem- Bradford and Lastrapes (2014) and, to the extent these drugs are
ployment rates after including controls (one minus the R-squared now being taken to address this, the increased procyclicality of
from regressing unemployment rates on state and year dummy poisoning deaths may be a physical manifestation of what was
variables, state-specific time trends, and the time-varying state previously a mental health problem36 .
demographic characteristics) fell from 0.177 in 1976–1995 to 0.094
in 1999–2010. There is also suggestive evidence that the procycli-
cality of mortality might have increased slightly in the most recent
Appendix A. Appendix
analysis periods, that include the severe 2007–2009 recession.
Second, the emerging importance of accidental poisoning fatal-
Tables A1 and A2.
ities occurred at the same time that deaths from this source
increased dramatically for young and middle-aged adults (see
Fig. 7). Over 90% of poisoning fatalities are now due to drug
overdoses, with particularly important roles for prescription opi- Appendix B. Supplementary data
oids (such as hydrocodone and oxycodone) and benzodiazepines
(Warner et al., 2011; Ruhm, 2015). The higher death rates reflect Supplementary data associated with this article can be found,
greater availability of these drugs raising the ease of self-injury in the online version, at http://dx.doi.org/10.1016/j.jhealeco.
and accidental death during bad economic times35 . Moreover, 2015.03.004.
Table A1
Definitions of specific causes of mortality.
Table A2
Sources of death by time period.
# % # % # %
Cause of death
Cardiovascular 923,419 41.6 957,085 46.0 892,420 37.7
Cancer 502,882 22.6 463,278 22.3 548,661 23.2
Other diseases 638,973 28.8 510,290 24.5 765,354 32.3
External causes 157,039 7.1 151,283 7.3 160,917 6.8
Transport accidents 48,545 2.2 50,581 2.4 45,750 1.9
Other accidents 54,044 2.4 45,419 2.2 60,350 2.5
Falls 15,225 0.7 12,721 0.6 17,217 0.7
Drowning/submersion 4181 0.2 4676 0.2 3565 0.2
Smoke/fires/flame 4229 0.2 4984 0.2 3402 0.1
Poison/noxious substance 12,090 0.5 5938 0.3 17,225 0.7
Suicide 30,755 1.4 29,391 1.4 32,171 1.4
Homicide 22,560 1.1 22,560 1.1 20,117 0.8
Note: Table shows average deaths per year for the specified age group or cause.
35
For example, per capita opioid sales more than tripled between 1999 and 2010
36
(Paulozzi, 2012). See Ruhm (2013) for a more extensive discussion of these issues.
28 C.J. Ruhm / Journal of Health Economics 42 (2015) 17–28
References Lam, Jean-Paul, Piérard, Emmanuelle, 2014. The Time-Varying Relationship Between
Mortality and Business Cycles in the U.S. University of Waterloo, Mimeo (June).
An, Ruopeng, Liu, Junyi, 2012. Local labor market fluctuations and physical activity Lin, Shin-Jong, 2009. Economic fluctuations and health outcome: a panel analysis of
among adults in the United States 1990–2009. In: ISRN Public Health, 2012, Asia-Pacific countries. Applied Economics 41 (4), 519–530.
Article 318610., pp. 1–7. Lindo, Jason M., 2015. Aggregation and the relationship between unemployment
Anderson, Robert, N., Arialdi, M., Miniño, Donna, L., Hoyert, Harry, M., 2001. Rosen- and health. Journal of Health Economics 40 (2), 83–96.
berg comparability of cause of death between ICD-9 and ICD-10: preliminary Malloy, Raven, Smith, Christopher L., Wozniak, Abigail, 2011. Internal migration in
estimates. National Vital Statistics Reports 49 (2), 1–32. the United States. Journal of Economic Perspectives 25 (3), 173–196, Summer.
Ariizumi, Hideki, Schirle, Tammy, 2012. Are recessions really good for your health? McInerney, Melissa, Mellor, Jennifer M., 2012. Recessions and seniors’ health, health
Evidence from Canada. Social Science and Medicine 74, 1224–1231. behaviors, and healthcare use: analysis of the medicare beneficiary survey. Jour-
Arkes, Jeremy, 2007. Does the economy affect teenage substance use? Health Eco- nal of Health Economics 31 (5), 744–751.
nomics 16 (1), 19–36. Miller, Douglas L., Page, Marianne E., Huff Stevens, Ann, Filipski, Mateusz, 2009.
Arkes, Jeremy, 2009. How the economy affects teenage weight. Social Science and Why are recessions good for your health? American Economic Review 99 (2),
Medicine 68 (11), 1943–1947. 122–127.
Bernanke, Ben, 2004. The Great Moderation, www.federalreserve.gov/boarddocs/ Neumayer, Eric, 2004. Recessions lower (some) mortality rates. Social Science &
speeches/2004/20040220/ (retrieved 6 March 2015). Medicine 58 (6), 1037–1047.
Blinder, Alan S., 1973. Wage discrimination: reduced form and structural estimates. Ogburn, William F., Thomas, Dorothy S, 1922. The influence of the business cycle
Journal of Human Resources 8 (4), 436–455. on certain social conditions. Journal of the American Statistical Association 18
Bradford, W. David, Lastrapes, William D., 2014. A prescription for unemployment? (139), 324–340.
Recessions and the demand for mental health drugs. Health Economics 23 (11), Oaxaca, Ronald, 1973. Male–female wage differentials in urban labor markets. Inter-
1301–1325. national Economic Review 14 (3), 693–709.
Brenner, M. Harvey, 1979. Mortality and the national economy. The Lancet 314 Paulozzi, Leonard J., 2012. Prescription drug overdoses: a review. Journal of Safety
(8142), 568–573. Research 43 (4), 283–289.
Brenner, M. Harvey, 1971. Economic changes and heart disease mortality. American Philipson, Thomas, Ebner, Michael, Lakdawalla, Darius N., Corral, Mitra, Conti, Rena,
Journal of Public Health 61 (3), 606–611. Goldman, Dana P., 2012. An analysis of whether higher health care spending in
Buchmueller, Tom, Grignon, Michel, Jusot, Florence, 2007. Unemployment and mor- the United States versus Europe is ‘worth it’ in the case of cancer. Health Affairs
tality in France, 1982-2002. In: Center for Health Economics and Policy Analysis 31 (4), 667–675.
Working Paper 07-04. McMaster University. Ruhm, Christopher J., 2000. Are recessions good for your health? Quarterly Journal
Butler, J.S., 2000. Efficiency results of MLE and GMM estimation with sampling of Economics 115 (2), 617–650.
weights. Journal of Econometrics 96 (1), 25–37. Ruhm, Christopher J., 2003. Good times make uou sick. Journal of Health Economics
Charles, Kerwin Kofi, DeCicca, Philip, 2008. Local labor market fluctuations and 22 (4), 637–658.
health: is there a connection and for whom? Journal of Health Economics 27 Ruhm, Christopher J., 2005. Healthy living in hard times. Journal of Health Economics
(6), 1532–1550. 24 (2), 341–363.
Colman, Gregory J., Dave, Dhaval M., 2013. Exercise physical activity, and exertion Ruhm, Christopher J., 2012. Understanding the relationship between macroecono-
over the business cycle. Social Science and Medicine 93 (September), 11–20. mic conditions and health. In: Andrew, M. Jones (Ed.), Elgar Companion to Health
Cotti, Chad, Tefft, Nathan, 2011. Decomposing the relationship between macroeco- Economics. , second ed. Edward Elgar, Cheltenham, UK, pp. 5–14.
nomic conditions and fatal car crashes during the great recession: alcohol- and Ruhm, Christopher J., 2013. Recessions, healthy no more? In: NBER Working Paper
non-alcohol-related accidents. B. E. Journal of Economic Analysis and Policy 11 No. 19287 (August).
(1), 1–48. Ruhm, Christopher J., 2015. Drug Poisoning Deaths in the United States 1999–2012.
Cutler, David M., 2008. Are we finally winning the war on cancer? Journal of Eco- University of Virginia, Mimeo.
nomic Perspectives 22 (4), 3–26. Ruhm, Christopher J., Black, William E, 2002. Does drinking really decrease in bad
Dávlos, María E., Fang, Hai, French, Michael T., 2012. Easing the pain of an economic times? Journal of Health Economics 21 (4), 659–678.
downturn: macroeconomic conditions and excessive alcohol consumption. Shierholz, Heidi, 2012. Labor force participation: cyclical versus structural changes
Health Economics 21 (11), 1318–1335. since the start of the great recession. In: Economic Policy Institute Issue Brief
Economou, Athina, Nikolau, Agelike, Theodossiou, Ioannis, 2008. Are recessions No. 333 (May 24).
harmful to health after all? Evidence from the European Union. Journal of Eco- Solon, Gary, Haider, Steven J., Wooldridge, Jeffrey, 2015. What are we weighting for?
nomic Studies 35 (5), 368–384. Journal of Human Resources (forthcoming).
Edwards, Ryan, 2011. American Time Use Over the Business Cycle. University of New Stevens, Ann Huff, Miller, Douglas L., Page, Marianne, Filipski, Mateusz, 2011. The
York, Mimeo City. best of times the worst of times: understanding procyclical mortality. In: NBER
Egan, Mark L., Mulligan, Casey B., Philipson, Tomas J., 2013. Adjusting measures of Working Paper No. 17657.
economic output for health: is the business cycle countercyclical? In: National Stock, James H., Watson, Mark W., 2003. Has the business cycle changed and why?
Bureau of Economic Research Working Paper No. 19058. In: Gertler, M., Rogoff, K. (Eds.), NBER Macroeconomics Annual 2002. MIT Press,
Eyer, Joseph, 1977. Prosperity as a cause of death. International Journal of Health Cambridge, pp. 159–218.
Services 7 (1), 125–150. Sullivan, Daniel, von Wachter, Till, 2009. Job displacement and mortality: an analysis
Fishback, Price V., Haines, Michael R., Kantor, Shawn, 2007. Births deaths and new using administrative data. Quarterly Journal of Economics 124 (3), 1265–1306.
deal relief during the great depression. Review of Economics and Statistics 89 Stuckler, David, Basu, Sanjay, Suhrcke, Marc, Coutts, Adam, McKee, Martin, 2009.
(1), 1–14. The public health effect of economic crisis and alternative policy responses in
Freeman, Donald G., 1999. A note on economic conditions and alcohol problems. Europe: an empirical analysis. The Lancet 374 (9686), 315–323.
Journal of Health Economics 18 (5), 661–670. Svensson, Mikael, 2007. Do not go breaking your heart: do economic upturns really
Galí, Jordi, van Rens, Thijs, 2010. The vanishing procyclicality of labor productivity,. increase heart attack mortality? Social Science and Medicine 65 (4), 833–841.
In: Kiel Institute Working Paper No. 1641 (August). Tapia Granados, José A., 2005. Recessions and mortality in Spain, 1980–1997. Euro-
Gerdtham, Ulf-G, Ruhm, Christopher J., 2006. Deaths rise in good economic times: pean Journal of Population 21 (4), 393–422.
evidence from the OECD. Economics and Human Biology 43 (3), 298–316. Tapia Granados, José A., Diez Roux, Ana V, 2009. Life and death during the
Gonzalez, Fidel, Quast, Troy, 2011. Macroeconomic changes and mortality in Mexico. great depression. Proceedings of the National Academy of Sciences 106 (41),
Empirical Economics 40 (2), 305–319. 17290–17295.
Gravelle, H.S.E., Hutchinson, G., Stern, J., 1981. Mortality and unemployment: a Tekin, Erdal, McClellan, Chandler, Jean Minyard, Karen, 2013. Health and health
critique of Brenner’s time-series analysis. The Lancet 318 (8248), 675–679. behaviors during the worst of times: evidence from the great recession. In: NBER
Gruber, Jonathan, Frakes, Michael, 2006. Does falling smoking lead to rising obesity? Working Paper No. 19234.
Journal of Health Economics 25 (2), 183–197. Thomas, Dorothy Swaine, 1927. Social Aspects of the Business Cycle. Alfred A. Knopf,
Halliday, Timothy J., 2007. Business cycles migration and health. Social Science and New York, NY.
Medicine 64 (7), 1420–1424. Warner, Margaret, Chen, Hui Li, Makuc, Diane M., Anderson, Robert N., Miniño,
Heutel, Garth, Ruhm, Christopher J., 2013. Air pollution and procyclical mortality. Arialdi M., 2011. Drug poisoning deaths in the United States, 1980–2008. In:
In: National Bureau of Economic Research Working Paper No. 18959. NCHS Data Brief No. 81. National Center for Health Statistics, Hyattsville, MD.
Kasl, Stanislav V., 1979. Mortality and the business cycle: some questions about Wooldridge, Jeffrey, 1999. Asymptotic properties of weighted M-estimators for vari-
research strategies when utilizing macro-social and ecological data. American able probability samples. Econometrica 67 (6), 1385–1406.
Journal of Public Health 69 (8), 784–788. Xu, Xin, 2013. The business cycle and health behaviors. Social Science and Medicine
Klebba, A. Joan, Scott, Joyce H., 1980. Estimates of selected comparability ratios based 77 (January), 126–136.
on dual coding of 1976 death certificates by the eighth and ninth revisions of Yeung, Gary Y.C., van den Berg, Gerard J., Lindeboom, Marrten, Portrait., France R.M,
the international classifications of diseases. Monthly Vital Statistics Report 28 2014. The impact of early-life economic conditions on cause-specific mortality
(11), 1–19. during adulthood. Journal of Population Economics 27 (3), 895–919.
Journal of Health Economics 42 (2015) 29–43
a r t i c l e i n f o a b s t r a c t
Article history: We aim to disentangle the relative impact of (i) cognitive ability and (ii) education on health and mortality
Received 17 September 2013 using a structural equation model suggested by Conti et al. (2010). We extend their model by allowing
Received in revised form 21 July 2014 for a duration dependent variable (mortality), and an ordinal educational variable. Data come from a
Accepted 7 March 2015
Dutch cohort born between 1937 and 1941, including detailed measures of cognitive ability and family
Available online 17 March 2015
background in the final grade of primary school. The data are linked to the mortality register 1995–2011,
such that we observe mortality between ages 55 and 75. The results suggest that at least half of the
JEL classification:
unconditional survival differences between educational groups are due to a ‘selection effect’, primarily
C41
I14
on the basis of cognitive ability. Conditional survival differences across those having finished just primary
I24 school and those entering secondary education are still substantial, and amount to a 4 years gain in life
expectancy, on average.
Keywords: © 2015 Elsevier B.V. All rights reserved.
Education
Cognitive ability
Mortality
Structural equation model
Duration model
1. Introduction the most compelling and well established facts in social science
research (Mazumder, 2012). Even in an egalitarian country such
Disparities in health and life expectancy across educational as the Netherlands, with a very accessible health care system, the
groups are striking and pervasive, and are considered one of difference in life expectancy between the university educated and
those who finished only primary school is 6–7 years (CBS, 2008). It is
commonly assumed that a large part of this association derives from
the causal effect of education on health outcomes. An abundant list
夽 Van Kippersluis gratefully acknowledges funding from the National Institute on
of possible mechanisms was proposed, among which occupational
Aging (NIA) under grant R01AG037398, from NETSPAR under the project “Income
demands, health behavior, and the ability to process information
and health, work and care across the life cycle II”, and from the Netherlands Organi-
zation of Scientific Research (NWO Veni grant 016.145.082). The authors acknowl- are the most commonly mentioned (Ross and Wu, 1995; Cutler
edge access to linked data resources (DO 1995–2011) by Statistics Netherlands (CBS). and Lleras-Muney, 2008).
We thank Mars Cramer and Mirjam van Praag for help in accessing the original data Yet, the association between education and health could also
(see https://easy.dans.knaw.nl/ui/datasets/id/easy-dataset:39042), and are grateful stem from (i) ‘reverse causality’, in which childhood ill-health con-
to Mars Cramer, and attendants from the Empirical Health Economics conference in
Munich 2013, the International Health Economics Association conference in Sydney
strains educational attainment (Behrman and Rosenzweig, 2004;
2013, the Multistate Event History Analysis in Hangzhou 2012, and seminar partici- Case et al., 2005) and (ii) confounding ‘third factors’ such as ability,
pants at the Chinese University of Hong Kong, Cornell University, Erasmus University parental background and time preference that influence both edu-
Rotterdam, and the University of Southern California for helpful comments. cation and health outcomes (Fuchs, 1982; Auld and Sidhu, 2005;
∗ Corresponding author at: Erasmus School of Economics, Erasmus University
Deary, 2008).
Rotterdam, PO Box 1738, 3000 DR Rotterdam, The Netherlands. Tel.: +31 10 4088837.
E-mail addresses: bijwaard@nidi.nl (G.E. Bijwaard), hvankippersluis@ese.eur.nl Studies based on natural experiments in education, such as
(H. van Kippersluis), veenman@ese.eur.nl (J. Veenman). changes in compulsory schooling laws, overcome the difficulty of
http://dx.doi.org/10.1016/j.jhealeco.2015.03.003
0167-6296/© 2015 Elsevier B.V. All rights reserved.
30 G.E. Bijwaard et al. / Journal of Health Economics 42 (2015) 29–43
separating the direct causal effect of education from third factor The results show that for most ages, cognitive ability and family
effects. The estimates based on these studies point towards a small socioeconomic status explain around half of the raw differences in
effect (Lleras-Muney, 2005; Oreopoulos, 2006; Van Kippersluis mortality across educational groups. Stated otherwise, education
et al., 2011; Meghir et al., 2013), or even insignificant effect of edu- remains important in determining mortality even after controlling
cation on health and mortality (Arendt, 2005; Albouy and Lequien, for cognitive ability, family socioeconomic status, and a range of
2008; Mazumder, 2008; Braakmann, 2011; Clark and Royer, 2013). other background variables. The conditional survival differences
This suggests that confounding factors may well play an impor- across educational groups are even remarkable, and amount to a
tant role in shaping the strong association between education and 4-year gain in life expectancy for those entering at least secondary
health. school compared to those that dropped out after primary school.
Surprisingly little research in economics has investigated the This paper is structured as follows. Section 2 presents the Bra-
contribution of early childhood abilities and childhood social back- bant data including the available register data from Statistics
ground in shaping the association between education and health.1 Netherlands, Section 3 presents the structural equation model that
Some recent economic studies report associations between child- we will use to disentangle the relative contributions of cognitive
hood cognitive and non-cognitive abilities, and health outcomes ability and education on health outcomes. Section 4 presents the
at ages 30–40 using the British Cohort Study (Murasko, 2007), the results and Section 5 discusses them.
U.K. National Child Development Study (Carneiro et al., 2007), the
U.S. National Longitudinal Study of Youth 1979 (Auld and Sidhu,
2. Data and descriptive statistics
2005; Kaestner and Callison, 2011), or the Dutch ‘Brabant data’
(Cramer, 2012). It is established that cognitive ability and some
The data are from a Dutch cohort born between 1937 and 1941.
non-cognitive factors such as self-esteem and conscientiousness
Very detailed information about individual intelligence, social
are associated with health outcomes. Nonetheless, hardly anything
background and school achievement is available for 5823 individ-
is known about (i) the relative impact of education and childhood
uals. The survey was held in the spring and summer of 1952 among
abilities on health outcomes, and in turn (ii) how much of the
pupils of the sixth (last) grade of primary schools in the Dutch
association between education and health is explained by these
province of Noord-Brabant, and hence is referred to as the ‘Bra-
cognitive and non-cognitive abilities.
bant data’. One-fourth of the province population was sampled;
A notable contribution to the literature is a recent series of
mainly by including every fourth child from the schools’ list of
papers by Conti and Heckman (2010), Conti et al. (2010, 2011),
pupils.3 Hartog (1989) investigated the data and found no reason to
and Heckman et al. (2014) who, using the British Cohort Study
doubt representativeness. A selective dropout of pupils before par-
and the National Longitudinal Study of Youth (NLSY79), estimate a
ticipating in the data collection does not exist, as primary school
structural equation model in which the interdependence between
was compulsory and enforcement of school attendance was strict
education, health, and two latent factors capturing cognitive and
(Dronkers, 2002).
non-cognitive abilities is explicitly modeled. The authors show that
Follow-up surveys took place in 1957, 1983 and 1993.4 In 1957
for most health outcomes around half of the association between
only a sub-sample – those who scored above-average on six tests
education and health is driven by cognitive and non-cognitive
– of the original cohort was interviewed about the school careers
abilities and early childhood social background. The other half is
between 1952 and 1957 to particularly investigate school career
interpreted as the causal effect of education on health.
choices of the most intelligent half of the cohort. In 1983 and 1993
While the series of papers by Conti, Heckman and co-authors
attempts were made to trace all initial respondents of the Brabant-
provided a significant contribution to the literature, there are two
cohort to investigate labour market behavior, with overall response
notable limitations. First, the health outcomes are measured at age
rates of around 45 percent. The sample is reduced to 2998 individ-
30, an age at which health differences by education may not have
uals who have measurements in 1952 and in either 1983 or 1993,
fully materialized. In fact, disparities in health and mortality seem
or both.5
to peak around middle-age (Cutler and Lleras-Muney, 2008). Sec-
The Brabant data are subsequently linked to administrative
ondly, the health measures are all self-reported, which may bias
records from Statistics Netherlands. The basis for this linkage is
the estimates since education is related to subjective health per-
identifying information on ZIP code, date of birth, and sex, provided
ceptions (Bago d’Uva et al., 2008).
in 1993 by Dutch municipalities, which includes information on all
In this paper, we aim to disentangle the effects of education and
individuals living in the Netherlands. The administrative records
cognitive ability on health outcomes. We will use the so-called ‘Bra-
are available since 1995. Because of the two-year discrepancy only
bant data’ – a representative cohort of primary school sixth graders
86 percent of the 2998 individuals could be traced in the munici-
in the Dutch province of Noord-Brabant – that has detailed infor-
pality register in 1995, leaving us with a working sample of 2579
mation on cognitive ability and social background measured back
individuals. Administrative records include the mortality register
in 1952. Three follow-up surveys in 1957, 1983 and 1993 contain
and the municipality register for the years 1995–2011 inclusive.
information on education, employment, and self-reported health.
The mortality register is used to identify drop out due to death in
We have linked these data to the mortality register 1995–2011,
such that the impact on mortality can be analyzed.
The contribution of this paper is threefold. First, we study the
relative impact of cognitive ability and education on mortality, as selection of ability due to differential survivorship. His study is based on the Terman
an objective health indicator. The second contribution is that, in data, a cohort of individuals with IQ beyond 140. Hence, apart from differences in
contrast to existing studies that measure health outcomes at ages the model specification, his focus is on an extraordinary sample corresponding to
the 99.6th percentile of the intelligence distribution, with very limited variation
30–40, we observe mortality during ages 55–75. Finally, we extend
in cognitive ability. Not surprisingly, he examines the effect of higher education
the structural equation model by Conti et al. (2010) by allowing for whereas we focus on secondary education.
a duration dependent variable (mortality).2 3
Some schools had school years beginning in April rather than in September. For
these schools, half the pupils of half the schools were included in the sample, which
yielded 369 observations on a total of 5823 (Hartog, 1989).
4
Mathijssen and Sonnemans (1958), Hartog and Pfann (1985), Van Praag (1992),
1
See Gottfredson (2004) for an overview of the epidemiological literature. and Hartog et al. (2002). The complete questionnaire is included in Van Praag (1992)
2
Savelyev (2012) developed a similar structural equation model for mortality as ‘Brabantse zesdeklassers, 1952–2010’.
5
ours, yet using a discrete-time hazard model and not taking into account dynamic In Section 4.2 it is verified that selective attrition does not affect our results.
G.E. Bijwaard et al. / Journal of Health Economics 42 (2015) 29–43 31
the follow-up period. Demographics are obtained from the munic- of the cognitive ability endowment in the final grade of primary
ipality register. school on educational choice and later-life mortality is seen as a
selection effect.9
2.1. Dependent variables The IQ p.m. (‘progressive matrices’) test focuses on mathemat-
ical ability and is a replication of the British Progressive Matrices
Our outcome variable is Mortality, which is identified from the test, designed by Raven (1958). It is considered to be a ‘pure’ mea-
mortality register in the period 1995–2011. Given that most pupils surement of problem solving abilities, as it does not require any
are born around 1940, this implies that we follow mortality from linguistic or general knowledge (Dronkers, 2002). Hence, the Raven
age 55 until 75.6 In our sample, 409 individuals, or 16 percent, died test is supposed to measure fluid or analytic intelligence (Carpenter
during the period 1995–2011. Close to 50 percent died from cancer, et al., 1990). In this sense, the test can be compared to Spearman’s
25 percent from cardiovascular diseases, and 8 percent from respi- g test (1927). The term g refers to the determinants of the common
ratory diseases such as COPD and pneumonia. External causes such variance within intelligence tests, being the core issue of intelli-
as accidents comprise only two percent, as do mental disorders (e.g. gence measurement (Carpenter et al., 1990).
dementia), diseases of the digestive system (e.g. liver cirrhosis) and Table 1 shows that the ability test designed by Raven has an
diseases of the nervous system (e.g. Parkinson). average of 102, with standard deviation of 13 while the vocabulary
test is 101, on average, with standard deviation 13. The correlation
between the Raven test and the vocabulary test is 0.38. This sug-
2.2. Independent variables
gests that while there seems to be some overlap between the two
measurements, the tests additionally gauge some idiosyncratic part
Our main independent variable of interest is Education, here
of cognitive ability. Therefore, we will use both measurements to
defined as the highest level of education attended, in three cat-
build a comprehensive latent factor of cognitive ability. In a robust-
egories: (1) Lower Education, including those who attended at
ness check we solely use the Raven test to see whether the results
most (extended)7 primary school, (2) Lower Vocational Education,
differ.
including those who attended at most lower vocational educa-
tion such as the lower agricultural school or lower polytechnic
schools, and (3) At least General Secondary School, including those
who attended lower general secondary school, higher general 2.3. Control variables
secondary school, and higher vocational education or university.
Education is retrieved mainly from the 1983 and 1993-survey Apart from a fairly standard set of demographic control variables
variables on the highest level of education attended. The maxi- such as Age, whether Male, and Birth Rank, we also have informa-
mum of the two defines Education, and where missing we update tion about the social and school environment of the individuals.
our educational variable with information from the 1957 sur- Most of these variables are reported by the School principal. Family
vey. Socioeconomic Status is measured in three categories from lowest to
Table 1 presents descriptive statistics and shows that 14 per- highest depending on father’s occupation.10 We additionally know
cent did not continue school after primary school forming the Lower whether the child had to work in the parent’s farm or company,
Education category, 35 percent only attended Lower Vocational Edu- defining the binary indicator Child Works, which potentially sig-
cation, and the other 51 percent attended At least General Secondary nals part of the childhood health status. In this (historical) case,
School. Fig. 1 shows the Kaplan–Meier survival curves for a binary however, the variable is mainly dependent on the parents having a
indicator of education with threshold at Lower Education, and sep- firm.
arately for the three education categories. It is clear that the largest Available information regarding the school includes School Type
survival differences are between those with only primary school and the Number of Teachers. Repeat defines the number of classes
and those above primary school, and that the difference grows with that children had to repeat. Further, we know the Teacher’s Advice
age to around ten percentage points near age 75. regarding further education of the child, and the Preference of the
Our second independent variable is Cognitive Ability. In the Bra- Parents concerning the education of the pupil, categories of which
bant data there are two measurements for cognitive ability, both are defined in Table 1, which also includes descriptive statistics.
measured in the final grade of primary school (i.e. around age 12): We have no information about childhood health status, which
(i) the Raven Progressive Matrices Test, and (ii) a Vocabulary test prevents us from investigating the possibility of reverse causality
(picking synonyms).8 The timing of the intelligence tests implies from health to education in our sample. The sample is comprised
that the plausible feedback effects from education to cognitive abil- of pupils who made it to the final grade of primary school. Hence,
ity (Deary and Johnson, 2010; Brinch and Galloway, 2012; Meghir pupils with severe health problems impairing going to school in
et al., 2013) will be seen as an education effect, while the impact the first place will not be present in our sample. Moreover, in the
1983 wave of the survey male respondents were asked whether
they served in the military. The main reason for disqualification of
6
Of the Dutch population 1940 cohort, only 6.8 percent died between the
ages of 12 and 55 – Human Mortality Database, University of California, Berkeley
(USA), and Max Planck Institute for Demographic Research (Germany). Available
9
at www.mortality.org or www.humanmortality.de (data downloaded on July 30, It should be emphasized however that there could be unobserved factors cor-
2012). related to both cognitive ability in the final grade of primary school and later-life
7
At the time, pupils had to stay in school for at least 8 years, or until they reached mortality, which our measure of cognitive ability would be picking up.
10
the age of 14. Since regular primary school only consisted of 6 grades, some schools We classify lower administrative, agricultural, industrial, and other lower work-
offered an additional 2-year extended primary school (“vglo”). ers, and the disabled into the Lowest Socioeconomic Status. If the School Principal
8
The data also contain the so-called LO-IV test, which consists of six sub-tests: considered the family antisocial, the family is also classified into the Lowest Socioe-
regularities in series of numbers, analogies in figures, analogies in words, and sim- conomic Status. Intermediary personnel, self-employed farmers, self-employed
ilarities between concepts (equal, not-equal, cause). Since the quality of this test craftsmen, and the retired are categorized into the Intermediate Socioeconomic Sta-
has been questioned (Hartog et al., 2002, p. 5) we will not use it in our analyses. tus (following Cramer, 2012). Teachers, executives and academics are classified into
There is also information on grades for specific courses (Dutch language, mathe- the Highest Socioeconomic Status. In case father’s occupation is missing, we use
matics (arithmetics), history, physics, geography, health sciences, and traffic), but father’s education for individuals in the 1957 survey. Father’s education is classified
since these are not clean measures of cognitive ability and are relative to others in into 3 levels, which we directly translate into the three socioeconomic statuses. We
one’s classroom, we choose not to use these grades. use mother’s education in case the father died or was not present in the household.
32 G.E. Bijwaard et al. / Journal of Health Economics 42 (2015) 29–43
Table 1
Descriptive statistics of the Brabant data sample.
Dependent variables
Mortality 0.16 0.35 2579
Independent variables
Education
Lower education 0.14 0.34 2537
Lower vocational education 0.34 0.48 2537
At least general secondary school 0.51 0.35 2537
Raven p.m. test 102.04 13.28 2579
Vocabulary test 101.42 12.87 2579
Control variables
Male 0.58 0.49 2579
Birth rank 2.50 2.55 2412
Family socioeconomic status
Lowest 0.53 0.50 2409
Middle 0.44 0.50 2409
Highest 0.03 0.16 2409
Child works 0.28 0.45 2256
School religion
Roman-Catholic 0.76 0.43 2518
Protestant 0.19 0.40 2518
Special 0.03 0.17 2518
Public 0.02 0.13 2518
Number of teachers 6.92 2.47 2452
Repeat
No repetition of grade 0.64 0.48 2462
Repeated once 0.27 0.45 2462
Repeated twice or more 0.09 0.28 2462
Teacher’s advice
Continue primary school 0.24 0.43 2429
Lower vocational education 0.38 0.48 2429
Lower secondary education 0.24 0.43 2429
Higher secondary education 0.14 0.20 2429
Preference of the parents
Work in family company 0.13 0.33 2200
Paid work without vocational education 0.20 0.28 2200
Paid work with vocational education 0.27 0.44 2200
General secondary education 0.41 0.49 2200
Notes: Author’s calculations on the basis of the Brabant data linked to the municipality register and the mortality register.
compulsory military duty is health problems.11 Since the fraction of depending on the perceived health gains. Hence, the educa-
individuals having served in the military is almost identical across tional choice is endogenous, and in practice it is assumed that
educational levels, this provides some indirect evidence that health selection into schooling can be fully accounted for by using
differences across educational levels were minimal during teenage observed characteristics and unobserved ability. The model con-
years. We furthermore refer to Conti et al. (2010) who showed that sists of three parts: (i) a binary educational choice depending
in their sample childhood health, as measured by childhood height, on latent abilities and other covariates, (ii) potential outcomes
was not an important determinant of educational choice. The lack depending on the choice of education, latent abilities, and other
of information on childhood health should therefore not be a major covariates, and (iii) a measurement system for the latent abili-
source of concern. ties.
The binary indicator for education Di is defined as 1 if individual
3. Methodology i took any education beyond the compulsory schooling age, and 0
if not:
Our empirical approach is an extension of the structural equa- 1 if Di∗ ≥ 0
tion framework developed by Conti et al. (2010). We briefly Di = (1)
describe the Conti et al. model, after which we will present our two 0 otherwise
extensions: allowing for an objective duration dependent variable where we assume Di∗ is an underlying latent utility which is con-
(mortality), and introducing an ordinal educational choice. Finally, tinuous and linear, and depends on latent abilities , and observed
we explain how we disentangle the effects of cognitive ability and characteristics XD :
education on the health outcomes.
Di∗ = XiD + ˛D i + iD (2)
3.1. Basic structural equation model with D being an error term independent of XD and . We assume
that D is normally distributed, which implies that we have a probit
The basic Conti et al. model allows a way of modeling the inter- model for the educational choice. We fix the variance at 1 since the
relationships between abilities, education and health outcomes, variance is not identified in a probit model.
where individuals potentially make their educational decisions The second part is the potential outcomes part, in which
there are two potential outcomes Yi1 and Yi0 , where the former
is the outcome in case the individual chose to pursue education
11
Other reasons were exemption owing to one’s brother’s service, grounds of beyond what is compulsory, and the latter is the outcome in case
conscience, or personal indispensability (e.g. Van Schellen and Nieuwbeerta, 2007). the individual dropped out of school right after the compulsory
G.E. Bijwaard et al. / Journal of Health Economics 42 (2015) 29–43 33
Fig. 1. Kaplan–Meier survival function by education level in two categories (top) and three categories (bottom).
schooling age. Both Yi1 and Yi0 depend on latent ability , and on equation for latent ability is defined by (5), where we have two
observed characteristics XY : measurements for latent cognitive ability.
It is common practice to define the potential outcomes of a
Yi1 = ˇ1 XiY + ˛1 i + i1 (3) duration variable like mortality in terms of the hazard that the out-
come of interest occurs.12 We define (1) (t) as the hazard rate for
Yi0 = ˇ0 XiY + ˛0 i + i0 (4)
an individual with education level beyond primary school (Di = 1),
with (0 , 1 ) independent of XY and , independent of iD con- and (0) (t) as the hazard rate for an individual with an education
ditional on XY and , and both follow a normal distribution with level equal to primary school (Di = 0). We assume a Gompertz pro-
variance 12 and 02 , respectively. portional hazard model for the two potential hazards, which has
The final part of the model is the measurement equation, where been shown to be an accurate representation of mortality between
one or two measurements, Mik (k = 1, 2), implicitly define the latent the ages of 30 and 80 (e.g. Gavrilov and Gavrilova, 1991; Cramer,
ability : 2012). Both potential hazards depend on the latent ability ,13 and
observed characteristics XY :
Mik = ık XiM + ˛Mk i + iM k (5)
(0) (t|X Y , ) = exp (a0 t + ˇ0 XiY + ˛0 i ) (6)
with Mk independent of XM and . We assume that Mk is normally
2 .
distributed with variance M
k
(1) (t|X Y , ) = exp (a1 t + ˇ1 XiY + ˛1 i ) (7)
3.2. Allowing for a duration outcome as dependent variable
The effect of latent ability on the hazard is captured by ˛0 and with h() is a normal distribution with variance 2 = 1. The
˛1 . The corresponding potential survival rates are maximum likelihood estimation of the parameters involves the
t calculation of an integral that does not have an analytical solu-
(0) Y (0) Y tion. However, Gaussian quadrature can approximate this one
S (t|X , ) = exp − (s|X , )ds (8)
0 dimensional integral very well. Hence, we estimate the parame-
t ters using maximum likelihood on the basis of Gaussian quadrature
S (1) (t|X Y , ) = exp − (1) (s|X Y , )ds (9) approximation.15
0
3.3. Allowing for an ordered discrete educational choice
Without additional restrictions on the distribution of the latent
factors the model is not identified. However, because we have Usually education is available in more than two categories with a
an intrinsically non-linear duration outcome instead of a linear natural ordering of the alternative education levels. As a robustness
outcome, the Ledermann bound on the number of measurements check (see Section 4.2), we extend the standard model to account
compared to the number of latent factors does not apply. Identi- for this type of ordinal independent variable, where the starting
fication of our model is closely related to the identification in a point is, again, an index model with a single latent variable given as
mixed proportional hazard (MPH) model, where we assume that in (2). Assume there are K education levels and define Di as the indi-
the unobserved heterogeneity has a log-normal distribution. A MPH cator of education that takes value k if the individual has reached
model is identified when the unobserved heterogeneity term has education level k:
finite mean and is independent of the other observed factors (Elbers
and Ridder, 1982). When we assume a normal distribution for the Di = k if
k−1 < Di∗ ≤
k (13)
latent ability, ∼N(0, 2 ), the implied unobserved heterogeneity in
where
0 =− ∞ and
K =∞. Then, assuming normally distributed D ,
the hazard (Eqs. (6) and (7)) has mean exp 1 2 2
˛
2 j
, for j = 0, 1. For we have an ordered probit model with (K − 1) additional thresh-
old parameters,
k . Each education level now has a corresponding
identification ˛j or 2 needs to be fixed. We choose to fix 2 = 1.
potential Gompertz hazard (k) , that depends on exogenous char-
Thus the latent ability follows a standard normal distribution.14 acteristics XY and on the unobserved latent ability, , i.e.,
An important feature of duration data is that for some individu-
als we only know that he or she survived up to a certain time (often (k) (t|X Y , ) = exp (ak t + ˇk XiY + ˛k i ) (14)
the end of the observation window). In this case an individual is
(right) censored, i = 0, and we use the survival function instead of 3.4. Disentangling the effects of ability and education
the hazard in the likelihood function. Another feature of duration
data is that only individuals are observed having survived up to a At the individual level, the main estimate of interest is the sur-
certain age. In our case, mortality follow-up is only available from vival difference across the two educational levels, S(1) (t) − S(0) (t),
age 55 onwards. In this case the individuals are left-truncated, and where S(1) (t) denotes the survival time up to age t for individuals
we need to condition on survival up to the age of first observation, with at least secondary education (D = 1), and S(0) (t) is the survival
t0 . time up to age t for those with primary school only (D = 0). We
The likelihood contribution of individual i in our duration are interested in the expected value of this identity for a given
model is (sub)population. In the sample, the difference in the Kaplan–Meier
survival curves is the unconditional survival difference between the
Li = (j) (t)i S (j) (t)/S (j) (t0 ),
(j)
j = 0, 1 (10) two levels of educational attainment, E[S(1) (t) − S(0) (t)]. This uncon-
ditional difference can be interpreted as the association between
With left-truncated data the distribution of latent ability among
education and mortality.
the survivors (up to the left-truncation time) changes. When
Here we are interested to what extent this association is driven
only individuals are observed that have survived until age t0 the
by cognitive ability and other control variables. Using the estimated
likelihood contribution is
parameters, we define the conditional survival difference between
the two levels of educational attainment, where conditioning is
i
Di
Li = ˚ XiD + ˛D · (1) (t|X Y , ) S (1) (t|X Y , )/S (1) (t0 |X Y , ) based on cognitive ability and the other control variables, as fol-
lows:
1−Di
× ˚ −XiD − ˛D · (0) (t|X Y , )
i
S (0) (t|X Y , )/S (0) (t0 |X Y , )
E S (1) (t) − S (0) (t)|X = x, = c dF X, (x, c) (15)
2 M − ık XiM − ˛Mk
1 ik
× dH(|T > t0 ) (11) where X are the covariates, and is the value of latent cognitive
Mk Mk
k=1
ability. We integrate over the joint distribution of the covariates
14
In principle restricting the distribution of to a normal distribution is not nec-
essary. In line with the literature on MPH models a discrete distribution with finite
15
points of support would be an alternative choice (Heckman and Singer, 1984). How- Gaussian quadrature is a numerical integration method based on Hermite poly-
ever, using a normal distribution assumes a continuum of ability values rather than nomials (Press et al., 1993). It provides an efficient approximation for evaluating
a finite number, and the distribution of intelligence is generally found to be close to indefinite integrals based on normal distributions (Butler and Moffitt, 1982). A sim-
normal (Gottfredson, 1997). ilar method has been applied before in survival analysis (Lillard, 1993).
G.E. Bijwaard et al. / Journal of Health Economics 42 (2015) 29–43 35
and latent ability, FX, (x, c).16 Note that these conditional survival For the ordinal education measure the procedure is very similar.
differences are conditional on surviving to the initial age, which is We have three potential hazards and three possible survival func-
55 in our case. tions, one corresponding to each educational level. Although there
Unfortunately, the integrals cannot be solved analytically, as the are more possibilities now to compare the educational groups, we
dimension of the covariates X is too large. Another issue is that the choose to focus on two binary comparisons of the particular edu-
comparison of the survival functions involves the counterfactual of cational level to the educational level directly preceding it. Hence,
surviving with another education level. Hence in order to illustrate we estimate two different conditional survival differences: (i) lower
the conditional survival differences we resort to simulation.17 For vocational education compared to primary education only and (ii)
each education level we simulate the survival of 10,000 individu- at least general secondary education compared to lower vocational
als. To each individual we assign observed characteristics based on education.
the empirical distribution in the sample. The simulation procedure
consists of four steps:
4. Results
Table 2
Duration model – binary education variable, two measurements for ability.
Conditional survival difference Fig. 5, which shows that cognitive ability explains the largest part
90% upper and lower bound of the selection effect. In fact, selection on other observable factors
is even negative between ages 60 and 70. We have to emphasize,
0.10
0.06
18
These outcomes are reasonably close to the gender-education specific estimated
0.00
0.10
Conditional survival difference
Selection Effect
0.08
0.06
0.04
0.02
0.00
55 57 59 61 63 65 67 69 71 73
age
55 57 59 61 63 65 67 69 71 73 75
age
Fig. 4. Decomposition of unconditional difference in the Kaplan–Meier survival function into conditional differences and a selection effect based on observed characteristics
and cognitive ability, with binary education variable and two measurements for cognitive ability.
coefficient estimates of the exogenous variables are very simi- If we decompose the unconditional survival differences
lar to the ones presented for the binary educational variable.21 between the three educational groups into a conditional survival
Fig. 6 presents the conditional survival differences for the three difference and a selection effect, we obtain Fig. 7. This graph
different educational levels. It is clear that there is a large, but shows that the conditional survival difference between primary
insignificant, conditional survival difference between lower voca- and vocational education is positive and becomes larger than the
tional school (level 2) and primary school (level 1). At age 75, selection effect from age 70 onwards, in line with the findings of
those who only attended primary school are around four per- the dichotomous indicator for education. The conditional survival
centage points more likely to die than those who attended lower difference between vocational and higher education is negligible.
vocational school. The conditional survival difference between gen- Taken together, Figs. 6 and 7 clearly indicate that the largest dif-
eral secondary school and lower vocational school is practically ference is between those having finished primary school and those
zero. beyond primary school, such that the dichotomization in the pre-
vious subsection seems justified.
While mortality is an objective, and in some sense ‘the ulti-
mate’, health outcome, the influence of education and cognitive
21 ability may differ depending on the health outcome used. In the
All results not presented and the details of the models used in this section are
available upon request. 1993 wave of our Brabant survey, hence around age 53 for our
G.E. Bijwaard et al. / Journal of Health Economics 42 (2015) 29–43 39
0.05
Conditional survival difference
Selection effect (other)
Selection effect (cognitive skills)
0.04
0.03
0.02
0.01
0.00
−0.01
−0.02
55 57 59 61 63 65 67 69 71 73
age
55 57 59 61 63 65 67 69 71 73 75
age
Fig. 5. Decomposition of observed difference in the Kaplan–Meier survival function into conditional differences and a selection effect due to observed characteristics and
cognitive ability, and other selection effects based on observed characteristics only, with binary education variable and two measurements for cognitive ability (with 90%
confidence intervals, below).
0.06
Selection Effect vocational to higher education
0.04
0.02
0.00
−0.02
55 57 59 61 63 65 67 69 71 73
age
Fig. 7. Decomposition of observed difference in the Kaplan–Meier survival function into conditional differences and a selection effect based on observed characteristics and
cognitive ability, with ordinal education variable and two measurements for cognitive ability (with 90% confidence intervals: lower left primary to vocational education;
lower right vocational to higher education).
Conditional difference
0.15
Fig. 8. This suggests that the relative contributions of education and Selection effect
selection effects may well differ across objective health measures
and the subjective health measures that are commonly used in the
literature.
0.10
ran all models separately for males and females. Strong disparities
in survival across educational groups exist for both males and
females. This can be distracted from Fig. 9 where the height of
0.00
0.14
0.14
Selection effect Selection effect
0.12
0.12
0.10
0.10
0.08
0.08
0.06
0.06
0.04
0.04
0.02
0.02
0.00
0.00
55 57 59 61 63 65 67 69 71 73 75 55 57 59 61 63 65 67 69 71 73 75
age age
0.15
0.20
Selection Effect Selection Effect
0.10
0.15
0.05
0.10
0.05
0.00
0.00
−0.05
−0.05
−0.10
55 57 59 61 63 65 67 69 71 73 75 55 57 59 61 63 65 67 69 71 73 75
age age
Fig. 9. Decomposition of observed difference in the Kaplan–Meier survival function into conditional differences and a selection effect based on observed characteristics and
cognitive ability, for males (left) and females (right).
starts in 1995. This could lead to an attrition bias, if attrition is non- again did not find any deviation. All these results are summarized
random. Unfortunately, we do not have access to the original data in Fig. 11 and the detailed estimation results are available upon
files such that we cannot investigate attrition directly. However, request.
Hartog (1989) investigated the non-response for the 1983 survey Finally, we varied the observed characteristics in the model.
and found no attrition bias in a wage analysis.22 Since the sample First by including additional variables among the exogenous vari-
in 1983 has been shown to be representative, we reran all analy- ables such as family size, number of children, additional school
ses on just the respondents that were observed in 1983 and found characteristics (e.g. whether restricted to girls, restricted to boys,
no substantial changes in the results. This suggests that selective or mixed), and whether both parents were still alive. These vari-
attrition does not affect our results. ables were not statistically significant in any of the models, and
The data contains information about children from different did not alter the results. Second, we also checked robustness to
years of birth. Most of them born in earlier years had to repeat excluding individuals with item non-response on some of the
a class and their average cognitive skills are lower. This could observed characteristics, in which case too the results remain
be a potential source of selection, if staying back was due to low similar.
cognitive ability, or, worse, to health reasons. We ran a robustness
check in which we excluded individuals born in 1937, 1938 and
1941. The results show that the conditional survival difference 5. Discussion
hardly deviates from the base model, if anything the difference
even becomes larger. Finally, we included an indicator for the This paper estimates to what extent survival differences across
year of measurement of the education level (1983 or 1993), and educational groups are due to a ‘selection effect’ based on cognitive
ability and other background variables. We extend the structural
equation model of Conti et al. (2010) to allow for a duration depen-
dent variable and an ordinal educational choice, and estimate the
22
Following Hartog (1989) we investigated whether the attrition between 1993 model on the basis of a Dutch cohort born around 1940 for which
and 1995 was related to observed characteristics. Literally all explanatory variables
including education, family background, and intelligence were not related to attri-
we observe mortality between ages 55 and 75. Most important
tion. The only exception was self-reported health; a worse health status increased conclusion is that the selection effect based on cognitive ability is
the probability of attrition between 1993 and 1995. responsible for around half of the raw differences in survival. Yet,
42 G.E. Bijwaard et al. / Journal of Health Economics 42 (2015) 29–43
References Elbers, C., Ridder, G., 1982. True and spurious duration dependence: the iden-
tifiability of proportional hazards models. Review of Economic Studies 49,
403–410.
Abbring, J., van den Berg, G., 2003. The nonparametric identification of treatment
Fuchs, V.R., 1982. Time preference and health: an exploratory study. In: Fuchs, V.
effects in duration models. Econometrica 71 (5), 1491–1517.
(Ed.), Economic Aspects of Health. The University of Chicago Press, Chicago.
Albouy, V., Lequien, L., 2008. Does compulsory education lower mortality? Journal
Gavrilov, L.A., Gavrilova, N.S., 1991. The Biology of Life Span: A Quantitative
of Health Economics 28 (1), 155–168.
Approach. Harwood Academic Publisher, New York, ISBN 3-7186-4983-7.
Arendt, J.N., 2005. Does education cause better health? A panel data analysis
Gottfredson, L., 1997. Mainstream science on intelligence: an editorial with 52 sig-
using school reforms for identification. Economics of Education Review 24 (2),
natories, history, and bibliography. Intelligence 24, 13–23.
149–160.
Gottfredson, L., 2004. Intelligence: is the epidemiologists’ elusive fundamental cause
Auld, M.C., Sidhu, N., 2005. Schooling, cognitive ability and health. Health Economics
of social class inequalities in health? Journal of Personality and Social Psychology
14 (10), 1019–1034.
86 (1), 174–199.
Bago d’Uva, T., O’Donnell, O., van Doorslaer, E., 2008. Differential health reporting by
Hartog, J., 1989. Survey non-response in relation to ability and family background:
education level and its impact on the measurement of health inequalities among
structure and effects on estimated earnings functions. Applied Economics 21,
older Europeans. International Journal of Epidemiology 37 (6), 1375–1383.
387–395.
Batty, G.D., Deary, I.J., Gottfredson, L.S., 2007. Premorbid (early life) IQ and later
Hartog, J., Pfann, G., 1985. Vervolgonderzoek Noord-Brabantse zesdeklassers 1983,
mortality risk: systematic review. Annals of Epidemiology 17 (4), 278–288.
Verantwoording van hernieuwde gegevensverzameling onder Noordbrabantse
Batty, G.D., Wennerstad, K.M., Smith, G.D., Gunnell, D., Deary, I., Tynelius, P., Ras-
zesdeklassers van 1952. University of Amsterdam, Amsterdam.
mussen, F., 2009. IQ in early adulthood and mortality by middle age: cohort
Hartog, J., Oosterbeek, H., 1998. Health, wealth and happiness: why pur-
study of 1 million Swedish men. Epidemiology 20 (1), 100–109.
sue a higher education? Economics of Education Review 17 (3),
Behrman, J.R., Rosenzweig, M.R., 2004. Returns to birthweight. Review of Economics
245–256.
and Statistics 86 (2), 586–601.
Hartog, J., Jonker, N., Pfann, G., 2002. Documentatie Brabant data. Netherlands Insti-
Borghans, L., Golsteyn, B.H.H., Heckman, J.J., Humphries, J.E., 2011. IQ, Achievement,
tute for Scientific Information Services, Amsterdam.
and Personality. University of Maastricht (Unpublished manuscript).
Heckman, J.J., Humphries, J.E., Veramendi, G., Urzua, S., 2014. Education Health and
Braakmann, N., 2011. The causal relationship between education, health and health
Wages. NBER Working Paper No. 19971.
related behaviour: evidence from a natural experiment in England. Journal of
Heckman, J.J., Singer, B., 1984. A method for minimizing the impact of distribu-
Health Economics 30 (4), 753–763.
tional assumptions in econometric models for duration data. Econometrica 52,
Brinch, C.N., Galloway, T.A., 2012. Schooling in adolescence raises IQ scores. Pro-
271–320.
ceedings of the National Academy of Sciences of the United States of America
Kaestner, R., Callison, K., 2011. Adolescent cognitive and non-cognitive correlates of
109 (2), 425–430.
health. Journal of Human Capital 5 (1), 29–69.
Butler, J.S., Moffitt, R., 1982. A computationally efficient quadrature procedure for
Lillard, L.A., 1993. Simultaneous equations for hazards: marriage duration and fer-
the one-factor multinomial probit model. Econometrica 50 (3), 761–764.
tility timing. Journal of Econometrics 56, 189–217.
Carneiro, P., Crawford, C., Goodman, A., 2007. The impact of early cognitive and
Lleras-Muney, A., 2005. The relationship between education and adult mortality in
non-cognitive skills on later outcomes. In: CEE DP 92.
the United States. Review of Economic Studies 72, 189–221.
Carpenter, P.A., Just, M.A., Shell, P., 1990. What one intelligence test measures: a
Mathijssen, M.A.J.M., Sonnemans, G.J.M., 1958. Schoolkeuze en schoolsucces bij
theoretical account of processing in the Raven progressive matrices test. Psy-
VHMO en ULO in Noord-Brabant. Zwijssen, Tilburg.
chological Review 97 (3), 404–431.
Mazumder, B., 2008. Does education improve health: a reexamination of the evi-
Case, A., Fertig, A., Paxson, C., 2005. The lasting impact of childhood health and
dence from compulsory schooling laws. Economic Perspectives 33 (2).
circumstance. Journal of Health Economics 24 (2), 365–389.
Mazumder, B., 2012. The effects of education on health and mortality. Nordic Eco-
Centraal Bureau voor de Statistiek, CBS, 2008. Hoogopgeleiden leven lang en gezond.
nomic Policy Review 2012, 261–301.
In: Gezondheid en zorg in cijfers. CBS.
Meghir, C., Palme, M., Simeonova, E., 2013. Education, Cognition and Health: Evi-
Clark, D., Royer, H., 2013. The Effect of Education on Adult Mortality and Health:
dence from a Social Experiment. NBER Working Paper 19002.
Evidence from Britain. American Economic Review 103 (6), 2087–2120.
Murasko, J.E., 2007. A lifecourse study on education and health: the relationship
Cockx, B., Picchio, M., 2012. Are short-lived jobs stepping stones to long-lasting jobs?
between childhood psychosocial resources and outcomes in adolescence and
Oxford Bulletin of Economics and Statistics 74 (5), 646–675.
young adulthood. Social Science Research 36 (4), 1348–1370.
Conti, G., Heckman, J.J., 2010. Understanding the early origins of the education-
Oreopoulos, P., 2006. Estimating average and local average treatment effects of edu-
health gradient: a framework that can also be applied to analyze
cation when compulsory school laws really matter. American Economic Review
gene-environment interactions. Perspectives on Psychological Science 5 (5),
96 (1), 152–175.
585–605.
Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T., 1993. Numerical Recipes
Conti, G., Heckman, J.J., Urzua, S., 2010. The education-health gradient. American
in C: The Art of Scientific Computing, 2nd ed. Cambridge UP, Cambridge.
Economic Review Papers and Proceedings 100, 234–238.
Raven, J.C., 1958. Mill Hill Vocabulav Scale, 2nd ed. H.K. Lewis, London.
Conti, G., Heckman, J.J., Urzua, S., 2011. Early Endowments, Education, and Health.
Ross, C.E., Wu, C.-L., 1995. The links between education and health. American Soci-
University of Chicago, Department of Economics (Unpublished manuscript).
ological Review 60 (5), 719–745.
Cramer, J.S., 2012. Childhood Intelligence and Adult Mortality, and the Role of Socio-
Savelyev, P.A., 2012. Conscientiousness, Education, and Longevity of High-Ability
Economic Status. Tinbergen Institute Discussion Paper 2012-070/4.
Individuals. Vanderbilt University, Department of Economics (Unpublished
Cutler, D., Lleras-Muney, A., 2008. Education and health: evaluating theories and
manuscript).
evidence. In: House, J.S., Schoeni, R.F., Kaplan, G.A., Harold, P. (Eds.), Making
Spearman, C., 1927. The Abilities of Man: Their Nature and Measurement. Macmillan,
Americans Healthier: Social and Economic Policy as Health Policy. Russell Sage
New York.
Foundation, New York.
Van Kippersluis, H., O’Donnell, O., van Doorslaer, E., 2011. Long run returns to edu-
Deary, I., 2008. Why do intelligent people live longer? Nature 456, 175–176.
cation: does schooling lead to an extended old age? Journal of Human Resources
Deary, I., Johnson, W., 2010. Intelligence and education: causal perceptions drive
46 (4), 695–721.
analytic processes and therefore conclusions. International Journal of Epidemi-
Van Praag, M., 1992. Zomaar een dataset: ‘Noordbrabantse zesde klassers’, Een pre-
ology 39, 1362–1369.
sentatie van 15 jaar onderzoek. University of Amsterdam, Amsterdam.
Dronkers, J., 2002. Bestaat er een samenhang tussen echtscheiding en intelligentie?
Van Schellen, M., Nieuwbeerta, P., 2007. De invloed van de militaire dienst-
Mens & Maatschappij 77 (1), 25–42.
plicht op de ontwikkeling van crimineel gedrag. Mens & Maatschappij 82 (1),
Elbers, C., Lanjouw, J.O., Lanjouw, P., 2003. Micro-level estimation of poverty and
5–27.
inequality. Econometrica 71 (1), 355–364.
Journal of Health Economics 42 (2015) 44–63
a r t i c l e i n f o a b s t r a c t
Article history: Today, almost 3 billion people in developing countries rely on biomass as primary cooking fuel, with
Received 30 July 2014 profound negative implications for their well-being. Improved biomass cooking stoves are alleged to
Received in revised form 12 March 2015 counteract these adverse effects. This paper evaluates take-up and impacts of low-cost improved stoves
Accepted 12 March 2015
through a randomized controlled trial. The randomized stove is primarily designed to curb firewood
Available online 20 March 2015
consumption, but not smoke emissions. Nonetheless, we find considerable effects not only on firewood
consumption, but also on smoke exposure and, consequently, smoke-related disease symptoms. The
JEL classification:
reduced smoke exposure results from behavioural changes in terms of increased outside cooking and a
C93
I12
reduction in cooking time. We conclude that in order to assess the effectiveness of a technology-oriented
O12 intervention, it is critical to not only account for the incidence of technology adoption – the extensive
O13 margin – but also for the way the new technology is used – the intensive margin.
Q53 © 2015 Elsevier B.V. All rights reserved.
Keywords:
Household air pollution
Energy access
Technology adoption
Development economics
Biomass fuel
http://dx.doi.org/10.1016/j.jhealeco.2015.03.006
0167-6296/© 2015 Elsevier B.V. All rights reserved.
G. Bensch, J. Peters / Journal of Health Economics 42 (2015) 44–63 45
strong implications for smoke emissions and thus cleanliness. It confirm savings rates determined in lab tests. In addition, we find a
is hence still a matter of ongoing debate under which conditions decrease in early indicators for respiratory diseases and eye infec-
ICSs can be considered as clean, also compared to modern fuels like tions. These effects on people’s health status cannot be explained
electricity and gas.1 only by the take-up of the new ICS and the firewood savings, but
This paper presents findings from a Randomized Controlled Trial rather by an additional reduction in smoke exposure due to more
(RCT) among 253 households in twelve villages in Senegal to ana- outside cooking and a reduced cooking time that is enabled by the
lyze behavioural responses and impacts following the introduction new stove.
of an ICS. The ICS, which was assigned free of charge, is a low- Our findings add to the existing body of evidence on ICS impacts,
cost and maintenance-free portable clay-metal stove. It is produced which so far is mainly represented by two RCTs: the RESPIRE study
in a fairly standardized way by local manufacturers (potters and in Guatemala (see, for example, Smith-Sivertsen et al., 2004, 2009;
whitesmiths) in their workshops and is marketed at a retail price Díaz et al., 2007; Smith et al., 2011) and a study conducted by J-Pal
of around 10 US$. The stove has an expected life span of one to in India (Hanna et al., 2012).3 Both studies used stationary chim-
three years before it deteriorates and has to be replaced. It has ney ICSs that are installed in the user’s kitchen, with the difference
already been widely used in large governmental dissemination pro- that the RESPIRE stoves are of higher quality, thus more expen-
grammes in urban and rural Africa. As such, this is the first study sive (100–150 US$), and require less maintenance than those used
to assess a type of ICS whose design is geared towards fuel savings, in the Hanna et al. (2012) study. A more detailed comparison of
ease of use, affordability and, hence, large-scale applicability, but technical features of the ICSs used in the different studies is pro-
one that lacks specific health-conducive technical features such as vided in Appendix A. While the RESPIRE study detects a substantial
a cleaner burning process or a chimney. Without further changes in reduction in household air pollution and a reduction in the risk
cooking behaviour, the reduction in particulate matter emissions of respiratory disease symptoms and eye problems, Hanna et al.
that the randomized ICS can technically achieve would probably be observe reductions in smoke inhalation only in the first year but
insufficient to affect the health of users. This is due to the non-linear not over a four year time horizon. This is mainly driven by mainte-
particulate exposure–response relation found in medical research nance being more and more neglected over time, which leads to a
suggests that large reductions in smoke exposure are required in weak performance and low usage rates after some years.
order to ensure positive health effects (see, for example, Ezzati and Against this background, our paper is the first to add evidence on
Kammen, 2001; Pope et al., 2011; Burnett et al., 2014). how people use an adapted and simple ICS in an unsupervised setup
The main impact indicators of this study are firewood consump- that is deemed to represent a more realistic study environment
tion, time use, respiratory disease symptoms and eye infections. than the highly controlled medical trials conducted for RESPIRE.
They are supplemented by various indicators along the results Our study contributes to the literature by providing compelling
chain of the intervention with regard to cooking behaviour. Effects evidence that such a simpler and cheaper ICS can actually also
on these indicators were assessed 12 months after randomiza- trigger substantial impacts – if cooking behaviour also changes.
tion following a baseline study in November 2009. The behavioural Conceptually, these results confirm the findings of Hanna et al.:
changes we look at – firewood usage patterns and smoke exposure Looking at the technical features of an ICS is not enough, since the
– can be expected to materialize already in the first few months real-world behaviour of users strongly co-determines the results.
after ICS adoption. The changes in these indicators we observe after Unlike Hanna et al., though, we find that behavioural adaptations
one year of ICS ownership therefore reflect impacts to be expected to a simple ICS may trigger sizable positive health effects.
in the long run – as long as people continue to use the ICS and These differences in findings of the two studies show the poten-
replace it by a new one once it is not functional anymore. The tials of disseminating ICS that are adapted to the target population
third wave of interviews in March 2013 is used to track the longer- and that facilitate cleaner cooking. The stove used in the Hanna et al.
term usage behaviour and the stove’s durability at the end of what study requires regular maintenance, for which people in turn need
technically is the life span of the ICS. to be trained (which not all of them were), while the stove random-
A couple of factors contribute to a high external validity of this ized for our study is maintenance-free. Furthermore, our portable
RCT for the African context: the study was implemented in an unob- stove is well adapted to the local cooking habits, whereas the stove
trusive way in order to ensure that we observe real-world cooking distributed in Hanna et al. interferes more with local cooking habits
behaviour. It was designed and conducted in cooperation with the by requiring people to cook inside, which they are not accustomed
ICS dissemination programme of the Government of Senegal, so to. In this sense, the stove in our study increases the number of
that an upscaling of the intervention under real-world conditions choice variables for the users, while the one used in Hanna et al.
would be possible. Furthermore, the dominating cooking fuel in decreases it.
our study area is firewood, which is also the case in most other In this broader behavioural context, our study adds to a
African countries (Bonjour et al., 2013). Firewood scarcity in our nascent strand in the health economics literature studying adop-
study region and, consequently, the incentive to use more effi- tion behaviour of households for health relevant technologies and
cient stoves is pronounced and comparable to other dry areas in goods such as bednets (Cohen and Dupas, 2010; Tarozzi et al.,
non-equatorial Africa.2 2014), point-of-use drinking water disinfectants (Luby et al., 2008;
We find that the ICSs are taken up by virtually all households Kremer et al., 2009), deworming drugs (Kremer and Miguel, 2007),
and intensively used, even after three and a half years. For the most condoms (e.g. Kamali et al., 2003), or a range of such technolo-
part, people only give up using the stove when it is not functional gies (Wendland et al., 2015). More specifically, it demonstrates
anymore and not because they lose interest in using it. We further-
more observe substantial effects on firewood consumption, which
3
In addition to these two studies, further evidence with mixed results exists for
China (Mueller et al., 2013; Yu, 2011), Mexico (Masera et al., 2007) and urban Senegal
1
See World Bank (2011) for a more detailed discussion of different types of (Bensch and Peters, 2013). Burwen and Levine (2012) conducted an RCT in Ghana
improved cooking stoves and Martin et al. (2011) for a recent overview on the using a very simple mud stove. As a major difference to the present study as well as
improved stoves and air pollution policy debate. the RESPIRE and the J-Pal study, tests in a controlled field lab setting already find that
2
External validity and potential challenges to it are discussed further in Section the stove does not perform better than the traditional ones. The poor performance
3.5 and Appendix D. is also reflected in low usage rates after a few months.
46 G. Bensch, J. Peters / Journal of Health Economics 42 (2015) 44–63
that the analysis of technology adoption and related promotion which is equivalent to 1–5 US$ (see Appendix B for pictures of
programmes should encompass both a technical and an economic the ICS and other stove types used in the study region). The GIZ
perspective, not only an assessment of the mechanical perfor- programme intends to expand its activities to rural areas and
mance. This is in line with the concept of intensive and extensive expects the price of the Jambaar for the rural market to be around
margins of behaviour that has recently been brought into the 4000–5000 CFA Francs (8–11 US$), which is well below the prices
debate on public health interventions (see Dupas, 2011): It is not of the more sophisticated ICS technologies widely disseminated in
only the mere technology adoption that counts (extensive margin). Latin America or Asia.
Rather, the full effect can only be determined if the way the new Cooking fuels are an issue of major importance in the daily
technology is used is accounted for as well, the intensive margin. life of Senegalese households. Households have the custom to
The remainder of the paper is organized as follows: Section cook inside, which leads to a higher exposure to smoke emissions
2 reviews the country and intervention background and outlines than outside cooking. WHO (2009) holds household air pollution
the research design including the identification strategy. Section 3 induced by solid fuel usage for cooking accountable for 6300 pre-
presents the study results for all our impact indicators, and Section mature deaths every year in Senegal alone. Apart from agricultural
4 concludes. land clearance, wood usage for cooking purposes is moreover the
most important driving force of ongoing deforestation in the mostly
2. Programme background and methodological approach arid and Sahelian country (see WEC/FAO, 1999; Tappan et al., 2004;
FAO, 2005a,b). A constant population growth of 2.6% per year puts
2.1. Improved stove dissemination and cooking fuels in Senegal further pressure on fuelwood resources. As a consequence, house-
holds face an increasing scarcity of fuelwoods: firewood collection
Despite its seeming superiority to traditional biomass cooking, is becoming increasingly time-consuming, while fuelwood prices
the ICS technology has not made significant inroads into African are rising. This circumstance applies particularly to the Bassin
households. There may be various reasons for this, which are Arachidier, the study area of this evaluation, situated some 200 km
comprehensively discussed in Rehfuess et al. (2014) and Lewis southeast of Dakar.
and Pattanayak (2012). One explanation relevant for the rural
setting is that firewood can typically be collected for free so that 2.2. Impact indicators
most of the benefits of ICS usage are not monetary ones. This
makes it more difficult for households to finance the investment The first impact indicator of our study is the household consump-
given liquidity and credit constraints. On the supply side, the tion of firewood. This indicator aggregates each dish cooked in a
stove design may fail to meet user needs in preparing local dishes typical week, with a dish being one component of a meal that is pre-
with available fuels and cooking utensils. Earlier programmes in pared on a separate stove, for example rice and sauce. We thereby
various African countries relied on subsidies for ICS production account for the fact that several stoves may be used simultaneously
or distributed them for free. Most of these programmes did not for the preparation of a single meal. The rationale for this indicator
succeed, however, in triggering sustainable ICS usage. Based on is that a reduction in firewood consumption not only has immedi-
such experience, development practitioners frequently argue that ate implications for wood scarcity and deforestation pressures, but
people do not appreciate and use ICS that they receive as a gift is also a strong intermediate indicator for other ultimately relevant
and, consequently, reject the option of distributing ICSs for free impacts such as health and time use.
(Barnes et al., 1994; Martin et al., 2011). Impacts on health and time use are examined directly. We
This is also the spirit underlying the ICS dissemination pro- investigate the indicator time spent by household members on fire-
gramme Foyer Amélioré au Sénégal (FASEN), which is implemented wood collection and cooking and the prevalence of diseases that are
by the Senegalese Ministry of Energy in cooperation with Deutsche potentially related to firewood usage. For this purpose, we look
Gesellschaft für Internationale Zusammenarbeit (GIZ).4 In contrast at symptoms that are likely to be affected in the short-term after
to earlier ICS interventions, FASEN focuses on establishing a sus- smoke emissions are reduced; these are captured by the indicators
tainable and autonomous market for ICSs by testing performance, household member with symptoms of respiratory diseases and house-
training producers and distributors, and supporting communica- hold member with eye problems. We examine this indicator both on
tion and promotional campaigns. Similar to other countries, FASEN the household level and the household member level. For respira-
so far concentrated its ICS dissemination on charcoal ICSs in urban tory diseases, these symptoms are cough, asthma, or difficulty in
areas. breathing. They indicate acute respiratory infections and chronic
The main ICS type disseminated by FASEN since 2006, the Jam- obstructive pulmonary diseases, which are the leading causes of
baar, is also used in the present RCT. It is a portable single-pot stove mortality and diseases induced by exposure to air pollution from
with a fired clay combustion centre enclosed by a metal casing. solid fuels (Ezzati and Kammen, 2002). Exposure to particles could
Owing to basic design improvements of the Jambaar compared be detected as a causal agent of these and other serious respiratory
to traditional stoves, the woodfuel burns more efficiently and the diseases such as lung cancer or pneumonia (see Duflo et al., 2008b;
heat is better conserved and focused towards the cooking pot. Pattanayak and Pfaff, 2009).
Both charcoal and firewood models exist. We chose the firewood Respiratory diseases and eye problems are elicited on a self-
Jambaar for our experiment as firewood is the dominant fuel reporting basis: respondents are asked to give information on those
in rural Senegal with 89% of rural households using it as their household members who exhibited the symptoms of interest in
primary cooking fuel (ANSD, 2006). In rural areas ICSs have not the six months preceding the interviews. While such self-reported
been available so far. Stove types used here are either three-stone health indicators are sometimes viewed with concern because of
stoves available at zero cost or traditional metal stoves and open potential measurement errors, the literature supports their appli-
fire grills that can be bought for between 500 and 2500 CFA Francs, cation by highlighting the correlation with actual illnesses (see
Idler and Benyamini, 1997; Miilunpalo et al., 1997; Peabody et al.,
2006; Butrick et al., 2010). In particular, if specific symptoms are
4
GIZ provides technical assistance on behalf of the German Federal Ministry for
asked about precisely as was done in this study, respondents can
Development and Economic Cooperation (BMZ) and is one of the largest bilateral be expected to report accurately. A deterioration in recall accuracy
development agencies in the world. of reported morbidity as found in Das et al. (2012) and Kjellsson
G. Bensch, J. Peters / Journal of Health Economics 42 (2015) 44–63 47
et al. (2014) is a concern in this study but would only reduce the controlling for baseline household characteristics such as educa-
precision of our health estimates and not induce any bias. tion and income using Ordinary Least Squares (OLS) regression.
To record firewood consumption and cooking time, the person In order to shed more light on how reductions in firewood con-
responsible for cooking is asked to specify the number of people sumption are induced by ICS usage, we also do an OLS regression on
cooked for and the types of stoves used for every meal throughout the individual dish level, additionally controlling for a set of poten-
a typical day. For each stove application, we then record the tial dish- and meal-specific confounders such as the number of
cooking duration and the cooking fuel type. In case of firewood, people cooked for. This dish-level regression has to be interpreted
the cooking person is additionally asked to pile up the amount with some care, since – in spite of the random ICS assignment – the
of firewood used for the respective stove application, which is households that received a new stove can still choose whether to
then weighed with scales. In combination with information on the use the ICS or a traditional stove for the respective meal. This choice
frequency with which the respective stoves are used throughout a might then be driven by unobservable factors, which would distort
typical week, this data serves to determine the weekly household the savings estimates if the unobservables are also correlated with
consumption of firewood. Enumerators crosschecked stove usage as firewood consumption.
part of the interviews by verifying which stove was currently in use Finally, we employ probit regressions on the health status of
or had been used recently. The indicator time spent by household households and of individual household members. In principle,
members on cooking aggregates the self-reported cooking duration these estimations might as well suffer from some endogeneity
for all meals of a typical day, whereas the time spent by household induced by intra-household bargaining processes: healthier and
members on firewood collection aggregates the spells in which more powerful women might bargain themselves out of cooking
household members are occupied with gathering firewood in the with the dirtier stove and into cooking with the cleaner ICS (see
course of a week. Pitt et al., 2006). This potentially leads to a spurious correlation
Technically achievable savings rates for the Jambaar (referred between ICS ownership and improvements in the health status. In
to as ICS in the following) have already been determined in con- our context, though, this is very unlikely, since the assignment to
trolled cooking tests (CCTs), where a cooking person prepares the the cooking duty does not seem to be a result of short-term nego-
same meal on both a traditional stove and an improved stove in tiations, but it is rather determined by cultural norms with one
order to compare the woodfuel consumption of both stove types. or two women per household being continuously responsible for
However, the effective savings in real-life households might devi- cooking. Even if post-randomization selection processes occurred,
ate from such laboratory field tests for various reasons summarized they would be uncovered by the health indicators we use, because
by Bensch and Peters (2013).5 The deficiencies of CCTs can be over- we observe both the people responsible for cooking and those who
come by evaluating the woodfuel consumption based on a survey are not.
among a larger sample of households in which the diversity and
dynamics of real-life cooking practises are captured. This is what is
2.4. RCT design and implementation
done in the present paper.
The study design followed the guidelines on the implementa-
tion of RCTs provided in Duflo et al. (2008a). The first decision that
2.3. Identification strategy
had to be taken was the level on which to randomize the treatment
– the village or the individual household. In the present case, it is
We employ two approaches to estimate the impact of ICS usage
sensible to randomize on the household level, since the decision
in this experimental setup. The intention-to-treat effect (ITT) is
about whether to adopt an ICS is taken in the household and not
obtained by simply comparing mean values of impact indicators
on a regional level. Furthermore, our impact indicators are mea-
for the treatment and control group, without accounting for non-
sured on household level (or below). One reason to randomize on
compliance from households that were assigned to the treatment
the village level instead of the household level would be to account
group but for some reason do not use the ICS. In our case, the ITT
for spillover effects. These are expected to be negligible, since the
serves to estimate the effect of providing the ICS for free to house-
ICSs are only used by the households themselves and the penetra-
holds who do not yet own one. The average treatment effect on
tion rate per village envisaged in this RCT is too low to affect, for
the treated (ATT), by contrast, accounts for non-usage in the treat-
example, local firewood supply.
ment group and potential take-up in the control group and thereby
The next decision regards the sample size, both in terms of
serves to estimate the impact of effective ICS usage. For this pur-
households and villages. We determined the sample size based on
pose, instrumental variable (IV) estimations are applied with the
a power calculation focusing on the indicator firewood savings. We
random assignment into the treatment group as an instrument for
approximated the relevant parameters ex-ante using the data col-
the effective usage of the ICS. In our case, ITT and ATT are very simi-
lected for the quasi-experimental study presented in Bensch and
lar given the high compliance rate in the treatment group and given
Peters (2013). Taking into account these parameters and the prob-
that only one household in the control group acquired an ICS from
ability of being assigned to the treatment group, we obtained a
another source. Although RCTs allow for a simple comparison of the
required sample size of 250 households spread across 12 villages
impact indicators at the time of the follow-up, the precision of the
(see Appendix C). We selected villages that are far away from GIZ-
estimates can be increased by controlling for other household char-
supported ICS producers in order to avoid treatment contamination
acteristics that have been collected in a baseline survey. We there-
that might occur if households randomly assigned to the control
fore implement both the ITT and ATT approach with and without
group obtain an ICS independently.6 Furthermore, we selected the
5 6
For example, the tests frequently concentrate on the main meal only and they Two further channels exist through which the treatment may be contaminated.
cannot account for the fact that households might prepare more hot meals because First, treatment households may share their stoves with control households. This
cooking becomes cheaper due to the higher efficiency of the ICS (or less exhausting in did not occur. Second, the two household groups may exchange about determinants
terms of firewood collection) – a phenomenon, which is referred to as the rebound of respiratory health, for example. Yet, the treatment did not involve any awareness
effect in the energy economics literature (see Frondel et al., 2008; Herring et al., raising and cooking is also a rather private issue, as stated in open interviews, that
2009). seems less of a talking point in women’s conversations. As a consequence, only
48 G. Bensch, J. Peters / Journal of Health Economics 42 (2015) 44–63
Nove
ember 2009 few days 1, 2 and 7 months November March 2013
2
after th
he lottery after allocation
o 2010
12 villages from the target region of a planned GIZ rural elec- as a fuel-saving device, which requires a few precautions. House-
trification intervention so that we could introduce the study as holds were, for example, informed that, in contrast to open fires
preparatory field work related to the electrification project and, for which people typically use large branches or even trunks, the
thereby, reduce attention paid to the randomization. firewood has to be chopped first in order to fit the relatively small
In November 2009, we conducted the baseline survey among fuel feed entrance of the ICS. In line with what real-world users are
253 randomly sampled households (see Fig. 1 for the timeline of told about this type of ICS, households were briefly informed about
the RCT). Information was gathered using a structured question- the convenience co-benefits of fuel savings, which are a quicker
naire covering the socio-economic dimensions that characterize cooking process, less smoke and a cleaner kitchen (if cooking is
the relevant living conditions of the households. Since the study done indoors). No information about potential repercussions on
also served as a baseline in the context of the envisaged electrifi- the health status was provided. The complete instructions on the
cation intervention – a solar home system dissemination project functioning and proper usage of the ICS and related information
– a particular focus of the questionnaire was on energy sources provided are presented in Appendix E.
(including electricity) and energy services (including cooking). Between the baseline and the follow-up phase, local community
Consequently, the cooking-related parts of the interviews did workers conducted three preparatory visits in the survey villages
not draw particular attention. This is important to avoid auspices for the planned electrification project. It is worth highlighting that
biases and Hawthorne effects (see Appendix D). We complemen- the electrification intervention was not implemented in any of
tarily gathered qualitative information in focus group discussions the sampled villages before the end of this study. Furthermore,
and semi-structured interviews with key informants such as electricity is virtually never used for cooking in rural Africa, in par-
women’s groups, stove and charcoal producers, teachers, regional ticular not in the case of solar home systems whose capacity is
administrators, and village chiefs. not sufficient for cooking purposes. Once in the field, the commu-
The random assignment was put into practice through a lot- nity workers additionally checked if ICS households were using the
tery directly following the baseline interviews. We presented the ICS and whether they had encountered technical problems (which
prizes of this lottery, an ICS or a 5 kg bag of rice, as recompense were in any case very rare). Again, no further treatment in terms of
for participation in the baseline study. Participants were therefore awareness raising or usage encouragement was undertaken. While
not aware of being part of an experiment. The connotation of the a few of the households were not yet making frequent use of their
ICS receivers as the treatment group and the bag of rice receivers new stove one month after ICS allocation, by the time of the second
as a control group was not communicated to the participants.7 In visit virtually all ICS households cooked regularly on the ICS.9 For
order to increase trust in the fairness of the lottery, we conducted it the follow-up phase at the end of 2010, the same structured ques-
in each of the villages directly after completing the interviews and tionnaire was used as in the baseline phase. Attrition was very low:
informed the households immediately about which recompense only four households either could not be located or had moved out
they would get. Hence, we applied simple stratified randomization of the village, three in the control and one in the treatment group.
with the villages as the stratification criterion. Of the 253 house- None of the households refused to participate in the follow-up
holds interviewed for the baseline, 98 received an ICS and 155 a survey.
bag of rice. The rice and ICSs were distributed within three days of We excluded two groups of households from the analysis: four
the baseline interview. The households that were drawn to get an households with affiliated Quran schools, where usually between
ICS received a brief 15-min introduction on how to use the stove.8 50 and 150 students live and eat and which are therefore not com-
The ICS and rice bag distribution as well as the instruction were parable to family households, as well as households that prior to
done by field workers who were involved in the preparation of the the study had already received improved stoves other than the ICS
electrification project and who were visiting the village anyhow. used in the RCT from urban relatives. These six treatment and ten
No specific village gathering was organized. The ICS was presented control households cannot be expected to have bought another ICS
in a non-RCT world and therefore do not represent the population
of interest. They were originally included in the randomization
only because they were a priori not clearly discernible and since we
minor contamination effects are conceivable that, furthermore, would rather lead conducted the randomization on-site and directly after the survey.
to an underestimation of effects. No further restrictions were made on who to include in the sample
7
The average rice consumption per capita in Senegal is 84 kg per year (GAIN,
2011). Hence, the bag of rice received by the control group corresponds to 0.5% of
annual rice consumption for the average household size in our sample and will most
likely not affect any of our impact indicators that were measured one year after the
9
distribution. It is not likely that the delayed take-up was triggered by the visits or in any way
8
Because many other ICS types require more extensive maintenance and more related to them. Instead, the visits revealed that, first, a few housewives travelled
usage instructions, one might think of these instructions as a treatment in its own outside the village and therefore had not used the ICS so far. Second, some women
right, which might be introduced as a random second treatment arm. For our ICS, needed to adapt to the quicker cooking with the ICS, which at the beginning created a
this is however not the case. Given the simplicity in the use of the ICS and given that feeling of insecurity. Third, some households were reluctant at the beginning as they
it is virtually maintenance free, additionally randomizing the instruction within the wanted to preserve their ICS and used it only sparsely. Fourth, a few polygamous
treatment group in our case would not make a difference. households needed some time to decide on who would use the ICS and when.
G. Bensch, J. Peters / Journal of Health Economics 42 (2015) 44–63 49
Table 1
Baseline characteristics of randomly assigned ICS owners and non-owners.
Socio-economic characteristics
Household size 12.88 (5.55) 12.94 (5.82) 0.94
Family structure (%) 0.96
Extended family 77.8 74.8
Nuclear family 15.6 18.0
Couple or monoparental family 6.6 7.2
Household head is of Wolof ethnicity (%) 52.2 50.4 0.92
Father with more than one wife (1 = yes) 33.7 30.2 0.58
Father’s education level (%) 0.88
None 12.5 9.8
Alphabetization 77.3 77.4
Primary 5.7 7.5
At least secondary 4.5 5.3
Main wife’s education level (%) 0.84
None 41.6 39.6
Alphabetization 51.7 53.9
Primary 6.7 5.8
At least secondary 0.0 0.7
Telecommunication expenditures (CFAF) 4250 (3830) 5090 (8640) 0.43
Ownership of bank account (1 = yes) 0.08 0.06 0.55
Household receives remittances (1 = yes) 0.42 0.45 0.65
Thatched roof (1 = yes) 0.67 0.67 0.97
Wall material of house is stone or brick (1 = yes) 0.49 0.51 0.75
Flooring material is soil (1 = yes) 0.36 0.27 0.16
Land is completely owned by household (1 = yes) 0.94 0.93 0.62
Ownership of sheep (1 = yes) 0.62 0.63 0.87
Number of mobile phones owned 1.86 (1.31) 2.04 (2.15) 0.48
Main wife is member of an association (1 = yes) 0.71 0.73 0.71
Note: sd – standard deviation; p-values are determined by means of t- and chi-square tests
†
The Os is a stove in which an open fire burns between three metal feet.
or the analysis. Altogether, the sample used for the subsequent 3. Results
impact analysis in Sections 3.2–3.4 comprised 229 households.
As a robustness check shows, not discarding these two groups 3.1. Socio-economic conditions and cooking behaviour
of households and, hence, performing the analysis with all 249
households for which baseline and follow-up data is available does The primary purpose of this section is to scrutinize the balancing
not change any of our findings, neither when applying ITT nor ATT. of the two randomized groups, since we abstained from explic-
In March 2013, approximately three and a half years after the itly balancing them through re-randomization before assigning the
randomization, an ICS usage tracking survey among the households ICSs. The second purpose is to illustrate the socio-economic envi-
that had received an ICS was conducted by enumerators famil- ronment in which the RCT was implemented. Table 1 documents
iar with the ICS. All but one of the 90 ICS households included the baseline socio-economic and cooking-related characteristics of
in the impact analysis could be retrieved for this interview wave. the 229 households before stove distribution. On average, house-
In addition to asking the households simple usage questions, the holds consist of 13 members; household size varies in a range
enumerators recorded their own assessment on the condition of between 2 and 42 persons per household. Larger households are
the ICS. The results of this usage tracking survey are presented in more common: 78% are extended families and 16% nuclear families
Section 3.5. (two parents plus children). Four in five households are subsistence
50 G. Bensch, J. Peters / Journal of Health Economics 42 (2015) 44–63
Table 2
Utilization rates of different stove types at follow-up.
Treatment Control
Note: The shares represent the ratio between the number of times the respective
stove type is used and the total number of stove applications per household and
week.
†
ICS usage among the control group is due to the fact that one household which
was not randomly assigned to receive an ICS acquired one individually after the
randomization.
per week using one of its stoves. As sometimes more than one
stove is used for one meal, the range of weekly stove applications
Fig. 2. Distribution of non-farm income at baseline. is between 14 and 49.
The follow-up data on stove usage shows no changes in the
control group: the most often used stove types are three-stone
stoves (53%), traditional metal stoves (25%) or Os (20%). Accord-
ingly, the savings potentials of ICS usage are relatively high with
73% of households mainly using open fire stoves in the absence of
an ICS. For the treatment group, the follow-up data shows that the
ICSs have achieved broad acceptance among users. There are only
two non-compliers: one ICS was completely broken in an accident
and one household did not use the new stove. Otherwise, as many
as 95% of the distributed ICSs are used at least seven times per week;
for 85% of treatment households the ICS became the predominantly
used stove. The proportion of individual dishes prepared with the
different stove types also mirrors this usage pattern (see Table 2).
As such, our set-up mimics the most likely scenario where treat-
ment households have one ICS at their disposal and continue to use
less efficient traditional stoves, because one stove is not sufficient
to prepare the required amount of food or because the ICS is too
Fig. 3. Distribution of farm income at baseline.
small for the pot sizes used in a few large households. The table also
shows that treatment households increased the number of dishes
prepared. This is probably not due to rebound effects (see footnote
farmers, the majority of them living in houses with thatched roofs. **), since the total number of hot meals cooked does not increase
As can be seen from the p-values in the right-hand column, two- in the treatment group and households reported that the quantity
sided tests of equality of the values for the two compared groups and type of food prepared has not changed since receiving the ICS.
do not reveal statistically significant differences. The groups are Instead, the increase simply reflects the fact that ICS households
balanced in the relevant observable characteristics. In addition, have an additional stove at their disposal such that the different
Figs. 2 and 3 show the distribution in non-agricultural and agricul- components of one meal that were formerly prepared on a single
tural income: the treatment and the control group strongly overlap. stove are now prepared on two stoves.
Accordingly, a two-sample Kolmogorov–Smirnov test cannot reject
the null of identical distributions at the 10 percent level.10 3.2. Firewood consumption
Regarding the baseline stove usage patterns reflected in the
table, two stove types dominate rural kitchens in Senegal: open ITT and ATT estimates for the household consumption of fire-
fires (three-stone stoves or Os in which the open fire burns between wood indicator are calculated both with and without the baseline
metal feet) and traditional metal stoves, the Malagasy and Cire. LPG household-level control variables taken from Table 1: in addition
stoves are rarely used in rural Senegal; in our sample only three to income, telecommunication expenditures are used as a proxy for
households mainly use LPG for cooking. 90% of dishes are prepared living standards. Bank account ownership is used as a proxy for the
with firewood. Around 15% of all meals are prepared with more household’s access to credits and ability to pay. Housing condi-
than one stove, primarily to prepare rice on one stove and a sauce tions as a wealth indicator are captured by whether the flooring
on a second one. On average, each household prepares 21 hot dishes material in the household is soil and whether the wall material is
stone or brick. As another wealth metric, we include a dummy indi-
cating sheep ownership. The results do not change if other wealth
10
We additionally ran a probit regression to check the correlation between ICS and socio-economic indicators shown in the table are included. As
allocation and the joint set of cooking-related as well as socio-economic character- suggested in Bruhn and McKenzie (2009), we additionally include
istics and village dummies. As part of this estimation, we performed a Likelihood village dummies in order to account for the stratified randomiza-
Ratio chi-square test with the null hypothesis that all of the regression coefficients tion. According to our findings presented in columns (1) and (2) of
are simultaneously equal to zero. The p-value of 0.98 validates the findings from the
univariate comparisons of no correlation. All tests have as well been carried out with
Table 3, firewood savings are substantial, with around 27 kg being
the original sample of 253 baseline households, for which statistically significant saved per week in every household after introduction of the ICS.
differences cannot be observed either. These are ITT results. ATT estimates differ only marginally being
G. Bensch, J. Peters / Journal of Health Economics 42 (2015) 44–63 51
Table 3
Effect of ICS usage on firewood consumption per week and per dish.
Dependent variable: Firewood consumption per week in kg Firewood weight per dish in kg
Dish variables†
Dish is cooked on open fire Ref. Ref.
Dish is cooked on ICS −1.99*** (0.24) −2.04*** (0.24)
Dish is cooked on traditional metal stove −0.03 (0.53) Ref.
Main dish 1.00*** (0.33)
Short cooking (<30 min) −0.94*** (0.19)
Meal variables†
Number of people the meal is cooked for (in terms of the logarithm 1.96*** (0.57)
of adult equivalents)
Lunch Ref.
Breakfast −1.66*** (0.18)
Dinner −0.32*** (0.10)
Multiple stoves −0.14 (0.33)
Household variables
Household with ICS −26.78*** (6.33) −26.96*** (6.10)
Average number of people cooked for (in terms of the logarithm of 42.79*** (14.36)
adult equivalents)
Father has formal education 2.63 (8.27) 0.01 (0.31)
Mother has formal education 5.95 (5.39) 0.23 (0.20)
Household income (in logarithmic terms) 1.25 (2.48) 0.03 (0.07)
Telecommunication expenditures (in logarithmic terms) 0.56 (0.97) 0.03 (0.03)
Bank account ownership 1.66 (16.85) 0.46 (1.18)
Flooring material is soil −17.63** (7.09) −0.60** (0.25)
Wall material of house is stone or brick 2.55 (9.83) 0.42 (0.36)
Ownership of sheep −7.31 (7.80) −0.32 (0.29)
Association membership of the mother −7.02 (7.93) −0.71** (0.29)
Mean of treatment group 60.80 (3.92) 60.69 (3.24) 2.28 (0.16) 2.25 (0.12)
Mean of control group 87.58 (4.68) 87.65 (5.03) 4.27 (0.17) 4.29 (0.21)
Savings rate (%) 30.6 30.8 46.7 47.5
Note: Computations on household level (columns 1 and 2) are performed with heteroskedasticity corrected standard errors accounting for heterogeneity in treatment
responses; standard errors for the dish-level estimations (columns 3 and 4) are clustered by household.
†
For an explanation of the dish- and meal-level control variables, see Bensch and Peters (2013).
*
Significance level of 10%.
**
Significance level of 5%.
***
Significance level of 1%.
slightly higher. As these observations hold in the same way for the furthermore provide insights into how the savings materialize,
other impact indicators, we will only present the more conserva- since they make it possible to examine the influence of dish- and
tive ITT estimates in the following (ATT estimates can be taken meal-specific factors. Table 3 shows in columns (3) and (4) the
from Table F1 in Appendix F). Inserting in the regression the values results for the OLS regression that controls for household charac-
1 and 0 for the binary treatment variable and average values for the teristics and characteristics specific to the stove application. The
covariates gives us the absolute ICS consumption values shown at results reveal the differential effects of various dish- and meal-
the bottom of the table. This implies that 30% of the households’ specific variables whose coefficient signs are as expected and
total firewood consumption is saved. reflect consistent firewood consumption figures across dish types.
This is clearly less than the 40–50% found in CCTs. As noted The R2 of 0.43 for the estimation including control variables in col-
above, rebound effects as one potential driver for the difference umn 3 indicates that a good part of the variation in the dependent
to CCT results do not seem to play a role. Another likely reason is variable can be explained by observable factors. The statistically
the fact that treatment households do not switch completely to ICS highly significant ICS coefficient would imply an average ICS sav-
usage and still prepare parts of their meals on traditional stoves. ings rate of 47%. It is, thus, in the range of the CCT results.
In order to assess the savings potentials in case they would fully An unbiased alternative to come up with a firewood savings
switch to ICS usage, we additionally compare the firewood con- estimate for the case of adopting ICS for the entire range of
sumption for dishes prepared on an ICS in the treatment group to stove applications is to perform a slightly adapted version of
dishes prepared on traditional stove types in the control group. the IV estimation in the calculation of the ATT for total firewood
Even though the analysis of firewood savings on the dish level may consumption. We now instrument a new treatment variable, ICS
be endogenous, it provides an upper bound estimate of savings usage intensity, by the random assignment. Usage intensity is
potentials where households had access to several ICSs to poten- coded as a continuous variable obtained by dividing the number of
tially abandon traditional stoves completely. These estimations dishes prepared on an ICS by the total number of dishes prepared
52 G. Bensch, J. Peters / Journal of Health Economics 42 (2015) 44–63
Table 4
Effect of ICS usage on time expenditures.
Mean (se) Mean (se) Mean (se) p-Value (H0 : Diff = 0) Mean (se) p-Value (H0 : Diff = 0)
(1) (2) (3) (4) (5) (6)
Duration of firewood collection per week (min) 719 ((75.1 ) 867 ((69.4 ) 148 (103.3 ) 0.15 136 (102.0 ) 0.19
Number of observations† 86 134
Cooking duration per day (min) 251 333 81 (21.8 ) 0.00*** 75 (20.7 ) 0.00***
Number of observations 90 139
Note: All values derived from ITT estimations with heteroskedasticity corrected standard errors (in parentheses) and including village dummies; se – standard error.
†
For the firewood collection indicator, the nine missing observations (5 control and 4 treatment) are due to households that were not able to specify the firewood collection
time spells.
***
Significance level of 1%.
in the respective household. It thus ranges from 0% to 100%. The ICS takes around one and a half hours. These savings far exceed the
resulting Wald estimator yields an average rate of 43.8–45.0% time that households additionally invest in cutting the firewood
(with and without controls). This unbiased IV estimate still suffers into smaller pieces, which takes not more than 15 min/day. Due
from the fact that treatment households increased the number of to a lack of local job and business opportunities, a shift of time
stove applications over which they spread the food preparation. towards income-generating activities cannot be observed. The
Nevertheless, we can conclude that if all meals in a household cooking women do not seem to sleep more either, since their time
were cooked on an ICS, the savings rate obtained in columns (1) awake differs by mere 5 min between the two compared groups.
and (2) of Table 3 could well be around 40%. Qualitative discussions rather suggest that the facilitation of the
The results on firewood consumption turn out to be robust cooking task helps them to execute household duties in a less
to outliers just like the results for the time and health indicators hurried way and to take more rest during the day.
assessed in the following two sections. The two robustness checks
we applied were, first, to estimate the median of the dependent
3.4. Health
variables by using quantile regression techniques and, second, to
exclude outliers defined as values more than two standard devia-
The negative effect of firewood usage on people’s health may
tions away from the mean. The results can be taken from Table F2
be alleviated by ICS usage via two channels. First, the reductions
in Appendix F.
in firewood consumption found in Section 3.2 can be expected to
reduce harmful smoke emissions, although it is – as discussed in
3.3. Time use the introduction – unclear whether simple ICSs like those used
in this RCT reduce smoke emission sufficiently to induce positive
As many as 96% of all households collect at least part of the fire- health effects. Second, exposure to the emitted smoke might be
wood they use for cooking. A reduction in firewood consumption is reduced, either via reductions in the cooking duration (as found
likely to lead to households spending less time on firewood collec- in Section 3.3) or if cooking behaviour changes because of the new
tion. In fact, the reduction in the aggregate time spent by household stove. In general, smoke exposure is very high in rural Senegal,
members on firewood collection is approximately two and a half with around two-thirds of the household members responsible
hours per week, which corresponds to 16–17% (Table 4). The reduc- for cooking staying next to the stove most of the time they are
tion, though, is statistically only borderline significant (p-values of cooking. Furthermore, the vast majority of households cook inside,
between 0.15 and 0.19 for ITT with and without controls), a finding predominantly in a separate kitchen. While in the control group the
that does not seem to be fully consistent with the reduction in total proportion of outside cookers stays stable, in the treatment group
firewood consumption of around 30% found in the previous section. it doubles from 11% to 23% between baseline and follow-up. The
Still, it is not surprising that time savings are less pronounced than main reason for this can be traced to the fact that the ICS better
savings in firewood. One reason for the lower savings is that ICS- shields the fuel from wind than three-stone stoves; also, from the
using households might just collect less wood during one excursion households’ perspective, wind and dust are indeed the main draw-
instead of reducing the number of excursions. The lack of statisti- back. In addition, the ICS requires less supervision, allowing the
cal significance of the difference might be due to inaccuracies in cook to dedicate more of her attention to other tasks away from
the time usage variable, which increases the standard error and, the smoke source.
thus, reduces power. The inaccuracies are induced, for example, by Virtually all persons responsible for cooking are women, on
the fact that 31% of households collect the wood on their own land average two per household with no difference between treatment
while farming, which makes it difficult to disentangle time spent on and control. We examine whether chronic symptoms of respiratory
the task of collection from time spent on ordinary field work. Also, diseases and eye infections prevail among the women respon-
some households use a variety of wood supply strategies depending sible for cooking and, as placebo outcomes, among the women
on different factors, most notably the season: some, for example, not responsible for cooking and male household members. We
do not collect the firewood every week but instead hold a stock that first look at two dummy variables: at least one household mem-
is typically replenished before the rainy season. ber with symptoms of respiratory diseases and at least one household
ICS households might moreover save time because cook- member with eye problems take the value one if at least one house-
ing is facilitated and quicker. In qualitative interviews women hold member of the respective group reports having suffered from
repeatedly pointed out that the ICS allows them to regulate the these symptoms at some point in the last six months before the
temperature more easily, which, in turn, makes it easier to do interview. The results are displayed in Table 5 and indicate the
other things while cooking. The cooking duration of all three meals share of households for which these variables take the value one.
throughout a typical day decreases significantly by more than The gender-differentiated data provides for striking indications of
75 min (Table 4), where preparation of an individual meal on an health effects: for women responsible for cooking, 9.0% of treated
G. Bensch, J. Peters / Journal of Health Economics 42 (2015) 44–63 53
Table 5
Effect of ICS usage on health status.
Mean Mean Mean p-Value (H0 : Diff = 0) Mean p-Value (H0 : Diff = 0)
(1) (2) (3) (4) (5) (6)
Note: Standard errors for the household level estimations are heteroskedasticity corrected, those for individual household member level estimations are clustered by
household, all estimations include village dummies in order to account for the stratified randomization.
†
ITT with inclusion of controls is not shown in this table, since for some control variables (bank account ownership, flooring material, village dummies) failure is perfectly
predicted in the estimated probit regressions.
‡
Differences in the number of observations are due to a few missing values and some households without any woman not responsible for cooking.
§
The values in this analysis are marginal means and marginal effects derived from estimations that can be found in regression form in Appendix F, Table F3. They are
conventionally calculated at the mean of the other independent variables taking into account the particularities of calculating margins for interaction terms in non-linear
models and conditioning on household members who are cooks.
*
Significance level of 10%.
**
Significance level of 5%.
***
Significance level of 1%.
households report at least one of them suffering from respiratory on individual level.12 The results confirm the findings of the
disease symptoms. The corresponding value for the control group household level estimations. In the group of household members
of 17.7% is almost twice as large – with this difference being statis- responsible for cooking the prevalence rates for both respiratory
tically significant. disease symptoms and eye infections go down by almost seven
If we look at the same proportion for male household members, percentage points. Significance levels are even more pronounced
who usually do not spend time around the cooking spot, treatment with p-values of 0.01 for both estimations with and without control
and control group households do not differ significantly from each variables respectively reflecting the more accurate definition of
other, nor do we find a difference for women not responsible for the indicator and the larger sample size. The estimations as well
cooking. The same pattern is observable for eye infections: 14.0% corroborate that the treatment has no effect at all on the group of
of households report that at least one woman responsible for cook- household members not responsible for cooking.
ing suffers from eye problems in the control group compared with Altogether, while the reduction in smoke due to fuel savings
4.5% in the treatment group. The difference is significantly different might be too modest to trigger perceivable health effects by itself, it
from zero. No such statistically significant difference is observed for is likely that the combination with the change in cooking behaviour
men and women not responsible for cooking. With respect to the enabled by the ICS explains the observed improvements in health
potential bargaining into or out of cooking selection processes out- indicators: the ICS facilitates outdoor cooking, the cooking duration
lined in Section 2.3, one would expect changes in prevalence rates is reduced, and the cooking and combustion process requires less
in the group of women who are not responsible for cooking if that supervision.
bargaining process was strong. However, this is not the case.
The bottom of Table 5 refers to results derived from ITT pro-
bit regressions for the same disease symptoms on the level of 3.5. Impact sustainability and upscaling the intervention
individual household members.11 We now look at the dummy
variables household member with symptoms of respiratory diseases Hitherto we have found quite strong and robust evidence for
and household member with eye problems, which take the value high take-up and impacts of ICS usage after one year that are, given
one if the respective household member reports having suffered the experimental set-up, internally valid. Internal validity, though,
from these symptoms at some point in the last six months before is only a necessary condition for high policy relevance. The decisive
the interview. We find prevalence rates of between 3% and 12% questions in a next step are, first, whether these usage rates and
impacts persist over time, second, whether the intervention yields
11
We abstain from showing ATT estimations here, since the specification requires
interacting the treatment status with the dummy variable that indicates the cooking
12
responsibility. We would thus need to instrument the ICS uptake and the interaction Comparable data on respiratory system diseases and eye problems for Sub-
term, respectively. Using the random assignment as instrument for both ICS uptake Saharan-African countries is very sparse (van Gemert et al., 2011). Studies with
and also in the interaction term (which is a controversial procedure) does not deliver indicator definitions that come closest to ours show comparable levels in these
any result in our case, since the estimations do not converge. health problems (ANSD and ICF International, 2012; Adeloye et al., 2013).
54 G. Bensch, J. Peters / Journal of Health Economics 42 (2015) 44–63
benefits that outweigh the costs and if so, third, whether it can be
upscaled. 4. Discussion and conclusion
In order to assess the sustainability of the observed impacts we
conducted an ICS usage tracking survey three and a half years after In this paper we evaluated take-up behaviour and impacts of
the random assignment. This enables us to examine the durability improved cooking stoves (ICSs) in rural Senegal by means of a ran-
of the randomized ICS under day-to-day rural cooking conditions domized controlled trial (RCT). ICSs are widely seen as an option for
and the usage behaviour over the full life-span of the ICS. In this sec- developing countries to combat the devastating effects of wood-
ond follow-up round, we did not collect information on impacts, fuel usage for cooking purposes on people’s health, work load as
because a majority of the stoves would have already exceeded well as the environment. The first finding is that ICS take-up was
their useable lifetimes. For statistical power reasons, this reduced close to 100% among the randomly assigned households and that
sample size would have made an examination of impacts difficult. people only cease to use the ICS if it deteriorates. This sustain-
Considering an expected life span of one to three years, the propor- ably high take-up rate comes as a surprise, since it is often argued
tion of 49% of treatment households still using the randomized ICS among development practitioners that people would not use ICSs
can, nevertheless, be considered surprisingly high. In the enumer- for which they have not paid. It also constitutes a major difference
ators’ appraisal, half of these ICS were still in good condition. The to the findings in Hanna et al. (2012). Major reasons for this are
proportion of dishes prepared with an ICS among ICS users declined probably differences in how convenient and advantageous the ICS
only slightly from 70% in 2010 to 62%. As can be seen in Fig. 4, those technology is from the household perspective and to which degree
treatment households who do not use the ICS anymore (51%) only the ICS has a better performance than the existing stove portfo-
slowly ceased to use their ICS. All of them have done so because lio. First, the ICS used in our study is maybe closer to the regular
the stove has deteriorated and 90% of them still used their ICS two cooking habits of the target population. It is easier to use, does
years after randomization.13 not require any particular maintenance and due to its portability
Against this background of persisting usage behaviour we con- households can decide themselves where to cook. Second, wood
duct a simple cost–benefit analysis. The costs of the ICS are scarcity is probably higher in our study area thereby increasing the
represented by the market price of around 10 US$. For a conser- relevance of an ICS. Third, more than a fourth of the households
vative estimate of the benefits, to begin with, we only account for in the study in India already also used cleaner fuels like electricity
reductions in firewood consumption. We take the average price and gas before the randomization so that the randomized ICS did
of 0.02 US$/kg of firewood paid by firewood-purchasing house- not necessarily represent an improvement for them.
holds at the time of the follow-up survey as an upper bound of The firewood savings were found to be statistically significant
the shadow price for collected firewood. Valuing the firewood that and substantial. They amount to around 30% per week in the most
ICS users save compared to traditional stove users shows that the likely scenario where households have one ICS and continue to
savings amount to 2.03 US$ per month. Even with a lower shadow use traditional stoves complementarily. If these complementar-
price for collected firewood, it is obvious already at this stage that ily used traditional stoves were also replaced by ICSs, the savings
the benefits of ICSs outweigh the costs by far over its life span. If could increase further up to around 40%. Such a reduction in fire-
health benefits and the reduction in cooking duration were taken wood consumption is an important impact in an arid country like
into account, the benefits would be even greater. Similarly, bene- Senegal, where forests are permanently under pressure and fire-
fits would turn out to be larger when social costs were additionally wood provision is a daily hardship for rural women. Moreover,
included, i.e. forest degradation, village air pollution, and carbon the CO2 that is sequestered in both dead wood and green wood
emissions. As a consequence, upscaling the intervention seems to is set free with obvious implications for climate change processes.
be economically sensible. Deforestation and forest degradation are in fact a relevant source
However, some challenges for external validity of the RCT need of global CO2 emissions. IPCC (2013) estimates that net land-use
to be considered when transferring the results to an upscaled change, mainly deforestation, is responsible for about 10% of the
intervention or to other regions. In Appendix D, we discuss the total anthropogenic CO2 emissions. To the extent woodfuel usage
aspects raised by Duflo et al. (2008a): general equilibrium effects, contributes to these processes, dissemination of ICS as used in this
study can help to reduce such losses of carbon sinks.
We also observe a reduction in firewood collection time, but this
13
Within the complete investigation period of three and a half years, the ICS was
is only borderline significant. Furthermore, we find that cooking
destroyed in two cases, once because of heavy rainfall and once because the kitchen duration is decreased significantly by over 20%. In addition, the
wall collapsed. In four cases, the ICS was stolen. cooking process is facilitated so that the time the cook needs to be
G. Bensch, J. Peters / Journal of Health Economics 42 (2015) 44–63 55
in direct proximity to the cooking spot is reduced. Together with from the point of view of the individual households. The inter-
an increase in outdoor cooking, this leads to an evident reduction play of cash and credit constraints, the lack of information, and
in exposure to harmful smoke. Consequently, we also find a clear the fact that in many cases the women responsible for cooking do
indication of a decrease in respiratory disease symptoms and eye not manage the household budget, all this however raises doubts
problems, with a drop of around 9 percentage points each for the about whether households would be able and willing to pay the
women responsible for cooking. market price for ICSs, even if the stoves were readily available on
Our self-reported health outcomes might of course feed criti- the market. The experience from long-standing pilot dissemination
cism that objective indicators such as individual particulate matter activities in neighbouring rural areas in Senegal seems to support
exposure as measured in the RESPIRE study deliver more accurate the presumption that the majority of rural households would prob-
information. Apart from the high costs of executing such a sur- ably stick to the cheaper traditional three-stone or metal stoves.14
vey, there is also a trade-off between the increased accuracy and a As the strategy of promoting the creation of sustainable ICS mar-
Hawthorne effect. Study participants can be expected to behave kets has already proven to be difficult in urban areas, where fuels
differently if they are asked to wear exposure monitoring tools are purchased and ICS benefits are clearly monetary ones, it can be
for 24 h, for example. Hence, self-reported and objective measure- expected to require even more efforts and resources in rural areas.
ments can rather be seen as complements. In addition, one might In combination, the high take-up and the positive external
suspect an auspices or courtesy bias in our data where respon- effects of ICS usage observed in this study would suggest that more
dents express their gratitude for having received the ICS or expect direct options of ICS promotion should be reconsidered. This could
additional benefits from a satisfied implementing agency. In their mean, for example, directly subsidizing the production of ICSs in
stove study in Ghana, Burwen and Levine (2012) suspect that this rural areas so that end-user prices can compete with traditional
effect biases their results, since the positive effects on self-reported stoves. If the findings can be confirmed in other rural areas, it might
health they observe are not plausible given that smoke exposure even be an option to distribute ICSs directly to the households,
is not reduced. However, this bias is not likely in the present case, either for free or at a very low, symbolic price. While this would
since participating households were not aware of the study’s focus be in contrast to the strategies pursued by most ICS dissemination
on ICSs. Even if some households noticed the role the ICS played in programmes, and many practitioners are opposed to a free distribu-
this study, they were unlikely to relate its usage to health outcomes. tion policy, the empirical literature provides evidence from other
The fact that we did not observe any health effect among house- field experiments that supports the idea. Paying a positive price
hold members not responsible for cooking strongly underpins this does not necessarily lead to higher usage rates of health-relevant
view. Hence, different from the Burwen and Levine (2012) study, goods (Cohen and Dupas, 2010; Tarozzi et al., 2014), charging cost-
placebo outcome indicators corroborated our findings. Finally, the sharing prices substantially reduce take-up (Kremer and Miguel,
magnitude of observed savings is in the range of what is expected 2007) and there is only weak evidence yet that price serves to allo-
based on laboratory tests and, thus, does not feed the suspicion of cate the health-relevant goods to those with the most need (Okeke
biased responses. et al., 2013).
Altogether, the substantial and statistically significant impacts Any ICS promotion policy has to be designed in close coop-
on different levels of indicators including positive external effects eration with local stakeholders, putting particular effort into the
such as reduced deforestation and household air pollution substan- choice of technically and culturally appropriate ICS models. Insti-
tiate the efforts that the international community dedicates to the tutions have to be created to sustain the distribution of direct
dissemination of ICSs. The findings on the health level fit into the subsidies for the ICSs, thereby avoiding the flash-in-the-pan effect
concept of intensive and extensive margins of behaviour that has a that has been observed in unsuccessful earlier ICS subsidization
longer tradition in agricultural economics (Feder et al., 1985) and programmes.
has recently been brought into the debate on public health-relevant As these recommendations can only be an interim conclusion,
behaviour in developing countries (see Dupas, 2011). The present further research on the take-up behaviour and on the impacts of
analysis suggests that not only the extensive margin of cooking ICS usage has to follow up in other regions and potentially other
should be addressed by disseminating cleaner stoves, but also the seasons as well. The indication of positive health effects of the
intensive margin by, for instance, raising awareness of the need to simpler ICS used in this RCT calls for taking into account cooking
reduce smoke exposure. This behavioural dimension should also be behaviour in these studies. As evidenced by the lower take-up of
taken into account by the Global Alliance for Clean Cookstoves and ICSs in the Hanna et al. (2012) study in India, the results may vary
the United Nations in outlining future policies to increase access in different environments and if other ICS types are used. In addi-
to improved or clean cooking stoves. Even ICSs that still emit con- tion, further experimental studies should examine the mechanisms
siderable amounts of smoke might trigger positive health effects if behind take-up behaviour, such as the households’ willingness-to-
they also induce exposure-relevant behavioural changes. pay for ICSs, but also the role of credit constraints, information,
The almost universal take-up among randomly assigned ICS and woodfuel scarcity. Such research efforts can substantiate –
owners suggests that if they have an easy opportunity to obtain an or contradict – the findings in this study and will thereby help
ICS that is adapted to local cooking habits people also use it. A sim- to decide under which circumstances and to which degree sub-
ple back-of-the-envelope cost–benefit calculation further made it sidies might in fact be required to encourage rural people to obtain
clear that investing in an ICS would be a profitable investment ICSs.
14
See also Miller and Mobarak (2013) for evidence on low purchase rates of ICS in
Bangladesh.
56 G. Bensch, J. Peters / Journal of Health Economics 42 (2015) 44–63
Study reference Stove type/model name Combustion Fuel type Feed type Chimney Portability Approx. Further stove
chamber type cost (US$) references
This study Jambaar wood Ceramic Wood Continuous No Yes 10 GIZ (2011a)
Bensch and Peters (2013) Jambaar charcoal Ceramic Charcoal Batch fed No Yes 9–19 GIZ (2011b)
Burwen and Levine (2012) Council of Scientific and Industrial Mud Wood Continuous Yes No <10 –
Research (CSIR) improved stove
Hanna et al. (2012) Appropriate Rural Technology Institute Mud Wood Continuous Yes No 12.5 –
(ARTI) improved stove
Masera et al. (2007) Patsari stove Mud/brick Wood Continuous Yes No 35 Kshirsagar and
Kalamkar (2014)
Miller and Mobarak (2013) Bangladesh Council of Scientific and Clay Wood Continuous E: no E: Yes E: $5.8 Mobarak et al.
Industrial Research (BCSIR) “efficiency” C: Yes C: No C: $10.9 (2012)
(E) and “chimney” (C) stove
RESPIRE Plancha mejorada Brick Wood Continuous Yes No 100–150 Díaz (2008)
Notes: All listed stoves are direct combustion stoves with natural draft. Further main (and more advanced) combustion types are gasifier and rocket type direct combustion;
forced draft is an alternative to natural draft. In addition, the combustion chamber may be metallic.
Stove type/model name Combustion chamber type Fuel type Feed type Chimney Portability Approx. cost (US$)
Since information on our primary impact variable, firewood External validity prevails if a study’s findings can be transferred
consumption, was not available in existing data sets for the from the study population to the policy population. In other words,
target region of our study, we took data collected in the quasi- external validity is concerned with whether findings obtained from
experimental study presented in Bensch and Peters (2013) from a small sample group represent the wider population in real world
urban Senegal to approximate the relevant parameters (prospec- situations. In the following, we discuss how our RCT design took
tive power analysis). After the follow-up survey, we verified these into account the three dimensions of external validity as defined
parameters by rerunning the analysis with the actual baseline data by Duflo et al. (2008a): general equilibrium effects, Hawthorne and
for those households included in the analysis (retrospective power John Henry effects as well as possible limitations to generalizations
analysis).The sample size n is given by the following formula: beyond our specific intervention and beyond our sample.
1 2
r (sd1 + sd22 ) General equilibrium effects may occur in the present case if
n = D[(Z˛ + Zˇ )2 ] widespread ICS usage leads to a sizable reduction in firewood
(X2 − X1 )2
demand and, in turn, to a reduction in the costs of firewood pro-
Table C1 provides the description, the values and the sources of vision, either because prices decrease or because firewood is less
the different parameters. The decisive parameter to be defined by scarce and easier to collect. This might induce households to con-
the researcher is the minimum detectable effect size (ES), which sume more of the now cheaper fuel. Although this would bring
reflects the smallest relative reduction in woodfuel consumption welfare benefits such as more hot meals, from a public health and
that we are able to detect at the given significance level (see Bloom, resource saving perspective this might be considered an adverse
1995). While the CCT suggest an effect size of 40%, we chose a second-round effect. Since most households in rural Senegal col-
minimum detectable effect size of 30% in order to account for the lect firewood and do not buy it, this effect can be expected to be
possibility of an overestimated effect size in the CCT. We defined less pronounced than for market-based energy sources.
the probability of being assigned to the control group to be 60% and Another major risk to the external validity of RCT results is if par-
that for the ICS treatment group to be 40%. ticipants change their behaviour because they know that they are
Taking these parameters into account, we obtain a required participating in an experiment or are somehow under observation.
sample size of around 200 households, as is indicated in the last row While so-called Hawthorne effects (if treatment group members
of the column for the prospective analysis in Table C1. In order to change their behaviour) or John Henry effects (if control group mem-
account for the sensitivity of the different parameters in the power bers change their behaviour) can never be ruled out completely,
calculation and potential attrition or non-compliance, we built in a we reduced the risk considerably through various precautionary
cushion and increased the number of households to be interviewed measures: first, we embedded the interviews in a baseline survey
to 250. for an electrification intervention under preparation in the studied
With respect to health and time savings impacts, the sample size areas (the intervention was not implemented in any of the sampled
required to measure significant effects tends to be substantially villages before the end of this study). The applied questionnaire
higher. The reason is that the effect on respiratory diseases, for covered a comprehensive set of socio-economic and energy-related
example, can be expected to be less pronounced. The implication dimensions such as electricity so that attention was not focused
of this is that the power of our study is not necessarily sufficient to primarily on cooking-related parts of the interviews. Second, the
detect all relevant health and time savings effects. lottery was framed as a reward for all households to recompense
Table C1
Table C1 Parameters for power calculation.
Prospective Retrospective
D = 1 + (m + 1) Design effect, accounting for the loss of variation in the data if clustered 1.59 2.25 Household data†
with instead of simple random sampling is used
Intra-cluster correlation, i.e. the proportion of the overall variance with 0.031 0.069 Household data
respect to firewood consumption explained by within-village (cluster)
variance in the data
m Mean number of interviewed households per cluster (village) 20 229/12 = 19.1 Defined
Z˛ Critical value (Z-score) for a given level of confidence ˛ reflecting the 1.96 (˛ = 5%) 1.96 (˛ = 5%) Defined (conventional)
probability that the null hypothesis is rejected given that it is in fact true
Zˇ Z-score for a given level of confidence ˇ reflecting the probability that the null 0.84 (ˇ = 80%) 0.84 (ˇ = 80%) Defined (conventional)
hypothesis is rejected given that it is in fact false
R Ratio of treatment and control observations (ICS owners to non-owners) 0.66 90/139 = 0.65 Lottery outcome defined in
sampling design
sd1 Standard deviation of firewood consumption of ICS non-owners 0.266 0.259 Household data
sd2 Standard deviation of firewood consumption of ICS owners 0.186 0.181 Implicitly defined through
minimum detectable effect size
(see below)
X1 Per capita firewood consumption of ICS non-owners (in kg) 0.384 0.411 Household data
X2 Expected per capita firewood consumption of ICS owners (in kg) 0.269 0.288 Implicitly defined through
minimum detectable effect size
(see below)
ES = |X2 –X1 |/X1 Minimum detectable effect size 30% 30% Defined based on experiences
with laboratory tests
n = n (ICS owners) + Result of power calculation: required minimum sample sizes for treatment 192 = 76 + 116 229 = 90 + 139
n (non-owners) and control group
†
Household data refers to the data from the urban quasi-experimental study (“prospective”) and to the baseline data from the present study (“retrospective”) to corroborate
the calculations of the prospective analysis.
58 G. Bensch, J. Peters / Journal of Health Economics 42 (2015) 44–63
Table F1
Table F1 ATT results for household level indicators on firewood consumption, time expenditures, and health.
Difference in means Regression-adjusted difference in means
Table F2
Table F2 Outlier analysis for household and dish level indicators.
Outlier analysis using median regressions Outlier analysis using outlier exclusion
(se) p-Value (H0 : Mean (se) p-Value (H0 : (se) p-Value (H0 : Mean (se) p-Value (H0 :
Diff = 0) Diff = 0) Diff = 0) Diff = 0)
(1) (2) (3) (4)
Firewood consumption per week (kg) 26.50 (4.63) 0.00*** 26.51 (3.51) 0.00*** 19.15 (4.41) 0.00*** 18.53 (4.45) 0.00****
Firewood weight per dish (kg) 1.54 (0.19) 0.00*** 1.77 (0.13) 0.00*** 1.50 (0.12) 0.00*** 1.61 (0.13) 0.00***
Duration of firewood collection per week (min) 60 (58 ) 0.30 76 (66 ) 0.25 123 (66 ) 0.06* 117 (62 ) 0.06*
Cooking duration per day (min) 68 (21.6 ) 0.00*** 77 ((16.0 ) 0.00*** 40 ((17.1 ) 0.02** 40 ((17.2 ) 0.02**
Note: Median regressions are quantile regressions that determine the median of the dependent variable conditional on the values of the independent variables. For outlier
exclusion, outliers are defined as values more than two standard deviations away from the mean. All values are computed using robust standard errors; se – standard error.
*
Significance level of 10%.
**
Significance level of 5%.
***
Significance level of 1%.
Table F3
Table F3 Probit regression on health status of household members.
Estimator: Coefficient (Standard Error in parentheses)
Probit, ITT
Dependent variable: Household member with respiratory system Household member with eye problem
disease
Household variables
Average number of people cooked for (in terms of the logarithm of adult equivalents) −0.34** (0.16) −0.17 (0.17)
Father has formal education −0.13 (0.17) −0.26 (0.25)
Mother has formal education 0.19 (0.12) 0.20 (0.14)
Household income (in logarithmic terms) 0.01 (0.04) 0.01 (0.06)
Telecommunication expenditures (in logarithmic terms) 0.04* (0.02) 0.00 (0.02)
Bank account ownership −0.50 (0.44) −0.52 (0.38)
Wall material of house is stone or brick −0.09 (0.14) −0.30** (0.15)
Ownership of sheep 0.04 (0.13) −0.08 (0.14)
Association membership of the mother 0.08 (0.12) −0.09 (0.14)
References FAO (Food and Agriculture Organization of the United Nations, 2005b. Global Forest
Resources Assessment. Food and Agriculture Organization of the United Nations,
Adeloye, D., Chan, K.Y., Rudan, I., Campbell, H., 2013. An estimate of asthma preva- Rome.
lence in Africa: a systematic analysis. Croatian Medical Journal 54 (6), 519–531. Frondel, M., Peters, J., Vance, C., 2008. Identifying the rebound: evidence from a
ANSD (Agence Nationale de la Statistique et de la Démographie), 2006. Résul- German household panel. Energy Journal 29 (4), 154–163.
tats du troisième recensement général de la population et de l’habitat (2002): GAIN (Global Agricultural Information Network), 2011. Senegal, Grain and Feed
Rapport National de présentation, http://www.ansd.sn/publications/rapports Annual. West Africa Rice Annual, http://gain.fas.usda.gov/Recent%20GAIN%
enquetes etudes/enquetes/RGPH3 RAP NAT.pdf (last accessed 03.03.10). 20Publications/Grain%20and%20Feed%20Annual Dakar Senegal 5-6-2011.pdf
ANSD (Agence Nationale de la Statistique et de la Démographie), ICF International, (last accessed 08.02.12).
2012. Enquête Démographique et de Santé à Indicateurs Multiples au Sénégal GIZ (Gesellschaft für Internationale Zusammenarbeit), 2011a. Firewood Jam-
(EDS-MICS) 2010–2011. ANSD and ICF International, Calverton, MD, USA. bar Stove, Senegal, https://energypedia.info/wiki/File:GIZ HERA 2011 Jambar
Armstrong, J.R., Campbell, H., 1991. Indoor air pollution exposure and lower Bois Senegal.pdf (last accessed 25.11.14).
respiratory infections in young Gambian children. International Journal of Epi- GIZ (Gesellschaft für Internationale Zusammenarbeit), 2011b. Charcoal Jam-
demiology 20 (2), 424–429. bar Stove, Benin, Kenya, Senegal. https://energypedia.info/images/b/b1/
Barnes, D.F., Openshaw, K., Smith, K., Van der Plas, R., 1994. What Makes People GIZ HERA 2011 Jambar Charbon Senegal.pdf (last accessed 25.11.14).
Cook with Improved Biomass Stoves? A Comparative International Review of Hanna, R., Duflo, E., Greenstone, M., 2012. Up in Smoke: The Influence of Household
Stove Programs, World Bank Technical Paper No. 242. World Bank. Behavior on the Long-run Impact of Improved Cooking Stoves, CEEPR WP 2012-
Bensch, G., Peters, J., 2013. Alleviating deforestation pressures? Impacts of improved 008. MIT Center for Energy and Environmental Policy Research.
stove dissemination on charcoal consumption in urban Senegal. Land Economics Herring, H., Sorrell, S., Elliott, D., 2009. Energy Efficiency and Sustainable Consump-
89 (4), 676–698. tion – The Rebound Effect. Palgrave Macmillan, New York.
Bloom, H., 1995. Minimum detectable effect size – a simple way to report the sta- Idler, E., Benyamini, Y., 1997. Self-assessed health and mortality: a review of twenty-
tistical power of experimental designs. Evaluation Review 19 (5), 547–556. seven community studies. Journal of Health and Social Behavior 38 (1), 21–37.
Bonjour, S., Adair-Rohani, H., Wolf, J., Bruce, N.G., Mehta, S., Prüss-Ustün, A., Lahiff, IPCC (Intergovernmental Panel on Climate Change), 2013. Climate Change 2013:
M., Rehfuess, E.A., Mishra, V., Smith, K.R., 2013. Solid fuel use for household The Physical Science Basis. Contribution of Working Group I to the Fifth Assess-
cooking: country and regional estimates for 1980–2010. Environmental Health ment Report of the Intergovernmental Panel on Climate Change. Cambridge
Perspectives 121 (7), 784–790. University Press, Cambridge/New York.
Bruhn, M., McKenzie, D., 2009. In pursuit of balance: randomization in practice Kamali, A., Quigley, M., Nakiyingi, J., Kinsman, J., Kengeya Kayondo, J., Gopal, R.,
in development field experiments. American Economic Journal: Applied Eco- Ojwiya, A., Hughes, P., Carpenter, L.M., Whitworth, J., 2003. Syndromic manage-
nomics 1 (4), 200–232. ment of sexually-transmitted infections and behaviour change interventions
Burnett, R.T., Pope III, C.A., Ezzati, M., Olives, C., Lim, S.S., Mehta, S., Shin, H.H., Singh, on transmission of HIV-1 in rural Uganda: a community randomised trial. The
G., Hubbell, B., Brauer, M., Anderson, H.R., Smith, K.R., Balmes, J.R., Bruce, N.G., Lancet 361 (9358), 645–652.
Kan, H., Laden, F., Prüss-Ustün, A., Turner, M.C., Gapstur, S.M., Diver, W.R., Cohen, Kan, X., Chiang, C.Y., Enarson, D.A., Chen, W., Yang, J., Chen, G., 2011. Indoor solid
A., 2014. An integrated risk function for estimating the global burden of disease fuel use and tuberculosis in China: a matched case–control study. BMC Public
attributable to ambient fine particulate matter exposure. Environmental Health Health 11 (1), 1–7.
Perspectives 122 (4), 397–403. Kjellsson, G., Clarke, P., Gerdtham, U.-G., 2014. Forgetting to remember or remem-
Burwen, J., Levine, D.E., 2012. A rapid assessment randomized-controlled trial of bering to forget: A study of the recall period length in health care survey
improved cookstoves in rural Ghana. Energy for Sustainable Development 16 questions. Journal of Health Economics 35, 34–46.
(3), 328–338. Kremer, M., Miguel, E., 2007. The illusion of sustainability. Quarterly Journal of
Butrick, E., Peabody, J., Solon, O., DeSalvo, K., Quimbo, S., 2010. A compari- Economics 122 (3), 1007–1065.
son of objective biomarkers with a subjective health status measure among Kremer, M., Miguel, E., Mullainathan, S., Null, C., Zwane, A.P., 2009. Making Water
children in the Philippines. Asia-Pacific Journal of Public Health 24 (4), Safe: Price, Persuasion, Peers, Promoters, or Product Design? Mimeo.
565–576. Kshirsagar, M.P., Kalamkar, V.R., 2014. A comprehensive review on biomass cook-
Campbell, H., Armstrong, J.R., Byass, P., 1989. Indoor air pollution in developing stoves and a systematic approach for modern cookstove design. Renewable &
countries and acute respiratory infection in children. Lancet 1, 1012. Sustainable Energy Reviews 30, 580–603.
Cohen, J., Dupas, P., 2010. Free distribution or cost-sharing? Evidence from a ran- Lewis, J.J., Pattanayak, S.K., 2012. Who adopts improved fuels and cookstoves? A
domized Malaria prevention experiment. Quarterly Journal of Economics 125 systematic review. Environmental Health Perspectives 120 (5), 637–645.
(1), 1–45. Luby, S., Mendoza, C., Keswick, B., Chiller, T.M., Hoekstra, R., 2008. Difficulties in
Das, J., Hammer, J., Sánchez-Paramo, C., 2012. The impact of recall periods on bringing point-of-use water treatment to scale in rural Guatemala. The Ameri-
reported morbidity and health seeking behaviour. Journal of Development Eco- can Journal of Tropical Medicine and Hygiene 78 (3), 382–387.
nomics 98 (1), 76–88. Martin II, W.J., Glass, R.I., Balbus, J.M., Collins, F.S., 2011. A major environmental
De Mel, S., McKenzie, D., Woodruff, C., 2008. Returns to capital in micro-enterprises: cause of death. Science 334 (6053), 180–181.
evidence from a field-experiment. Quarterly Journal of Economics 123 (4), Masera, O., Edwards, R., Armendariz, C., Berrueta, V., Johnson, M., Rojas Bracho, L.,
1329–1372. Riojas-Rodríguez, H., Smith, K.R., 2007. Impact of Patsari improved cookstoves
Dherani, M., Pope, D., Mascarenhas, M., Smith, K.R., Weber, M., 2008. Indoor air on indoor air quality in Michoacán, Mexico. Energy for Sustainable Development
pollution from unprocessed solid fuel use and pneumonia risk in children aged 11 (2), 45–56.
under 5 years: a systematic review and meta-analysis. Bulletin of the World McCracken, J., Smith, K., Stone, P., Díaz, A., Arana, B., Schwartz, J., 2011. Intervention
Health Organization 86 (5), 390–398. to lower household wood smoke exposure in Guatemala reduces ST-segment
Díaz, E., Smith-Sivertsen, T., Pope, D., Lie, R., Diaz, A., McCracken, J., Arana, B., Smith, depression on electrocardiograms. Environmental Health Perspectives 119 (11),
K., Bruce, N., 2007. Eye discomfort, headache and back pain among Mayan 1562–1568.
Guatemalan women taking part in a randomized stove intervention trial. Journal Miilunpalo, S., Vuori, I., Oja, P., Pasanen, M., Urponen, H., 1997. Self-rated health
of Epidemiology & Community Health 61 (1), 74–79. status as a health measure: the predictive value of self-reported health status
Díaz, E., (Ph.D. dissertation) 2008. Impact of Reducing Indoor Air Pollution on on the use of physician services and on mortality in the working-age population.
Women’s Health. RESPIRE Guatemala-Randomised Exposure Study of Pollu- Journal of Clinical Epidemiology 50 (5), 517–528.
tion Indoors and Respiratory Effects. Department of Public Health and Primary Miller, G., Mobarak, M., 2013. Gender Differences in Preferences, Intra-household
Health Care, The University of Bergen. Externalities, and Low Demand for Improved Cookstoves, NBER Working Paper
Duflo, E., Glennerster, R., Kremer, M., 2008a. Using randomization in development 18964. National Bureau of Economic Research.
economics research: a toolkit. In: Schultz, P., Strauss, J. (Eds.), Handbook of Mobarak, A.M., Dwivedi, P., Bailis, R., Hildemann, L., Miller, G., 2012. Low
Development Economics. North Holland, Amsterdam, pp. 3895–3962. demand for nontraditional cookstove technologies. Proceedings of the National
Duflo, E., Greenstone, M., Hanna, R., 2008b. Indoor air pollution, health and economic Academy of Sciences of the United States of America 109 (27), 10815–10820,
well-being. Sapiens Journal 1 (1), 1–9. http://dx.doi.org/10.1073/pnas.1115571109.
Dupas, P., 2011. Do teenagers respond to HIV risk information? Evidence from a Mueller, V., Pfaff, A., Peabody, J., Liu, Y., Smith, K.R., 2013. Improving stove evalu-
field experiment in Kenya. American Economic Journal: Applied Economics 3 ation using survey data: who received which intervention matters. Ecological
(1), 1–34. Economics 93, 301–312.
Ezzati, M., Kammen, D.M., 2001. Indoor air pollution from biomass combustion and Okeke, E.N., Adepiti, C.A., Ajenifuja, K.O., 2013. What is the price of prevention? New
acute respiratory infections in Kenya: an exposure–response study. The Lancet evidence from a field experiment. Journal of Health Economics 32 (1), 207–218.
358, 619–624. Pandey, M.R., 1984a. Prevalence of chronic bronchitis in a rural community of the
Ezzati, M., Kammen, D.M., 2002. Household energy, indoor air pollution, and health hill region of Nepal. Thorax 39, 331–336.
in developing countries: knowledge base for effective interventions. Annual Pandey, M.R., 1984b. Domestic smoke pollution and chronic bronchitis in a rural
Review of Environment and Resources 27, 233–270. community of hill region of Nepal. Thorax 39, 337–339.
Feder, G., Just, R.E., Zilberman, D., 1985. Adoption of agricultural innovations in Pandey, M.R., Smith, K.R., Boleij, J.S.M., Wafula, E.M., 1989. Indoor air pollution in
developing countries: a survey. Economic Development and Cultural Change developing countries and acute respiratory infection in children. The Lancet 1,
33 (2), 255–298. 427–429.
FAO (Food and Agriculture Organization of the United Nations), 2005a. State of the Pattanayak, S., Pfaff, A., 2009. Behavior, environment, and health in developing
World’s Forests. Food and Agriculture Organization of the United Nations, Rome. countries: evaluation and valuation. Annual Review of Resource Economics 1,
183–217.
G. Bensch, J. Peters / Journal of Health Economics 42 (2015) 44–63 63
Peabody, J.W., Nordyke, R.J., Tozija, F., Luck, J., Munoz, J.A., Sunderland, A., DeSalvo, Tappan, G., Sall, M., Wood, E.C., Cushing, M., 2004. Ecoregions and land cover trends
K., Ponce, N., McCulloch, C., 2006. Quality of care and its impact on population in Senegal. Journal of Arid Environments 59 (3), 427–462.
health: a cross-sectional study from Macedonia. Social Science & Medicine 62 Tarozzi, A., Mahajan, A., Blackburn, B., Kopf, D., Krisham, L., Yoong, J., 2014. Micro-
(9), 2216–2224. loans, insecticide-treated bednets, and malaria: evidence from a randomized
Pitt, M.M., Rosenzweig, M.R., Hassan, M.N., 2006. Sharing the Burden of Disease: controlled trial in Orissa, India. American Economic Review 104 (7), 1909–1941.
Gender, the Household Division of Labor and the Health Effects of Indoor Air UNDP/WHO (United Nations Development Programme and World Health Organi-
Pollution. Mimeo. zation), 2009. The Energy Access Situation in Developing Countries – A Review
Pope, C.A., Burnett, R.T., Turner, M.C., Cohen, A., Krewski, D., Krewski, D., Jerrett, Focused on the Least Developed Countries and Sub-Saharan Africa. United
M., Gapstur, S.M., Thun, M.J., 2011. Lung cancer and cardiovascular disease Nations Development Programme, New York.
mortality associated with ambient air pollution and cigarette smoke: shape of van Gemert, F., van der Molen, T., Jones, R., Chavannes, N., 2011. The impact of
the exposure–response relationships. Environmental Health Perspectives 119, asthma and COPD in sub-Saharan Africa. Primary Care Respiratory Journal 20
1616–1621. (3), 240–248.
Rehfuess, E.A., Puzzolo, E., Stanistreet, D., Pope, D., Bruce, N., 2014. Enablers and WEC/FAO (World Energy Council and Food and Agriculture Organization of the
barriers to large-scale uptake of improved solid fuel stoves: a systematic review. United Nations), 1999. The Challenge of Rural Energy Poverty in Developing
Environmental Health Perspectives 122 (2), 120–130. Countries. World Energy Council, London.
Shindell, D., Kuylenstierna, J.C.I., Vignati, E., van Dingenen, R., Amann, M., Klimont, Wendland, K.J., Pattanayak, S.K., Sills, E.O., 2015. National-level differences in the
Z., Anenberg, S.C., Muller, N., Janssens-Maenhout, G., Raes, F., Schwartz, J., Falu- adoption of environmental health technologies: a cross-border comparison
vegi, G., Pozzoli, L., Kupiainen, K., Höglund-Isaksson, L., Emberson, L., Streets, from Benin and Togo. Health Policy Planning 30 (2), 145–154.
D., Ramanathan, V., Hicks, K., Kim Oanh, N.T., Milly, G., Williams, M., Demkine, WHO (World Health Organisation), 2014. Burden of Disease from House-
V., Fowler, D., 2012. Simultaneously mitigating near-term climate change and hold Air Pollution for 2012, http://www.who.int/phe/health topics/outdoorair/
improving human health and food security. Science 335, 183–189. databases/FINAL HAP AAP BoD 24March2014.pdf (last accessed 26.04.14).
Smith, K.R., McCracken, J.P., Weber, M.W., Hubbard, A., Jenny, A., Thompson, L.M., WHO (World Health Organization), 2009. Country Profile of Environmental Burden
Balmes, J., Díaz, A., Arana, B., Bruce, N., 2011. Effect of reduction in household of Disease – Senegal. World Health Organisation, Geneva.
air pollution on childhood pneumonia in Guatemala (RESPIRE): a randomised World Bank, 2011. Household Cookstoves, Environment, Health, and Climate
controlled trial. The Lancet 378 (9804), 1717–1726. Change. A New Look on an Old Problem. World Bank, Washington.
Smith-Sivertsen, T., Díaz, E., Pope, D., Lie, R.T., Díaz, A., McCracken, J.P., Bakke, P., Yu, F., 2011. Indoor air pollution and children’s health: net benefits from stove and
Arana, B., Smith, K.R., Bruce, N., 2009. Effect of reducing indoor air pollution behavioral interventions in rural China. Environmental and Resource Economics
on women’s respiratory symptoms and lung function: the RESPIRE randomized 50 (4), 495–514.
trial, Guatemala. American Journal of Epidemiology 170 (2), 211–220. Zwane, A.P., Zinman, J., Van Dusen, E., Pariente, W., Null, C., Miguel, E., Kremer, M.,
Smith-Sivertsen, T., Díaz, E., Bruce, N., Díaz, A., Khalakdina, A., Schei, M.A., Karlan, D.S., Hornbeck, R., Giné, Y., Duflo, E., Devoto, F., Crepon, B., Banerjee,
McCracken, J.P., Arana, B., Klein, R., Thompson, L.M., Smith, K.R., 2004. Reducing A., 2011. Being surveyed can change later behavior and related parameter esti-
indoor air pollution with a randomized intervention design – a presentation of mates. Proceedings of the National Academy of Sciences of the United States of
the stove intervention study in the Guatemalan highlands. Norsk Epidemiologi America 108 (5), 1821–1826.
14 (2), 137–143.
Journal of Health Economics 42 (2015) 64–80
a r t i c l e i n f o a b s t r a c t
Article history: We estimate the effect of medical marijuana laws (MMLs) in ten states between 2004 and 2012 on ado-
Received 22 May 2014 lescent and adult use of marijuana, alcohol, and other psychoactive substances. We find increases in the
Received in revised form 23 February 2015 probability of current marijuana use, regular marijuana use and marijuana abuse/dependence among
Accepted 13 March 2015
those aged 21 or above. We also find an increase in marijuana use initiation among those aged 12–20. For
Available online 23 March 2015
those aged 21 or above, MMLs further increase the frequency of binge drinking. MMLs have no discernible
impact on drinking behavior for those aged 12–20, or the use of other psychoactive substances in either
JEL classification:
age group.
I18
K32 © 2015 Elsevier B.V. All rights reserved.
Keywords:
Medical marijuana law
Marijuana use
Alcohol use
Natural experiment
As of February 2015, 23 states and the District of Columbia consensus about the relief medical marijuana can bring for a range
have implemented medical marijuana laws (MMLs), which permit of serious illnesses, concerns have been voiced that MMLs may
marijuana use for medical purposes. Three states (i.e., Maryland, give rise to increased marijuana use in the general population and
Minnesota, and New York) adopted MMLs during 2014, and an increased use of other substances. Legislative and public attention
additional 11 states1 passed pro-medical marijuana legislation. have focused on these issues, but the empirical evidence is limited.
Medical marijuana bills have also been considered in many of the We contribute to the literature on the effects of marijuana
remaining states and are likely to land on the legislative agenda liberalization policies by examining the effect of the implemen-
in more states in the near future. Understanding the behavioral tation of MMLs in ten states between 2004 and 2012 on a variety
and public health implications of this evolving regulatory envi- of substance use outcomes including marijuana use, alcohol use,
ronment is critical for the ongoing implementation of MMLs and pain medication misuse, and hard drug use in both adolescent
future iterations of marijuana policy reform. Despite the growing and adult populations. To tease out the potential causal effect of
MML implementation, we exploited the geographic identifiers in a
restricted-access version of the National Survey on Drug Use and
夽 The authors gratefully acknowledge the helpful comments on earlier drafts of Health (NSDUH) micro-level data and estimated two-way fixed
this study from Sara J. Markowitz and David H. Howard. All errors are our own. effects models with state-specific linear time trends and a rich set
夽夽 The authors declare that they have no relevant or material financial interests of individual- and state-level covariates.
that relate to the research described in this study. The study was approved by the We find that implementation of an MML leads to a relative 14
Emory University Institutional Review Board (IRB) through an expedited review
procedure.
percent increase in the probability of past-month marijuana use
∗ Corresponding author. Tel.: +1 4047911709. and a 15 percent increase in the probability of almost daily/daily
E-mail addresses: hwen2@emory.edu (H. Wen), jason.hockenberry@emory.edu marijuana use among adults aged 21 or above. For this age group,
(J.M. Hockenberry), jrcummi@emory.edu (J.R. Cummings). MML implementation also results in a 10 percent increase in the
1
11 states with pro-medical marijuana legislation include Alabama, Florida, Iowa,
probability of marijuana abuse/dependence. Among adolescents
Kentucky, Mississippi, Missouri, North Carolina, South Carolina, Tennessee, Utah,
and Wisconsin. and young adults aged 12–20, we find a 5 percent increase in
http://dx.doi.org/10.1016/j.jhealeco.2015.03.007
0167-6296/© 2015 Elsevier B.V. All rights reserved.
H. Wen et al. / Journal of Health Economics 42 (2015) 64–80 65
the probability of past-year marijuana use initiation attributable and access to marijuana for a select group of patients. In practice
to MML implementation. however, the laws may have a spillover effect on marijuana use in
In addition to the increases in marijuana use, implementation the non-patient population.
of an MML also increases the frequency of binge drinking among The spillover effect may arise from four dimensions of the
those aged 21 or above, partially through increasing simultaneous existing MMLs that create a de facto legalized environment for
use of the two substances. In contrast, MML implementation does marijuana use in the general population (Pacula et al., 2013). First,
not affect underage drinking among those aged 12–20. In both age although all MMLs specify a list of conditions that are eligible for
groups, non-medical use of prescription pain medication, heroin medical marijuana,4 most MMLs include in the list a generic term
use, and cocaine use are unaffected. “chronic pain”, rather than specific diseases causing the pain (e.g.,
Overall, our findings indicate that state implementation of an neuropathy, fibromyalgia, rheumatoid arthritis, etc.) (Pacula et al.,
MML increases marijuana use, but has limited impacts on other 2013). The interpretation of “chronic pain” can go far beyond the
types of substance use (i.e., underage drinking, pain medication original legislative intent, analogous to the practice of off-label
misuse, and hard drug use), except for binge drinking among adults prescribing of other medications. Because pain can often be non-
of legal drinking age. descript and difficult to verify clinically, a recreational user may
The article proceeds as follows. Section 1 provides background pretend to be a pain patient in order to obtain a prescription for
information on medical marijuana and MMLs, outlines the theoret- medical marijuana.
ical framework, and summarizes the existing literature. Section 2 Second, some MMLs do not require establishment of a reg-
describes the data sources, variable measurement, and identifica- istry/renewal system to assess and monitor patient eligibility for
tion strategy. Section 3 presents the estimated policy effects, and medical marijuana. This, coupled with the loosely-defined eligibil-
the robustness checks. Concluding remarks are given in the last ity criteria, further blurs the boundary between the patient and the
section of the article. non-patient population (Cohen, 2010).
Third, MMLs provide medical marijuana patients with access
1. Background to the drug by allowing licensed retail dispensaries and/or home
cultivation. These supply channels exist in a legal grey area and
1.1. Medical marijuana law and potential risks and medical value may proliferate as a result of the reduced threat of prosecution
of marijuana under the MMLs (Pacula et al., 2010). In particular, Anderson
et al. (2013) provided empirical evidence that MMLs have led to a
In the last two decades, growing evidence has lent support to substantial increase in the supply of high-grade marijuana. As mar-
the efficacy and safety of marijuana as medical therapy to alle- ijuana supply rises, it may become prohibitively expensive for law
viate symptoms and treat diseases (see, for instance, Ben Amar, enforcement to ensure that the entire supply of marijuana intended
2006; Campbell and Gowran, 2007; Krishnan et al., 2009; Pertwee, for medical purpose ends up in the hands of legitimate patients,
2012; Gloss and Vickrey, 2012). This growing body of clinical evi- akin to how prescription opioids eventually find their way into the
dence on marijuana’s medicinal value has propelled many states street drug market. This spillover to the non-patient population is
toward a more tolerant legal approach to medical marijuana. In likely to occur in places where marijuana possession is decriminal-
1996, California signed the Compassionate Use Act into law (Propo- ized, prosecution of a marijuana offense is local law enforcement’s
sition 215) and became the first state in the U.S. to permit the “lowest priority”, and federal interference in marijuana regulation
medical use of marijuana. And since then a total of 23 states and the is limited (Sekhon, 2009).
District of Columbia have passed MMLs. These laws are intended to In addition to those specific components of the law, an MML
protect patients from state prosecution for their medical marijuana as a whole symbolizes liberalization of marijuana policy, which
use (Hoffmann and Weber, 2010).2 in turn, may give rise to the underestimation of the risks associ-
Typically under an MML, a patient with an eligible condition ated with marijuana use and the normalization of marijuana use
should first obtain recommendation from a qualified doctor for for recreational purposes (Hathaway et al., 2011).
the use of marijuana in medical treatment. With the doctor’s rec-
ommendation for medical marijuana use, the patient can then be 1.2. Literature on the effect of MML on marijuana use in the
issued a medical marijuana patient identification card by the state. general population
The patient ID cardholder and his/her caregivers are allowed to
possess a certain amount of marijuana through cultivation at home Empirical evidence is inconclusive with respect to the effect of
and/or purchase from a nonprofit retail dispensary licensed by an MML on marijuana use in the general population. A review of
the state (in some states called “compassionate center”).3 As such, this line of literature is beyond the scope of our paper. We direct
MMLs in principle should only provide restricted legal protection readers to Chu (2014) for a comprehensive review. Briefly, how-
ever, we note that the mixed findings from the previous studies can
be explained by the heterogeneity between different age groups
2
In contrast to the state MMLs, federal law continues to prohibit marijuana use examined and the variation in specific state laws covered by the
for any purpose since the enactment of the Controlled Substances Act (CSA) of studies.
1970. A 2005 Supreme Court decision (Gonzales v. Raich) reaffirmed that federal Studies on youths generally find no significant effect of an MML
law enforcement has the authority to prosecute patients for medical marijuana use
on youth marijuana use (e.g., Harper et al., 2012; Lynne-Landsman
in accordance with state laws (Gostin, 2005). It is only recently that the Obama
administration and the Department of Justice clarified the position that federal
et al., 2013; Anderson et al., 2011, 2012). The most comprehensive
law enforcement resources should not be dedicated to prosecuting persons whose evidence comes from Anderson et al. (2011, 2012), which brings
actions comply with their states’ permission of medical marijuana (Hoffmann and
Weber, 2010). This change in the prosecutorial stance would strengthen the legiti-
macy of existing MMLs and pave the way for the passage of new MMLs.
3
Several more recent MMLs have taken innovative twists that are intended to doctors for certifying patients’ medical need, as a doctor can be charged with a
tighten the regulation on access to medical marijuana. For instance, New York’s felony for prescribing marijuana to an ineligible patient.
4
2014 MML is the first in the U.S. to allow doctors in qualified hospitals to prescribe California is the only exception that allows medical marijuana for any condition
medical marijuana instead of recommending it. By allowing for medical marijuana “for which marijuana provides relief” and leaves the interpretation almost entirely
prescription, the law in effect imposes more responsibility on the participating to the discretion of doctors.
66 H. Wen et al. / Journal of Health Economics 42 (2015) 64–80
together several commonly used data sets and covers an 18-year 1.3.1. Relationship between marijuana use and alcohol use
period from 1993 to 2011. The study findings suggest that imple- Marijuana and alcohol target many common neural pathways in
menting an MML does not lead to a significant increase in marijuana human brains (Maldonado et al., 2006). On the one hand, marijuana
use among youths. Compared to the literature on youth marijuana use produces rewarding and sedative effects that are comparable
use, the existing literature on the adult population is relatively thin to the effect of alcohol use (Boys et al., 2001; Heishman et al., 1997),
and limited in scope and rigor (e.g., Harper et al., 2012; Anderson especially low-dose alcohol consumption6 (King et al., 2011). In this
et al., 2011). case, when MML lowers the cost of marijuana use, an individual
In addition to the potential heterogeneity in the response to an may substitute marijuana for alcohol to achieve a similar expe-
MML across age groups, MMLs may not be treated as a homoge- rience such as a general sense of well-being, with perhaps fewer
nous set of laws between states and across time. The variation immediate negative physical symptoms (e.g. hangovers).
in specific states laws implemented during different periods may On the other hand, the overall intoxication experience may
help reconcile the mixed findings from the previous studies. To be enhanced by the simultaneous use of marijuana and alco-
explore this potential heterogeneity, a recent study by Pacula hol together. Evidence suggests that ethanol, especially when
et al. (2013) uses the same data sets as Anderson et al. (2011, consumed in high doses, can facilitate the absorption of delta 9-
2012) but replaces a single dichotomous MML indicator with a tetrahydrocannabinol (THC) (Boys et al., 2001). In a randomized
set of indicators that represent key provisions of MMLs. Although control trial (RCT) conducted by Lukas and Orozco (2001), partici-
none of the estimates using a dichotomous MML indicator are pants reported significantly more episodes and longer durations of
significant, the MMLs that include a provision requiring patient euphoria when consuming marijuana together with high doses of
registry/renewal are found to lower the marijuana use rates and alcohol. The enhanced euphoria following simultaneous consump-
marijuana-related treatment admissions. This protective effect of tion of alcohol and marijuana may subsequently lead to a greater
the patient registry/renewal requirement, however, is offset by urge to drink even more. Such a scenario points toward a competing
another provision of MMLs that allows licensed retailors to dis- hypothesis that marijuana and alcohol, especially high-dose alco-
pense marijuana to medical marijuana patients. The third MML hol consumption, are complements rather than substitutes. In this
provision this study examines, the home cultivation provision, has case, an MML may result in the increased use of both substances.
inconsistent and sometimes counterintuitive effects on marijuana The takeaway of these pharmacologic findings is that whether
use. These study findings are informative as to the importance of marijuana and alcohol are substitutes or complements may depend
distinguishing between MML provisions and recognizing the vari- on individual motives for substance use. For instance, those who
ation in state MMLs. A caveat, however, is that although Pacula only expect a mild feeling of happiness and relaxation from sub-
et al. (2013) take a more nuanced approach to the classification of stance use may consume one of the substances in place of the other.
MMLs, they lump youths and adults together in their full-sample In contrast, those seeking intense euphoria would consume the two
analysis. As a result, the aforementioned age heterogeneity may be substances together, perhaps in higher doses.
obscured.
1.3.2. Relationship between marijuana use and other substance
use
Marijuana is also widely portrayed as a “gateway” drug, essen-
1.3. Spillover from marijuana use to the use of alcohol and other
tially inducing the use of drugs with more serious health, legal and
substances
social consequences (Kandel, 1975, 2002). One hypothesized path-
way is through pharmacological mechanisms: once users tolerate
On top of the spillover of marijuana use from medical marijuana
the psychoactive effects of marijuana use, they may crave and seek
patients to the non-patient population, the potential interdepen-
out more powerful drugs with more intense and longer-lasting
dence of substance use may lead to a further spillover from
effects. This pharmacological mechanism would thus predict an
marijuana use to the use of other psychoactive substances.5 Assum-
increase in subsequent use of hard drugs such as heroin and cocaine
ing marijuana has a downward sloping demand curve, the effect
attributable to the implementation of an MML.
of an MML on marijuana use should be unequivocally positive.
An alternative to this pharmacological mechanism is that the
The effect on other substance use, however, can be positive or
observed sequence from marijuana use to hard drug use may sim-
negative, depending on the relative magnitude of the income
ply reflect common predisposing factors rooted in genes or in the
and substitution effects (Chaloupka and Laixuthai, 1997; Pacula,
environment coupled with an exposure opportunity mechanism
1998). Specifically, contemporaneous substitution of marijuana for
through which marijuana users may be introduced to a shared
another substance in response to the implementation of an MML
market or subculture of hard drugs (Morral et al., 2002; Wagner
is most likely to occur for substances that have pharmacologi-
and Anthony, 2002a). If predisposing factors and exposure oppor-
cal effects most similar to that of marijuana. A complementary
tunities are the primary mechanisms that lead users to transition
relationship, on the other hand, is most likely to occur between
from marijuana use to hard drug use, an MML should not result in
marijuana and another substance if their combined use produces a
an increase in hard drug use because the predisposing factors and
synergistic interaction (Moore, 2010). In addition to the contempo-
exposure opportunities7 for hard drug use remain unaffected.
raneous relationship between marijuana use and other substance
In contrast to the concern about MML’s “gateway” effect, there
use, there may also be a progression from the demand for mari-
has been evidence that increased access to medical marijuana
juana to the craving and thus future demand for a more powerful
resulting from an MML may benefit certain individuals by reducing
substance with more intense and longer-lasting effects (Kandel,
their opioid use. For instance, marijuana may provide analgesia for
1975, 2002).
6
High-dose alcohol consumption, in contrast, tends to lower sedation and
heighten stimulation (King et al., 2011).
5 7
However, if the increased marijuana use arising from an MML is not for recre- The existing MMLs help marijuana users gain access to the drug through medi-
ational purpose (i.e., “intoxication”) but for medical purpose only, the use of other cal marijuana dispensaries and home cultivation, which are unlikely to expose the
substances is unlikely to be affected. marijuana users to the market or subculture of hard drugs.
H. Wen et al. / Journal of Health Economics 42 (2015) 64–80 67
patients with chronic pain (Lynch and Campbell, 2011). Thus, those Within the context of MMLs, Anderson et al. (2013) provide evi-
who have already received opioid pain medication may experience dence that states with MMLs see a reduction in alcohol-related
improved pain relief and lower their opioid dose after they com- traffic fatalities, alcohol consumption and beer sales. However, the
mence medical marijuana treatment. In addition, those who would authors do not have data on changes in marijuana use, thus their
have otherwise initiated opioid analgesics may choose medical findings do not necessarily imply that marijuana is a substitute for
marijuana instead (Abrams et al., 2011). Furthermore, marijuana (or a complement to) alcohol. In fact, when taking into account
may also benefit those with opioid misuse (i.e., non-medical use) the key provisions of MMLs, the replication study by Pacula et al.
by easing withdrawal symptoms and facilitating recovery (Scavone (2013) suggests that the findings from traffic fatalities and alcohol
et al., 2013). Therefore, one would expect states with MMLs to consumption are more consistent with a complementarity hypoth-
see a reduction in prevalence of opioid use, or other downstream esis. Nonetheless, the authors are only able to assess two outcomes
benefits such as reduced overdose mortality (Bohnert et al., 2011; related to alcohol consumption, which limits the scope of their
Bachhuber et al., 2014). study.9
Another piece of evidence in the context of MMLs comes from
Bachhuber et al. (2014), which assesses the mortality rate related
1.4. Literature on the relationship between marijuana use and to opioid overdose. The authors find a 25 percent reduction in the
the use of alcohol and other substances annual rate of opioid overdose mortality between 1999 and 2010 in
states with MMLs compared to those without such laws. However,
Through increased marijuana use, a further consequence of an the unaccounted state heterogeneity in the underlying prevalence
MML could also be the spillover to alcohol use and the use of other of opioid use or trajectory of overdose deaths may also contribute
psychoactive substances. Identification of the spillover effect in to the reduced mortality rate. Therefore, the reduction in opioid
an observational study hinges on the isolation of the exogenous overdose mortality rate may not necessarily imply a substitution
variation in substance use arising from policy/price shocks from between marijuana and opioids.
the endogenous variation due to “common factors” or “exposure In sum, the majority of the literature on the relationship
opportunities.” between marijuana use and the use of alcohol and other substances
Previous studies have exploited changes in state excise taxes relies on policy/price shocks other than MMLs for identification.
on beer (Pacula, 1998), the minimum legal drinking age (MLDA) Evidence from this line of literature is inconsistent and may not
(DiNardo and Lemieux, 2001; Yörük and Yörük, 2011, 2013; Crost extrapolate to the effect of an MML. Existing literature in the con-
and Guerrero, 2012) composite market prices of alcohol (Saffer text of MML, however, is relatively thin and limited in scope and
and Chaloupka, 1999) and market prices of cocaine (Saffer and rigor.
Chaloupka, 1999; DeSimone and Farrelly, 2003) to tease out the
exogenous changes in the use of alcohol or cocaine as well as the
1.5. Significance of our study
downstream use of marijuana. Although they generally find a direct
policy/price effect on the use of the target substance itself (e.g., alco-
To inform the current debate on MMLs and marijuana lib-
hol and cocaine) that follows a downward sloping demand curve,
eralization policies in general, we examine the effect of state
the downstream effect on marijuana use is mixed. Chaloupka and
implementation of MMLs between 2004 and 2012 on marijuana
Laixuthai (1997), DiNardo and Lemieux (2001), Crost and Guerrero
use, alcohol use, pain medication misuse, and hard drug use in both
(2012), and Crost and Rees (2013) find evidence for a substitution
adolescent and adult populations. Our study advances the existing
between marijuana and alcohol. However, Pacula (1998), Saffer
literature by: (i) providing one of the first estimates of the effect of
and Chaloupka (1999), and Yörük and Yörük (2011) find evidence
MML implementation on adult marijuana use based on micro-level
supporting the complementarity hypothesis between marijuana
nationally-representative data, as well as the updated estimates for
and alcohol. Moreover, evidence from Saffer and Chaloupka (1999)
adolescent marijuana use based on the most recent data; (ii) esti-
and DeSimone and Farrelly (2003) suggests a complementarity
mating the effect of MML implementation on a variety of substance
between marijuana and cocaine.
use outcomes with differential elasticities and expected harms; (iii)
Not only is there a lack of consistent evidence, it is also dif-
estimating the contemporaneous relationship between marijuana
ficult to extrapolate the effect of an MML on the use of other
and alcohol and other substances within the context of MMLs; (iv)
substances from the estimated reduced-form effect of policy/price
estimating explicitly the heterogeneous policy effects of key MML
related to the other substances on the use of marijuana. This diffi-
provisions between different age groups.
culty arises out of the nature of the underlying Marshallian demand
function, which does not require symmetric relationships between
2. Methods
substances (i.e., from substance A to B vs. from substance B to A), nor
does it require symmetric responses to policy/price changes (i.e.,
2.1. Data sources
permissive policy/lower price vs. restrictive policy/higher price).
Thus it is possible for marijuana to be a substitute for alcohol when
We pooled nine years of cross-sectional data from a restricted-
alcohol regulations become more restrictive but for alcohol be a
access version of the National Survey on Drug Use and Health
complement to marijuana when marijuana policies become more
(NSDUH) 2004–2012 (CBHSQ, 2013). NSDUH is a nationally and
permissive.8
9
The first outcome in Pacula et al. (2013), any current alcohol use, may not carry
8
This asymmetric relationship between marijuana use and alcohol use may come as much weight as binge or heavy drinking in terms of health consequences and
into play in the context of the minimum legal drinking age (MLDA) vs. an MML: a policy implications, especially for adults of legal drinking age. The other outcome,
teenager under the MLDA cannot legally acquire either alcohol or marijuana and specialty alcohol abuse treatment admissions, may not show a clear picture of the
may resort to illegal supply channels, whereas an experienced marijuana user living alcohol abuse/dependence prevalence, since more than 90 percent of Americans
in a MML state can get both marijuana and alcohol with little effort. In essence, when who suffer from alcohol abuse/dependence do not receive any treatment for their
identifying the relationship between marijuana and alcohol, using different policies conditions. Furthermore, a large proportion of those receiving the treatment only
may capture the decisions made by different groups from different choices set. Thus, receive it in a self-help group (e.g., Alcoholics Anonymous) or in a primary care
the results from one policy may not applicable to another policy setting. setting as opposed to a specialty alcohol abuse treatment setting (SAMHSA, 2013).
68 H. Wen et al. / Journal of Health Economics 42 (2015) 64–80
state-representative10 survey sponsored by the Substance Abuse during the past month.15 We created the following measures for
and Mental Health Services Administration (SAMHSA), and the alcohol use: (i) the total amount of drinks consumed during the
primary source of information on substance use behavior by past month,16 (ii) the unconditional frequency of binge drinking
the U.S. civilian, noninstitutionalized11 population aged 12 or days, and (iii) the probability of being classified as having alcohol
above. The majority of the NSDUH interview is conducted by self- abuse/dependence during the past year according to the DSM-IV
administrated audio computer-assisted self-interviewing (ACASI), criteria. We also created two dichotomous indicators to assess: (iv)
a highly private and confidential mode that encourages honest whether a respondent engaged both in marijuana use and in binge
reporting of substance use and other sensitive behaviors (Johnson drinking during the past month, and (v) whether a respondent used
et al., 2010). The response rates range from 73 percent to 76 percent marijuana while drinking alcohol (i.e., on the same occasion) dur-
between 2004 and 2012. ing the past month.17 These two measure of simultaneous use of
marijuana and alcohol can provide further insight into the contem-
poraneous complementarity between the two substances.
2.2. Variable measurement
Table 1
Implementation and key provisions of state medical marijuana laws (MMLs).
1996–2003 (8 states)
California 1996/11 1996/11 1996/11 n/a 1996/11 1996/11
Washington 1998/11 1998/11 1998/11 n/a n/a n/a
Oregon 1998/11 1998/12 1998/12 2007/01 n/a 1998/12
Alaska 1998/11 1999/03 1999/03 1999/03 n/a 1999/03
Maine 1999/11 1999/12 n/a 2009/12 2009/12 1999/12
Hawaii 2000/06 2000/12 2000/12 2000/12 n/a 2000/12
Colorado 2000/11 2001/06 2001/06 2001/06 2001/06 2001/06
Nevada 2000/11 2001/10 2001/10 2001/10 n/a 2001/10
Note:
Maryland passed two laws in 2003 and in 2011 favorable to medical marijuana, albeit not legalizing it.
a
Despite the allowance for retail medical marijuana dispensary under the laws, only four states actually opened their first dispensaries between 2004 and 2012, including
Colorado (2005/07), New Mexico (2009/06), Maine (2011/04), and New Jersey (2012/12).
b
The effective date of New Jersey MML is 2010/07 as specified in the statute, while the state governor Chris Christie delays its implementation.
c
Most sections of Connecticut MML came into effect from its passage (2012/05), while a few sections on 2012/10.
In addition to examining the effect of implementation of an One major policy change during the study period concerns state
MML as a whole, Pacula et al. (2013) recognize the importance of implementation of beer taxes.21 The other policy change is mari-
scrutinizing the potential heterogeneous effects between individ- juana decriminalization/depenalization: Massachusetts, California,
ual components of an MML. As highlighted in their study, four key and several cities and counties in other states relaxed penalties for
components that may be included in an MML and lead to hetero- recreational marijuana use or placed it “the lowest law enforce-
geneity in the policy effect are: (i) “non-specific pain” provision, ment priority.” We therefore created a dichotomous indicator for
which lists a generic “chronic pain” in the eligible conditions for the implementation of a decriminalization/depenalization policy
medical marijuana, rather than specifying diseases causing the in a given state during a given month.22 Table 2 provides descrip-
pain; (ii) “patient registry” provision, which requires a patient tive summary for the individual-level and state-level covariates
registry/renewal system; (iii) “retail dispensary” provision, which discussed above.
allows licensed marijuana retailors to dispense marijuana legally
to medical marijuana patients; and (iv) “home cultivation” pro- 2.3. Identification strategy
vision, which allows qualified patients and caregivers to grow a
certain amount of marijuana plants indoors for the patients’ own To identify the effect of MML implementation on individual
medical use. Accordingly, we created four indicators each rep- marijuana use, alcohol use, pain medication misuse, and hard drug
resenting the inclusion of a key MML provision. Note that for use, we estimated the following two-way fixed effects models:
an MML state, the inclusion date of a MML provision may differ
from the effective date of the MML, as the state may include the Y ist = ˇ0 + ˇ1 MML st + ˇ2 X 1ist + ˇ3 X 2st + s + t + s t + εist (1)
provision in the original statute, add it in a subsequent amend- where i denotes an individual, s denotes the state, and t denotes the
ment, or not include it in the law until the end of the study year. Yist represents the substance use outcomes. MMLst is the pol-
period. icy indicator for the implementation of an MML in a state s during
a year t. X1ist is the full vector of individual-level covariates. X2st is
2.2.5. Covariates the full vector of state-level covariates. The two-way fixed effects
We controlled for individual-level and state-level factors that are captured in our models by s and t to account for the time-
are correlated with both the individual choice to use substances invariant state heterogeneity as well as the national secular trend
and with state decisions about MMLs. Individual-level covariates
for adolescents and adults include a rich set of sociodemographic
characteristics. State-level covariates include three time-varying 21
We did not control for the market price of heroin or cocaine. The most com-
measures reflecting the fluctuation in state economic conditions: monly used source is the U.S. Drug Enforcement Administration’s System to Retrieve
(i) unemployment rate, (ii) average personal income, and (iii) Information from Drug Evidence (STRIDE) data set. Empirical studies often find that
STRIDE prices are not predictive or only weakly predictive of drug use (Horowitz,
median household income of the state, as well as two additional 2001). As French and Popovici (2011) pointed out, “part of difficulty here is that con-
measures reflecting relevant changes in state policy environment. ventional prices for illicit drug are not readily available and alternative measures
are not yet found.” Nonetheless, fluctuations in heroin prices and cocaine prices
are unlikely to be correlated with the MML implementation, thus omitting these
variables is unlikely to bias our results.
22
the estimated policy effects on the main outcomes are very similar across the mod- For lack of policy variations during the study period, the effect of a decriminal-
els. ization/depenalization policy itself cannot be precisely estimated.
70 H. Wen et al. / Journal of Health Economics 42 (2015) 64–80
Table 2
Descriptive summary of individual- and state-level covariates, sampling-weight adjusted.
MML states No and always MML states MML states No and always MML states
and common shocks related to substance use. We also included We stratified the sample into two age groups, adolescents and
state-specific linear time trends s t to account for the unobserved young adults aged 12–20 (N ≈ 269,500) and adults aged 21 or above
state-level factors that evolve over time at a constant rate (e.g., (N ≈ 323,900). We chose age 21 as the cut-off point in light of the
social norms and public sentiments related to substance use). previous evidence of an age 21 discontinuity in both alcohol use
Standard errors were clustered at the state level to correct and marijuana use (Crost and Guerrero, 2012; Yörük and Yörük,
for the serial correlation. The clustered standard errors allow for 2011, 2013). We tested four cut-off points in our analyses, age 18,
arbitrary within-state correlation in error terms but assume inde- age 21, age 25 and age 30. Only the age 21 stratification, which
pendence across the states (Bertrand et al., 2004).23
also coincides with the legal drinking age, produces significant and least three years afterwards. Among adolescents and young adults
meaningful differences in the estimated policy effect between age aged 12–20, however, the corresponding trend in past-month mar-
groups. ijuana use rates is not consistent. Bear in mind that the relative
We estimated Probit regressions for the dichotomous depend- trends shown in Fig. 1 are equivalent to unadjusted DD estimates
ent variables in our study. The other three discrete dependent that only partial out the two-way fixed effects (i.e., time-invariant
variables we study (i.e., the conditional frequency of marijuana state heterogeneity and national secular trend in past-month mar-
use days, the number of alcohol drinks, and the unconditional fre- ijuana use), but do not adjust for the individual- and state-level
quency of binge drinking days) possess positive skewness and/or covariates or state-specific linear trends. Nonetheless, this observa-
“excess zeroes” compared to a standard normal distribution, which tional trend-comparison suggests a potential association between
requires a more flexible estimation approach than an ordinary least MML implementation and increased current marijuana use among
squares (OLS) estimation. A generalized linear model (GLM) with adults aged 21 or above, but not among adolescents and younger
a gamma distribution and log link24 was estimated for the total adults.
amount of drinks during the past month among those aged 21 or Table 3 presents the marginal effects of MML implementation
above. For the total amount of drinks among those aged 12–20, on the four marijuana use outcomes, adjusted for the two-way fixed
on the other hand, we estimated a two-part model using Pro- effects, the full vector of individual- and state-level covariates, and
bit in the first part and GLM (gamma distribution and log link) the state-specific linear trends. Among adults aged 21 or above,
in the second part. Because there is an explicit decision process the implementation of an MML increases the probability of using
regarding legality of alcohol consumption among those under 21, marijuana during the past month by 1.32 percentage points (Panel
we use the TPM to model the decision to engage in underage B, Column 1, Row 1). This percentage point change can be trans-
drinking and the quantity consumed conditional upon deciding lated into a 14 percent relative increase from a baseline predicted
to engage in underage drinking as separate processes. We fol- marijuana use probability of 9.33 percentage points.
lowed the same logic when estimating the frequency variables. The NSDUH data do not allow us to distinguish between medical
Considering the underlying decision processes and the propor- marijuana patients and the non-patient population. Nonetheless,
tions of zero values, we estimated a zero-truncated negative according to the registry data (Anderson et al., 2013), the number
binomial regression25 for the conditional frequency of marijuana of registered medical marijuana patients accounts for an average of
use days and a zero-inflated negative binomial regression26 for 0.8 percent of the population across the five MML states on which
the unconditional frequency of binge drinking days in both age the registry information is available. Therefore, the 1.3 percentage
groups. point increase in the probability of marijuana use we find among
For ease of interpretation, we converted the coefficient of MMLst adults aged 21 or above is not likely to come exclusively from
in each of the estimations to the average marginal effect calculated an increase in use among registered patients. Though we cannot
at MMLst = 0 and the observed values of other covariates. test this directly, it suggests that there may also be a considerable
spillover effect of MML implementation on recreational marijuana
3. Results use or self-medication by the non-patient population.
Among adults aged 21 or above, we also find a 0.58 percent point
3.1. Estimated effect of MML implementation on marijuana use or a 15 percent increase in the probability of almost daily/daily
marijuana use (Panel B, Column 2, Row 1) attributable to MML
Fig. 1 shows an upward trend in past-month marijuana use rates implementation. Among adolescents and young adults aged 12–20,
among adults aged 21 or above in parallel with the implementation in contrast, no change in the probability or frequency of past-month
of MMLs. A relative increase in adult marijuana use in MML states marijuana use can be attributed to MML implementation (Panel A,
emerges immediately after the laws take effect, and persists at Columns 1–3).
With regard to marijuana use initiation during the preceding
year, MML implementation leads to 0.32 percentage point or a
5 percent increase in the probability of first-time marijuana use
24
The selection of distribution family under the GLM was made based on the among adolescents and young adults aged 12–20 (Panel A, Col-
modified Park test results.
25
The likelihood ratio test for overdispersion rejects a Poisson distribution in favor
umn 4, Row 1). Yet, the lack of a policy effect on the probability
of a binomial distribution. and frequency of past-month marijuana use among this age group
26
The likelihood ratio tests for overdispersion reject a Poisson distribution in favor suggests that many of these young people may be engaging in
of a binomial distribution. Furthermore, the Vuong tests for zero-inflation confirm experimental use with relatively low health, behavioral, and social
our choice of a zero-inflated model instead of an ordinary negative binomial model.
consequences. In other words, these findings are consistent with a
The zero-inflated Poisson/negative binomial model assumes that the sample con-
sists of two distinct groups of people: one group whose counts are generated by the scenario in which adolescents and young adults aged 12–20 who
standard Poisson/negative binomial model, and the other group, so-called “abso- experiment with marijuana use in response to an MML are not
lute zero” group, who have zero probability of a count greater than zero; observed transitioning to regular use, at least in the short term.
zeroes can come from either group (Greene, 2011; Wang, 2003). The absolute zero In contrast to the findings among adolescents and younger
group, in our case, may be those who abstain from alcohol for religious, cultural,
familial or other reasons. Thus, this group of people, as distinct from the majority
adults, we find no change in marijuana use initiation among those
of people who drink alcohol at least occasionally, have “absolute zero” risk of binge aged 21 or above (Panel B, Column 4) as a result of MML imple-
drinking. mentation, despite the aforementioned significant increases in any
An alternative to a zero-inflated regression is a hurdle model (i.e., a TPM for counts) past-month marijuana use and almost daily/daily use (Panel B,
with first-part Probit and second-part zero-truncated negative binomial. A practical
Columns 1 and 2). These findings suggest that the adults who
challenge, however, is that cluster-adjusted standard errors are difficult to compute
when combining the first- and second-part estimates from a hurdle model (Belotti respond to an MML by increasing current and regular use come
et al., 2014). Nonetheless, the point estimates for the combined effects we obtained largely from those who first tried marijuana long before its medi-
from the hurdle models (not shown) were very similar to the zero-inflated negative cal use was permitted. After the introduction of an MML that helped
binomial estimates from our main analyses. In another set of sensitivity analyses, we reduce costs of marijuana use (i.e., market prices as well as non-
also treated the count variables as continuous and estimated the combined marginal
effects and their cluster-adjusted standard errors using the STATA command “TPM”
market health, legal and social consequences), those with prior
(Belotti et al., 2014). The TPM estimates (not shown) were slightly larger and more marijuana use experience would likely reinitiate or increase their
significant than the zero-inflated negative binomial estimates. marijuana use.
72 H. Wen et al. / Journal of Health Economics 42 (2015) 64–80
Fig. 1. Pre- and post-trend in past-month marijuana use rates in medical marijuana law (MML) states relative to the control states. Note: The differences in past-month
marijuana use rate are equivalent to unadjusted difference-in-differences (DD) estimates that partial out the two-way fixed effects, but not adjust for individual- and state-
level covariates or state-specific linear trends. The time 0 is centered at the period when each medical marijuana law (MML) state started to implement its law, so the time
1 represents the first full month subsequent to the effective date of an MML. We calculate the differences between each of the MML state and the control states during each
month, and average them across all 10 MML states and over a 3-month period (to smooth the fluctuations in the monthly rate). Whiskers indicate 95% confidence intervals.
3.2. Estimated effect of MML implementation on alcohol use Among adolescents and young adults aged 12–20, we find no
significant change in any measure of alcohol use (Panel A), which
To the extent that alcohol is a complement or substitute to mar- suggests that the increased marijuana use initiation we reported
ijuana, the effect of MML implementation on marijuana use may previously is unlikely to spread to underage drinking.
spread to alcohol use (Table 4). Our estimates indicate that, among
adults aged 21 or above, MML implementation is not associated 3.3. Immediate and delayed effect of MML implementation on
with the total number of drinks (Panel B, Column 1), but positively other downstream outcomes
associated with the frequency of binge drinking. Our estimates
indicate an effect size of 0.16 more binge drinking days or a relative In addition to marijuana use and binge drinking, MML
increase of 10 percent (Panel B, Column 2, Row 1). The spillover implementation may have a spillover effect on marijuana
increase in binge drinking implies a complementary relationship abuse/dependence, alcohol abuse/dependence, non-medical use of
between marijuana use and high-dose alcohol consumption among prescription pain medication, and the use of hard drugs such as
adults aged 21 or above. Not only is this contemporaneous com- heroin and cocaine. The progression from marijuana use and binge
plementarity reflected in the independent measures of marijuana drinking to these downstream outcomes may be a gradual transi-
use and binge drinking, it is further confirmed by the measure of tion (Wagner and Anthony, 2002b). As such, we estimated not only
simultaneous use of the two substances. Among adults aged 21 or the contemporary policy effect but also the one-year and two-year
above, we find a 1.44 percentage point or a 22 percent increase in lagged policy effect (Table 5).
the probability of both marijuana use and binge drinking during The effect arguably most salient to the public health impli-
the past month (Panel B, Column 3, Row 1) and a 0.82 percent- cations of MMLs is the effect on marijuana abuse/dependence
age point or a 18 percent increase in the probability of marijuana among adults aged 21 or above. We found a delayed policy effect
use while drinking (i.e., in the same occasion) as a result of MML on increasing the probability of marijuana abuse/dependence by
implementation (Panel B, Column 4, Row 1). a relative 10 percent (Panel B, Column 1, Rows 2 and 3). The
H. Wen et al. / Journal of Health Economics 42 (2015) 64–80 73
Table 3
Estimated marginal effect of implementation and provisions of medical marijuana laws (MMLs) on marijuana use.
Note:
Standard errors in parentheses are clustered at the state level.
Baseline predicted mean is calculated as the average of predicted probabilities/counts when setting MMLst to 0 and leaving the other covariates as the observed values.
*Significant at the 10 percent level.
**
Significant at the 5 percent level.
***Significant at the 1 percent level.
Table 4
Estimated marginal effect of implementation and provisions of medical marijuana laws (MMLs) on alcohol use.
Note:
Standard errors in parentheses are clustered at the state level.
Baseline predicted mean is calculated as the average of predicted probabilities/counts when setting MMLst to 0 and leaving the other covariates as the observed values.
*
Significant at the 10 percent level.
**
Significant at the 5 percent level.
***
Significant at the 1 percent level.
increase in marijuana abuse/dependence of such magnitude is of 3.4. Policy heterogeneity between key MML provisions
concern. It suggests that those who used marijuana in response
to MML implementation are at high risk of progressing to abuse/ Our main estimates, in essence, capture the average policy effect
dependence. across all ten MMLs implemented between 2004 and 2012. How-
For both age groups, we found neither an immediate nor a ever, the policy effect of each of these laws may not necessarily
delayed effect of MML implementation on other downstream out- have the same magnitude or even the same direction. As noted by
comes including alcohol abuse/dependence, non-medical use of Pacula et al. (2013), four key MML provisions, namely the ambiguity
prescription pain medication, heroin use and cocaine use. in “non-specific pain”, the requirement for patient registry/renewal
74 H. Wen et al. / Journal of Health Economics 42 (2015) 64–80
Table 5
Estimated immediate and delayed marginal effect of implementation of medical marijuana laws (MMLs) on marijuana abuse/dependence, alcohol abuse/dependence,
prescription pain medication misuse, cocaine use, and heroin use.
Note:
Standard errors in parentheses are clustered at the state level.
Baseline predicted means in square brackets are calculated as the average of predicted probabilities/counts when setting MMLst to 0 and leaving the other covariates as the
observed values.
*
Significant at the 10 percent level.
**
Significant at the 5 percent level.
***Significant at the 1 percent level.
system, the allowance for retail dispensaries, and the permission 3.5. Policy endogeneity of MML adoption
for home cultivation, may have different implications for peo-
ple’s marijuana use behavior. Specifically, the “patient registry” There is a geographic concentration of MMLs states that have
provision may in effect reduce marijuana use in the general pop- adopted MMLs are all in the West and Northeast. This geographic
ulation. This protective effect of the “patient registry” provision, similarity raises concern that there may be some past disturbances
however, can be offset by the effect of “retail dispensary” provi- in marijuana use in these regions leading to their adoption of
sion which increases marijuana use significantly. In contrast to MMLs and not accounted for by the state fixed effects and the
Pacula et al. (2013), our study finds no consistent protective or state-specific linear trends. In other words, MML adoption may
offsetting effect in either provision (Tables 3 and 4, Panels A and be endogenous to marijuana use. To check for this potential pol-
B, Rows 2–4). A plausible explanation is the discrepancy between icy endogeneity, specifications with a series of lagged and leading
the time when a “patient registry” provision or a “retail dispen- indicators for adopting an MML were estimated for the probabil-
sary” provision was included a state’s MML and the time when the ity of past-month marijuana use (Table 6). We find that only the
state’s registry/renewal system or its legal dispensaries actually contemporary and 6-month lagged policy indicators had signifi-
began to operate (Anderson and Rees, 2014). Due to the contro- cant effects, and the indicators for approved but not implemented
versy and complexity surrounding its implementation, the time lag MMLs and the 12-month policy lag had moderate albeit imprecisely
between the effective date of a “retail dispensary” provision and estimated effects. All the leads had small and statistically insignif-
the actual opening of the first medical marijuana store may be par- icant effects (Panel B, Column 2). These estimates suggest that it
ticularly long.27 Although we find no consistent effect of “patient is in fact the policy shock from adopting an MML that drives the
registry” or “retail dispensary”, we observe a consistent and sig- changes in marijuana use, rather than some past disturbances in
nificant effect of the “non-specific pain” provision on increasing marijuana use that drive the adoption of an MML.
marijuana use, binge drinking and simultaneous use of marijuana
and alcohol among adults aged 21 or above. The observed effect
3.6. State-aggregate effect of MML implementation
of “non-specific pain” provision suggests that including a generic
term “chronic pain” in the eligible conditions for medical marijuana
To further check the robustness of our individual-level esti-
may extend the patient base to adults with less severe conditions
mates with regard to serial correlation, we aggregated the data to
or possibly those who pretend to be pain patients. Nonetheless,
the state level and estimated the effect of MML implementation on
considering the limited policy variations across the four MML pro-
state-level prevalence rates of our main individual-level findings.29
visions during our study period, the estimated individual effects of
these provisions should be interpreted with caution.28
policy effect between states by replacing the single indicator for MML implementa-
tion with ten separate indicators for MML implementation in each of the MML states.
27
Anderson and Rees (2014) pointed out that, for instance, Colorado included We find, in most cases, across-the-board significant policy effects in the same direc-
a “retail dispensary” provision in its original MML effective in 2001, but medical tion, albeit with varied effect sizes (Appendix 4). We cannot come to a conclusion,
marijuana dispensaries did not become commonplace until 2009. Moreover, Maine therefore, as to whether the heterogeneous policy effect comes from states’ unique
and Rhode Island added “retail dispensary” provisions to their MMLs in 2009, but experiences with implementing the MMLs or their inclusion/exclusion of certain
the first legal dispensary in Maine did not open until 2011 and the first Rhode Island provisions.
29
dispensary did not open until 2013. In Columns 1 and 3 of Table 7, we clustered the standard errors at the state
28
From a statistical standpoint, a substantial policy effect from one or two states level; while in Columns 2 and 4, we removed the time-series information from the
could potentially account for the overall findings. We tested for the heterogeneous standard errors by averaging the pre-MML data and the post-MML data (Donald and
H. Wen et al. / Journal of Health Economics 42 (2015) 64–80 75
Table 6
Robustness check for policy endogeneity by including policy leads and lags.
(1) (2)
% past-month marijuana use % past-month marijuana use
Note:
Standard errors in parentheses are clustered at the state level.
Baseline predicted mean in square brackets is calculated as the average of predicted probabilities/counts when setting MMLst to 0 and leaving the other covariates as the
observed values.
*
Significant at the 10 percent level.
**
Significant at the 5 percent level.
***Significant at the 1 percent level.
The previously highlighted policy effects on youth marijuana use increases in any marijuana use and regular use come from those
initiation, as well as on adult past-month marijuana use, marijuana who use the drug for legitimate medical purposes, there may still
almost daily/daily use, marijuana abuse/dependence, past-month be possibility that marijuana abuse/dependence would increase as
binge drinking, and simultaneous use of marijuana and alcohol a result of MML implementation. The effect of MML implementa-
remain significant with similar effect size in these state-level esti- tion on marijuana abuse/dependence constitutes a potential public
mates (Table 7). health concern similar to that of prescription drug abuse epidemic
in the U.S. (CDC, 2012).
4. Discussion Second, among those aged 21 or above, we find a spillover
effect of MML implementation on the increasing frequency of binge
Three main pieces of evidence from our study inform the pol- drinking, possibly through increased use of the two substances
icy discussions of MMLs. First, we find a significant effect of MML simultaneously. The complementarity between marijuana use and
implementation on increasing marijuana use. Estimates suggest binge drinking among adults of legal drinking age could magnify
that the populations responsive to MMLs are adolescents and young the expected harms of an MML. As Pacula and Sevigny (2014) com-
adults aged 12–20 who experimented with marijuana for the first mented, “even if consumption (of marijuana) were assumed to rise
time and adults aged 21 or above who tried marijuana prior to by 100 percent, the savings of liberalization policies would dwarf
the introduction of the law. This latter group also has an increased the known health costs associated with using marijuana. However,
risk of progression to almost daily/daily marijuana use and mari- all potential savings . . . could be entirely erased, and tremendous
juana abuse/dependence.30 We caution that even if we assume the losses incurred, if alcohol and marijuana turn out to be economic
complements.” The 10 percent increase in the frequency of binge
drinking and the 18–22 percent increase in the probability of
Table 7
Robustness check for serial correlation by examining state-aggregated data.
Note:
Standard errors in parentheses are clustered at the state level.
Baseline predicted mean in square brackets is calculated as the average of predicted probabilities/counts when setting MMLst to 0 and leaving the other covariates as the
observed values.
a
We average the pre-MML data and the post-MML data (Donald and Lang, 2007) following a two-step procedure described in Bertrand et al. (2004, p. 267). The second-step
equation is estimated based on pre- and post-MML two-period panels of 10 “MML states”. The standard errors are adjusted to take into account the small number of “MML
states” (Donald and Lang, 2007).
*
Significant at the 10 percent level.
**
Significant at the 5 percent level.
***
Significant at the 1 percent level.
simultaneous marijuana and alcohol use31 that we estimated may Third, neither underage drinking among those aged 12–20 nor
result in considerable economic and social costs from downstream other substance use (i.e., non-medical use of prescription pain med-
health care expenditures and productivity loss (Naimi et al., 2003). ication, heroin use and cocaine use) in both age groups is affected
It is worth noting that this implied complementarity between by MML implementation. In this regard, the often-voiced concerns
marijuana use induced by an MML and binge drinking does not about the potential gateway effect of marijuana is not supported by
necessarily contradict a conclusion made by Anderson et al. (2011, our findings. We caution that our study is not intended to refute the
2012) that the implementation of an MML results in reduced traf- gateway hypothesis. Rather it suggests that the gateway effect is
fic fatalities, and that the reduction is more pronounced in those not likely to occur in the context of an MML: for those who respond
involving alcohol. A possible interpretation that may reconcile our to MML implementation and use marijuana, their marijuana use is
findings with theirs is that MML implementation may lead to a not likely to act as a gateway to more dangerous substance use
shift of alcohol consumption from public places such as restaurants through the pharmacological properties of marijuana.32 On the
and bars to one’s own home. Thus, we may see a reduction in the other hand, our findings do not lend support to an area of potential
traffic fatalities, even if the implementation of an MML, in effect, benefits of the law either, which is to benefit those who misuse
increases binge drinking and simultaneous use of both alcohol and opioid pain medication by helping them ease opiate withdrawal
marijuana. The reduced traffic fatalities may result from the fact
that those potential high-risk drivers are now more likely to stay
at home and less likely to engage in driving.
32
Nonetheless marijuana may still be a gateway drug for other marijuana users
through other pathways. For instance, those who use marijuana regardless of the
laws or those who use marijuana in response to decriminalization may progress to
31
The interaction between marijuana and alcohol may magnify the risks posed by hard drug use because marijuana introduces them to a shared market or subculture
the two substances individually (Liguori et al., 2002; Medina et al., 2007). of hard drugs.
H. Wen et al. / Journal of Health Economics 42 (2015) 64–80 77
symptoms and achieve success in early recovery. However, NSDUH Taken together, our study findings provide evidence for a sig-
only includes questions about “non-medical use” of pain medica- nificant effect of MML implementation on increasing marijuana
tion, so we cannot examine the effect of MML implementation on use, and a spillover effect among adults of legal drinking age from
patients who use pain medication according to the prescription. increased marijuana use to increased binge drinking. The findings
The previously documented beneficial effect of an MML on reduc- do not, however, provide evidence to support other types of sub-
ing opioid overdose mortality may primarily come from this group stance use spillovers such as underage drinking, pain medication
of legitimate pain patients.33 An MML may benefit these patients misuse, and hard drug use.
by allowing them to start with medical marijuana treatment in lieu
of opioid pain medication or to switch partially or entirely from Appendix 1. DSM-IV criteria for substance abuse and
opioids to marijuana. Whether and to what extent the legitimate substance dependence
pain patients may benefit from MML implementation merit further
investigation, but are beyond the scope of our study.
A maladaptive pattern of substance use leading to clinically significant A maladaptive pattern of substance use leading to clinically significant
impairment or distress, as manifested by 3 or more of the following impairment or distress, as manifested by 1 or more of the following
occurring at any time in the same 12-month period: occurring at any time in the same 12-month period:
1. Tolerance or markedly increased amounts of the substance to 1. Recurrent substance use resulting in a failure to fulfill major role
achieve intoxication or desired effect or markedly diminished effect obligations at work, school, or home (e.g., repeated absences or poor
with continued use of the same amount of substance. work performance related to substance use; substance-related
2. Characteristic withdrawal symptoms or the use of certain absences, suspensions, or expulsions from school; neglect of children
substances to relieve or avoid withdrawal symptoms. or household).
3. Use of a substance in larger amounts or over a longer period than 2. Recurrent substance use in physically hazardous situations (e.g.,
was intended. driving an automobile or operating a machine when impaired by
4. Persistent desire or unsuccessful efforts to cut down or control substance use).
substance use. 3. Recurrent substance-related legal problems (e.g. arrests for
5. Involvement in chronic behavior to obtain or use the substance, or obtaining or using the substance, substance-related disorderly
recover from its effects. conduct).
6. Important social, occupational or recreational activities given up or 4. Continued substance use despite persistent or recurrent social and
reduced due to substance use. interpersonal problems caused or exacerbated by the substance (e.g.,
7. Continued substance use despite knowledge of a persistent or arguments with spouse about consequences of intoxication, physical
recurrent physical or psychological problem that is likely to have been fights).
caused or exacerbated by the substance.
33
More than 60 percent of the opioid pain medication users receive and take the
drug according to the prescription (Bachhuber et al., 2014).
78 H. Wen et al. / Journal of Health Economics 42 (2015) 64–80
% past-month −0.43 (0.48) −0.52 (0.48) −0.34 (0.43) 1.32** (0.58) 1.22** (0.60) 1.37** (0.59)
marijuana use [10.68] [12.75] [10.39] [9.33] [10.57] [8.90]
% marijuana almost −0.25 (0.17) −0.28 (0.18) −0.21 (0.15) 0.58 **
(0.26) 0.50 **
(0.24) 0.62 **
(0.26)
daily/daily use [3.52] [4.29] [3.40] [3.78] [4.43] [3.56]
Cond. # marijuana use −0.28 (0.45) −0.28 (0.48) −0.23 (0.47) 0.17 (0.64) 0.08 (0.61) 0.29 (0.65)
days [12.29] [12.37] [12.25] [14.15] [14.36] [14.02]
% marijuana use 0.32** (0.16) 0.34** (0.13) 0.31** (0.15) 0.15 (0.23) 0.18 (0.28) 0.15 (0.22)
initiation [6.47] [7.14] [6.32] [0.92] [1.14] [0.95]
# past-month total −0.03 (1.57) −0.06 (1.31) −0.01 (1.46) 0.95 (1.18) 0.90 (1.11) 0.99 (1.20)
drinks [7.76] [7.70] [7.83] [18.69] [18.62] [18.75]
# binge drinking days 0.04 (0.18) 0.01 (0.14) 0.05 (0.18) 0.16** (0.08) 0.14** (0.06) 0.16* (0.09)
[0.66] [0.67] [0.66] [1.52] [1.48] [1.53]
% past-month marijuana −0.63 (0.39) −0.73 (0.46) −0.57 (0.36) 1.44*** (0.35) 1.41*** (0.38) 1.50*** (0.37)
use and binge drinking [6.41] [7.49] [6.30] [6.44] [7.09] [6.21]
% marijuana use while −0.38 (0.49) −0.42 (0.61) −0.34 (0.47) 0.82* (0.45) 0.71 (0.53) 0.84* (0.46)
drinking [4.10] [5.03] [4.01] [4.45] [5.17] [4.24]
% past-month marijuana use −0.43 (0.48) −0.47 (0.65) 1.32** (0.58) 1.19*** (0.40)
[10.68] [11.28] [9.33] [8.50]
% marijuana almost daily/daily −0.25 (0.17) −0.28 (0.25) 0.58** (0.26) 0.53*** (0.13)
use [3.52] [3.76] [3.78] [3.53]
Cond. # marijuana use days −0.28 (0.45) −0.38 (0.67) 0.17 (0.64) 0.19 (0.60)
[12.29] [12.44] [14.15] [14.49]
% marijuana use initiation 0.32** (0.16) 0.33** (0.11) 0.15 (0.23) 0.14 (0.17)
[6.47] [6.58] [0.92] [0.85]
# past-month total drinks −0.03 (1.57) −0.04 (2.32) 0.95 (1.18) 0.76 (0.87)
[7.76] [8.59] [18.69] [16.32]
# binge drinking days 0.04 (0.18) 0.04 (0.17) 0.16** (0.07) 0.15*** (0.04)
[0.66] [0.74] [1.52] [1.49]
% past-month marijuana use −0.63 (0.39) −0.68* (0.37) 1.44*** (0.35) 1.28*** (0.31)
and binge drinking [6.41] [6.86] [6.44] [5.74]
% marijuana use while drinking −0.38 (0.49) −0.43 (0.37) 0.82* (0.45) 0.69* (0.38)
[4.10] [4.44] [4.45] [3.73]
MML implementation 0.32** (0.16) 1.32** (0.58) 0.58** (0.26) 0.25** (0.11) 0.16** (0.08) 1.44*** (0.35)
State MMLs
∼ Vermont 0.57 ***
(0.18) 1.76 ***
(0.21) 2.27 ***
(0.12) 1.19 ***
(0.12) 0.11 **
(0.03) 2.03***
(0.15)
∼ Montana 0.05 (0.20) 4.92*** (0.27) 1.26*** (0.15) 2.51*** (0.15) 0.47*** (0.05) 4.16*** (0.20)
∼ Rhode Island 0.61*** (0.16) −0.64 (0.37) −0.42* (0.23) 0.29** (0.13) 0.12*** (0.03) 0.25* (0.14)
∼ New Mexico −0.04 (0.23) −0.23 (0.17) −0.15 (0.16) 0.16 (0.11) 0.46*** (0.04) 0.08 (0.18)
∼ Michigan 0.43** (0.22) 2.37*** (0.34) 1.44*** (0.22) 0.17 (0.10) 0.22*** (0.06) 1.57*** (0.28)
∼ New Jersey 0.007 (0.19) 1.52** (0.23) 0.94** (0.15) −0.15 (0.11) −0.10 (0.07) 2.17*** (0.18)
∼ District of Columbia 0.33* (0.19) 0.59** (0.27) 0.87*** (0.21) 0.30** (0.13) 0.15** (0.06) 0.70*** (0.21)
∼ Arizona −0.02 (0.20) −0.23 (0.20) 0.47*** (0.10) −0.25 (0.16) −0.12 (0.07) 0.10 (0.12)
∼ Delaware 0.79*** (0.25) 0.39* (0.23) 0.06 (0.13) 0.21* (0.12) 0.27*** (0.04) 0.58** (0.24)
∼ Connecticut 0.21 (0.18) 1.27** (0.21) 0.85** (0.12) 0.59*** (0.11) 0.05 (0.03) 1.09*** (0.17)
References Centers for Disease Control and Prevention (CDC), 2012. CDC grand rounds: prescrip-
tion drug overdoses – a US epidemic. Morbidity and Mortality Weekly Report
Abrams, D.I., Couey, P., Shade, S.B., Kelly, M.E., Benowitz, N.L., 2011. 61 (1), 10.
Cannabinoid–opioid interaction in chronic pain. Clinical Pharmacology & Chu, Y.L., 2014. The effects of medical marijuana laws on illegal marijuana use.
Therapeutics 90 (6), 844–851. Journal of Health Economics 38, 43–61.
American Psychiatric Association (APA) (Ed.), 2000. Diagnostic and Statistical Man- Chaloupka, F.J., Laixuthai, A., 1997. Do youths substitute alcohol and marijuana?
ual of Mental Disorders: DSM-IV-TR. American Psychiatric Publishing, Arlington, Some econometric evidence. Eastern Economic Journal 23 (3), 253–276.
VA. Cohen, P.J., 2010. Medical marijuana 2010: it’s time to fix the regulatory vacuum.
Anderson, D.M., Hansen, B., Rees, D.I., 2011. Medical marijuana laws, traffic fatalities, The Journal of Law, Medicine and Ethics 38 (3), 654–666.
and alcohol consumption. Institute for the Study of Labor (IZA) Discussion Paper Crost, B., Guerrero, S., 2012. The effect of alcohol availability on marijuana use:
Series No. 6112. http://ftp.iza.org/dp6112.pdf (accessed 10.10.13). evidence from the minimum legal drinking age. Journal of Health Economics 31
Anderson, D.M., Hansen, B., Rees, D.I., 2012. Medical marijuana laws and teen mari- (1), 112–121.
juana use. Institute for the Study of Labor (IZA) Discussion Paper Series No. 6592. Crost, B., Rees, D.I., 2013. The minimum legal drinking age and marijuana use: new
http://ftp.iza.org/dp6112.pdf (accessed 10.10.13). estimates from the NLSY97. Journal of Health Economics 32 (2), 474–476.
Anderson, D.M., Hansen, B., Rees, D.I., 2013. Medical marijuana laws, traffic fatalities, DeSimone, J., Farrelly, M.C., 2003. Price and enforcement effects on cocaine and
and alcohol consumption. Journal of Law and Economics 56, 333–369. marijuana demand. Economic Inquiry 41 (1), 98–115.
Anderson, D.M., Rees, D.I., 2014. The role of dispensaries: the devil is in the details. DiNardo, J., Lemieux, T., 2001. Alcohol, marijuana, and American youth: the unin-
Journal of Policy Analysis and Management 33 (1), 235–240. tended consequences of government regulation. Journal of Health Economics
Bachhuber, M.A., Saloner, B., Cunningham, C.O., Barry, C.L., 2014. Medical cannabis 20 (6), 991–1010.
laws and opioid analgesic overdose mortality in the United States, 1999–2010. Donald, S.G., Lang, K., 2007. Inference with difference-in-differences and other panel
JAMA Internal Medicine 174 (10), 1668–1673. data. The Review of Economics and Statistics 89 (2), 221–233.
Belotti, F., Deb, P., Manning, W.G., Norton, E.C., 2014. TPM: estimating two-part Gloss, D., Vickrey, B., 2012. Cannabinoids for epilepsy. Cochrane Database Systematic
models. The Stata Journal, http://econ.hunter.cuny.edu/people/economics- Reviews 6.
faculty/pdeb/ihea-minicourse/tpm-estimating-two-part-models-working- French, M.T., Popovici, I., 2011. That instrument is lousy! In search of agreement
paper/at download/file (accessed 21.03.14; forthcoming). when using instrumental variables estimation in substance use research. Health
Ben Amar, M., 2006. Cannabinoids in medicine: a review of their therapeutic poten- Economics 20 (2), 127–146.
tial. Journal of Ethnopharmacology 105 (1), 1–25. Gostin, L.O., 2005. Medical marijuana, American federalism, and the Supreme Court.
Bertrand, M., Duflo, E., Mullainathan, S., 2004. How much should we trust The Journal of the American Medical Association 294 (7), 842–844.
differences-in-differences estimates? The Quarterly Journal of Economics 119 Greene, W.H., 2011. Econometric Analysis, 5th edition. Prentice Hall, Upper Saddle
(1), 249–275. River, NJ.
Bohnert, A.S.B., Valenstein, M., Bair, M.J., Ganoczy, D., McCarthy, J.F., Ilgen, M.A., Harper, S., Strumpf, E.C., Kaufman, J.S., 2012. Do medical marijuana laws increase
Blow, F.C., 2011. Association between opioid prescribing patterns and opioid marijuana use? Replication study and extension. Annals of Epidemiology 22 (3),
overdose-related deaths. The Journal of the American Medical Association 305 207–212.
(13), 1315–1321. Hathaway, A.D., Comeau, N.C., Erickson, P.G., 2011. Cannabis normalization and
Boys, A., Marsden, J., Strang, J., 2001. Understanding reasons for drug use amongst stigma: contemporary practices of moral regulation. Criminology and Criminal
young people: a functional perspective. Health Education Research 16 (4), Justice 11 (5), 451–469.
457–469. Heishman, S.J., Arasteh, K., Stitzer, M.L., 1997. Comparative effects of alcohol and
Budney, A.J., Roffman, R., Stephens, R.S., Walker, D., 2007. Marijuana dependence marijuana on mood, memory, and performance. Pharmacology Biochemistry
and its treatment. Addiction Science and Clinical Practice 4 (1), 4. and Behavior 58 (1), 93–101.
Campbell, V.A., Gowran, A., 2007. Alzheimer’s disease: taking the edge off with Hoffmann, D.E., Weber, E., 2010. Medical marijuana and the law. New England
cannabinoids? British Journal of Pharmacology 152 (5), 655–662. Journal of Medicine 362 (16), 1453–1457.
Carpenter, C., Dobkin, C., 2009. The effect of alcohol consumption on mortality: Horowitz, J.L., 2001. Should the DEA’s STRIDE data be used for economic analy-
regression discontinuity evidence from the minimum drinking age. American ses of markets for illegal drugs? Journal of American Statistical Association 96,
Economic Journal: Applied Economics 1 (1), 164. 1254–1271.
Center for Behavioral Health Statistics and Quality (CBHSQ), 2013. National Sur- Johnson, T.P., Fendrich, M., Mackesy-Amiti, M.E., 2010. Computer literacy and the
vey on Drug Use and Health, 2004–2011 [data files and code books]. U.S. accuracy of substance use reporting in an ACASI survey. Social Science Computer
Dept. of Health and Human Services (HHS), Substance Abuse and Mental Review 28 (4), 515–523.
Health Services Administration (SAMHSA), Center for Behavioral Health Statis- Kandel, D.B., 1975. Stages in adolescent involvement in drug use. Science 190 (4217),
tics and Quality (CBHSQ), Rockville, MD, https://www.datafiles.samhsa.gov, 912–914.
http://www.icpsr.umich.edu/icpsrweb/SAMHDA (accessed from 26.09.13 to Kandel, D.B. (Ed.), 2002. Stages and Pathways of Drug Involvement: Examining the
21.03.14). Gateway Hypothesis. Cambridge University Press, Cambridge, UK.
80 H. Wen et al. / Journal of Health Economics 42 (2015) 64–80
King, A.C., de Wit, H., McNamara, P.J., Cao, D., 2011. Rewarding, stimulant, and seda- Pertwee, R.G., 2012. Targeting the endocannabinoid system with cannabinoid
tive alcohol responses and relationship to future binge drinking. Archives of receptor agonists: pharmacological strategies and therapeutic possibilities.
General Psychiatry 68 (4), 389–399. Philosophical Transactions of the Royal Society B: Biological Sciences 367 (1607),
Krishnan, S., Cairns, R., Howard, R., 2009. Cannabinoids for the treatment of demen- 3353–3363.
tia. Cochrane Database Systematic Reviews 2. Saffer, H., Chaloupka, F., 1999. The demand for illicit drugs. Economic Inquiry 37 (3),
Liguori, A., Gatto, C.P., Jarrett, D.B., 2002. Separate and combined effects of marijuana 401–411.
and alcohol on mood, equilibrium and simulated driving. Psychopharmacology Scavone, J.L., Sterling, R.C., Van Bockstaele, E.J., 2013. Cannabinoid and opioid inter-
163 (3/4), 399–405. actions: implications for opiate dependence and withdrawal. Neuroscience 248,
Lukas, S.E., Orozco, S., 2001. Ethanol increases plasma delta-9-tetrahydrocannabinol 637–654.
(THC) levels and subjective effects after marihuana smoking in human volun- Sekhon, V., 2009. Highly uncertain times: an analysis of the executive branch’s
teers. Drug and Alcohol Dependence 64 (2), 143–149. decision to not investigate or prosecute individuals in compliance with
Lynch, M.E., Campbell, F., 2011. Cannabinoids for treatment of chronic non-cancer state medical marijuana laws. Hastings Constitutional Law Quarterly 37 (3),
pain: a systematic review of randomized trials. British Journal of Clinical Phar- 553–564.
macology 72 (5), 735–744. Solon, G., Haider, S.J., Wooldridge, J., 2013. What are we weighting for? National
Lynne-Landsman, S.D., Livingston, M.D., Wagenaar, A.C., 2013. Effects of state med- Bureau of Economic Research (NBER) Working Paper No. w18859.
ical marijuana laws on adolescent marijuana use. American Journal of Public Substance Abuse and Mental Health Services Administration (SAMHSA), 2013.
Health 103 (8), 1500–1506. Results from the 2012 National Survey on Drug Use and Health: Summary of
Maldonado, R., Valverde, O., Berrendero, F., 2006. Involvement of the endocannabi- National Findings. U.S. Dept. of Health and Human Services (HHS), Substance
noid system in drug addiction. Trends in Neurosciences 29 (4), 225–232. Abuse and Mental Health Services Administration (SAMHSA), Rockville,
Medina, K.L., Schweinsburg, A.D., Cohen-Zion, M., Nagel, B.J., Tapert, S.F., 2007. MD http://www.samhsa.gov/data/nsduh/2k11results/nsduhresults2011.htm
Effects of alcohol and combined marijuana and alcohol use during adolescence (accessed 10.10.13).
on hippocampal volume and asymmetry. Neurotoxicology and Teratology 29 Wagner, F.A., Anthony, J.C., 2002a. Into the world of illegal drug use: exposure oppor-
(1), 141–152. tunity and other mechanisms linking the use of alcohol, tobacco, marijuana, and
Moore, S.C., 2010. Substitution and complementarity in the face of alcohol-specific cocaine. American Journal of Epidemiology 155 (10), 918–925.
policy interventions. Alcohol and Alcoholism 45 (5), 403–408. Wagner, F.A., Anthony, J.C., 2002b. From first drug use to drug dependence: devel-
Morral, A.R., McCaffrey, D.F., Paddock, S.M., 2002. Reassessing the marijuana gate- opmental periods of risk for dependence upon marijuana, cocaine, and alcohol.
way effect. Addiction 97 (12), 1493–1504. Neuropsychopharmacology 26, 479–488.
Naimi, T.S., Brewer, R.D., Mokdad, A., Denny, C., Serdula, M.K., Marks, J.S., 2003. Binge Wang, P., 2003. A bivariate zero-inflated negative binomial regression model for
drinking among US adults. The Journal of the American Medical Association 289 count data with excess zeros. Economics Letters 78 (3), 373–378.
(1), 70–75. Wechsler, H., Dowdall, G.W., Davenport, A., Rimm, E.B., 1995. A gender-specific
Pacula, R.L., 1998. Does increasing the beer tax reduce marijuana consumption? measure of binge drinking among college students. American Journal of Public
Journal of Health Economics 17 (5), 557–585. Health 85 (7), 982–985.
Pacula, R.L., Kilmer, B., Grossman, M., Chaloupka, F.J., 2010. Risks and prices: the role Yörük, B.K., Yörük, C.E., 2011. The impact of minimum legal drinking age laws on
of user sanctions in marijuana markets. The BE Journal of Economic Analysis & alcohol consumption, smoking, and marijuana use: evidence from a regression
Policy 10 (1), 1–36. discontinuity design using exact date of birth. Journal of Health Economics 30
Pacula, R.L., Powell, D., Heaton, P., Sevigny, E.L., 2013. Assessing the effects of medical (4), 740–752.
marijuana laws on marijuana and alcohol use: the devil is in the details. National Yörük, B.K., Yörük, C.E., 2013. The impact of minimum legal drinking age laws on
Bureau of Economic Research (NBER) Working Paper No. w19302. alcohol consumption, smoking, and marijuana use revisited. Journal of Health
Pacula, R.L., Sevigny, E.L., 2014. Marijuana liberalization policies: why we can’t learn Economics 32 (2), 477–479.
much from policy still in motion. Journal of Policy Analysis and Management 33
(1), 212–221.
Journal of Health Economics 42 (2015) 81–89
a r t i c l e i n f o a b s t r a c t
Article history: This paper analyzes the interaction of direct and indirect risk selection in health insurance markets. It is
Received 15 April 2014 shown that direct risk selection – using measures unrelated to the benefit package like selective adver-
Received in revised form 6 September 2014 tising or ‘losing’ applications of high risk individuals – nevertheless has an influence on the distortions of
Accepted 9 December 2014
the benefit package caused by indirect risk selection. Direct risk selection (DRS) may either increase or
Available online 17 December 2014
decrease these distortions, depending on the type of equilibrium (pooling or separating), the type of DRS
(positive or negative) and the type of cost for DRS (individual-specific or not). Regulators who succeed in
JEL classification:
reducing DRS by, e.g., banning excessive advertising or implementing fines for ‘losing’ applications, may
I13
I18
therefore (unintendedly) mitigate or exacerbate the distortions of the benefit package caused by indirect
L13 risk selection. It is shown that the interaction of direct and indirect risk selection also alters the formula
for optimal risk adjustment.
Keywords: © 2015 Elsevier B.V. All rights reserved.
Risk selection
Risk adjustment
Health insurance
Discrete choice
Imperfect competition
3
van de Ven and van Vliet (1992) provide an extensive list of measures insur-
∗ Tel.: +49 651 2012624. ers may use for risk selection; for differential treatment of low and high risks’
E-mail address: lorenzn@uni-trier.de applications see Bauhoff (2012).
1 4
See van de Ven and Ellis (2000). See Shen and Ellis (2002).
2 5
See Breyer et al. (2011). See Frank et al. (2000), Cao and McGuire (2003) and Ellis and McGuire (2007).
http://dx.doi.org/10.1016/j.jhealeco.2014.12.003
0167-6296/© 2015 Elsevier B.V. All rights reserved.
82 N. Lorenz / Journal of Health Economics 42 (2015) 81–89
risk adjustment has been concerned with improving this underly- Table 1
Effect of DRS with individual-specific cost on the distortion of the benefit package.
ing regression by, e.g., including additional variables or altering the
grouping algorithm for diagnoses in morbidity based risk adjust- Type of equilibrium Positive DRS Negative DRS
ment, so that a larger part of the variance of actual expenditures is Pooling equilibrium Distortion decreases Distortion increases
explained. The larger the explained part of the variance, the closer Separating equilibrium Distortion decreases Distortion decreases
transfers are to actual cost, and the lower the incentives for risk
selection should be.
Initiated by the very influential study of Glazer and McGuire high risks is reduced. Negative DRS on the other hand creates a ‘sub-
(2000), there has developed a small literature that departs from this stitution effect’: If insurers are (somewhat) successful in repelling
statistical approach and instead explicitly models insurers’ incen- the high risks by DRS, they can reduce the degree of IRS. For an
tives for risk selection. One study in this literature has shown that overview of these results see, Table 1.
conventional, i.e. regression-based, risk adjustment may decrease A regulator who succeeds in reducing negative DRS (by, e.g.,
welfare if there is imperfect competition, another, that it may even charging a fine for ‘losing’ applications of high risk individuals) will
increase the extent of risk selection.6 These undesirable effects of therefore simultaneously reduce IRS in the pooling equilibrium, but
conventional risk adjustment exemplify the need for what Glazer unintendedly increase the distortion of the benefit package in the
and McGuire (2000) have termed optimal risk adjustment. They separating equilibrium. The distortions caused by IRS will also be
have shown that a regulator can increase the effectiveness of a risk increased if he succeeds in reducing positive DRS (by, e.g., banning
adjustment scheme by distorting the payments as calculated from excessive advertising), in this case for both the pooling and the
a regression: there has to be overpayment for signals which are separating equilibrium.
correlated with high risks and underpayment for signals which are In three of the four cases, optimal risk adjustment then becomes
correlated with low risks. If the over- and underpayments are cho- even more important. We therefore derive the impact of DRS
sen optimally, incentives for IRS can be eliminated completely.7 on the formula for optimal risk adjustment developed by Glazer
Optimal risk adjustment has also been derived for a setting where and McGuire (2000). We show that the overpayment for a signal
individuals differ in their elasticity to switch insurers or where that indicates a high risk has to be increased exactly by insurers’
insurers are allowed to vary their premium in some dimension, expenditures on positive DRS; likewise, the underpayment has to
as is the case in the insurance exchanges in the US.8 be reduced by the expenditures on negative DRS. With this modi-
A concern, already raised by Glazer and McGuire (2000) them- fication, their formula can eliminate the incentives for IRS even in
selves, is that such over- and underpayments create incentives the presence of DRS.
for DRS regarding the signal, but so far it has not been analyzed In the literature on optimal risk adjustment, some of the results
whether this has an influence on optimal risk adjustment. In fact, regarding the distortions caused by IRS have been derived under
we are not aware of any theoretical study that explicitly models perfect competition, but DRS seems incompatible with such a
the interaction of direct and indirect risk selection, even in the setting where individuals are perfectly informed about all ben-
absence of risk adjustment.9 In this study we therefore develop efit packages and premiums and always choose the insurer that
such a model and show that in general (the degree of) DRS has offers the best benefit package-premium combination. We there-
an influence on the distortions of the benefit package caused by fore derive our results within a discrete choice model, which can
IRS and that this alters the formula for optimal risk adjustment. easily capture different levels of competition. To keep the model
DRS may either increase or decrease the distortions caused by simple, we assume that the benefit package is one-dimensional, but
IRS, depending on whether insurers try to attract the low risks the model can be extended to a multi-dimensional benefit package.
(positive DRS) or to repel the high risks (negative DRS), whether a Also, to simplify the notation when deriving the results, we first
pooling or a separating equilibrium emerges, and whether the cost consider the case of two risk types, but then show that the results
for DRS is individual-specific or not. also hold for an arbitrary number of risk types.
If insurers’ expenditures for DRS are at least to some degree The remainder of this paper is organized as follows: In Section
individual-specific (and not just a fixed cost), they affect risk-type- 2, we introduce the basic discrete choice model and show how DRS
specific cost: Positive DRS increases the cost per low risk, negative can be incorporated in such a model. We analyze the pooling equi-
DRS the cost per high risk. In the first case, the cost difference librium in Section 3 and the separating equilibrium in Section 4.
between the risk types is reduced, in the second, it is increased. Section 5 concludes.
In the pooling equilibrium where both risk types pay the same pre-
mium, positive DRS therefore reduces the incentives for IRS, while
negative DRS increases it. 2. The model
In the separating equilibrium, insurers’ expenditures for posi-
tive DRS translate into a higher premium for the contract offered 2.1. Basic model without DRS
for the low risks; this makes this contract less attractive for the high
risks, so the distortion of the benefit package necessary to repel the Individual preferences regarding the benefit-premium bundle
are given by
u = pr v(m) − R, (1)
6
See Lorenz (2013) and Brown et al. (2011), respectively. In the empirical part of where R denotes the premium and m the level of medical services
their study, Brown et al. (2011) find such an increase in the extent of risk selection (measured in monetary terms). pr is the probability of becoming
for the Medicare Advantage program in the U.S.; however, there has been some
ill, and there are two risk types r = H, L, with pH > pL ; the share of
disagreement on this finding, see Newhouse et al. (2012).
7
See also Glazer and McGuire (2002) and Jack (2006). the low risks is . The utility of receiving medical treatment, v(m),
8
For the first setting, see Bijlsma et al. (2011), and for the second, McGuire et al. is increasing at a decreasing rate, i.e. v (m) > 0 and v (m) < 0. The
(2013) and Shi (2013). efficient level of medical services is implicitly defined by v (mFB ) =
9
Eggleston (2000) derives the optimal mix of supply and demand side cost shar- 1.
ing for a setting with a single (semi-altruistic) HMO that can influence the level of
medical services (according to the outcome of a patient–provider bargaining pro-
There are n insurers j, each offering a benefit-premium bundle
cess) and can dump a share of the high risks at some cost; however, there is no {mj , Rj }. The individual’s decision of which insurer to choose may,
competition as there is only one provider. however, not only depend on these benefit-premium bundles, but
N. Lorenz / Journal of Health Economics 42 (2015) 81–89 83
also on some other factors, like perceived friendliness of personnel, 2.2. The model with positive and negative DRS
location, or which insurer was recommended by family or friends.
In a discrete choice model, these other factors are captured by aug- We consider positive DRS to be an activity each insurer is
menting the individual’s utility as given in (1) by an individual- and engaged in which generates some cost and increases the proba-
j
insurer-specific utility component εi ; the utility of an individual i bility of being chosen by the individual (or group of individuals)
(being of risk type r) when choosing an insurer j therefore is the activity is targeted at. We model this increase in the probability
of being chosen to stem from an increase in the utility the individ-
j
ui (mj , Rj ) = pr v(mj ) − Rj + εi . (2) ual receives, which may either be real (as, e.g., with a discount for
a fitness club membership) or just perceived (as with advertising).
j
If εi is assumed to be i.i.d. extreme value, the logit model with its We denote the cost by aj and the increase in utility by g(aj ), where
analytically tractable choice probabilities arises. Denoting risk type g(aj ) is increasing and concave.13 With positive DRS, the (perceived)
r’s utility of the benefit-premium bundle offered by insurer j by utility of individual i choosing an insurer j therefore is
j
j
Vr = pr v(mj ) − Rj , ui (mj , Rj ) = pr v(mj ) − Rj + g(aj ) + εi , (4)
j j so that insurer k’s market share among risk type r is given by14
and specifying the variance of εi as Var(εi ) = 2 2 /6, the probabil-
ity of individual i (being of risk type r) choosing a particular insurer
k k ))/
k is10 e(Vr +g(a
Prk = j +g(aj ))/
. (5)
j
e(Vr
Prob(i chooses k) = Prob(Vr k + εki > Vrl + εli ∀ l =
/ k)
Two cases regarding the cost aj can be distinguished:
k
eVr / non-individual-specific and individual-specific cost. With non-
= n j
. (3) individual-specific cost, total cost for DRS of an insurer j is
j=1
eVr /
independent of the number of individuals choosing this insurer.
The prime example for this case is selective advertising, where
Denote this probability by Prk ; it is also insurer k’s market share cost does not increase if an additional individual chooses insurer
among the individuals of risk type r. Prk is increasing in Vrk : a higher j. With individual-specific cost, total cost for DRS of an insurer
share of individuals of risk type r will choose insurer k, if this insurer j increases in the number of individuals choosing this insurer.
offers a higher level of medical services or charges a lower premium. One example here are additional benefits which the regulator (or
j
The variance of the additional utility component, Var(εi ) = society) considers not to be part of a ‘normal’ basic benefit package
2 2
/6, is a measure of the level of competition in this health insur- insurers are supposed to provide, like discounts for fitness club
j memberships or special counseling services. In this case, total
ance market. If is small, all the εi are very similar and therefore
only play a minor role in which insurer is chosen: Offering an only cost of DRS increases if an additional individual chooses insurer j.
somewhat higher utility level than all the other insurers will, in this It seems reasonable to assume that most risk selection activities
case, already attract a large share of all individuals; this implies a entail both non-individual-specific and individual-specific (i.e.
high level of competition. If, on the other hand, is large, the other fixed and variable) cost.
factors besides the benefit level and the premium – captured by Like positive DRS, we model negative DRS as an activity that
j generates some cost, but decreases the probability of being chosen
large positive and large negative εi – are rather important, so that
insurers, when increasing their premium (or reducing their benefit by a particular individual (or group of individuals). We denote the
level), only lose a small share of their insured; a large level of cost of negative DRS by bj and the utility decrease by f(bj ), where
therefore corresponds to a low level of competition. f(bj ) is increasing and concave. With negative DRS, the (perceived)
As shown by Lorenz (2013) who has analyzed this basic model utility of individual i and insurer k’s market share are as given in
without DRS, the level of competition determines which type of (4) and (5), but with +g(a) replaced by −f(b).
equilibrium emerges: If the level of competition is low (i.e., is Unlike with positive DRS, it is difficult to imagine some activity
large), there will be a pooling equilibrium: all insurers offer the where the cost an insurer incurs for negative DRS is independent of
same benefit-premium bundle, so each individual i chooses the the number of individuals choosing this insurer. ‘Negative adver-
j tising’ might be an example, where an insurer informs about some
insurer j for which his εi is maximal. If the level of competition
undesirable feature of its offer, like scrupulous utilization reviews,
is high (i.e., is small), a separating equilibrium similar (but not
but this and similar examples may seem rather far-fetched. We
identical) to the Rothschild–Stiglitz equilibrium under perfect com-
think it is more realistic to consider negative DRS to be an activity
petition arises.11 Some of the n insurers offer a benefit-premium
insurers are engaged in during the application process, so that cost
bundle designated for the low risks, the remaining insurers a con-
depends on the number of individuals applying at the insurer.15
tract designated for the high risks (with a larger benefit package
Activities which fall into this category are that insurers require
and a higher premium). All the low risks choose the insurer with
j additional (unnecessary) paper work or involve the high risk indi-
the highest εi among the first set of insurers, but only most of the
viduals in lengthy phone calls in which they try to convince (or
j
high risks choose the insurer with the highest εi among the second
set. Because a small share of the high risks chooses one of the insur-
ers offering the contract designated for the low risks, the separation
13
of risk types is not perfect.12 Since we are interested in a setting where insurers are engaged in DRS, we
assume lim g (a) → ∞ to guarantee an interior solution.
a→0+
14
To simplify the notation, we do not introduce different symbols for Pjk for the
case of no, positive or negative DRS; we will, however, always make clear to which
10
See Train (2009, p. 40). case we refer.
11 15
See Zweifel et al. (2009), chapter 7, for the Rothschild–Stiglitz equilibrium in In Section 5 we argue that the main effects should be similar if negative DRS
this setting. does not occur during the application process, but is targeted at individuals who
12
We explain why this occurs in Section 4.1. already hold a contract with the insurer.
84 N. Lorenz / Journal of Health Economics 42 (2015) 81–89
Normalizing the mass of individuals to one and assuming profit Because the last term in the brackets [·] is positive, compared with
maximization, the objective of insurer k is (8), the condition if there is no DRS, v (mk ) has to decrease: mk
max k = PLk Lk + (1 − )PHk H
k
, (6) increases, so the distortion is reduced. More generally, the larger
mk ,Rk the equilibrium level of ak , the larger mk .
where rk = Rk − pr mk denotes insurer k’s profit per individual of Result 1. In the pooling equilibrium, the distortion of the benefit
risk type r. The solution to this objective yields the following two level decreases in the level of positive DRS if cost for DRS is individual-
conditions (see Appendix A.1): First, specific.
n
Lk + (1 − )H
k
= . (7)
n−1 The incentive to distort the benefit level (with or without DRS)
arises because profit per high risk is lower than profit per low
Average profit per insured decreases in n and increases in : a
risk, where the degree of the distortion depends on the difference
higher level of competition (a larger number of insurers n or a
between these two profits, which, in the case without DRS, is given
smaller level of ) decreases profit. Secondly, the condition deter-
by20 Lk − Hk = (pH − pL )mk .
mining the distortion of the benefit level is given by
With positive DRS, profit per low risk decreases because insurers
2
waste part of this profit on their expenditures for DRS; this reduces
(1 − )(pH − pL )
1− n mk v (mk ) = 1, (8) the difference between net profits (including ak ), and thereby the
n−1
p
incentive to distort the benefit level.
where p = pL + (1 − )pH . Because the fraction is positive, it is This is different for the case of non-individual-specific cost.21
immediately apparent that v (mk ) > 1, so that mk is distorted below Although in this case, expenditures for risk selection decrease total
the efficient level mFB . As is to be expected, the distortion increases profit, they do not specifically decrease profit per low risk, so the
in the difference pH − pL and in the level of competition (captured difference between the risk-type-specific profits remains the same:
by n/(n − 1)). Therefore, positive DRS has no influence on the distortion of the
benefit level if cost is non-individual-specific. In the following, we
3.2. The pooling equilibrium with positive DRS will therefore only consider individual-specific cost.
With positive DRS of the young and individual-specific cost, so that the distortion of the benefit level is eliminated for
insurer k’s objective reads as22 (1 − ) H
RAO = (p − pL )m. (15)
k k k k k ı
= rs Prs r − rY PrY a , (11)
r s r
If there is perfect correlation, the share of the low risks equals the
share of the young, so = ; in addition, the mass of individuals
k contains g(ak ) and g(aj ). The distortion of the benefit
where only PrY in the lower left and the upper right corner in Table 2 has to be
level is now determined by23 zero, which requires ı = (1 − ) = (1 − ). Replacing in (15) shows
2
that with perfect correlation the efficient benefit level is therefore
(1 − )(pH − pL ) (pH − pL ) k k implemented for RAO = (pH − pL )m, which is just the cost difference
1− n mk + n ıa v (m ) = 1. (12)
n−1
p n−1
p between the two risk types. With less than perfect correlation,
ı < (1 − ), so there is overpayment, and the lower the level of
Because the last term in the brackets is positive, compared with no correlation (i.e., the lower ı), the larger this overpayment has to
DRS, v (mk ) has to decrease, so the distortion is reduced. However, be.
for a given level of ak , the reduction of the distortion is of course not
as large as with DRS against the risk type itself, since ı < (1 − ). 3.3.2. Optimal risk adjustment with positive DRS
If there is less than perfect correlation, expenditures on DRS of the With optimal risk adjustment there is overpayment for the old,
young not only increase the cost of the low risks, but also of the so this is the group positive DRS will be targeted at.25 Insurer k’s
high risks and therefore translate into a smaller reduction of the objective in this case is given by (13) with RAO replaced by RAO − ak
cost difference between the risk types. As is to be expected, for a and with PrO k containing g(ak ) and g(aj ) as given in (5). The posi-
given level of ak the reduction of the distortion increases in the level tive equilibrium level of ak then enters condition (14), where again
of correlation (i.e. in ı). RAO has to be replaced by (RAO − ak ).26 The efficient benefit level is
Result 2. In the pooling equilibrium, the distortion of the benefit therefore implemented with
level decreases in the level of positive DRS of a signal that is correlated (1 − ) H
with low risk if cost for DRS is individual-specific. A higher level of RAO = (p − pL )m + ak . (16)
ı
correlation (a higher ı) increases the effect of a given level of DRS on
the distortion of the benefit level. Because part of the overpayment for the old is spent on positive
DRS, the optimal overpayment for the old has to be raised by exactly
these expenditures, so that the net difference in payments (includ-
3.3. Implications of DRS on optimal risk adjustment in the ing ak ) equals the amount necessary to eliminate IRS as given in
pooling equilibrium (15).
We now discuss the implications of the interaction of direct and Result 3. With individual-specific cost, if there is positive DRS
indirect risk selection for optimal risk adjustment. As shown by regarding a signal that is used for risk adjustment and that is correlated
Glazer and McGuire (2000), if a regulator does not observe indi- with high risk, the optimal overpayment of that signal to eliminate IRS
viduals’ risk type, but only a signal that is correlated with risk type has to be increased by the expenditures for DRS.
(like age), it is still feasible to eliminate the distortion of the benefit
This result shows that the formula for optimal risk adjustment
package by overpaying for a signal that indicates a high risk, and
derived by Glazer and McGuire (2000) is not invalidated by DRS:
underpaying for a signal that indicates a low risk. In Section 3.3.1,
there is still overpayment for a signal that is correlated with high
we first show how to implement optimal risk adjustment in our
risk, and underpayment for a signal correlated with low risk. Also,
setting with two risk types and two age groups if there is no DRS;
DRS does not invalidate their claim that optimal risk adjustment
we then determine how the optimal payments have to be modified
can implement the efficient benefit level. However, the formula to
if there is DRS in Section 3.3.2.
implement the efficient benefit level has to be modified and include
insurers’ expenditures on DRS if cost is individual specific. Whether
3.3.1. Optimal risk adjustment without DRS these expenditures are negligible or significant is an empirical mat-
With risk adjustment, each insurer receives a payment of RAO ter, but, e.g., the findings of Starc (2014), who reports that insurers
for each insured that is old; these payments are financed by a risk spend a large part of potential profits on marketing and insurance
adjustment fee RAF which each insurer has to pay for each insured brokers, indicate that these expenditures may be substantial.
(including the old). The balanced budget constraint requires this fee
to be RAF = (1 − )RAO . The insurer’s objective with risk adjustment 3.4. More than two risk types
is then given by
To keep the notation simple when deriving the results, so far
k = k
rY PrY (rk − RAF) + k
rO PrO (rk − RAF + RAO ). (13) we have considered the case of only two risk types. The results,
r r however, also hold for an arbitrary number of risk types. Let the
number of risk types be r and denote the probability of risk type r
Simplifying the optimality conditions now yields24
by pr and its share by r .
2
In the conditions determining the distortion of the benefit level,
(1 − )(pH − pL ) (pH − pL )
1− n mk + n ıRAO v (mk ) = 1, (14) conditions (8), (12) and (14), the term (1 − )(pH − pL ) then has to
2
n−1
p n−1
p r 2
be replaced by (p − p) > 0; see Appendix B.1. This is just
r=1 r r
the variance of the illness probabilities, and the larger this variance,
the larger the distortion of m.
22
Like in Section 3.2.1, we assume that positive DRS is profitable for only one of
the two signal types; see footnote 18.
23 25
See Appendix A.3. As before we assume that DRS is only targeted at one of the signal types.
24 26
See Appendix A.4. See Appendix A.5.
86 N. Lorenz / Journal of Health Economics 42 (2015) 81–89
choosing this contract. Because at the boundary of the shaded area Result 5. In the separating equilibrium, the distortion of the bene-
this second effect is of second order (since the density is about zero fit level is reduced if there is positive DRS of the low risk and cost is
if PHA ≈ 0), the two effects balance somewhere inside the shaded individual-specific.
area, represented by contract A3 .32 Because a small share of the
high risks choose contract A3 , this contract is somewhat above the We now turn to the case that both risk types can experience the util-
iso-profit line pL (which would apply if only the low risks choose ity increase g(a). In this case, DRS will be performed by all insurers:
insurer A). insurers of type A engage in DRS of the low risks and insurers of type
Due to the ‘stochastic nature’ of the incentive compatibility con- B in DRS of the high risks (since high risks are profitable to insurers
straint, the separating equilibrium under imperfect competition is of type B). Because the equilibrium level of expenditures on DRS in
therefore not perfectly separating. Instead, a small share of the high this case is the same for both types of insurers,35 all insurers raise
risks chooses the contract designated for the low risks, but none of their premium by the same amount. In Fig. 2(a), both iso-profit lines
the low risks choose the contract designated for the high risks. are then shifted upwards by the same distance, so there is no effect
on the benefit levels mA and mB .
However, it seems reasonable to assume that positive DRS is
4.2. The separating equilibrium with DRS at least somewhat more effective when targeted at the low risks
than when targeted at the high risks. In equilibrium, insurers of
4.2.1. The separating equilibrium with positive DRS type A will spend more on DRS and raise their premium by a higher
For positive DRS, we have to distinguish whether the utility amount than insurers of type B. The larger upward shift of their
increase g(a) will only (or at least primarily) be experienced by the iso-profit line then allows insurers of type A to increase their ben-
low risks (as, e.g., might be the case with a discount for a fitness efit level (just as in the case where the iso-profit line of insurers
club membership) or equally by both risk types (as, e.g., might be of type B is not shifted at all). Therefore, as long as positive DRS
the case with advertising). We begin with the case that the utility reduces the attractiveness of the contract offered by insurers of
increase can only be experienced by the low risks. Positive DRS will type A relative to the contract offered by insurers of type B in the
than only be performed by insurers of type A.33 premium-dimension, insurers of type A can increase the attractive-
The objective of insurers of type A with positive DRS of the low ness of their contract in the benefit level-dimension.
risks equals the objective as given in (9). The solution to this objec-
tive determines a positive equilibrium level of expenditures on
4.2.2. The separating equilibrium with negative DRS
DRS, which increases the premium charged by these insurers.34
Negative DRS will only be performed by insurers of type A,
In Fig. 2(a), this increase in RA can be shown by an upward shift of
because only these insurers try to avoid being chosen by the high
the corresponding iso-profit line. As is apparent, insurers of type
risks. As in Section 4.1, we explain the effect of negative DRS
A can then increase their benefit level before attracting the same
as if there was only one insurer of type A and one insurer of
share of the high risks as without DRS. Because a higher bene-
type B.
fit level (accompanied by the according increase in the premium,
With negative DRS, a high risk chooses insurer A if VHA − f (bA ) +
pL mA ) increases the utility of the low risks, insurers will offer this
εAi > VHB + εBi . Because negative DRS reduces the utility as perceived
higher benefit level (if there is some competition), so that the new
equilibrium will be a contract like A4 . Since positive DRS reduces by the high risks by f(bA ), VHA can be increased by exactly that
the attractiveness of the contract offered by insurers of type A for amount without altering the number of the high risks choosing
the high risks in the premium-dimension, they can increase the insurer A. This allows insurer A to increase his benefit level, see
attractiveness of their contract in the benefit level-dimension. Fig. 2(b). There, AL5 denotes a contract offered by insurer A as
perceived by the low risks, while AH 5
denotes the same contract as
perceived by the high risks. Compared to AL5 , AH 5
is shifted upwards
by f(bA ), the utility decrease of negative DRS (measured in mone-
32
See Appendix C.1; for a more detailed derivation of this equilibrium, see Lorenz tary terms). The larger bA , the larger f(bA ), and therefore the larger
(2013).
VA
mA can be without increasing the share of high risks choosing
33
Because contract B is far above the indifference curve I L associated with con-
insurer A. Because this effect occurs regardless of whether cost
tract A3 , for the low risks there is a huge difference in utility between A3 and B. Any
moderate increase in the (perceived) utility of contract B due to positive DRS of insur- for DRS is individual-specific or not, there exists one case where
ers of type B will reduce this difference only to a small degree and not induce any
of the low risks to choose this contract. Because insurers of type B cannot increase
their market share among the low risks (which remains at zero), they will refrain
from positive DRS. 35
This can be seen by replacing Hk by (Hk − ak ) in (6) for insurers of type B and
34
See Appendix C.2. comparing the respective optimality condition with the one for insurers of type A.
88 N. Lorenz / Journal of Health Economics 42 (2015) 81–89
DRS influences the distortion caused by IRS even if cost is non- Appendix A.
individual-specific.
A.1. The pooling equilibrium without DRS
Result 6. In the separating equilibrium, negative DRS against the
high risks reduces the distortion of the benefit level of the low risks Using the property that the derivative of Prk with respect to Vrk
∂Prk P k (1−P k )
regardless of whether cost for negative DRS is individual-specific or can be expressed in terms of Prk itself as = r r , the FOCs to
∂Vrk
not.
objective (6) are given by
The main mechanism in this last case is therefore different from all
the other cases we have considered. Here, IRS is substituted by DRS, ∂ k P k (1 − PLk ) k P k (1 − PHk ) k
while in all the other cases, IRS affects risk type specific profits and = − L L + PLk + (1 − ) − H H + PHk
∂ Rk
thereby the gains from IRS.
=0 (17)
In this paper, the interaction of direct and indirect risk selec- The FOC of (9) with respect to ak is
tion (DRS and IRS) has been analyzed. It has been shown that DRS,
∂ k PLk (1 − PLk ) k
using measures unrelated to the benefit package, nevertheless has = g (a )(Lk − ak ) − PLk = 0. (19)
an influence on the distortions of the benefit package caused by IRS. ∂ak
If cost for DRS is (at least to some degree) individual-specific, DRS
j
selectively reduces the profit per individual of the risk type it is tar- With Pr = (1/n) ∀ j, this can be simplified to g (ak )(Lk − ak ) =
geted at. Positive DRS therefore reduces and negative DRS increases n/(n − 1), which implicitly defines a positive level of ak , because
the difference in profits between the low and the high risks. Because lim g (a) → ∞. Replacing Lk by (Lk − ak ) in (17) and (18), solving
in the pooling equilibrium the degree of the distortion depends a→0+
on the difference between these profits, positive DRS reduces the (17) for Rk and substituting in (18) then yields (10).
distortion of the benefit level, while negative DRS increases it. In
the separating equilibrium, DRS can act as a substitute for IRS and A.3. Positive DRS of a signal that is correlated with risk type
thereby reduce the distortion of the benefit level. In addition, it
has been shown that the effects of DRS on type- and signal-specific The FOC of (11) with respect to ak defines a positive equilibrium
profits also have an influence on the formula for optimal risk adjust- level of ak . Solving the FOC with respect to Rk for Rk = (n/(n −
ment: The over- and underpayments have to be inflated by insurers’ 1)) + pmk + ak and inserting in the FOC with respect to mk yields
expenditures on positive DRS and reduced by their expenditures on (12).
negative DRS.
We have derived these results for a setting where negative DRS
occurs during the application process, but the main mechanisms A.4. Optimal risk adjustment without DRS
should also hold if it is targeted at the high risks who already hold
a contract with the insurer. If an insurer attracts a larger share of Solving the FOC of (13) with respect to Rk for Rk = (n/(n − 1)) +
the high risks, this will increase the cost of negative DRS, even if the pmk + RAF − (1 − )RAO and substituting in the FOC with respect to
activity of risk selection and the cost associated with it occur only mk (using the definitions of rs as given in Table 2) yields (14).
later (when the insurer tries to induce the high risks to switch to
another insurer). With negative DRS, high risks are more expensive A.5. Optimal risk adjustment with DRS
than without DRS; anticipating these additional cost, insurers will
therefore not make their contract as attractive for this risk group The derivation of condition (16) is identical to the derivation of
as they would without DRS. Therefore, negative DRS will increase condition (14) described in Appendix A.4; the only difference is that
the distortion in such a setting as well. RAO has to be replaced by (RAO − ak ).
References Lorenz, N., 2014. The interaction of direct and indirect risk selection. Trier University
Working Paper 12/14. http://ideas.repec.org/p/trr/wpaper/201412.html
Bauhoff, S., 2012. Do health plans risk-select? An audit study on Germany’s McGuire, T.G., Glazer, J., Newhouse, J.P., Normand, S.-L., Shi, J., 2013. Integrating risk
Social Health Insurance. Journal of Public Economics 96 (9–10), adjustment and enrollee premiums in health plan payment. Journal of Health
750–759. Economics 32 (6), 1263–1277.
Bijlsma, M., Boone, J., Zwart, G., 2011. Competition leverage: how the demand side Newhouse, J.P., Price, M., Huang, J., McWilliams, J.M., Hsu, J., 2012. Steps to reduce
affects optimal risk adjustment. TILEC Discussion Paper, 2011-039. favorable risk selection in Medicare Advantage largely succeeded, boding well
Breyer, F., Bundorf, M.K., Pauly, M.V., 2011. Health care spending risk, for Health Insurance Exchanges. Health Affairs 31 (12), 2618–2628.
health insurance, and payment to health plans. In: Pauly, M.V., McGuire, Olivella, P., Vera-Hernandez, M., 2010. How complex are the contracts offered by
T.G., Barros, P.P. (Eds.), Handbook of Health Economics, vol. 2. Elsevier, health plans? SERIEs 1 (3), 305–323.
Amsterdam, pp. 691–762. Rothschild, M., Stiglitz, J., 1976. Equilibrium in competitive insurance markets: an
Brown, J., Duggan, M., Kuziemko, I., Woolston, W., 2011. How does risk selection essay on the economics of imperfect information. Quarterly Journal of Eco-
respond to risk adjustment? Evidence from the Medicare Advantage Program. nomics 90 (4), 629–649.
NBER Working Paper 16977. Shen, Y., Ellis, R.P., 2002. How profitable is risk selection? A comparison of four risk
Cao, Z., McGuire, T.G., 2003. Service-level selection by HMOs in Medicare. Journal of adjustment models. Health Economics 11 (2), 165–174.
Health Economics 22 (6), 915–931. Shi, J., 2013. Efficiency in plan choice with risk adjustment and premium discrimi-
Eggleston, K., 2000. Risk selection and optimal health insurance-provider payment nation in Health Insurance Exchanges. Unpublished.
systems. Journal of Risk and Insurance 67 (2), 173–196. Spiegel, 2011. Fragwürdige Beratung: Krankenversicherer wimmelt Senioren ab.
Ellis, R.P., McGuire, T.G., 2007. Predictability and predictiveness of health care spend- Spiegel, May 9, 2011. http://www.spiegel.de/wirtschaft/soziales/fragwuerdige-
ing. Journal of Health Economics 26 (1), 25–48. beratung-krankenversicherer-wimmelt-senioren-ab-a-761384.html
Frank, R.G., Glazer, J., McGuire, T.G., 2000. Measuring adverse selection in managed Starc, A., 2014. Insurer pricing and consumer welfare: evidence from Medigap. RAND
health care. Journal of Health Economics 19 (6), 829–854. Journal of Economics 45 (1), 198–220.
Glazer, J., McGuire, T.G., 2000. Optimal risk adjustment in markets with adverse Train, K.E., 2009. Discrete Choice Methods with Simulation, 2nd ed. Cambridge Uni-
selection: an application to managed care. American Economic Review 90 (4), versity Press, New York.
1055–1071. van de Ven, W.P., Ellis, R., 2000. Risk adjustment in competitive health plan markets.
Glazer, J., McGuire, T.G., 2002. Setting health plan premiums to ensure efficient qual- In: Culyer, A.J., Newhouse, J.P. (Eds.), Handbook of Health Economics. Elsevier,
ity in health care: minimum variance optimal risk adjustment. Journal of Public Amsterdam, pp. 755–845.
Economics 84 (2), 153–173. van de Ven, W.P., van Vliet, R.C., 1992. How can we prevent cream skimming in
Jack, W., 2006. Optimal risk adjustment with adverse selection and spatial compe- a competitive health insurance market? The great challenge for the 90’s. In:
tition. Journal of Health Economics 25 (5), 908–926. Zweifel, P., Frech, H. (Eds.), Healtheconomics Worldwide. Kluwer Academic Pub-
Lorenz, N., 2013. Adverse selection and risk adjustment under imperfect competi- lishers, Dordrecht, Boston, London, pp. 23–46.
tion. Trier University Working Paper 5/13. http://ideas.repec.org/p/trr/wpaper/ Zweifel, P., Breyer, F., Kifmann, M., 2009. Health Economics, 2nd ed. Springer, Berlin,
201305.html Heidelberg.
Journal of Health Economics 42 (2015) 90–103
a r t i c l e i n f o a b s t r a c t
Article history: This study explores the impact of environmental regulations in China on infant mortality. In 1998, the
Received 27 January 2014 Chinese government imposed stringent air pollution regulations, in one of the first large-scale regulatory
Received in revised form 24 February 2015 attempts in a developing country. We find that the infant mortality rate fell by 20 percent in the treatment
Accepted 26 February 2015
cities designated as “Two Control Zones.” The greatest reduction in mortality occurred during the neonatal
Available online 6 March 2015
period, highlighting an important pathophysiologic mechanism, and was largest among infants born
to mothers with low levels of education. The finding is robust to various alternative hypotheses and
JEL classification:
specifications. Further, a falsification test using deaths from causes unrelated to air pollution supports
Q56
I18
these findings.
Q53 © 2015 Elsevier B.V. All rights reserved.
I12
O13
Keywords:
Infant mortality
Air pollution
Environmental regulation
China
http://dx.doi.org/10.1016/j.jhealeco.2015.02.004
0167-6296/© 2015 Elsevier B.V. All rights reserved.
S. Tanaka / Journal of Health Economics 42 (2015) 90–103 91
environment, wherein the intensity of exposure to the regulations (2011) examines regulations on air pollution and water pollution
can be defined by the TCZ regulatory status, and we are able to com- in India since 1987. They find these regulations efficacious in reduc-
pare changes in infant mortality rate (IMR) before and after the pol- ing air pollution, but such reductions led to modest and statistically
icy reform, between the cities assigned and not assigned as the TCZ. insignificant reductions in infant mortality. Our study provides
To implement the analysis, we draw IMR data from the Chi- an interesting contrast, finding that infant mortality significantly
nese Disease Surveillance Points (DSP) system that collected birth responded to the environmental regulation. Further, the regulation
and death registrations for 145 nationally representative sites from we focus on targeted coal for energy generation, which is the major
1991 through 2000. IMR, defined as the number of infant deaths contributor to air pollution in many other developing countries as
under age one per 1000 live births in a given year, is available for well, whereas Greenstone and Hanna (2011) focus on vehicular
each DSP site by year level, linked with detailed information on pollution.3,4 The findings in this study accordingly present relevant
birth characteristics and parental attributes. We match this dataset estimates for the effect of environmental regulations in develop-
to the TCZ regulatory status assigned to individual cities, based on ing countries implementing similar policies on coal in the power
the governmental report, and thereby estimate the treatment effect industry.
of the regulations. Second, the present study contributes to our understanding
We find that the air pollution regulations led to significant of the relationship between air pollution and infant mortality at
reductions in infant mortality. The difference-in-differences esti- greater concentration levels. Previous evidence is predominantly
mates suggest that the regulations have led to 3.29 fewer infant derived from the United States or other developed countries, where
deaths per 1000 live births than would have occurred in the absence pollution is relatively low.5 Since we know little about the shape of
of the regulations. This corresponds to a 20 percent reduction the dose–response relationship, it is consequently difficult to pre-
in IMR. 63 percent of the reduction in infant mortality occurred dict the marginal impact of pollution reduction in the presence of
during the neonatal period, highlighting an important pathophysi- non-linear relationship.6 Air pollution in China is one of the highest
ologic mechanism, and the greatest reduction of mortality occurred in the world. Its total suspended particulates (TSP) level in 1995 was
among children born to mothers with low educational levels. four times higher than the WHO standards, and four times higher
A major methodological challenge, however, is that the TCZ des- than the level in the United States in 1970, when the Clean Air
ignation rule may not be orthogonal to unobserved characteristics Act was amended, as examined in Chay and Greenstone (2003b).
that contribute to reductions in air pollution and infant mortal- Thus, estimates in China provide compelling evidence applicable to
ity. The present study conducts a number of robustness checks the distinctive context of developing countries where air pollution
and a falsification test to address this issue. First, we confirm that levels are relatively high.
the TCZ status has little association with changes in observable Third, there is extensive literature showing differential patterns
covariates, assuring that there is no systematic difference in con- according to socioeconomic status, yet it is still an open question as
current trends in observable characteristics between the TCZ and to whether air pollution also exhibits differential impact on infant
non-TCZ cities. Although this is not a direct test of the exclusion mortality (Currie and Hyson, 1999; Case et al., 2002; Jayachandran,
restriction, since it requires that TCZ status not be correlated with 2009). While infants in poor countries are considered to be the
trends in unobservable factors, this result leads us to believe that most susceptible to the effect of pollution, not only because of
the treatment effect is less likely to be confounded by differential high pollution levels but also because families lack the resources
trends in unobservable factors as well (Altonji et al., 2005). Second, or knowledge necessary to avoid exposure, the impacts may be
the regression is also directly adjusted for differential pre-existing small if air pollution does not have first-order effect on them.7 Thus,
trends in mortality, yet the estimates are essentially unaffected. the present study helps identify vulnerable population in designing
Lastly, the policy had no impact on infant deaths due to acciden- policies.
tal causes. The absence of a causal mechanism linking air pollution The current research design has several advantages over previ-
to these causes of death serves as falsification evidence, suggest- ous studies. First, this study focuses on infants, not only because
ing that differences in access to or quality of medical services and
technologies cannot be the sources of bias. Overall, there is no evi-
dence that the estimates are driven by inappropriate identification
system. Second, even if pollution is successfully reduced to some extent, infant mor-
assumptions, leading us to believe that the treatment effect based tality may not fall if a concave relationship between mortality and pollution level
on the TCZ status is indeed not spurious but causal. leads to low marginal pollution effect at high concentration levels. Third, magni-
This study makes three major contributions to the existing liter- tudes of impacts in reducing air pollution should be greater if people have limited
ature. First, by exploiting regulation-induced changes in air quality, access to medical services, initially have lower health status, and/or have limited
knowledge in avoiding pollution.
it addresses a policy-relevant question: to what extent do envi- 3
Since the most heavily affected industry is the power industry, which was a
ronmental regulations in developing countries lead to reductions driving force behind China’s rapid economic growth, the findings in this study
in infant mortality? Several prior studies have focused on vari- accordingly highlight the important tradeoffs among economic growth, environ-
ation in air quality induced by recession (Chay and Greenstone, mental quality, and human health. See Tanaka et al. (2014) for the impact of the
2003a), weekly fluctuations (Currie and Neidell, 2005), wildfires environmental regulation on industrial performance.
4
A relatively smaller-scale air quality regulatory regime, targeting a different
(Jayachandran, 2009), or wind directions (Luechinger, 2014). Chay industry, in another developing country, can be found in the Indian transportation
and Greenstone (2003b) provide compelling evidence for the link- sector, which was mandated to use compressed national gas vehicles in Delhi during
age between the Clean Air Act of 1970 and infant mortality in the working hours. Kumar and Foster (2007) show its effect on respiratory health.
5
U.S. It remains to be determined, however, whether, and how effec- Examining the pollution effect at low levels, especially levels lower than what
is often considered to be the standard, is also an interesting question in itself.
tively, environmental regulations can improve human health in
Currie and Neidell (2005) find that CO has significant impact on infant mortality
developing countries.2 A recent study by Greenstone and Hanna in California over the 1990s at relatively low levels.
6
For example, Arceo et al. (2012) show a non-linear relationship between CO
and health in Mexico. Evidence in developed countries may understate the impact
of pollution reduction in developing countries if the marginal impact of pollution is
2
Extrapolating evidence in developed countries to developing countries is dif- higher at greater concentration levels.
7
ficult for a number of reasons. First and foremost importantly, we do not know For example, people with poor health tend to stay indoor with little exposure
whether environmental regulations in developing countries had any impact in to pollution. Children from rich households, who tend to have better health, may be
reducing pollution due to weaker implemental mechanisms and an enforcement exposed more to pollution if they are more likely to have outside activities.
92 S. Tanaka / Journal of Health Economics 42 (2015) 90–103
they are particularly vulnerable to air pollution due to their weak environmental regulatory policies, the first version of which
respiratory system, but because focusing on infant mortality mit- was enacted in 1987, known as the Air Pollution Prevention and
igates complicating factors associated with adult mortality. For Control Law (APPCL). This original law, however, failed to reduce
example, adult deaths correlate more closely to chronic disease air pollution, mainly because it excluded the power sector, the
conditions than to acute changes in air quality. In addition, adults major contributor of SO2 emissions (Qian and Zhang, 1998). Even
may migrate into less polluted areas. Addressing infant mortality worse, SO2 emissions continued to surge, and areas affected by
circumvents these issues, if not completely, because it is relatively acid rain expanded.
less difficult to identify causes of death during the first year of APPCL was consequently amended in 1995. The major part of
birth, and because migration rates are low for pregnant women and the amendment was to include a section to regulate pollutant emis-
infants. Lastly, China is not only one of the most polluted countries sions and coal combustion, particularly regarding the usage of high
but also one of the first developing countries to regulate air pollu- sulfur-content coal, at power plants (Hao et al., 2007). Although the
tion on such a large-scale. It is evident that China serves as a rare 1995 APPCL still had a weak enforcement mechanism and limited
research environment in which to assess the impact of environ- efficacy, a prominent feature of the amendment was to propose a
mental regulations at greater concentration levels. future regional strategy, which would identify priority regions to
The rest of the paper is organized as follows. Section 2 provides improve air quality and prevent the spread of acid rain.9,10
the historical background on air pollution and national air pollution This was officially approved and implemented as the Two Con-
regulations in China. Section 3 describes the data and the descrip- trol Zone (TCZ) policy in January 1998 (State Council, 1998). This
tive statistics. Section 4 presents the econometric framework and legislation designated prefectures exceeding nationally mandated
its validity, and Section 5 presents empirical results. Section 6 con- thresholds as either acid rain control zone or SO2 pollution control
cludes. zone.11 Based on the records in preceding years12 , prefectures were
designated as SO2 pollution control zone if;
2. Background on air pollution and regulations in China
• Average annual ambient SO2 concentrations exceeded the Class
2.1. Brief history II standard,13
• Daily average concentrations exceeded the Class III standard, or
China is infamous for its air pollution, due to emissions from a • High SO2 emissions were recorded.14
power sector that relies heavily on coal to generate electric power.
As the world’s largest coal producer, China possesses abundant Alternatively, prefectures were designated as acid rain control
and relatively cheap coal, which constitutes the country’s primary zone if
energy resource endowment, accounting for 75.5 percent of total
energy production in 1995 (National Bureau of Statistics of China, • Average annual pH values for precipitation were less than or
2006). However, coal generally emits more pollutants than other equal to 4.5,
fossil fuels. As China underwent rapid economic growth, total SO2 • Sulfate deposition was greater than the critical load, or
emissions increased from 18.4 million tons in 1990 to 23.7 million • High SO2 emissions were recorded.
tons in 1995, and the ambient air pollution rose to levels detri-
mental to human health (State Environmental Protection Agency In total, 175 prefectures out of 333 prefectures across 27
[SEPA], 1996). provinces were designated as TCZs. They accounted for 40.6 per-
Fig. 1 illustrates the world distribution of TSP (Panel A) and SO2 cent of its population, 62.4 percent of GDP, and 58.9 percent of
(Panel B) concentration levels in 1995. The TSP level in Beijing, the the total SO2 emissions in 1995 (Hao et al., 2001). The SO2 pollu-
capital city of China, was 377 g/m3 , almost four times higher than tion control zone was concentrated in the north due to high SO2
the WHO guideline of 90 g/m3 , and its SO2 concentration level was emissions for heating,15 whereas the acid rain control zones were
90 g/m3 , almost double the WHO guideline of 50 g/m3 (WHO, primarily in the south, where heat, humidity, and solar radiation
2002). SO2 is also an important precursor of acid rain. From the combine to create high atmospheric acidity. Hence, acid rain in the
1980s to the mid-1990s, the area of China experiencing acid rain
expanded by more than 1 million km2 (Yang and Schreifels, 2003).
During that decade, elevated air pollution gave rise to increas- and cardiovascular reasons. See also Aunan and Pan (2004) and Matus et al. (2012)
ing public concern about adverse impacts on human health.8 for health effects of air pollution in China.
9
In response, the Chinese government formulated a series of Article 27 of the 1995 APPCL stipulates: “The environmental protection depart-
ment under the State Council together with relevant departments under the State
Council may, in light of the meteorological, topographical, soil and other natural
conditions, delimit the areas where acid rain has occurred or will probably occur
8
It is generally known that the smaller a particulate, the more detrimental it is and areas that are seriously polluted by sulfur dioxide as acid rain control areas and
to health. For example, PM10 or PM2.5 , the particles with a diameter of 10 or 2.5 sulfur dioxide pollution control areas, subject to approval by the State Council.”
micrometers or less, respectively, or toxic gas, such as SO2 , are considered to be 10
It is a standard practice of policy experimentation in China to implement strate-
the most hazardous because, when inhaled, these particulate matters or gas can gies in a particular region or for a set of time period, attempting to demonstrate their
penetrate deep into the lungs and interfere with internal gas exchange. Further, effectiveness before expanding their implementation to the entire nation.
11
Laden et al. (2000) find that fine particles emitted from combustion sources (i.e., In this sense, the legislation can be considered to be parallel to the attainment
motor vehicles or coal combustion) have a stronger association with mortality than and nonattainment county designation by the Clean Air Acts in the United States.
12
those from non-combustion sources. Alternatively, SO2 becomes sulfuric acid when The original document does not specify exactly which years of records they refer
it interacts with water, which is the main component of acid rain that may have a to.
13
direct or indirect impact on health. Yet, epidemiological evidence of the impact of According to the Chinese National Ambient Air Quality Standards (CNAAQS)
SO2 on mortality in developed countries is somewhat mixed. While Mendelsohn and for SO2 , Class I standard designates an annual average concentration level not
Orcutt (1979) show close associations between the two, SO2 is also considered a less exceeding 20g/m3 , Class II ranges 20 g/m3 < SO2 < 60 g/m3 , and Class III ranges
important determinant of mortality (Schwartz and Marcus, 1990; Nielsen and Ho, 60 g/m3 < SO2 < 100 g/m3 . Cities should meet Class II, which is considered to be
2007). Hedley et al. (2002) is one of the few intervention studies that investigates less harmful.
14
changes in SO2 caused by an overnight restriction on all power plants and road The original document does not specify the levels of SO2 emissions that are
vehicles in Hong Kong using fuel oil with a sulfur content of more than 0.5 percent. considered to be “high.”
15
They found that the intervention resulted in an immediate reduction in ambient See Almond et al. (2009) for the impact of heating policy, which created a
SO2 concentrations and a reduction in death rates particularly due to respiratory discrepancy in air quality north and south of the Huai River.
S. Tanaka / Journal of Health Economics 42 (2015) 90–103 93
Fig. 1. Air pollution across countries Notes: These figures present the world distribution of the TSP in Panel A and SO2 in Panel B in 1995.
Source: World Bank (1998).
south cannot necessarily be attributed to SO2 emissions traveling • All new and renovated power plants are required to use coal with
down from the north, but is rather due to local emissions. This is less than 1 percent sulfur content.
even more evident because acid deposition is the greatest in the • Existing power plants using coal with sulfur content above 1
summer, when wind direction is generally south to north. percent are required to install flue gas desulfurization (FGD)
The TCZ status enforced more stringent regulations mandating equipment.
the use of less high-sulfur coal and the development of clean coal
technology. For example; 2.2. Effectiveness of TCZ policy
• No new coal mines producing coal with a sulfur content higher Various studies have documented the effectiveness of TCZ reg-
than 3-percent can be established, and existing mines that pro- ulatory actions in reducing pollutant emissions and improving air
duce such coal must gradually be shut down or reduce output. quality. For example at the national level, SO2 emissions fell from
• Construction of any new coal-burning thermal power plants in 23.67 million tons in 1995 to 19.95 million tons in 2000, and the
large and medium-sized prefectures is prohibited. percentage of prefectures exceeding the Class II standard fell from
94 S. Tanaka / Journal of Health Economics 42 (2015) 90–103
4. Empirical framework
The main objective of this study is to assess the effect of air pol-
lution regulations on infant mortality. In an ideal research setting, Fig. 2. Trends in infant mortality rate. Notes: This figure plots the trend of infant
the TCZ status is randomly assigned across cities, creating varia- mortality rate due to internal causes between the TCZ and non-TCZ cities. The
annual mean is calculated using the population as the weight. The solid vertical line
tion uncorrelated with baseline characteristics. In the absence of
indicates the timing of TCZ policy implementation in January 1998. Because each
a randomized controlled trial, we first use a simple difference-in- observation represents the annual average value, the solid vertical line is located
differences (DID) approach, based on the TCZ regulatory status; between 1997 and 1998 to clarify the timing of their implementations.
Table 2
Balancing test by the TCZ status.
Trend difference
(1) (2)
All Years 1996–2000
Table 4
Identifying the biological mechanism.
Notes: Dependent variables are; birthweight in gram in column (1), the number of infants born with less than 2500 g per 1000 live births in column (2), gestation period in
week in column (3), the number of infants born in less than 32 weeks per 1000 live births in column (4), IMR within one day in column (5), IMR within 28 days in column
(6), and IMR within 6 months in column (7). The mean values of respective variable (weighted by total population) and their standard deviations in the square bracket are
provided based on the observations before 1995.
*
Significant at p < 0.1 level.
**
Significant at p < 0.05 level.
share of gestation period below 32 weeks, the lowest one percentile an upward bias. In order to rule out such a possibility, we exam-
level. Neither estimate is significant, indicating that the impact on ine the regulations’ effect on infant mortality by cause of death.
low birthweight is not driven by changes in length of gestation If maternal exposure is a primary channel, we expect to see larger
period. changes in mortality associated with prenatal disorders as opposed
The results above suggest that fetal exposure to pollution affect- to postnatal causes. There is added significance to this analysis;
ing fetal development appears to be a key. However, birthweight the estimates are directly comparable to the falsification test that
and length of gestation period may not fully capture fetal devel- examines the effect on external causes of deaths unrelated to air
opment. Hence, we now explore the effect on deaths occurring at pollution.
different time periods. Infant deaths occurring during the neona- Given that the exact etiology and pathology of diseases caused
tal period (within 28 days after birth) are generally considered to by fetal exposure to air pollution are not yet known, the most
be associated with poor fetal development (Chay and Greenstone, important comparison is between internal and external causes
2003a,b).31 Column (5) presents the impacts on infant deaths that of deaths. We compute IMR due to internal deaths to include all
occurred within one day of birth. The point estimate is small yet health-related, non-accidental causes that are potentially associ-
marginally significant at the 10 percent level, implying that 26 ated with air pollution, other than infant deaths due to external
percent of overall reduction in IMR occurred within one day. The causes of deaths, those that clearly do not pertain to air pollution:
magnitude is in line with Chay and Greenstone (2003b), who esti- injury and poisoning (Chay and Greenstone, 2003a,b).
mate that roughly 22 percent of overall infant deaths occurred As predicted, Table 5 shows statistically significant effects on
within one day. Yet this contrasts to the finding in Chay and mortality from health-related causes.33 When effects are estimated
Greenstone (2003a), who attribute roughly 60 percent of overall separately for four major health-related causes,34 the estimates
impact to infant deaths within one day. Column (6) reveals that correspond to a reduction of 49.7 percent in mortality from ner-
the regulations are disproportionately more associated with the vous system disorders and a reduction of 32.7 percent in circulatory
probability of death during the neonatal period, indicating that system disorders. These findings are consistent with vast literature
63 percent of the effect of the regulations on infant mortality is that also finds strong associations between these birth defects and
due to reductions in this period.32 This corresponds to the find- maternal smoking during pregnancy (which essentially works in
ings in Chay and Greenstone (2003a) and Chay and Greenstone a similar mechanism as air pollution exposure) (see for example
(2003b), whose 73–82 percent and 80 percent of infant mortality Fried, 1995; Brennan et al., 1999).
reductions occurred in the same period, respectively. Overall, these On the other hand, infectious, parasitic and respiratory diseases
findings highlight weak fetal development via maternal exposure did not have significant impacts. Low rates of respiratory diseases
as an important biological mechanism. indicate that these are not typical causes of deaths for infants, par-
A related concern is that the regulations are confounded by ticularly during the neonatal period, who spend most of their times
other concurrent changes in factors contributing to the reductions indoor, while these diseases have been found to be a major cause
in infant mortality. For example, in response to high exposure of deaths for children or at youngest post-neonatal period (See for
to pollution in the TCZ cities, the local governments may have example, Woodruff et al., 1997; Borja-Aburto et al., 1997; Bobak
increased healthcare spending, leading to an improvement in qual- and Leon, 1999; Woodruff et al., 2006).
ity and/or quantity of healthcare services. Then, without directly Most importantly, we find no statistically significant effect on
controlling for a local health policy, a simple DID estimate would mortality from external causes such as injury and poisoning. If
erroneously pick up effects through a healthcare policy, causing
33
Over the study period between 1991 and 2000, 51 percent of infant deaths are
due to diseases of the circulatory system, 19.4 percent come from nervous systems
31
According to the WHO, infant deaths during the neonatal period are also asso- and sense organs, and 19 percent are due to external causes.
34
ciated with preterm birth, intrapartum-related complications, and infections, and Note that this exercise of separately estimating effects for different “internal”
thus we need caution in attributing it solely to fetal development, as there is also a causes do not itself test whether the fundamental cause is due to air pollution or
possibility that post-natal exposure plays a role. others, as it is mentioned that the exact pathway is not known. Indeed, Chay and
32
Note that because the analysis is based on the birth record, we need to keep in Greenstone (2003a,b) do not disentangle effects among internal causes for this rea-
mind that deaths in a shorter period are more likely to be recorded and be cautious son. We do this simply to highlight variation across disease types given our data
about the estimates. We appreciate this discussion by an anonymous referee. capacity to do so.
S. Tanaka / Journal of Health Economics 42 (2015) 90–103 99
Table 5
Effects on IMR by cause of death.
Accidental causes
Injury and poisoning 2.01 −0.39
[7.21] (0.48)
Notes: All specifications include year fixed effects, DSP site fixed effects, household attributes, and district attributes. The number of observations is 1281. Standard deviations
are reported in the square brackets, and robust standard errors are reported in the parentheses.
*
Significant at p < 0.1 level.
***
Significant at p < 0.01 level.
Notes: Each cell reports the coefficient of interests and standard errors in the paren-
5.3. Heterogeneity in the regulations effect theses from separate regressions for respective subsample. The dependent variables
are IMR, the number of infants born at less than 2500 g per 1000 live births, and the
This part tests the hypothesis that the regulations on air pol- number of infants born in less than 32 weeks per 1000 live births, respectively.
lution may have a heterogeneous impact on infant mortality Low maternal education is defined as educational attainment less than high school,
and high maternal education is equal to or above high school completion. Note that
across various subsamples. We first search for heterogeneity in
the analysis based on mother’s education inevitably drops deaths associated with
the treatment effects between boys and girls due to biologically missing mother’s education.
based gender differences given that, in the literature, male fetuses *
Significant at p < 0.1 level.
**
are considered to be more physiologically sensitive than female Significant at p < 0.05 level.
***
fetuses to environmental changes. On the other hand, heteroge- Significant at p < 0.01 level.
have more access to health services to treat their children. Fami- Table 7
Robustness checks.
lies with high socioeconomic status also have a greater degree of
mobility to better areas, while the poor may continue to be exposed Specifications Coeff.
to greater pollution. On the other hand, the effect may be smaller Use death record −3.45*
among poor households, if long-term exposure to pollution in poor (1.84)
areas allowed them to be more adept at keeping infants indoors. Use 1996–2000 −3.57**
The third row reports the estimated effects for the sample (1.44)
Only districts and cities −3.62**
of households where maternal education was less than a high
(1.64)
school degree, and the fourth row for the sample of households Common support −3.24***
where mothers attained at least a high school education. We find (1.12)
that the regulations’ effect is substantially higher among house- Eliminate outliers −2.73***
(0.95)
holds with low maternal education.36 The finding suggests that
Include province × year effects −3.50***
the regulations’ effect on infant mortality should be stronger for (1.27)
the low-socioeconomic families that are more vulnerable to the Control for 1995–1997 −4.08***
effects of air pollution. (1.56)
Weight by number of birth −2.92**
(1.26)
5.4. Additional robustness checks Use 1995 cut-off −2.92*
(1.61)
The findings above leave little room for the scope of confound- Cluster at prefecture level −3.205***
(1.076)
ing factors. First, little association between TCZ status and trends Only northern China −3.138*
in observable characteristics limits the possibility that the main (1.657)
results erroneously reflect time trends that vary systematically Only southern China −3.258**
between the TCZ and the non-TCZ cities. Second, the consistency (1.497)
Use 1991–1997 sample −1.967
of the regulations’ effect in both magnitude and statistical signif-
(1.901)
icance when controlling for the set of key determinants of infant
Notes: The table provides robustness checks of the main results to various other
mortality suggests that the estimated effects are robust to compar-
hypotheses and specifications. See the text for their explanations.
isons with similar characteristics. Third, the absence of treatment *
Significant at p < 0.1 level.
effect in the falsification exercise on external infant mortality pro- **
Significant at p < 0.05 level.
vides strong evidence that health care system reform or medical ***
Significant at p < 0.01 level.
technology advancement cannot be a source of bias.
In this subsection, we extensively explore additional robust-
ness checks to rule out other possible scenarios. First, we examine
whether the finding is robust when using an alternative dataset. restrict the sample to only districts and county-level cities, mostly
As discussed above, using the birth record, which reports the urban areas, in the third row. The estimate is larger and remains sig-
occurrence of deaths only within the calendar year, may result in nificant at the 5 percent level, indicating that the treatment effect is
understating the effect, if the number of infant deaths is truncated not driven by simple comparisons between urban and rural areas.
at zero. The death record allows us to compute IMR for all deaths A major concern with any non-experimental studies is the pos-
occurring before age of one. As expected, the first row of Table 7 sibility that omitted heterogeneity may give rise to a spurious
shows that the size of the estimate becomes larger, though not sub- relationship. Controlling for the attributes, as in columns (2) and
stantially different from the main result, indicating that the effect (3) in Table 3, may not solve this issue when we cannot compare
is not sensitive to using the death record, while the main analysis the distribution of attributes across TCZ and non-TCZ cities. We
may understate the overall impact, if any. address this issue in two ways. In the fourth row, we limit to the
Second, we confirm that the treatment effect is not driven by observations under common support using the propensity score,
other national or local policy changes. In the second row, we restrict which potentially restricts the sample to DSP sites that have similar
the sample to the years between 1996 and 2000, which corresponds observed characteristics. In doing so, we first compute propensity
to the period of the 9th Five-Year Plan. The fact that the time period scores of being TCZ based on households and DSP attributes used
is shorter and falls within one policy regime helps reduce a set of in the main analysis. Then, we re-estimate the treatment effects
potential confounders in the pre- and post-natal health environ- using observations only under the common support. Alternatively,
ment other than pollution. Also, Table 2 shows that all observed in the fifth row, we examine whether the main findings are robust
characteristics had balanced trends during this short period. The to eliminating outliers. Specifically, we eliminate DSP sites whose
estimated effect is similar, suggesting that various other national IMR are above 99th percentile. The orders of the both magnitudes
or local policy changes should not confound the effect. are similar.
Third, despite evidence that the treatment effect is robust to het- Another concern is that there may be unobserved policy changes
erogeneous city-specific trends in IMR, as predicted in Table 3, there that affected infant mortality. In addressing this, we control for
may still be a concern that the TCZ status may be correlated with province times year fixed effects in the sixth row. In this specifica-
administrative divisions. For example, urban districts and county- tion, the effect of the regulations on infant mortality is identified
level cities may be more likely to be treated, whereas poor counties using variation in regulatory stringency before and after 1998
may be less likely to be assigned as TCZ. To address this issue, we within the province, thus purging any potential effects resulting
from any other policy changes at the provincial level. Further, in
the seventh row, we control for the years between 1995 and 1997,
the intermediate period after the APPCL and before the TCZ pol-
36
Note that, although neither of these estimates in themselves nor the differences icy was in effect. Note that this additional variable controls for the
between them are statistically significant, when we restrict the IMR to internal
causes, the coefficient to low maternal education level is −4.40 and significant
immediate impacts of APPCL, if any, but does not directly rule out
at the 5 percent level, whereas that to high maternal education is −0.34 and not delayed impacts of APPCL after 1998. Both of the estimates are again
statistically significant. unchanged.
S. Tanaka / Journal of Health Economics 42 (2015) 90–103 101
Next, we re-estimate the main analysis using different spec- the TCZ policy, which went into effect in 1998, was one of the
ifications. Namely, we weight the regressions using the number largest-scale air pollution regulatory schemes ever implemented
of population aged 0 in the eighth row;37 we use 1995 as a cut- in a developing country, imposing stringent regulations on pollut-
off year instead of 1998 to incorporate all years after APPCL was ant emissions from power plants in cities exceeding the nationally
enacted in the ninth row;38 we cluster the standard error at the mandated standards.
prefecture level in the tenth row;39 we use only northern China in The major objective of this paper is to test the hypothesis that
the eleventh row; we use only southern China in the twelfth row.40 these regulations led to reductions in infant mortality within the
All these estimates effects remain similar, showing robustness to TCZ cities subjected to particularly stringent regulations. Using the
alternative specifications. difference-in-differences approach, comparing changes in infant
Lastly, we repeat the main analysis using samples only in the mortality between the cities assigned and not assigned as the TCZ,
pre-reform period in the last row. We use 1995 as a placebo cut- before and after the policy reform, we find substantial impacts:
off year, and thus years between 1995 and 1997 are defined as infant mortality decreased by 20 percent; a large fraction of the
“post”-reform observations. The point estimate is lower and not reduction occurred in the neonatal period, which can be attributed
statistically indistinguishable from zero, reassuring that the trends to fetal exposure; and infants of mothers with low levels of educa-
between the TCZ and non-TCZ cites are similar in the pre-reform tion benefited the most.
period. The set of falsification tests and robustness checks limits the
Taken all above together, there is no evidence to indicate that role of omitted variables in biasing these estimates, leading us
the main results are driven by inappropriate identification assump- to believe that the linkage between the air pollution regulations
tions, leading us to believe that the relationship is causal. The and infant mortality reduction is causal. First, the estimates are
collection of these robustness checks substantially limits the scope robust to DSP site-specific trends in IMR. Second, we confirm that
of omitted variables, leading us to believe that the main find- the treatment effect is absent for infant deaths caused by external
ings substantiate the causal impact of environmental protection causes, ruling out a potential mechanism through improved local
on infant mortality. healthcare system. Lastly, the estimates are robust to various alter-
native hypotheses, i.e., using the death record, controlling for other
6. Conclusions policies at national or local levels, and limiting to only urban areas.
The findings in this study have important implications for pol-
China suffers from notoriously bad air pollution, the health icy. First, the question of whether, and to what extent, air pollution
effects of which have been of increasing public concern. In 1998, regulations in developing countries can lead to reducing infant
mortality remains unanswered. This study highlights a significant
reduction in infant mortality correlating to air pollution reductions,
37 in contrast to Greenstone and Hanna (2011) who find that air pol-
This intends to adjust the regression weighted by the number of birth, where
we do not have information on actual numbers of birth. lution reductions in India had only modest and insignificant impact
38
The main analysis uses 1998 as a cut-off year for the post-reform period, rather on infant mortality. Our results substantiate compelling evidence
than 1995 when the APPCL was amended, based on both theoretical and empir- to support air pollution regulations in countries that suffer from
ical rationales. For theoretical purposes, it is plausible to take two to three years high levels of air pollution. The size of the benefits itself may not
before the regulations are carried out to the full extent. Chay and Greenstone (2005)
are based on a similar argument when they use 1975 nonattainment status as an
be enough to justify environmental protection without consider-
instrumental variable in estimating the effect of 1970 Clean Air Act on housing ing their costs. In the United States, the Clean Air Act Amendments
prices between 1970 and 1980. In their context, the nonattainment status changes are found to have caused distortions on productivity (Gollop and
every year. By using the mid-decade regulation, they also take into account a two- Roberts, 1983; Barbera and McConnell, 1990; Greenstone et al.,
to three-year lag before the policy was fully executed. This is also relevant in our
2012), firm’s location decisions (Henderson, 1996; Becker and
context because the regulations required power plants to alter the energy sources
and install costly technology (such as FGD). Informal conversations with officials at Henderson, 2000; List et al., 2003); employment (Greenstone,
local power plants provide anecdotal evidence to support this assertion. Time-lags 2002; Deschenes, 2010; Walker, 2011, 2013); and foreign direct
are also likely in China because it is common for the government to set policy targets investment inflows and outflows (Eskeland and Harrison, 2003;
or guidelines, often very ambitious ones, without specifying the critical details until Keller and Levinson, 2002; Hanna, 2010). On the other hand, Tanaka
later, thereby largely leaving implementation up to the local governments or indi-
vidual firms. Further, the 1995 amendment had a weak implementation mechanism,
et al. (2014) provide compelling evidence that polluting firms in the
and more drastic actions (such as shutting down numerous inefficient power plants TCZ cities substantially improved economic performance through
and enforcing stringent air pollution regulations) were enforced only after the TCZ increased market dynamics via the entry of more efficient firms
policy in 1998. The empirical rationale for the cut-off date is based on the finding and the exit of less efficient ones. In addition, costs of reducing
that the interaction term between the TCZ status and the post-1998 period better
pollution are likely to be smaller under convex marginal cost func-
balances important determinants of infant mortality, compared with using 1995 as
a cut-off year, suggesting that the former is less likely to be confounded. Further, tions. Therefore, our finding is likely to pass cost-benefit analysis
using a 1998 cut-off year enables us to restrict the sample to observations between and suggest further implementation of air pollution regulations in
1996 and 2000, where the interaction term is not correlated with any observable similarly polluted countries.
variables. Therefore, using 1998 better averts omitted variable bias. Second, while the precise mechanisms through which air pol-
39
The main results are clustered at DSP level for two reasons. (1) It is conventional
to cluster the standard errors at the treatment-site level, and in our case, the treat-
lution reductions lead to health benefits is not known, our findings
ment status varies at the DSP level, not the prefecture level. There are cases that one highlight substantial reductions in infant mortality during the
prefecture accommodates multiple DSP sites that have different treatment status, neonatal period, shedding light on maternal exposure to pollu-
i.e., one is a district or county-level city, while the other is a county. (2) Indeed, tion as a potential pathophysiologic mechanism. This necessitates
most DSP sites are in different prefecture. Out of 145 DSP sites, there are only 12
additional policy interventions to protect pregnant women against
cases that we observe multiple sites in a single prefecture (two of which are three
DSP sites in one prefecture, and the rest of the cases include two DSP sites in one environmental risks.
prefecture). This robustness check clustering at the prefecture level should thus be Third, our findings identify that children in households with
seen as accounting for serial correlations within prefecture. low maternal education are particularly susceptible to fluctua-
40
The northern and southern China is defined as prefectures accommodating SO2 tions in air quality. Although all individuals are potentially exposed
control zones (north) or acid rain control zone (south) (or more so if accommo-
dating both). We eliminated Tibet Autonomous Region, Qinghai Province, Xinjiang
to ambient pollution, the evidence indicates that socioeconomic
Autonomous Region, as they are not typically perceived as either northern or south- status cushions the effect of air pollution, either through behav-
ern China. ioral factors in avoiding pollution or socioeconomic factors such
102 S. Tanaka / Journal of Health Economics 42 (2015) 90–103
as increased access to medical care. As such, our findings provide Currie, J., Hyson, R., 1999. Is the impact of health shocks cushioned by socioeco-
justifications to interventions targeting low-income households, nomic status? The case of low birthweight. American Economic Review 89 (2),
245–250.
including information provision about air pollution effect. Currie, J., Neidell, M., 2005. Air pollution and infant health: what can we learn
This study has clear policy implications for developing countries from California’s recent experience? Quarterly Journal of Economics 120 (3),
in general: namely, while climate change does not currently appear 1003–1030.
Currie, J., Neidell, M.J., Schmieder, J., 2009. Air pollution and infant health: lessons
to be a sufficiently strong motivation for these countries to embark from New Jersey. Journal of Health Economics 28 (3), 688–703.
on more aggressive air pollution regulations, our findings leave Dejmek, J., Selevan, S.G., Benes, L., Solansky, I., Sram, R.J., 1999. Fetal growth and
little doubt that protecting the environment is vital to improving maternal exposure to particulate matter during pregnancy. Environmental
Health Perspectives 107 (6), 475–480.
domestic public health.
Deschenes, O., 2010. Climate policy and labor markets. NBER Working Paper No.
16111.
Acknowledgements Deschenes, O., Greenstone, M., Shapiro, J.S., 2012. Defensive investments and the
demand for air quality: evidence from the NOX budget program and ozone
reductions. NBER Working Paper No. 18267.
I am indebted to Daniele Paserman, Dilip Mookherjee, Tavneet Eskeland, G.S., Harrison, A.E., 2003. Moving to greener pastures? Multinationals and
Suri, and Wesley Yin for invaluable advice and feedback. I am also the pollution haven hypothesis. Journal of Development Economics 70 (1), 1–23.
Fried, P.A., 1995. Prenatal exposure to marihuana and tobacco during infant infancy,
grateful to Lucas Davis, Esther Duflo, Michael Greenstone, Hsueh- early and middle childhood: effects and an attempt at synthesis. Archives of
Ling Huynh, Kelsey Jack, Ginger Zhe Jin, Hiroaki Kaido, Kevin Lang, Toxicology Supplement 17, 233–260.
Adriana Lleras-Muney, Michael Manove, Kenneth A. Rahn, Leena Gollop, F.M., Roberts, M.J., 1983. Environmental regulations and productivity
growth: the case of fossil-fueled electric power generation. Journal of Political
Rudanko, Marc Rysman, Johannes Schmieder, Jeremy Smith, three
Economy 91 (4), 654–674.
anonymous reviewers, and seminar participants at Boston Univer- Greenstone, M., 2002. The impacts of environmental regulations in industrial activ-
sity, Hiroshima University, Loyola Marymount University, National ity: evidence from the 1970 and 1977 Clean Air Act Amendments and the census
of manufactures. Journal of Political Economy 110 (6), 1175–1219.
University of Singapore, Tufts University, University of Washing-
Greenstone, M., List, J.A., Syverson, C., 2012. The effects of environmental regulation
ton, 2010 NEUDC, and the 2011 Royal Economic Society for their on the competitiveness of U.S. manufacturing. NBER Working Paper 18392.
comments and suggestions. I also thank the Chinese Center for Dis- Greenstone, M., Hanna, R., 2011. Environmental regulations, air and water pollution,
ease Control and Prevention for sharing the data. Financial support and infant mortality in India. NBER Working Paper 17210.
Hanna, R., 2010. US environmental regulation and FDI: evidence from a panel of
from the Institute for Economic Development at Boston Univer- US-based multinational firms. American Economic Journal: Applied Economics
sity, as well as Hewlett/IIE Dissertation Fellowship in Population, 2 (3), 158–189.
Reproductive Health and Economics, is gratefully acknowledged. Hao, J., He, K., Duan, L., Li, J., Wang, L., 2007. Air pollution and its control in
China. Frontiers of Environmental Science and Engineering in China 1 (2),
Jieshuang He provided excellent research assistance. All remaining 129–142.
errors are my own. Hao, J., Wang, S., Liu, B., He, K., 2001. Plotting of acid rain and sulfur dioxide pollution
control zones and integrated control planning in China. Water, Air, and Soil
Pollution 130, 259–264.
References He, K., Huo, H., Zhang, Q., 2002. Urban air pollution in China: current status, char-
acteristics, and progress. Annual Review of Energy and the Environment 27,
Almond, D., Chay, K.Y., Lee, D.S., 2005. The costs of low birth weight. Quarterly 397–431.
Journal of Economics 120 (3), 1031–1083. Hedley, A.J., Wong, C.M., Thach, T.Q., Ma, S., Lam, T.H., Anderson, H.R., 2002. Car-
Almond, D., Chen, Y., Greenstone, M., Li, H., 2009. Winter heating or clean air? Unin- diorespiratory and all-cause mortality after restrictions on sulphur content fuel
tended impacts of China’s Huai River policy. American Economic Review Papers in Hong Kong: an intervention study. Lancet 360 (9346), 1646–1652.
and Proceedings 99 (2), 184–190. Henderson, V., 1996. Effects of air quality regulation. American Economy Review 86
Altonji, J.G., Elder, T.E., Taber, C.R., 2005. Selection on observed and unobserved (4), 789–813.
variables: assessing the effectiveness of catholic schools. Journal of Political Jayachandran, S., 2009. Air quality and early-life mortality: evidence from
Economy 113 (1), 151–184. Indonesia’s wildfires. Journal of Human Resources 44 (4), 916–954.
Arceo, E., Hanna, R., Oliva, P., 2012. Does the effect of pollution on infant mortality Keller, W., Levinson, A., 2002. Pollution abatement costs and foreign direct invest-
differ between developing and developed countries? Evidence from Mexico City. ment inflows to U.S. states. Review of Economics and Statistics 84 (4),
NBER Working Paper No. 18349. 691–703.
Aunan, K., Pan, X.C., 2004. Exposure-response functions for health effects of ambi- Knittel, C., Miller, D.L., Sanders, N.J., 2011. “Caution, Drivers! Children Present: Traf-
ent air pollution applicable for China: a meta-analysis. Science of the Total fic, Pollution, and Infant Health,” NBER Working Paper No. 17222.
Environment 329, 3–16. Kumar, N., Foster, A., 2007. Respiratory Health Effects of Air Pollution in Delhi and
Barbera, A.J., McConnell, V.D., 1990. The impact of environmental regulations on its Neighboring Areas. Mimeo, India.
industry productivity: direct and indirect Effects. Journal of Environmental Eco- Laden, F., Neas, L.M., Dokery, D.W., Schwartz, J., 2000. Association of fine particulate
nomics and Management 18 (1), 50–65. matter from different sources with daily mortality in six U.S. cities. Environ-
Becker, R., Henderson, V., 2000. Effects of air quality regulations on polluting indus- mental Perspectives 108 (10), 941–947.
tries. Journal of Political Economy 108 (2), 379–421. Levy, J.I., Hammitt, J.K., Spengler, J.D., 2000. Estimating the mortality impacts of
Behrman, J.R., Rosenzweig, M.R., 2004. Returns to birthweight. Review of Economics particulate matter: what can be learned from between-study variability? Envi-
and Statistics 86 (2), 586–601. ronmental Health Perspectives 108 (2), 109–117.
Bleakley, H., 2007. Disease and development: evidence from hookworm eradication List, J.A., Millimet, D.L., Fredriksson, P.G., McHone, W.W., 2003. Effects of envi-
in the American South. Quarterly Journal of Economics 122 (1), 73–117. ronmental regulations on manufacturing plant births: evidence from a
Bobak, M., Leon, D.A., 1999. The effect of air pollution on infant mortality appears propensity score matching estimator. Review of Economics and Statistics 85 (4),
specific for respiratory causes in the postneonatal period. Epidemiology 10 (6), 944–952.
666–670. Luechinger, S., 2014. Air pollution and infant mortality: a natural experiment from
Borja-Aburto, V.H., Loomis, D.P., Bangdiwala, S.I., Shy, C.M., Rascon-Pacheco, R.A., power plant desulfurization. Journal of Health Economics 37, 219–231.
1997. Ozone, suspended particulates, and daily mortality in Mexico City. Amer- Matus, K., Nam, K.M., Selin, N.E., Lamsal, L.N., Reilly, J.M., Paltsev, S., 2012. Health
ican Journal of Epidemiology 145 (3), 258–268. damages from air pollution in China. Global Environmental Change 22 (1),
Brennan, P.A., Grekin, E.R., Mednick, S.A., 1999. Maternal smoking during preg- 55–66.
nancy and adult male criminal outcomes. Archives of General Psychiatry 56 Mendelsohn, R., Orcutt, G., 1979. An empirical analysis of air pollution
(3), 215–219. dose–response curves. Journal of Environmental Economics and Management
Case, A., Fertig, A., Paxson, C., 2005. The lasting impact of childhood health and 6 (2), 85–106.
circumstance. Journal of Health Economics 24 (2), 365–389. Moretti, E., Neidell, M., 2011. Pollution, health, and avoidance behavior: evidence
Case, A., Lubotsky, D., Paxson, C., 2002. Economic status and health in childhood: from the ports of Los Angeles. Journal of Human Resources 46 (1), 154–175.
the origins of the gradient. American Economic Review 92 (5), 1308–1334. National Bureau of Statistics of China, 2006. http://www.stats.gov.cn/
Chay, K.Y., Greenstone, M., 2003a. The impacts of air pollution on infant mortality: Nielsen, C.P., Ho, H.S., 2007. Air pollution and health damages in China: an intro-
evidence from geographic variation in pollution shocks induced by a recession. duction and reviews. In: Ho, Mun, S., Chris, P., Nielsen (Eds.), Clearing the Air:
Quarterly Journal of Economics 118 (3), 1121–1167. The Health and Economic Damages of Air Pollution in China. The MIT Press,
Chay, K.Y., Greenstone, M., 2003b. Air quality, infant mortality, and the Clean Air Act Cambridge.
of 1970. NBER Working Papers No. 10053. Perera, F.P., Jedrychowski, W., Rauh, V., Whyatt, R.M., 1999. Molecular epidemi-
Chay, K.Y., Greenstone, M., 2005. Does air quality matter? Evidence from the housing ological research on the effects of environmental pollutants on the fetus.
market. Journal of Political Economy 113 (2), 376–424. Environmental Health Perspectives Supplements 107 (3), 451–460.
S. Tanaka / Journal of Health Economics 42 (2015) 90–103 103
Qian, J., Zhang, K., 1998. China’s desulfurization potential. Energy Policy 26 (4), Walker, R.W., 2011. Environmental regulation and labor reallocation: evidence from
345–351. the clean air act. American Economics Review Papers & Proceedings 101 (3),
Qian, N., 2008. Missing women and the price of tea in China: the effect of sex- 442–447.
specific earnings on sex imbalance. Quarterly Journal of Economics 123 (3), Walker, R.W., 2013. The transitional costs of sectoral reallocation: evidence from
1251–1285. the Clean Air Act and the workforce. Quarterly Journal of Economics 128 (4),
Samet, J.M., Zeger, S.L., Dominici, F., Curriero, F., Coursac, I., Douglas, W., Dockery, J.S., 1787–1835.
Zanobetti, A., 2000. The National Morbidity, Mortality, and Air Pollution Study. Woodruff, T.J., Grillo, J., Schoendorf, K.C., 1997. The relationship between selected
Part II. Morbidity, Mortality, and Air Pollution in the United States. Health Effects causes of postneonatal infant mortality and particulate air pollution in the
Institute, Boston. United States. Environmental Health Perspectives 105 (6), 608–612.
Schwartz, J., Dockery, D.W., Neas, L.M., 1996. Is daily mortality associated specifically Woodruff, T.J., Parker, J.D., Schoendorf, K.C., 2006. Fine particulate matter (PM2.5)
with fine particles? Journal of the Air and Waste management Association 46 air pollution and selected causes of postneonatal infant mortality in California.
(10), 927–939. Environmental Health Perspectives 114 (5), 786–790.
Schwartz, J., Marcus, A., 1990. Mortality and air pollution in London: a time series World Bank, 1998. World Development Indicators. The World Bank, Washington,
analysis. American Journal of Epidemiology 131, 185–194. D.C.
State Council, 1998. Official Reply to the State Council Concerning Acid Rain Control World Health Organization (WHO), 2002. Air Quality Guidelines for Europe, 2nd ed,
Areas and Sulfur Dioxide Pollution Control Areas. Copenhagen.
State Environmental Protection Agency (SEPA), 1996. The Report on Environmental Yang, J., Cao, D., Ge, C., Gao, S., 2002. Air pollution control strategy for China’s power
Quality in 1991–1995. SEPA. sector. In: Chinese Academy for Environmental Planning, Beijing.
Tanaka, S., Yin, W., Jefferson, G., 2014. Environmental Regulation and Industrial Yang, J., Schreifels, J., 2003. Implementing SO2 emissions in China. In: Presented in
Performance: Evidence from China. Mimeo. OECD Global Forum on Sustainable Development. Emissions Trading, Paris.
United Nations Environment Programme, 2009. Two Control Zone Plan and Pro- Zivin, J.G., Neidell, M., 2009. Day of haze: environmental information disclosure
gram to Control Sulfur Pollution, Available at: http://www.ekh.unep.org/files/ and intertemporal avoidance behavior. Journal of Environmental Economics and
GP-2.pdf Management 58 (2), 119–128.
Journal of Health Economics 42 (2015) 104–114
a r t i c l e i n f o a b s t r a c t
Article history: The US health insurance industry is highly concentrated, and health insurance premiums are high and
Received 17 April 2014 rising rapidly. Policymakers have focused on the possible link between the two, leading to ACA pro-
Received in revised form 5 November 2014 visions to increase insurer competition. However, while market power may enable insurers to include
Accepted 28 March 2015
higher profit margins in their premiums, it may also result in stronger bargaining leverage with hospitals
Available online 8 April 2015
to negotiate lower payment rates to partially offset these higher premiums. We empirically examine the
relationship between employer-sponsored fully-insured health insurance premiums and the level of con-
JEL classification:
centration in local insurer and hospital markets using the nationally-representative 2006–2011 KFF/HRET
I11
L11
Employer Health Benefits Survey. We exploit a unique feature of employer-sponsored insurance, in which
L41 self-insured employers purchase only administrative services from managed care organizations, to dis-
D4 entangle these different effects on insurer concentration by constructing one concentration measure
representing fully-insured plans’ transactions with employers and the other concentration measure rep-
Keywords: resenting insurers’ bargaining with hospitals. As expected, we find that premiums are indeed higher for
Insurance
plans sold in markets with higher levels of concentration relevant to insurer transactions with employers,
Competition
lower for plans in markets with higher levels of insurer concentration relevant to insurer bargaining with
Hospitals
Premiums hospitals, and higher for plans in markets with higher levels of hospital market concentration.
Bargaining power © 2015 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.jhealeco.2015.03.009
0167-6296/© 2015 Elsevier B.V. All rights reserved.
E.E. Trish, B.J. Herring / Journal of Health Economics 42 (2015) 104–114 105
of insurer concentration should lead to increased insurer market The second HHI market concentration measure focuses on the
power in the markets where insurance is sold (to employers and hospital price’s portion of the premium tied to the negotiations
individuals), likely resulting in relatively higher premiums due to between insurers and hospitals. While self-insured enrollment rep-
higher plan profit margins, all else equal. On the other hand, insur- resents a distinct product that is sold to employers, the insurer’s
ers also engage in bilateral bargaining over transaction prices with patient volume across the entire combined “book of business” (i.e.,
providers, one of the key drivers of insurer costs. Thus, higher levels the fully-insured market and the self-insured market) represents
of insurer market concentration may yield stronger insurer bar- its market share relevant to the price negotiations with hospitals.
gaining leverage with local providers, thereby enabling them to We therefore use these HealthLeaders-InterStudy data to mea-
negotiate lower provider prices, which may partly be passed on sure each plan’s fully-insured and self-insured combined market
to insurance purchasers in the form of lower premiums. This pur- share in this HHI calculation representing insurer bargaining with
chasing power effect is particularly important, given the recent providers. We hypothesize that concentration in the fully-insured
movement toward increased consolidation among provider mar- and self-insured markets combined will be associated with rela-
kets driven by the ACA and other trends (Cutler and Scott Morton, tively lower health insurance premiums. (We also hypothesize that
2013). higher hospital market concentration – derived from the American
Moreover, the effects of insurer market power may depend on Hospital Association’s (AHA) Annual Survey – will be associated
the amount of provider market power, and vice versa. The extent with relatively higher health insurance premiums.)
to which insurers can use their bargaining leverage to negotiate Using plan-level premium data from the restricted-use
lower provider prices likely depends on the level of competition in Kaiser Family Foundation/Health Research and Educational Trust
the local provider market, as these prices may already be at or near (KFF/HRET) Employer Health Benefits Survey for years 2006
the point at which economic profits are zero in relatively competi- through 2011, we find that premiums are indeed higher among
tive provider markets. Furthermore, the extent to which hospitals markets with higher levels of insurer concentration representing
can use their bargaining leverage likely depends on local insur- fully-insured coverage sold to employers (and higher among more
ance market conditions. A better understanding of the extent to concentrated hospital markets), and we find that premiums are
which higher prices resulting from concentrated provider markets indeed lower among markets with higher levels of insurer con-
are passed through to consumers in the form of higher premiums centration representing insurer bargaining with hospitals (derived
(rather than simply representing a transfer of rents from insurers from combined fully-insured and self-insured market shares).
to providers) is particularly relevant for antitrust enforcement in Regarding the organization of the remainder of the paper, we
terms of evaluating the extent to which hospital market consolida- first summarize the relevant literature on the effects of insurer and
tion ultimately harms consumers.2 hospital concentration and then describe the conceptual frame-
work. We then explain our empirical model, data, and market defi-
nitions. Our results, discussion, limitations, and conclusions follow.
1.1. Our empirical contribution
increased insurance concentration is associated with a substitu- in higher HMO premiums and reductions in insurance coverage,
tion of nurses for physicians. Similarly, Dafny et al. (Forthcoming) and that these effects were strongest among competitive insurance
exploit United Healthcare’s uneven impact of its non-participation markets.
in state exchanges to conclude that more concentrated insurance
exchanges were associated with higher premiums in 2014. 3. Conceptual framework for premiums
Numerous recent studies document the significant inter- and
intra-market variation in negotiated provider prices, including its Premiums set by insurers for a given employer represent a com-
association with market-level factors, such as insurer or hospital bination of expected medical spending covered by the insurer and a
market concentration (for example, White et al., 2013; Berenson loading factor. The loading factor reflects the insurer’s administra-
et al., 2012; Ginsburg, 2010; MedPAC, 2009; Massachusetts tive costs (such as marketing and paying claims) and any possible
Attorney General, 2010; US GAO, 2005). Several recent papers mark-up in the profit margin resulting from the insurer exercising
examine the relationship between both insurer and hospital con- market power in selling the insurance policy. Expected medical
centration and negotiated hospital prices. McKellar et al. (2013) spending is a function of prices and quantities of medical care to
and Moriya et al. (2010) both find that higher levels of insurance be consumed. Prices generally represent the outcome of negotia-
concentration are associated with lower hospital prices, but that tions between insurers and providers, and the expected quantity of
higher levels of hospital concentration are not significantly associ- healthcare consumed generally reflects the generosity of the plan
ated with higher hospital prices. However, Melnick et al. (2011) find and the health status and other features of the group covered.4
that higher hospital concentration is indeed associated with higher As noted above, an increase in the level of concentration in
hospital prices, and also find that hospital prices are lower in the the insurance market likely has offsetting effects on premiums
most concentrated health plan markets compared to more compet- as market concentration may differentially affect the loading and
itive health plan markets. Ho and Lee (2013) find heterogeneous expected spending components of the premium. Regarding the
effects of insurer competition on negotiated hospital prices; while loading component of the premium, the most straightforward
increased insurer competition actually reduces hospital prices on effect of increasing insurance market concentration is the likely
average, they observe a positive and significant effect on the prices positive effect on loading as the insurer gains more market power
negotiated by the most attractive hospitals.3 and attains higher profit margins on policies sold to employers.5
Similar effects have been documented among physician However, higher levels of insurer market concentration may poten-
markets. Schneider et al. (2008) find that higher physician concen- tially yield efficiencies in certain administrative costs such as lower
tration is associated with higher prices but find no effect of insurer advertising costs and an increased ability to spread certain fixed
concentration on prices, while Dunn and Shapiro (2012) find that costs over a larger population. Would an insurer with increased
physician prices are higher in concentrated physician markets and market power ever pass any portion of these saving in adminis-
lower in concentrated insurance markets. Additionally, Dunn and trative costs along to consumers in the form of relatively-lower
Shapiro (2013) find that negotiated physician prices increased as a premiums? Consider the extreme case of one monopolist insurer
result of health reform in Massachusetts, including some evidence setting the price of the premium such that its marginal revenue
suggesting that these price increases are at least partly attributable (from selling an additional policy) equals its marginal cost. Unless
to increased competition among insurers. However, the outcome the aggregate demand for insurance is completely inelastic, any
of interest in each of these studies is the prices negotiated between decrease in the marginal cost (from administrative efficiencies via
insurers and providers, leaving open the question of whether and, larger market share) implies a partial decrease in the premium (to
if so, the extent to which such prices are ultimately passed through thus reduce marginal revenue in equilibrium). That said, the over-
to consumers in the form of higher premiums. all effect of increased market concentration would seem to likely
In perhaps the most closely related paper to our study, Town increase premiums, with the partial effect of increased profits on
et al. (2006) analyze the effects of hospital industry consolida- higher premiums exceeding the partial effect of reduced adminis-
tion in the 1990s on HMO premiums. They derive a theoretical trative costs on lower premiums, unless the aggregate demand for
model demonstrating the effects of horizontal mergers in upstream insurance is highly elastic.
markets on consumer prices in downstream markets, and apply Regarding the expected spending components of the premium,
this model to the hospital (i.e., the upstream input to the prod- increased insurance market concentration may also result in lower
uct of health insurance) and health insurance (i.e., the downstream healthcare spending, as the insurer gains stronger bargaining
output) industries. Their theory predicts that the effects of con- leverage with hospitals and is able to negotiate lower payment
solidation in the upstream industry will have differential effects rates. However, the extent to which these lower provider prices
on the price and quantity of the downstream product dependent attained by an insurer are passed along as savings to consumers
on the level of competition in the downstream product industry, in the form of lower premiums is also unclear. Similar to the
and their empirical findings support this theory. Specifically, they
find that the hospital mergers that occurred in the 1990s resulted
4
Quantity consumed is also a function of the price; however, here we are focusing
on prices in terms of total transaction price negotiated with the hospital by the
3
Several other papers also focus on the effect of the type of hospital with insurer. Given the presence of insurance coverage, the portion of this price faced
respect to negotiation between hospitals and insurers (which we do not consider by the consumer seeking medical care is likely to be considerably smaller than this
in our empirical analysis). Ho (2009) develops a sophisticated model of the insurer- negotiated transaction price, so the price effects would likely reflect the change in
hospital bargaining game, estimating the expected division of profits between consumer cost-sharing, rather than the change in overall price. In a similar paper
insurers and hospitals. She finds that specific hospital features have important which disentangles the price and quantity effects on physician services consumed,
effects on the outcome of this bargaining game – that “star” and capacity constrained Dunn and Shapiro (2012) find very small price effects on quantity consumed in this
hospitals have stronger bargaining leverage with insurers and higher profits. This state of insurance coverage. Additionally, McKellar et al. (2013) find that, despite
result is also documented by Berenson et al. (2012) who, using data from qualitative an inverse correlation between market-level private prices and utilization, overall
interviews with hospital and insurance executives from the Community Track- the price effect dominates, resulting in a positive relationship between prices and
ing Study, find that “must-have hospital systems. . .can exert considerable market spending.
5
power to obtain steep payment rates from insurers.” Lewis and Pflum (2014) also Competitive pressures on insurers could also lead to improvements in quality
find that multi-market participation by a hospital system may increase bargaining for the insurance plan, holding spending constant, although Scanlon et al. (2008)
leverage. find no evidence to support competition’s effect on quality.
E.E. Trish, B.J. Herring / Journal of Health Economics 42 (2015) 104–114 107
above consideration of reduced administrative costs, a reduction models anyway, but these models yielded insignificant results.)
in negotiated provider prices is essentially a downward shift in Our prior is that exogenously-high insurer profits would increase
the insurer’s marginal cost curve. If the price elasticity of demand insurer competition and exogenously-high hospital prices would
for insurance is completely inelastic, all of the savings from lower increase hospital competition, leading to a bias against observing
provider prices paid by the insurer would be retained by the our hypothesized positive effects of HHIs on premiums.
insurer as higher profits. Otherwise (unless the price elasticity
is extremely high), a portion of the savings from lower provider 4.2. Data
prices would likely be passed on to consumers as lower premiums
(tied with the desire to sell more policies)6 while a portion of We obtain data on employer-sponsored health insurance pre-
the savings from lower provider prices would be retained by the miums from a restricted-use version of the annual KFF/HRET
insurer as a higher profit margin. Employer Health Benefits Survey for 2006–2011. (The public-use
Conversely, as hospital markets become more concentrated, version of this dataset does not have geographic identifiers.) The
hospitals may gain stronger leverage in the bargaining game with KFF/HRET survey provides nationally representative data regard-
insurers, resulting instead in higher premiums via increased spend- ing employers’ health benefits offerings for roughly 2000 firms
ing due to higher negotiated payment rates to hospitals. Moreover, per year. The data include plan-level information on the largest
as hospital markets become more competitive, the effect of insurer plan of each type of plan (i.e., HMO, PPO, etc.) offered by the
concentration may have a negligible impact on hospital prices if employer in the year. We obtain plan-level premiums, type of
those prices cannot be negotiated downwards any further by insur- plan, and generosity factors such as deductible and out-of-pocket
ers due to hospital solvency constraints. maximum information from these data and focus our regression
As a result, the relative magnitudes of these potentially off- analyses on single coverage. We also include firm-level control
setting effects of increasing insurance concentration on health variables from these data including firm industry, size, unioniza-
insurance premiums are not clear. Our study therefore aims to tion, and workforce characteristics. We restrict our analysis to
empirically isolate some of these potentially countervailing effects employers purchasing fully-insured coverage by excluding self-
of health insurance concentration, and their interaction with local insured employers. We also exclude rural employers from our
hospital market concentration, on health insurance premiums in analyses, as we ultimately link these data to market concentration
the employer-sponsored insurance market. measures constructed for urban markets. Finally, we exclude obser-
vations with premiums in the highest and lowest one percentile of
4. Empirical model and data the distribution of the data.
We also include time-variant market-level control variables in
4.1. Empirical overview these premium regressions. These include mean per capita income
at the CBSA-level, which we obtain from the Bureau of Economic
We run plan-level OLS regressions to test the relationship Analysis, and the age, sex, and race-adjusted mean annual Medicare
between insurer and hospital market concentration and employer- hospital reimbursement per enrollee at the HRR-level, obtained
sponsored fully-insured premiums from the KFF/HRET Employer from the Dartmouth Atlas of Health Care, which we include to con-
Health Benefits Survey from 2006 through 2011. These models trol for local variation in practice patterns that would be expected
use the logged single-employee’s total annual premium (i.e., the to affect utilization and therefore premiums. Additionally, we con-
employer and employee shares combined) as the dependent vari- trol for state-level premium tax rates and an index of high-cost
able of interest and include plan, firm, industry, and market-level state-mandated benefits,7 both of which may increase premiums
controls for premiums. Our model uses continuous HHI measures for fully-insured coverage. We use a one year lag for all market-
for insurer and hospital market concentration and, as noted above, level variables, except the premium tax rates and mandated benefit
incorporates two separate measures of insurance market concen- index, which are contemporaneous (though highly invariant over
tration to disentangle the effects of insurer market concentration the time period studied).
in the market for selling fully-insured coverage to employers from
those effects of insurer market concentration in the market for 4.3. Market definitions
bargaining over service prices with hospitals.
An important limitation of our analysis is that we ultimately We construct HHI measures of insurance market concentra-
rely on cross-sectional geographic variation in these market con- tion from the HealthLeaders-InterStudy census of private insurers
centration measures for both insurers and hospitals, and thus the and subsequently merge these market concentration measures to
endogeneity of these market concentration measures is a potential the KFF/HRET data.8 The HealthLeaders-InterStudy data include
concern. A good instrument for cross-sectional variation in market enrollment at the managed care organization (MCO)-product-
concentration is simply not apparent to us. Many studies therefore county-level for each year.9 We construct two distinct measures
use variation over time in these market concentration measures,
but we think that firm decisions to merge with one another are also
likely endogenous to market characteristics themselves. Regard- 7
The index is constructed by summing the number of high cost benefit mandates
less, there is very little within-market variation over time in either in effect for the given year in the state in which the policy is sold. “High cost” man-
dates are defined as those for which associated healthcare spending is estimated to
the insurer HHIs or the hospital HHIs during this 2006 through 2011
be more than 1% of overall premium by the Council for Affordable Health Insurance
time period for our premium data. (Despite this lack of variation (2006–2011).
over time, we tested models including market-level fixed-effects 8
The HHI is the sum of the squared market shares of each competitor in the
market, and is a commonly used measure of market competitiveness in horizontal
merger analyses conducted by the DOJ and FTC. The measure ranges from 0 to 10,000
with 10,000 representing a perfect monopoly. We scale this by 100 points (such that
6
In the presence of adverse selection, insurers may also pass on savings in the the HHI ranges from 0 to 100) in all of our regression analyses for easier presentation
form of lower premiums in an effort to attract a healthier risk pool. For example, of results.
9
Starc (2014) shows that medical spending is positively associated with premiums in The InterStudy data have been criticized for work on health insurance
the Medigap market and that adverse selection in this market somewhat restrains markets due to concerns with accuracy and consistency (see, for example,
insurer premium markups despite insurer market power. Dafny et al., 2011). One important point to note is that earlier criticisms of the
108 E.E. Trish, B.J. Herring / Journal of Health Economics 42 (2015) 104–114
of insurance market concentration based on the market shares of systems from the Insurer:Employer HHI calculations, as these
the relevant transaction. plans compete with other plans for employer coverage.11
For the market transaction in which the insurer sells fully- We consider two ways to define geographic markets: Core-
insured coverage to employers, we define the product market as Based Statistical Areas (CBSA) and counties. The CBSA is a
all fully-insured managed care insurance products and aggregate geographic area defined by Office of Management and Budget to
the MCO’s enrollment in these products within a defined geo- represent an area with commuting ties to an urban center. The
graphic market (described below). We refer to this measure as the 11 largest CBSAs (e.g., greater New York City, greater Chicago) are
“Insurer:Employer HHI” representing the level of competition of separated into smaller Metropolitan Divisions (e.g., four Divisions
the market in which the employer is purchasing a fully-insured within New York City, three Divisions within Chicago), and so we
managed care product to provide coverage for its employees. use the smaller Metropolitan Division codes, when available, to
For the market transaction in which the insurer uses its bargain- define the geographic markets within these larger CBSAs.12 While
ing leverage to negotiate with hospitals, we define the insurance we believe that HHIs using the CBSA as the geographic market def-
product market by aggregating each insurer’s enrollment in a inition should reasonably characterize the market transactions for
geographic market for its entire commercial book of business relatively-smaller employers purchasing coverage among compet-
(i.e., combined enrollment in fully-insured and self-insured man- ing private health insurers and reasonably characterize the market
aged care products) because that full set of commercially insured transactions between private health insurers and hospital systems,
patients represents that insurer’s purchasing power. We refer to we also construct HHI measures using counties as the geographic
this measure as the “Insurer:Hospital HHI” measure. We exclude market. Accordingly, we run a regression model using the CBSA for
observations in markets in which the HealthLeaders-InterStudy these three HHIs and then a separate regression model using the
data provide implausibly high, low, or variant total enrollment. county for these three HHIs and report results for both measures.13
To construct the measure of market concentration for hospi- The joint distribution of the Insurer:Employer and Insurer:
tal services, we use data from the AHA annual survey. We include Hospital CBSA-based HHI measures is shown in Fig. 1A. While the
all non-federal short-term general acute care hospitals in the US. two measures are strongly correlated across markets (i.e., the cor-
We define the product market as the number of private-pay inpa- relation coefficient is 0.83), there is actually a considerable level
tient days aggregated to the hospital system within the geographic of differences between the Insurer:Employer and Insurer:Hospital
market.10 This concentration measure, referred to as “Hospital HHI” HHI measures, so that we are able to disentangle these opposing
represents the relative bargaining strength of the local hospital effects of higher profits and lower hospital prices on premiums.
market with which insurers must negotiate hospital prices. The joint distribution of relative bargaining leverage (i.e.,
We exclude plan enrollment and hospital admissions among Insurer:Hospital HHI and Hospital HHI for CBSAs) is shown in
three specific integrated delivery systems – namely, Kaiser Per- Fig. 1B. The correlation coefficient is 0.22, indicating that there is
manente, Geisinger Health System, and Intermountain Healthcare a mix of markets where the insurers have more bargaining power
– from the calculations of the Insurer:Hospital and Hospital HHI than hospitals, insurers have less bargaining power than hospitals,
measures, respectively, in the geographic markets where their and insurers and hospitals have comparable bargaining power. The
hospitals exclusively treat patients from the integrated insurer DOJ/FTC Horizontal Merger Guidelines provide particular HHI cut-
and there is thus no relevant hospital price negotiation. However, offs as one way to categorize the level of competition in a market;
we do not remove the enrollment among these integrated delivery by these standards, markets with an HHI between 1500 and 2500
are considered moderately concentrated, and markets with an HHI
greater than 2500 are considered highly concentrated (US DOJ/FTC,
2010).
InterStudy data related to the fact that they only measured enrollment in HMOs do
For the Insurer:Employer HHI measure using the CBSA for the
not apply to our study, as information on PPOs and other products was added begin-
ning in 2005 associated with combining with HealthLeaders. Nonetheless, there are market definition, 2.6% of plans are in un-concentrated markets,
still concerns regarding the validity and volatility of enrollment. We have addressed 39.3% are in moderately concentrated markets, and 58.1% are in
these concerns in several ways. In particular, we have removed some enrollment highly concentrated markets. For the Insurer:Employer HHI mea-
to address the double counting issue of “rental network” enrollment, particularly sure using the smaller county for the market definition, 1.8% of
in 2007–2008, following our own analysis and discussions with database managers
at HealthLeaders-InterStudy. Additionally, we have taken several steps to address
plans are in un-concentrated markets, 35.1% are in moderately con-
volatility in the data. First, we have taken the average MCO-product enrollment centrated markets, and 63.1% are in highly concentrated markets.
of the two observations per year (January and July) and used this as MCO-product For the Hospital HHI measure using the CBSA for the market def-
enrollment for the year. Next, we aggregate total managed care enrollment (fully- inition, 27.2% of plans are in un-concentrated markets, 29.8% are
and self-insured) in the data at the market (CBSA/Division or county) level, and com-
in moderately concentrated markets, and 43.1% are in highly con-
pare this enrollment to estimates of the under-65 population for the market, which
we obtain from the Small Area Health Insurance Estimates. We exclude from our centrated markets. For the Hospital HHI measure using the county
analyses any markets where the aggregate private enrollment in the HealthLeaders- for the market definition, 10.8% of plans are in un-concentrated
InterStudy data is less than 30% or greater than 100% of the total under-65 population markets, 20.1% are in moderately concentrated markets, and 69.1%
in the market. We believe these are conservative cutoffs, as the under-65 population are in highly concentrated markets. Only 1.0% of plans are in
includes not only those that are privately insured, but also those that are uninsured
and those with Medicaid or another source of public coverage (such as VA, Medicare,
etc.). Additionally, we drop any market-year observations for markets in which the
HHI is more than 25% greater or less than the mean HHI of that same market across
11
the six-year time period included in our study (i.e., an implausibly large insurance The results are qualitatively unchanged if we either do not exclude this enroll-
market one-year outlier). Overall, these restrictions result in an exclusion of about ment and/or if we simply drop observations in markets with integrated delivery
20% of the total plan-level observations in the KFF/HRET data. These excluded mar- systems present in the market.
12
kets tend to have higher levels of insurance market concentration (likely due to The results are also qualitatively unchanged if we use CBSAs to define geographic
mis-measurement), but are otherwise similar to the markets retained in our study. markets without using the smaller Metropolitan Divisions within these 11 largest
10
We also run models with alternative definitions of hospital product market, such CBSAs.
13
as beds, total volume, total admissions, and Medicare discharges; hospital market We also run models using the Dartmouth Atlas’ Hospital Referral Region (HRR)
concentration based on these different measures are all very highly correlated and as the geographic market for both Insurer:Hospital HHIs and Hospital HHIs, as the
our results are robust to these alternative definitions. Moreover, we believe that HRR has been frequently been used as a geographic market for healthcare. HRRs are
using the system-level measures (rather than individual hospitals) more accurately generally larger geographic areas than CBSAs, so CBSA-level markets are typically
represent the bargaining nature with insurers. more highly concentrated than HRR-level markets.
E.E. Trish, B.J. Herring / Journal of Health Economics 42 (2015) 104–114 109
100
mandated benefit index. (We lag the income and Medicare utiliza-
tion values because they are essentially t − 1 forecasts by an insurer
while the tax rates and benefit mandates are known in advance.)
t is a year-indicator variable and εpt is the random error. We use
80
The first two columns of Table 1 include the weighted means and
Hospital HHI
60
0 20 40 60 80 100 The next three columns of Table 1 present the full results from
Ins:Hosp HHI the OLS regression model for the annual premium shown in Eq.
(B)
(1). Before discussing the results for insurer and hospital market
concentration, we note that the results for the control variables
Fig. 1. Comparison of the two insurance market concentration measures and the
joint distribution with hospital market concentration using CBSA market definitions. generally appear as expected, indicating that the overall data and
Notes: The scatterplots depict the joint distribution of the Insurer:Employer and model is well specified. For instance, plans with higher deductibles
Insurer:Hospital CBSA-based HHI measures of insurance market concentration in have lower premiums, unionized firms have higher premiums, and
Panel (A) and the joint distribution of Insurer:Hospital and Hospital CBSA-based
smaller firms have higher premiums.
HHI measures for market concentration in Panel (B). Each dot represents a plan
from the 2006–2011 KFF/HRET Employer Health Benefits Survey. HHI is Herfindahl- In this model using the CBSA as the geographic market, the
Hirschman Index. coefficient for a 100 point increase in the Insurer:Employer HHI
is 0.0021 and the coefficient for a 100 point increase in the Hospi-
CBSA markets where both insurer and hospital markets are un- tal HHI is 0.0019. These findings, which are statistically significant
concentrated. In contrast, 31.6% of plans are in CBSA markets where at the 5% and 1% levels, respectively, support our hypothesis that
both insurer and hospital markets are highly concentrated. higher levels of both Insurer:Employer and Hospital concentration
are associated with higher employer-sponsored health insurance
4.4. Empirical model premiums. To put a relative magnitude on these coefficients, we
consider their effect size in the commonly-used example of a stan-
We estimate parameters from the following OLS plan-level pre- dard “five-to-four” merger – a market in which two of five equally
mium regression: sized firms merge, resulting in an 800 point increase in HHI (i.e., an
HHI increase from 2000 to 2800). These coefficient estimates imply
ln Ppt = ˛ + ˇIns:Emp HHImt−1 + ϕIns:Hosp HHImt−1 that a simulated five-to-four merger in the Insurer:Employer mar-
ket is associated with 1.7% ($78) increase in premiums, and the
+ Hosp HHImt−1 + Xp + ıFf + Mmt−1 + t + εpt (1)
same increase in the Hospital HHI is associated with a 1.5% ($67)
where the indices are plan p, firm f, market m, and year t. The increase in premiums.
Ins:Emp HHI term in this equation is the one-year lagged HHI of The coefficient for a 100 point increase in the Insurer:Hospital
the market in which insurers sell fully-insured policies to employ- HHI is −0.0024, also statistically significant at the 1% level. A simu-
ers, the Ins:Hosp HHI term is the one-year lagged HHI of the market lated five-to-four merger in this insurer bargaining leverage market
in which insurers bargain with hospitals, and the Hosp HHI term is associated with a 1.9% ($90) decrease in predicted premiums. This
is the one-year lagged HHI of the hospital market. The Xp and Ff finding of a positive coefficient on the Insurer:Employer HHI term
covariates are plan-level and firm-level control variables, while the and a negative coefficient on the Insurer:Hospital HHI term pro-
Mmt−1 are market-level controls, including the one-year lagged and vides support that there are indeed offsetting effects of increases in
logged CBSA-level per capita income, the lagged and logged mean insurer concentration in terms of market power in selling insurance
HRR-level Medicare hospital reimbursement values, the contem- to employers (increasing premiums) versus negotiating leverage
poraneous state premium tax rate, and the contemporaneous state with hospitals (decreasing premiums).
110 E.E. Trish, B.J. Herring / Journal of Health Economics 42 (2015) 104–114
Table 1
Summary statistics and premium regression results for insurance and hospital market concentration.
Notes: The left-hand side of the table shows enrollment-weighted means and standard deviations from the 2006–2011 KFF/HRET Employer Health Benefits Survey with
geographic markets defined as CBSAs. The right-had side of the table shows enrollment-weighted OLS regression results from a plan-level regression of log annual premium
on the insurance and hospital market concentration. N = 5270; F(32, 1288) = 20.49 (p = 0.000); R2 = 0.2176. Standard errors are robust cluster-corrected at the market-year
level. The insurance and hospital market concentration measures and other market controls are lagged by one year and HHIs are scaled by 100. Percentages may not sum to
100% due to rounding. The coefficient, standard error, and p-value included in the first row are for the intercept.
The results in Table 2 illustrate how the inclusion of the two picked up by the measure limits its magnitude and significance in
distinct and potentially offsetting measures of insurance con- the direction that we expect. This provides support for the fact that
centration (in the employer and hospital markets) appear to be these two insurance market concentration terms are indeed con-
necessary to disentangle the different effects of market power on tributing unique information regarding the structure of the market
the seller and buyer side. Table 2A’s second column (labeled Model in which the plan is sold and that in which the insurer bargains
1) repeats our main results including all three HHI measures using over hospital prices, and their relevant association with premiums.
the CBSA market definition, and the next six columns show the Finally, Model 4 indicates that our findings for insurance market
results from additional separate regressions to show the possi- concentration are not sensitive to the exclusion of hospital market
ble permutations of including/excluding these three concentration concentration, and Model 5 indicates that our finding for the associ-
measures. Our ability to capture these related but offsetting effects ation between higher premiums and hospital market concentration
of insurer concentration is supported by the fact that the magni- is not sensitive to the exclusion of insurer market concentration.
tude and statistical significance of the coefficients on these terms Table 2B presents the results from this same set of regressions
are diminished when only one of them is included in the regression but instead using counties as the geographic market. While the
specification. This is illustrated in the other columns of Table 2A; mean concentration measures for the insurer and hospital markets
when only one of these insurance market concentration mea- are higher for county-defined markets compared to CBSA-defined
sures is included without the other (i.e., Models 2 and 3 with the markets (especially for hospital markets), overall the same pattern
hospital HHI excluded, Models 6 and 7 with the hospital HHI of regression results holds for the models using county as the mar-
included), the portion of the offsetting effect on premiums that is ket. In the model including all three concentration measures with
E.E. Trish, B.J. Herring / Journal of Health Economics 42 (2015) 104–114 111
Table 2
Premium regression results for insurance and hospital market concentration measures excluding each measure, using (A) CBSA and (B) county market definitions.
(A)
Ins:Emp HHI 3107 0.0021 0.0005 0.0028 0.0001
SD/SE 1344 0.0010 0.0005 0.0010 0.0005
p-Value 0.029 0.308 0.004 0.768
(B)
Ins:Emp HHI 3225 0.0025 0.0005 0.0029 0.0004
SD/SE 1351 0.0009 0.0005 0.0009 0.0005
p-Value 0.008 0.351 0.002 0.416
Notes: These seven models show selected enrollment-weighted OLS regression results from a plan-level regression of log annual premium from the KFF/HRET survey on the
insurance and hospital market concentration measures using the CBSA as the geographic market in Panel (A) and the county as the geographic market in Panel (B). Standard
errors (SE) are robust cluster-corrected at the market-year level. The three market concentration measures in the regression models are lagged by one year and scaled by
100 (i.e., the coefficients represent the effect of an HHI that is 100 points higher).
markets defined at the county-level, the coefficient for a 100 point Table 3
Premium regression results for insurance and hospital market concentration mea-
increase in the Insurer:Employer HHI is 0.0025 and the coefficient
sures, stratified by insurance market concentration.
for a 100 point increase in the Hospital HHI is 0.0009, both signif-
icant at the 1% level. Additionally, the coefficient for a 100 point ln(Premium) 1 2 3
Full sample Ins:Emp HHI Ins:Emp HHI
increase in the Insurer:Hospital HHI is −0.0025, also significant at
≤2500 >2500
the 1% level.
Ins:Emp HHI 0.0021 0.0061 0.0026
SE 0.0010 0.0039 0.0010
5.1. Stratified analyses p-Value 0.029 0.115 0.026
higher negotiated prices, but that these higher prices are ulti-
mately passed-through to consumers in the form of higher insurance premiums. Further, these results (from Tables 3 and 4)
premiums, regardless of downstream insurer market structure. suggest that the negative relationship between premiums and
The finding that the positive and negative associations between insurer bargaining power with hospitals may be particularly pro-
Insurer:Employer and Insurer:Hospital market concentration and nounced among more highly concentrated insurer and hospital
premiums are strongest among the more concentrated insurance markets. Generally, they support the suggestion that the relative
markets may suggest that the association between higher levels balance of insurer and hospital concentration also has important
of insurance concentration on the ability to charge higher premi- implications for insurance premiums, reflecting the underlying
ums to employers and to negotiate lower prices with providers market structure that insurers must bargain with hospitals to set
may be particularly important among relatively more concentrated transaction prices, and thus the level of concentration in both
insurance markets. insurer and hospital markets and their relative bargaining leverage
Next we examined whether a similar pattern of results would jointly impact these negotiated prices. Importantly, they provide
emerge from stratifying the observations based on the level of empirical evidence that these higher provider prices are often
hospital market concentration. Here we hypothesize that the asso- ultimately passed-through to consumers in the form of higher pre-
ciation of the level of competition in the Insurer:Hospital market miums, and that this pass-through also depends on relative market
and premiums would be strongest among the more concentrated conditions. In general, we observe higher premiums among plans
hospital markets, as the effects of increased insurer bargaining sold in markets with higher levels of hospital and Insurer:Employer
leverage may be less pronounced if hospital prices are already rela- concentration and lower premiums among plans sold in markets
tively low due to hospital market competition alone. Analogous to with higher levels of Insurer:Hospital concentration, especially
the above analysis, we stratified the observations into two groups among more highly concentrated markets.
according to level of concentration in the hospital market; i.e., Hos-
pital HHI below or above 2500 using the CBSA-defined geographic 5.2. Limitations
markets.
The findings from this analysis stratified by hospital market con- As noted earlier, the cross-sectional design of this study limits
centration are presented in Table 4, where Column 1 again simply our ability to infer the causal relationship between the concen-
repeats the results from Table 2A’s Column 1, and Columns 2 and tration of insurance and hospital markets and health insurance
3 show the results among plans sold in markets with more or less premiums. It is difficult to construct instruments that would rep-
competitive hospital markets. We find that the statistical signifi- resent an exogenous source of variation in market concentration
cance of the positive and negative associations between premiums that would be unrelated to premiums, particularly instruments
and Insurer:Employer HHI and Insurer:Hospital HHI, respectively, that would be uniquely predictive of the two different insurer and
hold only among the more concentrated hospital markets. We the hospital concentration measures. One approach (for examining
also find that the statistical significance of the positive association provider prices) is to instrument for insurer concentration using the
between Hospital HHI and premiums holds only among the more underlying distribution of firms, but it is doubtful that this would
concentrated hospital markets. These findings provide support for be unrelated, independently, to employer-sponsored premiums.
our hypothesis that, while hospital prices may already be rela- Another common approach is to exploit mergers as a source of
tively lower among more competitive hospital markets, in more variation in concentration over time, but there was little consol-
concentrated hospital markets, more concentrated insurers may idation activity for insurers and hospitals during this time period
leverage their stronger bargaining power to negotiate lower prices for which we have these rich KFF/HRET data. Moreover, it is not
among concentrated hospital markets and use their market power clear that the merger itself would necessarily be exogenous. Yet
in selling insurance to employers to increase their profit margins. another commonly-used approach is to construct measures of mar-
Taken together, these results (from Tables 1 and 2) suggest that ket concentration based on predicted, rather than actual patient
the levels of competition in the health insurance and hospital mar- flows, but we are limited by data and analytical resources to com-
kets are significantly associated with employer-sponsored health plete this exercise at the national level for hospitals and unsure how
E.E. Trish, B.J. Herring / Journal of Health Economics 42 (2015) 104–114 113
one would apply this consumer-flow approach to insurer market Overall, we believe that our ability to construct two distinct
shares. We have tried to alleviate these endogeneity concerns by insurance market concentration measures using the variation in
lagging the market concentration and market control variables by fully-insured and self-insured enrollment represents an improve-
one year; this at least implies a temporal relationship that is con- ment in depicting these markets and their unique association with
sistent with the hypothesis that the level of market concentration health insurance premiums. Nonetheless, further work on these
affects health insurance premiums. Additionally, if higher premi- open questions is warranted.
ums in fact encourage market entry by other insurers, this would
result in a more competitive health insurance market, which would
bias our results downward. Nonetheless, we interpret our results 6. Conclusion
as associations between market concentration and premiums and
not necessarily a causal relationship. The US health insurance industry is highly concentrated and
Another limitation is that our model also measures the asso- health insurance premiums are high and rising rapidly. Our data
ciation between aggregate market-level measures of insurer and demonstrate that less than 3% of the markets in which employers
hospital concentration and the premium of a specific insurance purchase fully-insured coverage are considered un-concentrated
plan purchased by an employer in that market. The KFF/HRET data by the guidelines set forth by the DOJ/FTC. Similarly, more than
do not allow for the identification of which insurer sold the pol- half of these markets are considered highly concentrated. Provi-
icy to the employer, so we therefore are also unable to link the sions included in the ACA are focused on increasing competition
specific market share of that insurer to the observation. Similarly, in these markets, with the expectation that increased competi-
we do not know anything about which hospitals are included in tion within the health insurance industry would help to lower
a given plan’s network, nor about the insurer-hospital contracts. premiums. Though focused on the individual rather than the
Thus, due to data limitations, we are unable to model the bargain- employer-sponsored market, early evidence suggests that not all
ing between insurers and hospitals and to consider constructs such state insurance markets are, in fact, becoming more competitive
as “Option Demand/Willingness to Pay” (Capps et al., 2003) mea- (Cox et al., 2014) but that more competitive exchanges have lower
sures for inclusion of certain hospital systems and the effects that premiums (Dafny et al., Forthcoming).
this may have on premiums. Nonetheless, while the market mea- However, health insurers operate in a complex bilateral
sures may not reflect the specific insurer and associated network oligopoly, whereby they must negotiate service prices with hos-
from which the plan is purchased, they do represent the overall pitals and other providers, and higher levels of market power may
market conditions within which the employer is choosing a policy. in fact result in stronger bargaining leverage with these providers
Thus, we believe that they provide important information regard- to drive down prices, which could then be partially passed through
ing the relationship between market conditions and policies sold in in the form of lower premiums. Thus, the ultimate impact of the
those markets. Additionally, while we control for some plan gen- level of competition in the health insurance industry on premiums
erosity features such as plan type, deductible, and out of pocket is unclear – but it seems likely that the underlying goal of reduc-
maximum, premium variation may also reflect differences in qual- tions in insurer administrative overhead associated with increased
ity and plan generosity that are not accounted for by these control insurer competition can generally not be achieved without the
variables (including the plan’s network) which could conceivably unintended consequence of higher provider prices associated with
be correlated with the extent of market concentration. decreased bargaining power with providers. The analyses pre-
Additionally, our market concentration measures are reliant sented herein suggest that the effects of increasing competition in
on how we have chosen to define the markets. While we believe health insurance markets on health insurance premiums are likely
that CBSAs represent a reasonable geographic market for employ- to depend on the level of competition in local hospital markets,
ers purchasing insurance and a reasonable geographic market for as well as the relative competitiveness of the fully-insured and
hospital care, the extent to which our measures accurately rep- self-insured markets and insurers’ overall bargaining leverage with
resent the true level of competition in these markets depends on hospitals and other local providers.
the degree to which they accurately reflect the markets in which We find that employer-sponsored insurance premiums among a
insurance is purchased and hospital network inclusion negotiations nationally representative sample of firms purchasing fully-insured
occur, respectively. However, given that CBSAs are constructed to products are higher in markets where insurance and/or hospital
represent person-flows for commuting to employment, we believe markets are highly concentrated, as compared to those in which
that they represent reasonable choices for the markets we are try- they are more competitive. Further, we find that higher levels of
ing to measure. We are also reassured by the observation that our concentration among the market in which insurance is sold to
results are similar when using counties instead of CBSAs. employers are associated with higher premiums, whereas higher
Further, given that these KFF/HRET data are of plans actually levels of concentration among the market in which insurers bargain
purchased by employers, our analysis does not explicitly model the with hospitals are associated with lower premiums. Importantly,
employers’ option to not offer coverage or to self-insure as an alter- we find that higher levels of concentration in hospital markets are
native choice.15 However, given that all the plans in our model are also associated with higher premiums – providing evidence that
indeed purchased, they reflect choices that employers have made the well-documented higher prices resulting from consolidation
dependent on the local market conditions that do in fact exist, among hospitals do in fact affect consumers in the form of higher
and thus we feel that they represent an appropriate association premiums, and that local market conditions affect the extent of this
between market conditions and premiums. pass-through.
However, our findings, along with recent literature suggesting
that hospital prices are lower among more concentrated insur-
ance markets, suggest that higher levels of insurer bargaining
15
We have also examined firm-level decisions to self-insure using these KFF/HRET leverage with hospitals may lead to lower health insurance pre-
data, with a primary focus on examining the influence of state community rating miums via lower negotiated hospital prices, as long as there
rules for low-risk versus high-risk industries among firms with 25–100 workers
(Trish and Herring, 2014). In those analyses, there were no statistically significant
is sufficient competition in the market for selling insurance to
associations between insurer market concentration nor hospital market concentra- small employers that these lower prices get passed through to
tion (included as control variables) and a firm’s decision to self insure. employers in the form of lower premiums. Recent policy changes,
114 E.E. Trish, B.J. Herring / Journal of Health Economics 42 (2015) 104–114
such as the introduction of minimum medical loss ratios, may Dafny, L., Gruber, J., Ody, C., 2015. More insurers lower premiums: evidence from
help to ensure that such savings are passed through, even in the initial pricing in the health insurance marketplaces. American Journal of Health
Economics 1 (1), 53–81.
absence of higher levels of competition in this market. Addition- Dartmouth Atlas of Health Care, 2012. Data by Region, Available from: http://www.
ally, our results suggest that this important negative relationship dartmouthatlas.org/data/region/
between insurer bargaining power and premiums is particularly Dranove, D., Gron, A., Mazzeo, M., 2003. Differentiation and competition in HMO
markets. Journal of Industrial Economics 51 (4), 433–454.
pronounced among more highly concentrated markets. Taken Dunn, A., Shapiro, A., BEA Working Papers 0084 2012. Physician Market Power and
together, these findings suggest that ACA provisions to increase Medical-Care Expenditures. Bureau of Economic Analysis.
competition in health insurance markets may be unsuccessful by Dunn, A., Shapiro, A., Working Paper 2013-36 2013. The Impact of Health Care
Reform on Physician Payments: Evidence from Massachusetts. Federal Reserve
not also considering the level of concentration in local hospital mar-
Bank of San Francisco.
kets, particularly if they dilute insurers’ overall bargaining leverage Gaynor, M., Town, R., 2011. Competition in health care markets. In: Pauly, M.,
with local hospital systems. This may be particularly problematic Mcguire, T., Barros, P. (Eds.), Handbook of Health Economics, vol. 2. Elsevier,
Amsterdam, pp. 499–637.
due to recent provider consolidation trends and the strong incen-
Gaynor, M., Town, R., 2012, June. The Impact of Hospital Consolidation – Update.
tives for such provider consolidation included in the ACA, such Robert Wood Johnson Foundation Synthesis Report.
as increased horizontal and vertical integration resulting from the Gaynor, M., Vogt, W., 2000. Antitrust and competition in health care markets. In:
formation of Accountable Care Organizations. Therefore, efforts tar- Culyer, A., Newhouse, J. (Eds.), Handbook of Health Economics, vol. 1B. Elsevier,
Amsterdam, pp. 1405–1487.
geted toward reducing health insurance premiums may be better Ginsburg, P., 2010. Wide Variation in Hospital and Physician Payment Rates Evidence
directed toward insurer–provider negotiations and rate regula- of Provider Market Power. HSC Research Brief No. 16, Center for Studying Health
tion, or efforts to simultaneously reduce the level of concentration System Change.
Government Accountability Office (GAO), 2005. Report to the Honorable Paul Ryan:
among insurers and providers. House of Representatives: Federal Employees Health Benefits Program – Com-
petition and Other Factors Linked to Wide Variation in Health Care Prices,
Acknowledgements Washington, DC.
Ho, K., 2009. Insurer–provider networks in the medical care market. American Eco-
nomic Review 99 (1), 393–430.
This research was supported by Grant No. 69070 from the Robert Ho, K., Lee, R.S., NBER Working Paper #19401 2013. Insurer Competition and Nego-
Wood Johnson Foundation’s Changes in Healthcare Financing and tiated Hospital Prices. National Bureau of Economic Research.
Lewis, M.S., Pflum, K.E., 2015. Diagnosing hospital system bargaining power in
Organization (HCFO) initiative. We would like to thank Anthony managed care networks. American Economic Journal: Economic Policy 7 (1),
Damico, Matthew Rae, and Gary Claxton of the Kaiser Family Foun- 243–274.
dation for their assistance with the KFF/HRET Employer Health Massachusetts Attorney General, 2010. Examination of Health Care Cost Trends
and Cost Drivers. Report for Annual Public Hearing, Office of Attorney General
Benefits Survey data and Adele Shartzer for her assistance with the
Martha Coakley, Boston, MA.
AHA data. We are grateful for helpful comments by two anonymous McKellar, M.R., Naimer, S., Landrum, M.B., Gibson, T.B., Chandra, A., Chernew, M.,
reviewers, Chris Garmon, Jeff Stensland, Abe Dunn, Vivian Wu, 2013. Insurer market structure and variation in commercial health care spend-
ing. Health Services Research (Epub ahead of print: 5 December 2013).
Rich Lindrooth, Lisa Dubay, and seminar participants at CBO, NCHS,
Medicare Payment Advisory Commission (MedPAC), 2009. Report to the Congress:
Urban Institute, Emory University, Mathematica Policy Research, Medicare Payment Policy, Washington, DC.
Analysis Group, University of Massachusetts-Amherst, RAND Cor- Melnick, G., Shen, Y., Wu, V., 2011. The increased concentration of health plan
poration, University of Wisconsin, Colorado School of Public Health, markets can benefit consumers through lower hospital prices. Health Affairs
(Millwood) 30 (9), 1728–1732.
Virginia Commonwealth University, an RWJF HCFO grantee brief- Moriya, A., Vogt, W., Gaynor, M., 2010. Hospital prices and market structure in the
ing, AcademyHealth’s Annual Research Meeting, and the American hospital and insurance industries. Health Economics, Policy and Law 5, 459–479.
Society of Health Economists Conference. Robinson, J., 2004. Consolidation and the transformation of competition in health
insurance. Health Affairs (Millwood) 23 (6), 11–24.
Scanlon, D.P., Swaminathan, S., Lee, W., Chernew, M., 2008. Does competition
References improve health care quality? Health Services Research 43 (6), 1931–1951.
Schneider, J., Li, P., Klepser, D., Peterson, N., Brown, T., Scheffler, R., 2008. The effect of
American Medical Association, 2013. Competition in Health Insurance. A Compre- physician and health plan market concentration on prices in commercial health
hensive Study of US Markets: 2013 Update. insurance markets. International Journal of Health Care Finance and Economics
Berenson, R., Ginsburg, P., Christianson, J., Yee, T., 2012. The growing power of 8, 13–26.
some providers to win steep payment increases from insurers suggests policy Starc, A., 2014. Insurer pricing and consumer welfare: evidence from Medigap. RAND
remedies may be needed. Health Affairs (Millwood) 31 (5), 973–981. Journal of Economics 45 (1), 198–220.
Capps, C., Dranove, D., Satterthwaite, M., 2003. Competition and market power in Town, R., Wholey, D., Feldman, R., Burns, L., NBER Working Paper #12244 2006.
option demand markets. RAND Journal of Economics 34 (4), 737–763. The welfare consequences of hospital mergers. National Bureau of Economic
Council for Affordable Health Insurance, 2006–2011. Health Insurance Mandates in Research.
the States 2006–11. The Council for Affordable Health Insurance. Trish, E., Herring, B., Working Paper 2014. Does Small Group Market Community
Cox, C., Ma, R., Claxton, G., Levitt, L., 2014. Sizing Up Exchange Market Competition. Rating Affect Firm Self-Insurance?
Kaiser Family Foundation Issue Brief. US Department of Justice and the Federal Trade Commission, 2010. Hori-
Cutler, D.M., Scott Morton, F., 2013. Hospitals, market share, and consolidation. zontal Merger Guidelines, Available from: http://www.justice.gov/atr/public/
Journal of the American Medical Association 310 (18), 1964–1970. guidelines/hmg-2010.pdf
Dafny, L., 2010. Are health insurance markets competitive? American Economic Vogt, W.B., Town, R., 2006. How has Hospital Consolidation Affected the Price and
Review 100 (4), 1399–1431. Quality of Hospital Care? Research Synthesis Report No. 9, Robert Wood Johnson
Dafny, L., Dranove, D., Limbrock, F., Morton, F.S., 2011. Data impediments to empir- Foundation.
ical work on health insurance markets. The B.E. Journal of Economic Analysis & White, C., Bond, A.M., Reschovsky, J.D., 2013. High and Varying Prices for Privately
Policy 11 (2), Article 8. Insured Patients Underscore Hospital Market Power. HSC Research Brief No. 27,
Dafny, L., Duggan, M., Ramanarayanan, S., 2012. Paying a premium on your pre- Center for Studying Health System Change.
mium? Consolidation in the US Health Insurance Industry. American Economic Wholey, D., Feldman, R., Christianson, J., 1995. The effect of market structure on
Review 102 (2), 1161–1185. HMO premiums. Journal of Health Economics 14, 81–105.
Journal of Health Economics 42 (2015) 115–124
a r t i c l e i n f o a b s t r a c t
Article history: Breast cancer is a notable exception to the well documented positive education gradient in health. A
Received 12 July 2013 number of studies have found that highly educated women are more likely to be diagnosed with the dis-
Received in revised form 30 October 2014 ease. Breast cancer is therefore often labeled as a “welfare disease”. However, it has not been established
Accepted 5 November 2014
whether the strong positive correlation holds up when education is exogenously determined. We esti-
Available online 22 November 2014
mate the causal effect of education on the probability of being diagnosed with breast cancer by exploiting
an education reform that extended compulsory schooling and was implemented as a social experiment.
JEL classification:
We find that the incidence of breast cancer increased for those exposed to the reform.
I12
I18 © 2014 Elsevier B.V. All rights reserved.
Keywords:
Education gradient in health
Schooling reform
Breast cancer
http://dx.doi.org/10.1016/j.jhealeco.2014.11.001
0167-6296/© 2014 Elsevier B.V. All rights reserved.
116 M. Palme, E. Simeonova / Journal of Health Economics 42 (2015) 115–124
that may be acquired along with prolonged education, or if it can The closest study we are aware of is by Glied and Lleras-
be attributed to factors and individual characteristics correlated Muney (2008) who use the Surveillance, Epidemiology and End
with both educational attainments and the probability to get breast Results Program (SEER) data to estimate the effects of technological
cancer. The most common research strategy used in epidemiolog- progress on cancer deaths by education, relying on US compulsory
ical studies is to add confounders that are known to be associated schooling laws for exogenous variation in educational attainment.
with educational attainments and potentially etiologically related They find that conditional on technological progress, extra educa-
to breast cancer, such as delayed childbearing in a regression frame- tion reduces overall cancer mortality in men, but not in women.
work, and to investigate if the correlation remains (see e.g. Braaten Excluding cancers of the reproductive system, inclusive of breast
et al., 2004; Danø et al., 2004; Heck and Pamuk, 1997). There are cancer, makes the estimated effects for men and women consis-
at least two problems with this strategy. First, there is an identi- tent. The authors do not specifically test for the effects of education
fication problem. Most confounders, such as fertility behavior, are on survival from reproductive system cancers in women, relying
likely to be endogenous to educational attainment. This means that on the findings in the medical literature we discuss above.
it is still not clear if including them in the regression makes up for a This study finds that attaining higher levels of education
causal relation with education, or if they just proxy individual char- increases the risk of being diagnosed with breast cancer in women,
acteristics correlated with educational attainments. Second, adding confirming the results obtained from purely correlational studies.
independent variables in a regression would in most cases aggra- However, we also find that this heightened probability of diagnosis
vate the downward bias from measurement errors (see e.g. Greene, is later followed by an elevated probability of death from breast
2003). cancer among better educated women. Further, we investigate the
An alternative strategy to analyze this research question is to potential role of fertility decisions, which has been pointed out as
use exogenous variations in educational attainments created by the mechanism linking education and the incidence of breast can-
natural experiments. A number of influential studies have used cer. We find no convincing evidence in favor of this hypothesis.
this research strategy to study the relationship between education The curious association between education and the most common
and measures of general health. Lleras-Muney (2005), Oreopoulos cancer diagnosis in women appears to be affected by qualities,
(2006) and Clark and Royer (2012) use variation induced by behaviors, and risk factors acquired in the process of obtaining more
changes in compulsory schooling legislations in the US and the education, rather than pre-existing characteristics that predispose
UK as a source of exogenous variation in education. Spasojevic some women to both get more education and be diagnosed with
(2010), Meghir et al. (2012) as well as Lager and Torssander (2012) the disease.
investigate the health consequences of the introduction of compre-
hensive school reform in Sweden. An interesting related question
2. The comprehensive school reform
is whether the health effects of education vary by gender2 and
diagnosis.
2.1. The Swedish school system before and after the reform
In this paper we investigate whether there is a causal effect of
education on the incidence and mortality from breast cancer in the
Sweden implemented a compulsory schooling reform as a social
population of women born in Sweden between 1940 and 1957 who
experiment between 1949 and 1962. Prior to the implementa-
survived until at least 1985. We make use of a compulsory schooling
tion of the reform, pupils attended a common basic compulsory
reform that increased the number of compulsory years of education
school (folkskolan) until grade six. After the sixth grade pupils were
from 7 or 8 depending on municipality to 9 years nationwide. We
selected to continue either for one or, in mainly urban areas, two
also compile a unique nationally representative dataset from vari-
years in the basic compulsory school, or to attend the three year
ous Swedish national data registries, including the Swedish Cancer
junior secondary school (realskolan). The selection of pupils into the
Registry.
two different school tracks was based on their past academic per-
The Swedish setting is particularly well suited to study how edu-
formance, measured by grades. The pre-reform compulsory school
cation affects the incidence of a “welfare disease” such as breast
was in most cases administered at the municipality level. The junior
cancer in women for several reasons. First, Sweden is ethnically
secondary school was a prerequisite for the subsequent upper sec-
and racially homogenous, especially in the cohorts under study.
ondary school, which was itself required for higher education.
This reduces potential omitted confounders that could correlate
In 1948 a parliamentary committee proposed a school reform
both with the hereditary genetic make-up and the SES of some eth-
that implemented a new nine-year compulsory comprehensive
nic or racial subgroups. Second, health care is free at the point of
school.3 The reform had three main elements:
access and the Swedish government provides free universal health
insurance. Disparities arising from differential access to care due to
financial constraints are unlikely to play a role in the Swedish set- 1. An extension of the number of years of compulsory schooling to
ting. Breast cancer screening covers the entire female population 9 years in the entire country.
in the critical ages and is free of charge. The screening program 2. Abolition of early selection and tracking based on academic per-
was adopted nation-wide in 1986 after the first results from the formance. Although pupils in the comprehensive schools were
Swedish mammography trials became available (Tabar et al., 1985). able to choose between three tracks after the sixth grade – one
The take up rate of this screening program after the first invita- track including vocational training, a general track, and an aca-
tion to screen is about 80 percent (see e.g. Hussain et al., 2008). demic level preparing for later upper secondary school – they
Third, the Swedish Cancer Registry is the oldest cancer registry and were kept in common schools and classes until the ninth grade.
one of the best in terms of data quality and accuracy in the world 3. Introduction of a national curriculum. The new curriculum
today. replaced the pre-existing curriculum which varied between
municipalities.
2 3
Clark and Royer (2012) as well as Meghir et al. (2012) investigate for differential We offer a brief description of the main parts of the Swedish comprehensive
effects of education by gender and find inconclusive evidence. Gathman et al. (2012) school reform. The school reform and its development are described in Meghir
analyze a number of compulsory schooling reforms in Europe and find diverging and Palme (2003, 2005), and Holmlund (2007). For more detailed reference on the
effects of education on mortality by gender. reform, see Marklund (1981).
M. Palme, E. Simeonova / Journal of Health Economics 42 (2015) 115–124 117
2.2. The social experiment link women in the sample to their parents. We then use the
Education Register for the parents to determine the level of
The social experiment with the new comprehensive nine-year education of each woman’s father. Fathers who had more than
compulsory school started during an assessment period between the basic required (7 years) education are considered highly
1949 and 1962, when the final curriculum was decided.4 The pro- educated.
posed new comprehensive school system, as described above, was All women who died of breast cancer as a primary cause of
introduced in municipalities or parts of city communities, which death were found in the Cancer Register as having been previ-
in 1952 numbered 1055. The cohorts included in our empirical ously diagnosed with the disease. We record all diagnoses and
analysis, born between 1940 and 1957, cover the entire period deaths until 2006. The Swedish Cancer Register is the oldest Can-
of implementation of the comprehensive school. In 1962 it was cer Register in the world and contains detailed information on all
decided that the new comprehensive school would become the incidences of cancer diagnosis in Sweden. It is compiled from com-
standard education in Sweden. The last class that graduated from pulsory cancer diagnosis registrations by physicians, cytologists
the old schooling system did so in 1970. and pathologists and covers close to 100% of all cancer diagnoses
The selection of municipalities into the new comprehensive in Sweden (Swedish National Board of Health and Welfare, 2006).
school was not based on random assignment. Still, the decision Studies of the accuracy of the Cancer Register have shown that cases
to select the areas was based on an attempt to choose locations of breast cancer are the most reliably reported cancer diagnosis
that were representative for the entire country, both in terms in the Register, with under-reporting rates of less than 1.1% of all
of demographics as well as geographically. At first the National cases diagnosed within the reporting year (Barlow et al., 2009).
Board of Education contacted the municipalities, or sometimes they Importantly, the exact date of every diagnosis is recorded, and the
themselves applied to participate. From this pool of applicants a data can be linked to the population registers through a unique
“representative” sample of municipalities was chosen. Municipal- person ID.
ities could elect to implement the comprehensive school starting In the empirical analysis we use the population of all women
with first or fifth grade cohorts. Once the grade of implementation born in Sweden between 1940 and 1957 and surviving until at least
was fixed, all individuals from the cohort immediately affected and 1985. We exclude 414,214 women with missing parental educa-
all subsequent cohorts went to comprehensive school. The older tion background and use the remaining sample of 562,814 women.
cohorts continued in the per-reform school. Of those, 19,736 women were diagnosed with breast cancer after
Meghir and Palme (2005) and Holmlund (2007) study the effect 1984. Of those who were diagnosed, 2370 women died, and breast
of the comprehensive school reform on educational attainments.5 cancer was noted as the cause of death on their death certificate.
The Meghir and Palme (2005) estimates for their entire sample Another 401 of the women diagnosed with breast cancer after 1984
are 0.252 additional years for males and 0.339 years for females; died from a different main diagnosis.8
for low SES persons the estimates are 0.3 extra years for males Table 1 summarizes the main explanatory and control variables
and 0.512 for females.6 Holmlund has estimates in the range used in the analysis. The mortality data start in 1985 and include
0.21–0.61 additional years of schooling for men and 0.13–0.44 for the exact date of death and the main cause of death as recorded in
women. the death certificate. We restrict the time of first diagnosis to be
after 1984 in order to avoid selection of women who were diag-
3. Data nosed previously and survived until the period after 1984. The
women in our sample were aged between 28 and 45 in 1985 and
This is a population-level study. We match data from the (those surviving) between 49 and 66 in 2006. As a percent of total
Swedish Cancer Register to Swedish population register data, the female mortality, breast cancer mortality peaks between ages 40
1990 Swedish Education register, and the Cause of Death regis- and 60 at around 15% of total deaths in the age group. This implies
ter. The population register contains information on the parish of that we are capturing the interval in women’s lives during which
birth for all individuals born in Sweden in the 1940–1957 cohorts. they are most likely to be affected by breast cancer (as opposed
We use this register to assign municipality of birth for all women to another lethal disease). Aggregate mortality in Sweden is very
in the cohorts affected by the schooling reform. The municipal- low at ages below 45 at 6 per 1000 (from data), and breast can-
ity of birth is then used to assign the year in which the reform cer mortality is even lower at 1 per 1000 (from data). A back of
was implemented in that locality, and the reform treatment status the envelope calculation suggests that we are potentially missing
to different cohorts of women who were born in the municipal- at most 100 deaths from breast cancer that may have occurred
ity. Note this means that all estimated effects are of the “intention in our study population before 1985.9 This is a very small part
to treat” type, but we avoid potential bias coming from selective of the total number of breast cancer deaths in the sample – less
migration. Holmlund (2007) offers a detailed exposition of the exact than 5%.
matching algorithm used.7 Several differences in the raw means between women of high
The Cause of Death register contains information on the date and low SES family backgrounds are worth discussing. Unsur-
of death and the principal cause of death. The Census data provide prisingly, on average women of higher SES obtained more years
information on the date of birth and the number of children born of education. They are less likely to have had any children and
to the women in the 1940–1957 cohorts. We use this informa- the average age at first childbearing in this group is about two
tion to assign age at first childbearing and the total completed years higher. High SES women are also more likely to have been
fertility per woman. The multi-generational register is used to
8
The distribution of the main causes of death among those women is: 61 women
4
The official evaluation was mainly of administrative nature. Details on this eval- died from ovarian cancer; 45 from lung cancer; 22 from AMIs; 15 from pancreatic
uation are also described in Marklund (1981). cancer; 12 died from colon cancer; 9 from melanoma; 15 from unknown causes.
5
Holmlund (2007) does not have individual treatment status and imputes it from The remaining 222 deaths are distributed across more than one hundred different
municipality of residence in 1960. causes.
6 9
Note that Meghir and Palme (2005) use the exact reform assignment from the Assuming mortality at missing ages is equivalent to mortality at those ages
school registries for a random subset of the cohorts born in 1948 and 1953. Their among observed cohorts; assuming also that cohort sizes at different ages are sim-
estimates are free of measurement error in the reform assignment variable. ilar over time, which gives an upper bound estimate since demographic trends led
7
We are grateful to Helena Holmlund for sharing her algorithm with us. to steady cohort size increase between 1940 and 1957.
118 M. Palme, E. Simeonova / Journal of Health Economics 42 (2015) 115–124
Table 1
Main explanatory and outcome variables in interest. Standard deviations are reported in square brackets under the mean. P-values of tests of differences in means between
high and low SES background women are also reported.
Variable
Years of education 360,240 11.155 180,612 12.823 0
[2.805] [3.063]
No children (nulliparous) 372,894 0.11 187,702 0.134 0
[0.315] [0.34]
Age at first childbearing 331,164 26.5 162,587 28.1 0
[5.4] [5.5]
Age at diagnosis 12,723 50.196 6654 49.513 0.088
[6.581] [6.511]
Death year – year of diagnosis 1538 4.83 724 5.3 0
[4.55] [4.63]
Diagnosed with breast cancer 372,894 0.035 187,702 0.036 0.012
[0.183] [0.186]
Died from breast cancer 372,894 0.004 187,702 0.004 0.16
[0.065] [0.063]
diagnosed with breast cancer, to have received the diagnosis at an 4. The relation between educational attainment and breast
earlier age and, conditional on dying from breast cancer, to have cancer incidence and mortality
lived longer between their initial diagnosis and the time of death.
There is no significant difference in the probability of death from We start the analysis by documenting correlations between
breast cancer by SES background. These facts suggest that either the years of attained education and socio-economic background
(1) higher SES women are more likely to have been diagnosed ear- and the probability of diagnosis and death from breast cancer in
lier or that (2) higher SES women received better treatment, or Sweden. Table 2 presents the estimates. We use all available obser-
both. The differences in age at diagnosis appear in favor of the first vations to maximize power. Coefficients and standard errors are
hypothesis, but we cannot draw any firm conclusions based on this multiplied by 1000 for better presentation. Women with an extra
evidence. year of education are 3 percent (evaluated at the mean incidence of
Here it is important to consider the importance of breast can- breast cancer diagnosis in the population) more likely to have been
cer screening for early diagnosis and treatment. The large clinical diagnosed with breast cancer than their less educated peers. The
trials that produced evidence on the beneficial effects of mam- correlation with high socio-economic family background is larger
mography, were done in Sweden in the 1970s and 1980s (see – even if we control for years of attained education, women who
Tabar et al., 1985). Thus, policy makers in Sweden were quite were born in better-off families are 7 percent more likely to receive
aware of the importance of breast cancer screening at the time a breast cancer diagnosis.
our study period begins. After the first results of the random- The effect of education on deaths from breast cancer is not as
ized trials came out, the National Board of Health and Welfare clear. None of the education coefficients attain statistical signif-
issued guidelines in 1986 recommending that the county councils icance at the 10% level even though the coefficient on years of
invite women ages 40–54 years to screening every 18 months and education implies a negative correlation both with and without
women ages 55–74 years every second year. Thus, national ser-
vice screening with mammography was initiated in 1986. Local Table 2
health administrations are in charge of running the screening Correlations between years of educational attainment and diagnosis/death from
programs. All women of eligible ages receive a letter giving a spe- breast cancer in women.
cific date and time for a mammography examination. Failure to Panel A (1) (2) (3)
attend the scheduled examination or re-schedule the appointment
Diagnosis
results in a second invitation up to six consecutive invitations. A
Years of schooling 0.96* 0.89*
regional case study from Uppsala reports that of the 46,041 eligible coef * 1000 (0.09) (0.08)
women only 5.6% never attended after six attempted appoint- High SES 2.99* 1.92*
ments. Non-attenders tend to be older (over the age of 60), foreign coef * 1000 (0.54) (0.56)
born and single. Note that all foreign-born women residing in Observations 541,135 560,596 541,135
Mean incidence per 1000 34.1 34.1 34.1
Sweden are excluded from our sample by construction. Interest- R-squared 0.006 0.007 0.010
ingly, the relationship between education and the probability of Empirical model Linear Prob Linear Prob Linear Prob
non-attendance is u-shaped, with women finishing high school,
some college, and college more likely to attend than those with Panel B (1) (2) (3)
professional education or high school drop outs (Lagerlund et al., Death from breast cancer
2002). Years of schooling −0.01 −0.02
Breast cancer is a common killer in our sample. In this rel- coef * 1000 (0.01) (0.01)
High SES −0.09 0.12
atively young population, 15% of all deaths are due to breast
(0.18) (0.09)
cancer. Cardio-vascular diseases account for an extra 13.5% of Mean deaths per 1000 4.1 4.1 4.1
total mortality. Deaths from other cancers are responsible for Observations 541,135 560,596 541,135
another 33%. In total, cardio-vascular and cancer-related mortality Empirical model Linear Prob Linear Prob Linear Prob
account for close to two-thirds of all female deaths in the sample Note: Robust standard errors in parentheses; *Significant at 1%; SE clustered on the
cohorts. municipality of birth level.
M. Palme, E. Simeonova / Journal of Health Economics 42 (2015) 115–124 119
controls for parental SES background. Coupled with the evidence 1000 municipality indicator variables in addition to the 17 birth
on the higher incidence of diagnoses among the more educated cohort dummies. For relatively small treatment effects, when both
women, this suggests that conditional on being diagnosed with approaches have been used in a similar context, the results are
breast cancer, more educated women are more likely to survive. almost identical.11
This is consistent both with evidence that educated people are more We also use linear probability models as one of two methods
adept at using new medical technologies (Glied and Lleras-Muney, of estimating the probability of death from breast cancer. The lin-
2008; Lichtenberg and Lleras-Muney, 2005) and with earlier diag- ear probability model is handy because it can efficiently estimate a
nosis and earlier treatment in higher SES background women. The large number of dummy coefficients in specifications where we
table of means shows supportive evidence for the latter expla- also include municipality-specific time trends. We complement
nation. Even though the mortality point estimate suggests that the linear probability estimates with estimates from Cox semi-
education has a positive effect on survival, the precision of the parametric proportional hazard models. For the time to death from
estimates is not high enough to make any strong conclusions. breast cancer outcomes we use Cox proportional hazard models of
The corresponding Cox proportional hazard estimates are: in this type:
the full sample one year of schooling reduces the probability of
death from breast cancer by a statistically insignificant 1.6% relative
I1,i,m,t (r|Ri,m,t , Ti , Mi ) = I0 (r) exp{˛ + ˇ1 Ri,m,t + 1 Ti + 2 Mi }, (2)
to the mean (SE 0.0197), which is a larger estimate than the LP
estimate evaluated at the mean (0.03%); the Cox estimate of high
SES in model 2 is a statistically insignificant decrease of 2.4% relative where r is exposure time and I0 (r) is the baseline hazard. This model
to the mean, not too far from the LP estimate of 2.2% relative to the is semi-parametric in the sense that no functional form assumption
mean. is imposed on the base line hazard. Importantly, when we consider
the hazard of death from breast cancer, we consider only deaths
5. Empirical specification from breast cancer as terminal event. Thus, all women who died
from causes other than breast cancer are considered still living
We use two main types of outcomes in the empirical analy- at the end of the observation window. Prior research has found
sis. When we consider breast cancer mortality, we use the binary that the compulsory schooling reform did not significantly affect
mortality outcome and the time to death as the outcome vari- life expectancy for (high and low SES) Swedish women (Meghir
ables. When we study the incidence of breast cancer, we use a et al., 2012). Moreover, the age at first diagnosis in this sample
binary outcome variable equal to one if the woman was ever diag- is fairly young. These two facts suggest that a competing risks
nosed with breast cancer after 1984 and zero otherwise. We use phenomenon is an unlikely explanation for our estimates. Never-
the same identification strategy for the effect of the reform for theless, as Honoré and Lleras-Muney (2006) show that decreasing
both types of outcomes. If the reform would have been randomly cardio-vascular disease mortality in the US contributed to a steady
distributed among Sweden’s 1000 or so municipalities we could (non-declining) cancer mortality rate between the 1970s and 2000s
have simply compared the outcomes in the treated and non-treated we construct Peterson bounds on our estimates taking into account
municipalities conditional on year of birth. However, as has been the association between cardio-vascular and cancer mortality risks.
discussed in previous studies (see e.g. Meghir et al., 2012), this was The assumption here would be that obtaining extra education,
not the case. Therefore, we will control for both birth cohort and while reducing the likelihood of cardio-vascular mortality, indi-
municipality of birth. We start with the following latent variable rectly increases the likelihood of breast cancer mortality. To control
specification: for unobserved differential trends that might affect municipalities
differently depending on the timing of the education reform, we
∗
yi,m,t = ˛ + ˇ1 Ri,m,t + 1 Ti + 2 Mi + εi.m.t , (1) include linear trends by year of reform implementation. All munic-
where i, m and t are sub-indices for individual, municipality and ipalities that implemented the reform in the same year are assigned
birth cohort, respectively; y* is a latent variable for health status; T the same linear trend. The empirical results section reports the
is a vector of dummy variables for year of birth; M is a corresponding results from these preferred specifications.
vector of dummy variables for municipality of birth; finally, ε is an Table 3 demonstrates the effects of being exposed to the school-
individual random disturbance. ing reform on the number of years of attained education for Swedish
The key identifying assumption is that the distribution f(·) of women of affected cohorts. We first show the effects on the entire
ε does not depend on the assignment to reform treatment, con- sample and then split the sample according to the education level
ditional on cohort and municipality. In practice we impose the of the father. We expect that the education reform affected chil-
stronger assumption that the distribution of ε is independent of dren from low SES families more as they were more likely to drop
all right hand side variables. It is important to note that the reform out of school earlier. The results confirm that women from low SES
assignment in this analysis depends on the municipality of birth, backgrounds increased their education by more than those from
rather than the municipality of schooling. On the one hand, this high SES backgrounds. The reform resulted in an average increase
means that the estimates are of the “intention-to-treat” type. On of 1.8 months of schooling for girls coming from relatively disad-
the other hand we avoid selection issues coming from differential vantaged backgrounds. The corresponding estimate for the high
(and potentially endogenous) mobility.10 SES group is about one third of the size and does not attain sta-
For the binomial outcome breast cancer diagnosis, we use tistical significance at the 10% level. We present estimates from
linear probability models. The reason for using a linear proba- models including linear time trends grouped by year of imple-
bility model, rather than e.g. logit and probit, which restrict the mentation and from specifications including municipality-specific
probabilities in the [0, 1] interval and relax the linearity assump- linear time trends. The estimated coefficients are very similar,
tion, is computational convenience, since all models include about which is reassuring that unobserved municipality-specific changes
coincidental with reform implementation are unlikely to bias our
results.
10
Meghir and Palme (2005) show, however, that 90.1 percent have the same
reform assignment based on predictions from their municipality of birth as their
municipality of schooling; 5.3 percent moved from reform to non-reform munici-
11
palities; 4.6 moved in the other direction. See for example Meghir et al. (2011).
120 M. Palme, E. Simeonova / Journal of Health Economics 42 (2015) 115–124
Table 3
The effect of education reform on women’s educational attainment in years of education.
Note: Robust standard errors in parentheses; SE clustered on the municipality of birth level. + Significant at 10%; **significant at 5%; *significant at 1%.
Table 4
Educational reform and the risk of diagnosis and death from breast cancer.
Diagnosis
Reform coef * 1000 1.5** 1.71** 1.6 1.92+ 1 1.1
(0.78) (0.775) (1.1) (1.025) (1.5) (1.47)
Mean dep var * 1000 34.1 33.7 35
Linear trends by year of reform implementation Yes Yes Yes
Municipality trend Yes Yes Yes
Observations 560,596 560,596 372,894 372,894 187,702 187,702
R-squared 0.006 0.0075 0.006 0.009 0.010 0.015
Note: Municipality fixed effects included in all specifications; birth cohort dummies included in all specifications; robust standard errors in parentheses; standard errors
clustered on the municipality of birth level; + significant at 10%; **significant at 5%; *significant at 1%.
6. Results education although the precision is not sufficient to make any def-
inite conclusion.12
6.1. Main findings The linear probability estimation results in Table 4 also show
that the reform causes a significant increased risk of mortality with
We next turn to the effects of the compulsory education reform breast cancer as a primary cause of death.13 That is, the expected
on breast cancer incidence and death. Since we know from previ- improvement in the effect of better response to cancer treatment
ous research (see e.g. Meghir and Palme, 2005) that the reform had from more education was not sufficient to offset the increased risk
very different effects on later-life economic wellbeing depending of being diagnosed with breast cancer.
on parental SES, we run separate regressions by women’s fam- In addition to the linear probability models we obtained Cox
ily SES background. It is important to note that since the reform PH mortality estimates stratified at the municipality level. The
was not limited to simply increasing the number of compulsory Cox semi-parametric model imposes fewer restrictions on the
years of education, but had additional elements, the results that estimates, however it suffers from severe incidental parameters
follow are not directly comparable with the education correla- problems with a large number of dummy variables, such as would
tions presented in Table 2. The results on how reform treatment be included in a specification including municipality specific lin-
affected the probability of diagnosed breast cancer are shown ear trends. That is why we ran the Cox estimations with linear
in the top panel of Table 4 and on mortality from the disease trends by year of implementation only, so these estimates are com-
in the bottom panel. We present estimates with year of imple- parable to the linear probability coefficients reported in columns
mentation specific linear trends for easy comparison with the (1), (3), and (5). The hazard is stratified by municipality of birth,
semiparametric Cox estimates, as well as results from linear prob- allowing for potentially different underlying breast cancer mortal-
ability models including municipality-specific linear trends. We ity hazards by municipality. The Cox estimates are as follows: full
multiply all coefficients and standard errors by 1000 for ease of
presentation.
There is a significantly positive effect of reform assignment
12
If we consider only the reform’s effect on schooling attainment, we would mul-
on the probability of being diagnosed with breast cancer in the
tiply the reform estimates from Table 4 by 1/(estimated change in years of education
full sample. Although the precision in the estimate in the low from Table 2), resulting in much larger estimates of the effect of an additional year of
SES subsample is somewhat inferior, it is obvious that the effect education than what is obtained in the correlations reported in Table 3. We empha-
is attributable to the group originating from low SES families, size, however, that using the reform as an IV for years of attained education is
who experienced the largest effect of the education reform. The most likely flawed, as the reform contained additional elements that challenge the
exclusion restrictions.
point estimate of the magnitude of the effect suggests a 1.5 13
In the analyses we exclude all women who have received a diagnosis of breast
percentage point elevated risk, which is somewhat more than cancer pre-1985 to avoid selection bias. This is because our mortality data start in
the correlation estimate corresponding to one year extra year of 1985 and survival following a breast cancer diagnosis could be related to the reform.
M. Palme, E. Simeonova / Journal of Health Economics 42 (2015) 115–124 121
sample coefficient 0.18+ (SE 0.1), which is very similar to the LP sample. The reference cohort is the one born 2 years before the first
estimate evaluated at the mean (15% increase); low father’s edu- treated cohort. The regressions control for municipality and year of
cation sample estimate 0.2 (SE 0.15) – again very similar to the LP birth fixed effects, as well as municipality group by year of imple-
coefficient estimated at the mean – a 16% increase. mentation linear trends. As the figures demonstrate the conditional
probability of diagnosis and death is not significantly different from
6.2. Competing risks zero in cohorts born pre-implementation. There is however a sharp
increase in the probability that starts with the cohort right before
A potentially important concern in analyzing mortality by dif- the first fully treated cohort and levels off at a new and increased
ferent causes has been raised by Honoré and Lleras-Muney (2006). level with the second fully treated cohort (1 year after year zero of
Technological progress in medicine or any other factor that affects the implementation in the figures below).
the treatment or detection of certain diseases would affect the Second, we performed a number of placebo tests in which we
probability of death from related diagnoses but also the prob- pretend that the reform was implemented earlier or later than
ability of death from other conditions, which pose “competing the actual implementation year. The placebo treatment groups are
risks”. In essence, failure to die from one condition at a given age defined by falsely assigning treatment to women born 6, 4 and 2
increases the probability of death from another condition. Honoré years before the first fully treated cohort, as well as 2, 4 and 6 years
and Lleras-Muney (2006) show in particular that cardio-vascular after the first cohort. In the first arrangement women who were not
(CVD) and cancer deaths in the US are related in this manner. treated receive false treatment status. In the latter arrangement we
Improvements in the treatment of CVD led to decreased mortality pretend that women who were (actually) treated and were born 2,
from CVD but also to increased mortality from cancer compared to 4, and 6 years from the first treated cohort were not treated. Thus,
the counterfactual. This is important in our setting because educa- in this set-up treated women receive false untreated status. We
tion may have affected the early detection and proper treatment present all these tests together in Fig. 2.
of CVD, leading to a reduction in the probability of death from Every estimate is obtained from a separate regression includ-
CVD. Through the competing risks channel, this reduction may have ing cohort and municipality fixed effects, as well as year of
increased the probability of death from breast cancer. To exam- implementation linear trends. The regressions assigning treat-
ine this hypothesis, we first estimate the effects of being exposed ment to untreated cohorts include only women from untreated
to the reform on CVD mortality and compute bounds for our cohorts. Similarly, the regressions assigning non-treatment to
estimates. treated cohorts include only treated women. As the figures
The probability of death from CVD is reduced by reform treat- demonstrate, the largest in absolute value and only statistically
ment by 0.53% (Cox estimate 0.99472, CI 0.8353–1.1845) in the significant effects are estimated when we assign the correct treat-
full sample. In the subsample of low SES background women, the ment values. Further, there are no particular discernible patterns,
reform treatment leads to a 2.7% decrease in CVD mortality (Cox suggesting that there is nothing that systematically biases our
estimate 0.97304, CI 0.80374–1.178). Assuming that everyone who estimates.
did not die from CVD died from breast cancer, we compute a lower
bound on our breast cancer mortality estimates. The education 6.4. Changes in fertility behavior as a possible mechanism behind
reform increases the probability of death from breast cancer or CVD the results
by 9.8% (hazard ratio 1.098, CI from 0.997 to 1.209). Thus even if
the entire reduction in CVD mortality is translated into breast can- The causal estimates confirm the positive correlations between
cer mortality, we still find a positive effect of the reform on the education and the probability of breast cancer diagnosis. Medical
(combined) mortality rate, even though it is about half the size studies have pointed to several channels that might contribute to
of the effect we obtain when we assume the risks are unrelated these findings (see Nechuta et al., 2010 for a recent review). For two
(18%).14 of these – the inverse relation between educational attainments
and completed fertility as well as the positive relation between
education and age at first birth – we have information in our data
6.3. Parallel trends assumption
set allowing us to analyze how these two outcomes were affected
by the schooling reform.
Our difference-in-differences analysis relies on the assumption
As a background to this analysis, Table 5 shows associations
of parallel trends in the incidence of diagnosed breast cancers
between attained education and women’s fertility behavior in
before and after the cohort affected by the reform in each munici-
Sweden using the same population we analyzed in the mortality
pality. We implement two different tests of this assumption. First,
regressions. Column (1) reports the correlation between year of
Fig. 1 plots the conditional marginal effects of exposure in the
schooling and the probability of having no children; column (2)
6 cohorts pre-implementation to 6 cohorts post-implementation
displays the relation between year of schooling and number of chil-
dren; finally, column (3) The shows the correlation between years
of schooling and age at first child.
14
A separate issue emerges if we consider testing for the effect of the reform on As can be seen in Table 5, there is a statistically significant rela-
deaths from breast cancer as one of a series of multiple mortality tests we could
perform, including the reform effect on death from CVD and death from other
tion between years of schooling and each of the three outcomes
causes. We performed an adjustment procedure to calculate the q-value, which is under study. The point estimates suggest that one additional year
the P-value of the test adjusted for the false discovery rate. This methodology was of schooling is associated with a 0.003 increase in the probability of
developed by Storey and co-authors and software was created by Dabney and Storey having no children; 0.013, or an about 0.8 percent, fewer children;
(Storey, 2002; Storey and Tibshirani, 2003). We picked a 0 of 1 and an FDR threshold
and, finally, almost half a year older age at first birth.
of 0.05. The q-value on the reform coefficient in the first linear probability regression
of breast cancer mortality is 0.096 (the P-value is 0.032); the q-value on the reform In Table 6 we turn to analyzing the effect of schooling reform
coefficient in a linear probability regression with binary outcome “death from any on the same set of outcomes as those analyzed in Table 5. None
other condition” is 0.141 (P-value 0.95); the q-value on the reform coefficient in a of the point estimates attain statistical significance. Comparing
linear probability regression with binary outcome “death from CVD” 0.79 (P-value the estimates for the effect of the reform with the correlations
0.79). While a P-value threshold of 0.1 implies that 1 in every 10 tests will be a false
positive, a q-value threshold of 0.1 implies that one in every 10 positive tests will
shown in Table 5, it is evident that the precision in the reform
be a false positive. Even after a conservative adjustment for multiple hypotheses effect estimates for the probability of having no children as well
testing, we obtain a reform effect that is still significant at the 10% level. as the total number of children is too low to enable us to reject
122 M. Palme, E. Simeonova / Journal of Health Economics 42 (2015) 115–124
Fig. 1. Probability of breast cancer diagnosis and death from breast cancer among cohorts of women born close to the first cohort affected by the reform. Note: Conditional
marginal effects plotted in solid line, 95% confidence intervals in dashed lines. The omitted category is women born 2 years before the first cohort that was affected by the
reform. Cohort and municipality fixed effects included in the regressions, as well as municipality group by year of implementation-specific linear trends.
0,15 1,5
1
0,1
0,5
0,05
0
-6 -4 -2 0 2 4 6
0 -0,5
-6 -4 -2 0 2 4 6
-0,05 -1
-1,5
-0,1
-2
Diagnosis
0,006
0,004
0,002
0
-6 -4 -2 0 2 4 6
-0,002
-0,004
-0,006
Fig. 2. Placebo tests assigning treatment status to untreated or untreated status to treated cohorts.
the hypothesis that the effects are the same as for one additional is as small as 0.025 for age at first birth, and so we can con-
year of schooling. However, for the age at first child outcome, clude that it is unlikely that the mechanism behind our result
the point estimate is very different and the precision sufficient of the reform effect on cancer diagnosis incidence is through
to allow us to reject that the effect is as large as the almost 0.5 delayed childbearing among those who had children. For the
years as suggested by the result in Table 5. The upper confidence other two outcomes, the precision is too low for any definite
limit for a 95 percent confidence interval for the reform effect conclusions.
Table 5
Education and women’s fertility behavior.
Note: + Significant at 10%; **significant at 5%; *significant at 1%; robust standard errors in parentheses clustered at the municipality of birth; linear trends by year of reform
implementation included in all specifications; birth cohort dummies included in all specifications.
M. Palme, E. Simeonova / Journal of Health Economics 42 (2015) 115–124 123
Meghir, C., Palme, M., Simeonova, E., 2013. Education, Cognition and Health: Evi- Spasojevic, J., 2010. Effects of education on adult health in Sweden: results from a
dence from a Social Experiment. NBER Working Paper 19002. natural experiment. Contributions to Economic Analysis 290, 179–199.
Nechuta, S., Paneth, N., Velie, E., 2010. Pregnancy characteristics and maternal breast Storey, J., 2002. A direct approach to false discovery rates. Journal of the Royal
cancer risk: a review of the epidemiologic literature. Cancer Causes and Control Statistical Society: Series B 64, 479–498.
21, 967–989. Storey, J., Tibshirani, R., 2003. Statistical significance for genome-wide experiments.
Oreopoulos, P., 2006. Estimating average and local average treatment effects of edu- Proceeding of the National Academy of Sciences 100, 9440–9445.
cation when compulsory school laws really matter. American Economic Review Tabar, L., et al., 1985. Reduction in mortality from breast cancer after mass screening
96 (1), 152–175. with mammography. The Lancet 325 (8433), 829–832.
Journal of Health Economics 42 (2015) 125–138
a r t i c l e i n f o a b s t r a c t
Article history: This paper aims at opening the black box of peer effects in adolescent weight gain. Using Add Health
Received 1 February 2015 data on secondary schools in the U.S., we investigate whether these effects partly flow through the eating
Accepted 12 March 2015 habits channel. Adolescents are assumed to interact through a friendship social network. We propose a
Available online 28 March 2015
two-equation model. The first equation provides a social interaction model of fast food consumption. To
estimate this equation we use a quasi maximum likelihood approach that allows us to control for common
JEL classification:
environment at the network level and to solve the simultaneity (reflection) problem. Our second equation
C31 I10
is a panel dynamic weight production function relating an individual’s Body Mass Index z-score (zBMI)
I12
to his fast food consumption and his lagged zBMI, and allowing for irregular intervals in the data. Results
Keywords: show that there are positive but small peer effects in fast food consumption among adolescents belonging
Obesity to a same friendship school network. Based on our preferred specification, the estimated social multiplier
Overweight is 1.15. Our results also suggest that, in the long run, an extra day of weekly fast food restaurant visits
Peer effects increases zBMI by 4.45% when ignoring peer effects and by 5.11%, when they are taken into account.
Social interactions
© 2015 Elsevier B.V. All rights reserved.
Fast food
1. Introduction per year (Finkelstein et al., 2009). Obesity is also associated with
increased risk of reduced life expectancy as well as with serious
For the past few years, obesity has been one of the major con- health problems such as type 2 diabetes (Maggio and Pi-Sunyer,
cerns of health policy makers in the U.S. It has also been one of the 2003), heart disease (Calabr et al., 2009) and certain cancers (Calle,
principal sources of increased health care costs. In fact, the increas- 2007), making obesity a real public health challenge.
ing trend in children’s and adolescents’ obesity (Ogden et al., 2012) Recently, a growing body of the health economics literature has
has raised the annual obesity-related medical costs to $147 billion tried to look into the obesity problem from a new perspective using
a social interaction framework. An important part of the evidence
suggests the presence of peer effects in weight gain. On one hand,
夽 We wish to thank Christopher Auld, Charles Bellemare, Luc Bissonnette, Vincent Christakis and Fowler (2007), Trogdon et al. (2008), Renna et al.
Boucher, Paul Frijters, Guy Lacroix, Paul Makdissi, Daniel L. Millimet, Kevin Moran, (2008) and Yakusheva et al. (2014) are pointing to the social multi-
Bruce Shearer for useful comments and Yann Bramoullé, Badi Baltagi, Rokhaya Dieye, plier as an important element in the obesity epidemics. As long as it
Habiba Djebbari, Tue Gorgens, Bob Gregory, Louis Hotte, Linda Khalaf, Lung fei
is strictly larger than one, a social multiplier amplifies, at the aggre-
Lee, Xin Meng and Rabee Tourky for useful discussions. The suggestions of two
anonymous referees have substantially improved the paper. We are grateful to
gate level, the impact of any shock (such as the reduction in relative
Rokhaya Dieye for outstanding assistance research. The usual disclaimer applies. price of junk food) that may affect obesity at the individual level.
Financial support from the Canada Research Chair in the Economics of Social Policies This is so because the aggregate effect incorporates, in addition to
and Human Resources and le Centre interuniversitaire sur le risque, les politiques the sum of the individual direct effects, positive indirect peer effects
économiques et l’emploi is gratefully acknowledged. This research uses data from
stemming from social interactions. On the other hand, Cohen-Cole
Add Health, a program project directed by Kathleen Mullan Harris and designed by
J. Richard Udry, Peter S. Bearman, and Kathleen Mullan Harris at the University of and Fletcher (2008b) found that there is no evidence of peer effects
North Carolina at Chapel Hill, and funded by grant P01-HD31921 from the Eunice in weight gain. Also, results from a placebo test performed by
Kennedy Shriver National Institute of Child Health and Human Development, with the same authors (Cohen-Cole and Fletcher, 2008a) indicate that
cooperative funding from 23 other federal agencies and foundations.
∗ Corresponding author. Tel.: +1 418 656 5678.
there are peer effects in acne (!) in the Add Health data when one
E-mail addresses: Bernard.Fortin@ecn.ulaval.ca (B. Fortin),
applies the Christakis and Fowler (2007) method discussed later
m.yazbeck@uq.edu.au (M. Yazbeck). on.
http://dx.doi.org/10.1016/j.jhealeco.2015.03.005
0167-6296/© 2015 Elsevier B.V. All rights reserved.
126 B. Fortin, M. Yazbeck / Journal of Health Economics 42 (2015) 125–138
While the presence (or not) of peer effects in weight has been ables. The system of equations thus allows us to evaluate the impact
widely researched,1 the literature on the mechanisms by which of an eating habits’ exogenous shock on an adolescent’s weight,
peer effects flows is still scarce. Indeed, most of the relevant lit- when peer effects on fast food consumption are taken into account.
erature attempts to estimate the relationship between variables To estimate our two-equation model, we use three waves of the
such as an individual’s Body Mass Index (BMI) and his average National Longitudinal Study of Adolescent Health (Add Health), that
peers’ BMI, without exploring the channels at source of this poten- is, Wave II (1996), Wave III (2001) and Wave IV (2008).5 We define
tial linkage.2 The aim of this paper is to go beyond the black box peers as the nominated group of individuals reported as friends
approach of peer effects in weight gain and try to identify one within the same school. The consumption behaviour is depicted
potentially important mechanism through which peer effects in through the reported frequency (in days) of fast food restaurant
adolescence overweight may flow: eating habits (as proxied by fast visits in the past week.
food consumption). Estimating our system of equations raises serious econometric
Three reasons justify our interest in eating habits in analyzing problems. It is well known that the identification of peer effects
the impact of peer effects on teenage weight. First of all, there is (first equation) is a challenging task. These identification issues
an important literature that points to eating habits as an important were first pointed out by Manski (1993) and discussed among oth-
component in weight gain (e.g., Niemeier et al., 2006; Rosenheck, ers by Bramoullé et al. (2009) and Blume et al. (2015). On one
2008).3 Second, one suspects that peer effects in eating habits are hand, (endogenous + contextual) peer effects must be identified
likely to be important in adolescence. Indeed, at this age, youngsters from correlated (or confounding) factors. For instance, students in a
have increased independence in general and more freedom as far same friendship group may have similar eating habits because they
as their food choices are concerned. Usually vulnerable, they often share similar characteristics (i.e., homophily) or face a common
compare themselves to their friends and may alter their choices environment (e.g., same school). On the other hand, simultane-
to conform to the behaviour of their peers. Therefore, unless we ity between an adolescent’s and his peers’ behaviour (referred
scientifically prove that obesity is a virus, it is counter intuitive to to as the reflection problem by Manski) may make it difficult to
think that one can gain weight by simply interacting with an obese identify separately the endogenous peer effect and the contextual
person.4 This is why we are inclined to think that the presence of effects.
real peer effects in weight gain can be estimated using behavioural We use a new approach based on Bramoullé et al. (2009) and
channels such as eating habits. Third, our interest in peer effects in Lee et al. (2010), and extended by Blume et al. (2015) to address
youths’ eating habits is policy driven. There has been much discus- these identification problems and to estimate the peer effects
sion on implementing tax policies to address the problem of obesity equation. First, we assume that in their fast food consumption
(e.g., Caraher and Cowburn, 2007; Powell et al., 2013). As long as decisions, adolescents interact through a friendship network. Each
peer effects in fast food consumption is a source of externality that school is assumed to form a network. School fixed effects are
may stimulate overweight among adolescents, it may be justified introduced to capture correlated factors associated with network
to introduce a consumption tax on fast food. The optimal level of invariant unobserved variables (e.g., similar preferences due to self-
this tax will depend, among other things, on the social multiplier selection in schools, same school nutrition policies, distance from
of eating habits, and on the causal effect of fast food consumption fastfood restaurants). The structure of friendship links within a
on adolescent weight. network is allowed to be stochastic but conditional on the school
In order to analyze the impact of peer effects in eating habits on fixed effects and observable individual and contextual variables,
weight gain, we propose a two-equation model. The first linear-in- is strictly exogenous. The possibility that friends select each other
means equation relates an individual’s fast food consumption to his using unobservable traits that may be correlated with their fast
individual characteristics, his reference group’s mean fast food con- food consumption decisions is an important issue and is discussed
sumption (endogenous peer effect), and his reference group’s mean later on.
characteristics (contextual peer effects). The endogenous peer effect To solve the reflection problem, we exploit results by Bramoullé
reflects the possibility that eating behaviour of his friends influ- et al. (2009) who show that if there are at least two agents who
ences a teenager’s own behaviour. For instance, one reason why an are separated by a link of distance 3 within a network (i.e., there
adolescent may want to go to a fast food restaurant is to be with are two adolescents in a school who are not friends but are linked
his friends during the lunch. Contextual effects, such as the average by two friends), both endogenous and contextual peer effects are
level of education of his friends’ mother, may also affect a teenager’s identified. Finally, we exploit the similarity between the linear-
eating habits. Thus, mothers with higher education may encourage in-means model and the spatial autoregressive (SAR) model with
not only their children but also their children’s friends to develop or without autoregressive spatial errors.6 The model is estimated
accurate eating bebaviour. using a quasi maximum likelihood (QML) approach as in Lee et al.
The second equation is a panel dynamic production function that (2010). The QML is appropriate when the estimator is derived
relates an individual’s BMI adjusted for age (BMI z-score or zBMI) to from a normal likelihood but the error terms in the model are not
his fast food consumption, his lagged zBMI and other control vari- truly normally distributed. We also estimate the model using gen-
eralized spatial two-stage least square proposed in Kelejian and
Prucha (1998) and refined in Lee (2003), which is less efficient than
1
For a complete review see Fletcher et al. (2011) who conducted a systematic
QML.
review of literature that shows that school friends are similar as far as body weight The estimation of the production function (second equation)
and weight related behaviours. also raises serious econometric issues. First, fast food consumption
2
One recent exception is Yakusheva et al. (2011) and Yakusheva et al. (2014) is likely to be an endogenous variable correlated with the individ-
who look at peer effects in overweight and in weight management behaviours such
ual error term. Moreover, the short and the long term impacts of
as eating and physical exercise, using randomly assigned pairs of roommates in
freshman year. fast food consumption on zBMI may be different, suggesting the
3
An indirect evidence of the relationship between eating habits and weight gain
come from the literature on the (negative) effect of fast food prices on adolescents’
BMI (see Auld and Powell, 2009; Powell, 2009; Powell and Bao, 2009). See also Cutler
5
et al. (2003) which relates the declining relative price of fast food and the increase Note that for the first equation we use wave II and for the second equation we
in fast food restaurant availability over time to increasing obesity in the U.S. use the three waves.
4 6
Of course, having obese peers may influence an individual’s tolerance for being Our approach is more general than the SAR model as the latter usually ignores
obese and therefore his weight management behaviours. contextual effects and spatial fixed effects.
B. Fortin, M. Yazbeck / Journal of Health Economics 42 (2015) 125–138 127
introduction of lagged zBMI as an explanatory variable. Finally, Add individual variable, the corresponding contextual variable at the
Health data waves are collected at irregular intervals. As a con- reference group level.
sequence, estimators obtained from standard dynamic panel data Using the same dataset as Trogdon et al. (2008) and Renna et al.
models are inconsistent. In order to deal with these problems, we (2008), Cohen-Cole and Fletcher (2008b) exploit panel information
use a nonlinear instrumental approach developed by Millimet and (wave II in 1996 and wave III in 2001) for adolescents for whom at
McDonough (2013). least one of same-sex friend is also observed over time. Compared
Results suggest that there is a positive but small endogenous with Christakis and Fowler’s approach, their analysis introduces
peer effect in fast food consumption among adolescents in gen- time invariant and time dependent environmental variables (at
eral. Based on our QML specification, the estimated social multiplier the school level). Friendship selection is controlled for by individ-
is 1.15. Moreover, the production function estimates indicate that ual fixed effects. The authors find that peer effects are no longer
there is a positive significant impact of fast food consumption on significant with this specification.
zBMI. Combining these results, we find that, in the long run, an extra All the studies discussed up to this point focus on peer effects
day of weekly fast food restaurant visits increases zBMI by 4.45% in weight outcomes without analyzing quantitatively the mecha-
when ignoring peer effects and by 5.11%, when they are taken into nisms by which they may occur. The general issue addressed in this
account. paper is whether the peer effects in weight gain among adolescents
The remaining parts of this paper will be laid out as follows. Sec- partly flow through the eating habits channel. This raises in turn two
tion 2 provides a survey of the literature on peer effects in obesity as basic issues: (a) are there peer effects in fast food consumption?,
well as its decomposition into the impact of peer effects on fast food and (b) is there a link between weight gain (or obesity) and fast
consumption and the impact of fast food consumption on obesity. food consumption? In this paper, we address both issues.
Section 3 presents the specification of our fast food equation with The literature on peer effects in eating habits (first issue) is
peer effects. Section 4 is devoted to our weight production function. recent and quite limited. In a recent paper, in which the formation
In Section 5, we give an overview of the Add Health Survey and we of the network is randomized, Yakusheva et al. (2011) estimate peer
provide descriptive statistics of the data we use. In Section 6, we effects in explaining weight gain among freshman girls using a simi-
discuss estimation results. Section 7 concludes. lar set up but in school dormitories. In their paper, they test whether
some of the student’s weight management behaviours (i.e., eating
habits, physical exercise, use of weight loss supplements) can be
2. Previous literature predicted by her randomly assigned roommate’s behaviours. Their
results provide evidence of the presence of negative peer effects
In recent years, a number of studies found strong “social network in weight gain. Their results also suggest positive peer effects in
effects” in weight outcomes. In a widely debated article, Christakis eating habits, exercise and use of weight loss supplements. In a
and Fowler (2007) found that an individual’s probability of becom- subsequent paper, Yakusheva et al. (2014) investigate the pres-
ing obese increased by 57% if he or she had a friend who became ence of peer effects in weight gain exploiting random assignment of
obese in a given interval.7 However, their analysis has been crit- roommates during first year of college. The authors find evidence
icized for suffering from a number of limitations (see Cohen-Cole that suggests that peer effects in weight gain are predominantly
and Fletcher, 2008b; Lyons, 2011; Shalizi and Thomas, 2011).8 In significant among females.10
particular, it ignores potential spurious correlations between two Our paper finds its basis in this literature as well as the literature
friends’ BMI resulting from the fact that they are exposed to the on peer effects and obesity discussed above. However, while works
same environment. Both Shalizi and Thomas (2011) and Lyons by Yakusheva et al. (2011) and Yakusheva et al. (2014) rely upon
(2011) show that relying on link asymmetries does not rule out experimental data, we use observational non-experimental data.
shared environment as it claims. Also, the simultaneity problem Peers are considered to have social interactions within a school
between these two outcomes is not directly addressed by allowing network. This allows for the construction of a social interaction
the peer’s obesity to be endogenous. matrix that reflects how social interaction between adolescents in
In the same spirit, Trogdon et al. (2008) investigate the presence schools occurs in a more realistic setting (as in Trogdon et al., 2008;
of peer effects in obesity using Add Health data. They include school Renna et al., 2008). An additional originality of our paper lies in the
fixed effects to account for the fact that students in a same school fact that it relies upon a linear-in-means approach when relating
share a same surrounding. The authors also estimate their BMI peer an adolescent’s behaviour to that of his peers. Also, the analogy
model with an instrumental variable approach using information between the forms of the linear-in-means model and the spatial
on friends’ parents’ obesity and health and friends’ birth weight autoregressive (SAR) model allows us to exploit the particularities
as instruments for peers’ BMI. They find that a one point increase of this latter model, in particular the natural instruments that are
in peers’ average BMI increases own BMI by 0.52 point. Based derived from its structural form.
on a similar approach and using Add Health dataset, Renna et al. Regarding the second issue, i.e., the relationship between weight
(2008) also find positive peer effects. These effects are significant (or obesity) and fast food consumption, it is an empirical ques-
for females only (=0.25 point).9 These analyses raise a number of tion that is still on the debate table.11 There is no clear evidence
concerns though. In particular, they assume no contextual variables in support of a causal link between fast food consumption and
reflecting peers’ mean characteristics. This rules out the reflec- obesity. Nevertheless, most of the literature in epidemiology
tion problem by introducing non-tested restriction exclusions. In finds evidence of a positive correlation between fast food con-
our approach, we introduce school fixed effects and, for each sumption and obesity (see, Anderson et al., 2011; Rosenheck,
2008).
7
They used a 32-year panel dataset on adults from Framingham, Massachusetts
and a logit specification.
8 10
For a response to these criticisms and others, see Fowler and Christakis (2008), In contrast, De la Haye et al. (2010) provide evidence that close adolescent male
VanderWeele (2011) and Christakis and Fowler (2013). friends tend to be similar in their consumption of high-calorie food.
9 11
Also Ali et al. (2011) and Ali et al. (2011) provide evidence that there are peer cor- The literature on the impact of physical activity on obesity is also inconclu-
relations in weight related behaviours and peer influence in weight misperception sive. For instance, Berentzen et al. (2008) provide evidence that decreased physical
respectively. activity in adults does not lead to obesity.
128 B. Fortin, M. Yazbeck / Journal of Health Economics 42 (2015) 125–138
The economic literature tends to be conservative with respect j∈N
ylj
j∈N
xlj
i i
to this question. It focuses on the impact of “exposure” to fast where ni and ni are respectively his peers’ mean fast
food on obesity. Dunn et al. (2012), using an instrumental variable food consumed and characteristics. In the context of our paper,
approach, investigates the relationship between fast food avail- ˇ is the endogenous peer effect. It reflects how the adolescent’s
ability and obesity. They finds that an increase in the number of mean fast food
consumption of fast food is affected by his peers’
fast food restaurants has a positive effect on the BMI among non- consumption. It is standard to assume that ˇ < 1. The contex-
whites. Alviola et al. (2014), using a similar approach, provides tual peer effect is represented by the parameter ı.13 It captures
evidence that the number of fast food restaurants has a significant the impact of his peers’ mean characteristic on his fast food con-
impact on school obesity rates. Similarly, Currie et al. (2010) find sumption. It is important to note that the Gl matrix and the xl ’
evidence that proximity to fast food restaurants has a significant vector are allowed to be stochastic but are assumed strictly exoge-
effect on obesity for 9th graders. Also, Anderson and Matsa (2011), nous conditional on ˛l , that is, E(εli |xl , Gl , ˛l ) = 0. This assumption
exploiting the placement of Interstate Highways in rural areas to is flexible enough to allow for correlation between the network’s
obtain exogenous variations in the effective price of restaurants, unobserved common characteristics (e.g., school’s cafeteria quality)
did not find any causal link between restaurant consumption and and observed characteristics (e.g., mother’s education). Neverthe-
obesity. More generally, Cutler et al. (2003) and Bleich et al. (2008) less, once we condition on these common characteristics, mother’s
argue that the increased calorie intake (i.e., eating habits) plays a education is assumed to be independent of the idiosyncratic error
major role in explaining current obesity rates. Importantly, weight terms. Let Il be the identity matrix for a network l and l the corre-
prior to adulthood sets the stage for weight in adulthood. While sponding vector of ones, Eq. (1) for network l can be rewritten in
most of the economics literature analyses the relationship between matrix notation as follows:
adolescents’ fast food consumption and their weight using an indi-
rect approach (i.e, effect to fast food exposure), we adopt a direct yl = ˛l l + ˇGl yl + xl + ıGl xl + εl , forl = 1, ..., L. (2)
approach linking weight as a function of fast food consumption,
lagged weight and control variables. Note that Eq. (2) is similar to a SAR model (e.g., Cliff and Ord, 1981)
In the next two sections, we present our two-equation model of generalized to allow for contextual and fixed effects (hereinafter
weight production function with peer effects in fast food consump- referred to as the GSAR model). Since ˇ < 1, (Il − ˇGl ) is invertible.
tion. We first propose a linear-in-means social interaction equation Therefore, in matrix notation, the reduced form of Eq. (2) can be
of fast food consumption (first equation) and discuss the economet- written as:
ric methods we use to estimate it. We then present our econometric −1 −1
weight production function which relates the adolescent’s zBMI yl = ˛l /(1 − ˇ)l + (Il − ˇGl ) (Il + ıGl )xl + (Il − ˇGl ) εl , (3)
level to his fast food consumption (second equation). −1 ∞
where we use the result that (I−ˇGl ) = k=0
ˇk Gkl , so that
the vector of intercepts is ˛l /(1 − ˇ)l , assuming no isolated
adolescents.14
3. Social interactions equation of fast food consumption Eq. (3) allows us to evaluate the impact of a marginal shock
in ˛l (i.e., a common exogenous change in fast food consumption
We assume a set of N adolescents i that are partitioned in a set within the network) on an adolescent i’s fast food consumption,
of L networks. A network is defined as a structure (e.g., school) in when the endogenous peer effect is taken into account. One has
which adolescents are potentially tied by a friendship link. Each ∂(E(yli | ·)/∂˛l = 1/(1 − ˇ). This expression is defined as the social
adolescent i in his network has a set of nominated friends Ni of size multiplier in our model. When ˇ > 0 (strategic complementarity in
ni that constitute his reference group (or peers). We assume that i is fast food consumption), the social multiplier is larger than 1. In this
excluded from his reference group. Since peers are defined as nom- case, the impact of the shock is amplified by social interactions as
inated friends, the number of peers will not be the same for every more fast food consumption by his peers induces an adolescent to
network member. Let Gl (l = 1, . . ., L) be the social interaction matrix adopt a similar behaviour.
for a network l. Its element glij takes a value of 1/ni when i is friend We then perform a panel-like within transformation. More pre-
with j, and zero otherwise. Therefore, assuming no isolated individ- cisely, we average Eq. (3) over all students in network l and subtract
uals, the Gl matrix is row normalized. We define yli as the fast food it from i’s equation. This transformation allows us to address prob-
consumed by adolescent i in network l, xli represents the adolescent lems that arise from the fact that adolescents are sharing the
i’s observable characteristics, yl the vector of fast food consump- same environment or preferences. Let Kl = Il − Hl be the matrix that
tion in network l, and xl is the corresponding vector for individual obtains the deviation from network l mean with Hl = n1 (l l ). The
l
characteristics. To simplify our presentation, we look at only one
network within transformation will eliminate the network fixed
characteristic (e.g., adolescent’s mother education).12 The network
effect ˛l . Pre-multiplying (3) by Kl yields the reduced form of the
invariant unobservable variables are captured through fixed net-
model for network l, in deviation:
work effects (the ˛l ’s). They take into account unobserved factors
such as preferences of school, school nutrition policies, or pres- −1 −1
Kl yl = Kl (Il − ˇGl ) (Il + ıGl )xl + Kl (Il − ˇGl ) εl . (4)
ence of fast food restaurants around the school. The εli ’s are the
idiosyncratic error terms. They capture i’s unobservable character-
istics that are not invariant within the network. Formally, one can 3.1. Identification
write the linear-in-means equation for adolescent i as follows:
Our peer effects structural Eq. (2) raises two basic identification
problems.
y
j∈Ni lj
x
j∈Ni lj
yli = ˛l + ˇ + xli + ı + εli , (1)
ni ni
13
It is standard to assume the presence of a contextual effect for each individual
characteristic influencing the outcome. Otherwise, the model may impose ad hoc
exclusion restrictions which generate invalid instruments and inconsistent estima-
12
Later on, in Section 3.1.1, we will generalize the equation to account for many tors.
characteristics. 14
When an adolescent is isolated, his intercept is ˛l .
B. Fortin, M. Yazbeck / Journal of Health Economics 42 (2015) 125–138 129
3.1.1. Simultaneity (e.g., impulsivity, a specific taste for sugar- and fat-rich food).
Simultaneity between individual and peer behaviour (the reflec- Recently, some researchers (e.g., Hsieh and Lee, 2011; Goldsmith-
tion problem) may prevent separating contextual effects from Pinkham and Imbens, 2013; Liu et al., 2013; Badev, 2013) have
endogenous effects. This problem has been analyzed by Bramoullé made attempts to develop econometric models allowing for the
et al. (2009) when individuals interact through social networks. joint estimation of network formation and network interactions.
They show that the conditions of identification depend on both the However, empirical results using Add Health data and focusing on
values of parameters and the structure of the network. More explic- outcomes such as smoking, sleeping behaviour, and scholar perfor-
itly, let us first assume throughout that ˇ + ı = / 0. Then define G mance, do not seem to detect much difference in peer effects when
the block-diagonal matrix with the Gl ’s on its diagonal. Assume first networks are assumed exogenous and when they are allowed to be
the absence of fixed network effects (i.e., ˛l = ˛ for all l). In this case, endogenous.17
Bramoullé et al. (2009) show that the structural parameters of Eq. One specification of our peer effects equation also allows the
(2) are identified if the matrices I, G, G2 are linearly independent. error terms to be (first-order) autocorrelated within networks.
This condition is satisfied when there are at least two adolescents Therefore its structure becomes analogous to that of a general-
who are separated by a link of distance 2 within a network. This ized spatial autoregressive model with network autoregressive
means that they are not friends but have a common friend.15 The disturbances (denoted as the GSARAR model). This model implies
intuition is that this provides exclusion restrictions in the model. that in addition to the endogenous and contextual effects,
More precisely, the friends’ friends mean characteristic can serve as some unobserved characteristics of the friends are also inter-
instrument for the mean friends’ fast food consumption. Of course, dependent. In this case, the error terms in (2) can be written
when fixed network effects are allowed, the identification condi- as:
tions are more restrictive. Bramoullé et al. (2009) show that, in this
εl = Gl εl + l , (5)
case, the structural parameters are identified if the matrices I, G, G2
and G3 are linearly independent. This condition is satisfied when at where the innovations, l , are assumed to be i.i.d. (0, 2 Il ) and || < 1.
least two adolescents are separated by a link of distance 3 within a Given these assumptions, we can write:
network, i.e., we can find two adolescents who are not friends but
3 > 0 while g 2 = g = 0. εl = (Il − Gl )−1 l . (6)
are linked by two friends. In this case, glij ij ij
Hence, no linear relation of the form G3 = 0 I + 1 G + 2 G2 can exist. Allowing for many characteristics and performing a Cochrane-
This condition holds in most friendship networks and, in particular, Orcutt-like transformation on the structural equation (4) in
in the data we use.16 deviation, the latter is given by the following structural form:
Kl Ml yl = ˇKl Ml Gl yl + Kl Ml Xl + Kl Ml Gl Xl ı + l , (7)
3.1.2. Correlated effects
where Xl is the matrix of adolescents’ characteristics18
in the lth
The presence of confounding unobservable variables affect-
network, Ml = (I − Gl ) and l = Kl Ml l .
ing fast food consumption and correlated with the explanatory
Following Lee et al. (2010), we propose two approaches to esti-
variables raises difficult identification problems. First, since ado-
mate the peer effects equation (7): a quasi maximum likelihood
lescents are not randomly assigned into schools, endogenous
approach (QML) and a generalized spatial two stage least squares
self-selection through networks may be the source of potentially
(GS-2SLS) approach. The QML estimators are estimated assum-
serious biases in estimating (endogenous + contextual) peer effects.
ing that the disturbances are normally distributed. However, we
Indeed, if the variables that drive this process of selection are
do allow the log-likelihood function to be partially misspecified,
not fully observable, correlations between unobserved network-
as standard errors are computed to be robust to non-normal dis-
specific factors and the regressors are potentially important sources
turbances (using a sandwich formula). Assuming that the error
of bias. In our approach, we assume that network fixed effects cap-
terms are i.i.d. and under a number of regularity assumptions (see
ture these factors. This is consistent with two-step models of link
Lee et al., 2010, p. 152), QML estimators are consistent but not
formation. Each adolescent joins a school in a first step, and forms
asymptotically efficient. On the other hand, GS-2SLS estimators also
friendship links with others in his school in a second step. In the
assume that the error terms are i.i.d. but impose less regularity con-
first step, adolescents self-select into different schools with selec-
ditions than QML estimators. QML estimators are asymptotically
tion bias due to specific school characteristics. In a second step,
more efficient than GS-2SLS estimators.19
link formation takes place within schools randomly or based on
observable individual characteristics only. Recall also that network
4. Weight production function
fixed effects take into account common unobservable variables at
the school level that may influence fast food consumption (e.g.,
In this section, we propose a dynamic (AR(1)) weight production
availability of fast food restaurants).
function that relates an individual’s zBMI in period t (assumed a
Of course, one limitation of using network fixed effects is that
year) to his lagged zBMI, his fast food consumption and his own
it ignores the possibility that the links formation within a network
characteristics in period t. Let yitb be an individual i’s zBMI level in
depend on omitted variables. The matrix G may be endogenous f
even when controlling for the network fixed effects and observable period t, and yit be the individual’s fast food consumption in period
characteristics. Thus friends may select each other using unob-
servable traits that may be correlated with fast food consumption
17
One reason may be that this data base includes a very large number of observable
characteristics, some of them being used in the regressions. Another explanation
is that, although statistically significant, the explanatory power of the individual
15
More generally, Eq. (2) is identified when individuals do not interact in groups characteristics on the probability that two individuals are friends is extremely small
or interact in groups with at least three different sizes (see Bramoullé et al., 2009). (Boucher, 2014).
16 18
Identification fails, however, for a number of non trivial networks. This is notably Following the linear-in-means model, we allow the peers’ mean characteristic
the case for complete bipartite networks. In these graphs, the population of students corresponding to each individual’s characteristic to have a potential effect on his
is divided in two groups such that all students in one group are friends with all fast food consumption. Therefore, we do not impose ad hoc (identifying) exclusion
students in the other group, and there is no friendship links within groups. These restrictions to the structural peer effects equation.
19
include star networks, where one student, at the centre, is friend with all other The derivation of the QML and GS-2SLS estimators are presented in the
students, who are all friends only with him. Appendix.
130 B. Fortin, M. Yazbeck / Journal of Health Economics 42 (2015) 125–138
t. Then, for a given vector of characteristics x̃it , the data generating and serially uncorrelated. Also, while the estimators are inconsis-
process (DGP) of the weight production function can be formally tent when T is fixed and N→ ∞, Monte–Carlo simulations by MM
expressed as follows (for notational simplicity we suppress l): suggest that this approach has superior small sample properties
compared to other dynamic panel data estimators.
f
yitb = 1 yi,t−1
b
+ 2 yit + 3 x̃it + i +
it , (8) Second, some covariates (in particular, the individual’s fast food
consumption, yf ) are likely to be correlated with the unobserved
where 1 is the autoregressive parameter (|1 | < 1), i is the indi-
effect and/or to be serially correlated. In the first case, Everaert
vidual i’s time-invariant error component (fixed effect) and
it , his
(2013) suggests to use Hausman and Taylor (1981) type instru-
idiosyncratic error that may change across t. One difficult problem
ments for these covariates, that is, deviations from individual
with (8) is that the Add Health data set, the waves are irregu-
sample means (e.g., ÿf ). Also, in the presence of serially correlated
larly spaced. This means that the successive periods of observed
covariates, one solution suggested by MM is to impute data for the
data (that is, for 1996, 2001 and 2008) do not conform to suc-
missing periods. For instance, we can use current value of covariates
cessive (yearly) periods as defined by our underlying DGP. In that
to approximate missing covariates between periods m and m − 1.21
case, standard methods to estimate a dynamic panel model with
Therefore, in Eq. (10), we can write:
endogenous variables (e.g., Anderson and Hsiao, 1981; Arellano
and Bond, 1991) yield inconsistent estimators. To address this
gm −1
f j f
g
1 − 1m
point, we follow Millimet and McDonough (2013) (hereinafter MM) (2 yi,t(m)−j + 3 x̃i,t(m)−j )1 ≈ (2 yi,m + 3 x̃i,m ) .
1 − 1
approach. From repeated substitution in Eq. (8) we rewrite Eq. (8) j=1
defined over the observed periods m = 1, 2, 3, one obtains: (11)
b f
yim = 1gm yi,m−1
b
+ 2 yim + 3 x̃im + i +
˜ im , (9)
In this paper, we estimate the weight production function given
where gm is the gap size or the number of years between observed by Eqs. (9)–(11) using a nonlinear instrumental approach and based
period m and m − 1, (which, in our case, are equal to g1 = 1, g2 = 5, on current values of covariates to approximate missing data for the
g3 = 7)20 ; = (1 − 1gm )/(1 − 1 ), and missing periods. Following MM, we denote this estimator: E-NLS-
IV-C. We also present a GMM version of this estimator using a two-
gm −1
f j
gm −1
j step approach to obtain an optimal weighting matrix (clustered at
21
No approximation is needed for variables such age, for which we have perfect
information at each period.
20 22
One has g(1) = 1 since Wave I from Add Health data (which corresponds to m = 0) Adolescents were asked to nominate up to five female friends and five male
was collected in 1995. friends.
B. Fortin, M. Yazbeck / Journal of Health Economics 42 (2015) 125–138 131
in-home in the subsequent waves in 1996 (wave II), 2001 (wave Table 1
Descriptive statistics.
III) and 2008–2009 (wave IV). The extensive questionnaire was
also used to construct the saturation sample that focuses on 16 Variable Mean SD
selected schools (about 3000 students). Every student attending Fast food consumptiona 2.33 1.74
these selected schools answered the detailed questionnaire. There
Female 0.50 0.50
are two large schools and 14 other small schools. All schools are
Age 16.36 1.44
racially mixed and are located in major metropolitan areas except
one large school that has a high concentration of white adolescents White 0.57 0.49
Black 0.15 0.34
and is located in a rural area. Consequently, fast food consump- Asian 0.01 0.09
tion may be subject to downward bias if one accepts the argument Native 0.13 0.33
that the fast food consumption among white adolescents is usually Other 0.14 0.35
lower than that of black adolescents. Mother present 0.85 0.35
In this paper we use the saturation sample of wave II in-home
Mother education
survey to investigate the presence of peer effects in fast food
No high school degree 0.15 0.35
consumption.23 One of the innovative aspects of this wave is the High school/GED/Vocational Instead of high school 0.36 0.48
introduction of the nutrition section. It reports among other things Some college/vocational after high school 0.21 0.39
food consumption variables (e.g., fast food, soft drinks, desserts, College 0.18 0.38
Advanced degree 0.06 0.24
etc.). This allows us to depict food consumption patterns of each
Don’t know 0.04 0.20
adolescent and relate it to that of his peer group. In addition, the
availability of friend nomination allows us to retrace school friends Father education
No high school degree 0.16 0.36
and thus construct friendship networks. To estimate the weight
High school/GED/vocational instead of high school 0.33 0.47
production function, we considered information from wave I, wave Some college/vocational after high school 0.17 0.37
II, wave III and wave IV. College 0.18 0.38
We exploit friends nominations to construct the network of Advanced degree 0.08 0.26
Don’t know 0.06 0.24
friends. Thus, we consider all nominated friends as network
Missing 0.02 0.16
members regardless of the reciprocity of the nomination. If an
adolescent nominates a friend then a link is assigned between Grade 7–8 0.11 0.32
Grade 9–10 0.27 0.44
these two adolescents (directed network with non symmetric
Grade 11–12 0.62 0.48
links).
Allowance per week 8.28 11.65
6. Results While this result suggests that one has to be quite cautious when
accepting the estimate, it can be argued that one should perform
6.1. Baseline: OLS peer effects estimates a one-tail test since one expects the endogenous peer effect to be
either positive or zero. In that case, the social multiplier associated
We first estimate a naive OLS of the peer effects equation where with an exogenous increase in an adolescent fast food consump-
1
we regress the fast food consumption of an adolescent on the tion is 1.15 (= 1−0.129 ) and is significantly different from 1 at the
average fast food consumption of his peers, his individual char- 10% level, based on a one-tail test (its standard error is 0.096 using
acteristics as well as the average characteristics of his peers. We the delta method, with a one-tail p-value of 0.059). This reflects a
then apply a panel-like within transformation to account for corre- relatively low endogenous peer effect.
lated effects (OLS w ). It is clear that the estimates of naive OLS and How can we compare these results to those obtained previ-
OLS w are inconsistent. The former ignores both correlated effects ously in the related literature? Although there are few studies that
and simultaneity problems while the latter ignores simultaneity investigated the presence of peer effects in fast food consumption
problems. However, they are reported to provide a baseline for this using the linear-in-means equation, a richer body of literature has
study. investigated a tangent issue: obesity. As compared with endoge-
Estimation results reported in Table 2 show that there is a posi- nous effects obtained in the literature on obesity, our peer effect is
tive significant peer influence in fast food consumption. According intermediate between studies that obtain no peer effects (Cohen-
to the naive OLS estimates, an adolescent would increase his weekly Cole and Fletcher, 2008b) and the literature that provides evidence
frequency (in days) of fast food restaurant visits by 0.21 in response that there are peer effects are strong, for instance, generating a
to an extra day of fast food restaurant visits by his friends. On social multiplier larger than 1.5 (e.g., Christakis and Fowler, 2007;
average, this corresponds to an increase of 9% (=0.21/2.33). OLS w Trogdon et al., 2008).29
estimate is slightly lower (=0.15, or 6.6%). This reduction in the To check the sensitivity of these results to the presence of SAR
estimated effect may partly be explained by the fact that adoles- disturbances, we also estimate our model using a GSARAR QML
cents in the same reference group tend to choose a similar level specification. The estimated spatial autocorrelation coefficient is
of fast food consumption partly because they face a common envi- negative but not significant at the 5% level. Moreover the endoge-
ronment or because adolescents with similar characteristics tend nous peer effect is large (=0.3655) but no longer significant even at
to attend the same school (homophily). As for the individual char- the 10% level (one-tail test). Also a likelihood test does not reject
acteristics, age, father education and weekly allowance positively the GSAR QML specification. Therefore we consider the latter as
affect fast food consumption. Turning our attention to the contex- our preferred one. This suggests a much lower endogenous peer
tual peer effects, we notice that the latter variable decreases with effect (=0.13), which can be interpreted as a lower bound to this
mean peers’ mother’s education and increases with mean peers’ parameter, at least when assuming that selection on unobserv-
father’s education. The former result indicates that friends’ mother ables is not an important source of biases, after controlling for
education negatively affects an adolescent’s fast food consump- network fixed effects and observable characteristics (see our dis-
tion. cussion above).
To sum up, we can say that results in general are consistent with
6.2. GS-2SLS and QML peer effects estimates the hypothesis that fast food consumption is linked to issues of
interactions with friends. However, our social multiplier estimate
Next, we estimate our peer effects equation with school fixed does not appear to be very strong (as the endogenous effects are less
effects using GS-2SLS (with i.i.d. error terms and without impos- than 0.3), at least when we consider a specification which seems
ing autoregressive disturbances: = 0). We then estimate this reasonable. This result, despite its small magnitude, addresses the
equation using a QML approach with i.i.d. error terms. Also, we esti- puzzle around the behavioural channels through which peer effects
mate another plausible version of this model by allowing network in weight gain flows. Indeed, while Yakusheva et al. (2014) in their
autoregressive disturbances (GSARAR model). attempt to uncover the channels through which these effects flow
Estimation results are displayed in Table 3. The GS-2SLS have tested for two behavioural channels exercise and eating dis-
approach (see last two columns) assumes that the instrument for orders (e.g., anorexia), they could not test for the presence of peer
∗ effects in eating habits due to data limitations.
G∗ y∗ is given by G∗ ŷ (see Eq. 14).27 One can check whether this
∗ As for estimated individual effects and focusing on the GSAR
instrument is weak by regressing G∗ y∗ on G∗ ŷ , X∗ and G∗ X∗ and
performing a Stock-Yogo test (see Table 3). It consists in comparing QML specification, they follow fairly the baseline model. Fast food
the Cragg-Donald F statistic associated with the estimated coeffi- consumption is positively associated with age and father’s edu-
∗ cation as well as positively associated with weekly allowance.
cient of G∗ ŷ (= 17.80) with its critical value when one assumes a
10% tolerance28 for the size distortion of the 5% Wald test (=16.38). Mother’s education seems to have a negative but non-significant
Based on this test, we reject that the instrument is weak. The impact on fast food consumption. It is important to note that while
endogenous effect resulting from GS-2SLS estimation is positive the general perception is that fast food is an inferior good, the
(=0.11 or 4.73%) but non significant. When using the GSAR QML empirical evidence suggests a positive income elasticity (Aguiar
approach, estimation results show a positive endogenous effect and Hurst, 2005). The positive relation between fast food consump-
of 0.129 (or 5.3%). This estimate reveals to be very close to the tion and allowance is thus in line with the positive relation between
one obtained by GS-2SLS and is slightly smaller than the OLS ones income and fast food consumption.
obtained in the previous sub-section. It is statistically significant at One advantage of our spatial approach is that it allows to iden-
the 5% level if we perform a one tail test (one-tail p value = 0.039) tify both endogenous and contextual peer effects. Turning our
but at 10% if we consider a two-tail test (two tail p value = 0.0785). attention to the latter, we note in particular that an adolescent’s
fast food consumption decreases with peers’ mother’s education
27
The star superscript indicates that the original variable has been transformed
29
to eliminate the problem of singular variance matrix generated by the use of the More specifically, Cohen-Cole and Fletcher (2008b) finds a statistically insignif-
within transformation to eliminate fixed network effects. See Appendix. icant social multiplier of 1.03, Christakis and Fowler (2007) find a statistically
28
This level of tolerance is the smallest one that can be computed given that there significant social multiplier of 2.63 and Trogdon et al. (2008) find a statistically
is only one excluded instrument. significant social multiplier of 2.08.
B. Fortin, M. Yazbeck / Journal of Health Economics 42 (2015) 125–138 133
Table 2
Peer effects in fast food consumption.
OLS OLSw
N = 2339
*
Significant at 10% level.
**
Significant at 5% level.
***
Significant at 1% level.
but increases with mean peers’ father’s education. While the for- Nevertheless, we still need to provide evidence of the presence of
mer causal effect seems natural as mothers with higher education a relationship between fast food consumption and weight gain. In
may (directly and indirectly) encourage both their children and this section we report estimates of the weight production function
their friends to have better eating habits, the latter effect is rather presented earlier.
puzzling. One partial explanation is that fathers with higher edu- Results from the estimation of the production function are
cation are more likely to be absent from home. Therefore they reported in Table 5. Specification (1) shows baseline OLS estimates
have less positive influence on their children’s and friends’ eating of equation (8) (where t is replaced by m), specification (2) shows
habits. the NLS estimates of Eq. (9), specification (3) shows E-NL-IV-C esti-
mation results for Eq. (9) and finally specification (4) shows GMM
6.3. Weight production function estimates version the previous estimator using a two step with an optimal
weighting matrix. All specifications are estimated using wave 3 and
Estimation results presented in the earlier sections are consis- wave 4, but where information from wave 1 and wave 2 are used
tent with the presence of peer effects in fast food consumption. to construct the instruments.
134 B. Fortin, M. Yazbeck / Journal of Health Economics 42 (2015) 125–138
Table 3
Peer effects in fast food consumption GSAR, GSARAR and GS-2SLS.
Individual characteristics
Female −0.0787 0.0589 −0.0861 0.0832 −0.0839 0.0780
Age 0.1401 *** 0.0479 0.1477 0.1386 0.1346 ** 0.0531
White −0.0622 0.0795 −0.0582 0.2083 −0.0619 0.1169
Mother present −0.0319 0.0760 −0.0278 0.1097 −0.0375 0.0973
In line with our expectations, the general results indicate that used the same data set. The impact of lagged zBMI is 0.7591 (com-
lagged zBMI and current fast food consumption have positive sig- pared to 0.7600 for Niemeier et al. (2006)). As for the impact of fast
nificant effect (which is between 0 and 1, in the case of the lagged food consumption, it is 0.014 (compared to 0.020 for Niemeier et al.
zBMI) on current zBMI. These results seem to be robust across dif- (2006)). 30
ferent specifications with some differences that can be explained
by the differences in the assumptions made on the DGP. More
specifically results in specification (1), our baseline specification, 30
It is important to note that Niemeier et al. (2006) used a different wave and a
are comparable to previous findings by Niemeier et al. (2006) who different approach.
B. Fortin, M. Yazbeck / Journal of Health Economics 42 (2015) 125–138 135
Table 5
Weight production function.
here to take into account the fact that the estimated standard errors Appendix A. Quasi maximum likelihood (QML) of the peer
are panel clustered). The test rejects the null hypothesis that the effects model
instruments are weak.31
Let us rewrite Eq. (7) for convenience:
Kl Ml yl = ˇKl Ml Gl yl + Kl Ml Xl + Kl Ml Gl Xl ı + l .
7. Conclusion
The elimination of fixed network effects using a within trans-
This paper investigates whether peer effects in adolescent formation leads to a singular variance matrix such that E(l l |
weight partly flow through the eating habits channel. We first Xl , Gl ) = Kl Kl 2 = Kl 2 . To resolve this problem of linear depend-
attempt to study the presence of significant endogenous peer ency between observations, we follow a suggestion by Lee et al.
effects in fast food consumption. New methods based on spatial (2010) and applied by Lin (2010). Let [Ql Cl ] be the orthonor-
econometric analysis are used to identify and estimate our model, mal matrix of Kl , where Ql corresponds to the eigenvalues of 1
under the assumption that individuals interact through a friendship and Cl to the eigenvalues of 0. The matrix Ql has the follow-
social network. Our results indicate that an increase in his friends’ ing properties: Ql Ql = In∗ , Ql Ql = Kl and Ql = 0, where n∗l = nl − 1
l
mean fast food consumption induces an adolescent to increase his with nl being the number of adolescents in the lth network. Pre-
own fast food consumption. This peer effect amplifies through a multiplying (7) by Ql , the structural model can now be written as
social multiplier the impact of any exogenous shock on fast food follows:
consumption. However, our estimated social multiplier based on
our preferred (conservative) specification is small as it is equal to M∗l y∗l = ˇM∗l G∗l y∗l + M∗l X∗l + M∗l G∗l X∗l ı + ∗l , (12)
1.15.
where M∗l = Ql Ml Ql , y∗l = Ql yl , G∗l = Ql Gl Ql , X∗l = Ql Xl , and ∗l =
We also estimate a dynamic weight production function which
Ql l . With this transformation, our problem of dependency
relates the individual’s Body Mass Index to his fast food consump-
between the observations is solved, since we have E(∗l ∗ l
| Xl , Gl ) =
tion. Our results reveal a positive significant impact of a change
in fast food consumption on the change in zBMI. Specifically, in 2 In∗ .
l
the long run, a one-unit increase in the weekly frequency (in days) Assuming that ∗l is a n∗l -dimensional i.i.d normally distributed
of fast food consumption produces an increase in zBMI by 4.45%. disturbance vector, the log-likelihood function of (12) is given by:
This effect reaches 5.11% when the social multiplier is taken into
account. This suggests the presence of a positive but low endoge-
nous peer effect. In short, our results are intermediate between −n∗ L
L
ln L = ln(2 2 ) + ln |In∗ − ˇG∗l | + ln |In∗ − M∗l |
studies on overweight or obesity that report no peer effects (e.g., 2 l l
l=1 l=1
Cohen-Cole and Fletcher, 2008a) and others that provide evidence
1 ∗ ∗
of strong peer effects (e.g., Trogdon et al., 2008; Christakis and L
Fowler, 2007) − l l , (13)
Coupled with the reduction in the relative price of fast food 2 2
l=1
and the increasing availability of fast food restaurants over time,
L
the social multiplier could somewhat increase the prevalence of where n∗ = n∗ = N − L, and, from (12), ∗l = M∗l (y∗l − ˇG∗l y∗l −
l=1 l
∗ ∗ ∗
obesity in the years to come. Conversely, this multiplier may con- Xl − Gl Xl ı). Maximizing (13) with respect to (ˇ, , ı , , ) yields
tribute to the decline of the spread of obesity and the decrease the maximum likelihood estimators of the model. Interestingly, the
in health care costs, as long as it is exploited by policy mak- QML method is implemented after the elimination of the network
ers through tax and subsidy reforms encouraging adequate eating fixed effects. Therefore, the estimators are not subject to the inci-
habits among adolescents, or used to implement network based dental parameter problem that may arise since the number of fixed
interventions to promote healthy eating behaviours (Fletcher et al., effects increases with the size of the networks sample. To compute
2011). robust standard errors, we use a sandwich form A−1 BA−1 , where A
There are many possible extensions to this paper. From a is minus the expectation of the Hessian matrix and B is the expec-
policy perspective, it would be interesting to investigate the pres- tation of the outer product of the gradient matrix. An advantage of
ence of peer effects in physical activity of adolescents. A recent this approach is that it allows us to obtain robust standard errors
study by Charness and Gneezy (2009) finds that there is room that are not driven by the normality assumption that ML imposes
for intervention in peoples’ decisions to perform physical exercise on the error term.
through financial incentives. It would be thus valuable to investi-
gate whether there is a social multiplier that can be exploited to
Appendix B. Generalized spatial two stage least squares
amplify these effects. Furthermore, in the same way, it would be
(GS-2SLS) of the peer effects model
interesting to study the presence of peer effects weight percep-
tions. So far, most of the peer effects work has focused mainly on
To estimate the model (12), we also adopt a generalized spatial
BMI outcomes. At the methodological level, a possible extension
two-stage least squares procedure presented in Lee et al. (2010).
would be to assume a Poisson or a Negative Binomial distribution
This approach provides a simple and tractable numerical method
to account for the count nature of the consumption data at hand. As
to obtain asymptotically efficient IV estimators within the class of
far as we know, no work has been carried out in this area. Finally,
IV estimators. In the case of our paper this method will consist of
it would be most useful to develop a general approach that would
a two-step estimation.32 To simply the notation, Let X∗ be a block-
allow same sex and opposite sex peer effects to be different for both
diagonal matrix with X∗ l on its diagonal, G∗ be a block-diagonal
males and females.
matrix with G∗ l on its diagonal, and y∗ the concatenated vector of
the yl∗ ’s over all networks.
31
As the model is exactly identified, it is not possible to conduct an over identifi-
cation test. 32
Note that for this particular case we impose = 0 and thus Ml = Il .
B. Fortin, M. Yazbeck / Journal of Health Economics 42 (2015) 125–138 137
∗
Now, let us denote by X̃ the matrix of explanatory variables Charness, G., Gneezy, U., 2009. Incentives to exercise. Econometrica 77 (3), 909–931.
∗
such that X̃ = [G∗ y∗ X∗ G∗ X∗ ]. Let P be the weighting matrix Christakis, N.A., Fowler, J.H., 2013. Social contagion theory: examining dynamic
social networks and human behavior. Statistics in Medicine 32 (4), 556–577.
such that P = S(S S)−1 S , and S a matrix of instruments such that Christakis, N., Fowler, J., 2007. The spread of obesity in a large social network over
S = X∗ G∗ X∗ G∗2 X∗ . In the first step, we estimate the following 32 years. New England Journal of Medicine 357 (4), 370–379.
Cliff, A., Ord, J., 1981. Spatial Processes: Models & Applications. Pion Ltd.
2SLS estimator:
Cohen-Cole, E., Fletcher, J., 2008a. Detecting implausible social network effects in
∗ ∗ −1 ∗ acne height and headaches: longitudinal analysis. British Medical Journal 337,
ˆ 1 = (X̃ PX̃ ) X̃ Py∗ , a2533.
Cohen-Cole, E., Fletcher, J., 2008b. Is obesity contagious? Social networks vs. envi-
where 1 is the first-step 2SLS vector of estimated parameters ronmental factors in the obesity epidemic. Journal of Health Economics 27 (5),
ˆ ) of the structural model. This estimator is consistent but
(ˆ 1 , ı̂1 , ˇ 1143–1406.
1 Currie, J., DellaVigna, S., Moretti, E., Pathania, V., 2010. The effect of fast food restau-
not asymptotically efficient within the class of IV estimators. rants on obesity and weight gain. American Economic Journal: Economic Policy
Now, in the second step, we estimate a 2SLS using a new matrix 2, 34–65.
of instruments Ẑ given by: Cutler, D., Glaeser, E., Sphapiro, J., 2003. Why have Americans become more obese?
∗
Journal of Economic Perspectives 17, 93–118.
Ẑ = G∗ ŷ X∗ G∗ X∗ , De la Haye, K., Robins, G., Mohr, P., Wilson, C., 2010. Obesity-related behaviors in
adolescent friendship networks. Social Networks 32 (3), 161–167.
∗ Dunn, R.A., Sharkey, J.R., Horel, S., 2012. The effect of fast-food availability on
where G∗ ŷ is computed from the first-step 2SLS reduced form (pre- fast-food consumption and obesity among rural residents: an analysis by
multiplied by G∗ ): race/ethnicity. Economics & Human Biology 10 (1), 1–13.
Everaert, G., 2013. Orthogonal to backward mean transformation for dynamic panel
∗ −1 data models. Econometrics Journal 16, 179–221.
G∗ ŷ = G∗ (I − ˇ
ˆ 1 G∗ ) (X∗ ˆ 1 + G∗ X∗ ı̂1 ). (14) Finkelstein, E.A., Trogdon, J.G., Cohen, J.W., Dietz, W., 2009. Annual medical spending
attributable to obesity: payer-and service-specific estimates. Health Affairs 28
We then estimate: (5), w822–w831.
∗ −1 Fletcher, A., Bonell, C., Sorhaindo, A., 2011. You are what your friends eat: system-
2 = (Ẑ X̃ ) Ẑy∗ . atic review of social network analyses of young people’s eating behaviours and
bodyweight. Journal of Epidemiology and Community Health, jech-2010.
This estimator can be shown to be consistent and asymptoti- Fowler, J., Christakis, N., 2008. Estimating peer effects on health in social networks.
Journal of Health Economics 27 (5), 1400–1405.
cally best IV estimator. Its asymptotic variance matrix is given
∗ ∗ −1 Goldsmith-Pinkham, P., Imbens, G.W., 2013. Social networks and the identification
by N[Z X̃ R−1 X̃ Z] . The matrix R is consistently estimated by of peer effects. Journal of Business and Economic Statistics 31 (3), 253–264.
N 2 Hausman, J.A., Taylor, W.E., 1981. Panel data and unobservable individual effects.
R̂ = s2 ẐNẐ , where s2 = N −1 i=1 ûi and ûi are the residuals from Econometrica 49 (6), 1377–1398.
the second step. It is important to note that, as in Kelejian and Hsieh, C.-S.-C., Lee, L.-F., 2011. A social interactions model with endogenous friend-
Prucha (1998), we assume that errors are homoscedastic. The esti- ship formation and selectivity, Working paper. Mimeo.
Kelejian, H., Prucha, I., 1998. A generalized spatial two-stage least squares procedure
mation theory developed by Kelejian and Prucha (1998) under the for estimating a spatial autoregressive model with autoregressive disturbances.
assumption of homoscedastic errors does not apply if we assume Journal of Real Estate Finance and Economics 17 (1), 99–121.
heteroscedastic errors (Kelejian and Prucha, 2010). Kelejian, H., Prucha, I., 2010. Specification and estimation of spatial autoregres-
sive models with autoregressive and heteroskedastic disturbances. Journal of
Econometrics 157, 53–67.
References Lee, L., 2003. Best spatial two-stage least squares estimators for a spatial autore-
gressive model with autoregressive disturbances. Econometric Reviews 22 (4),
Aguiar, M., Hurst, E., 2005. Consumption versus expenditure. Journal of Political 307–335.
Economy 113 (5), 919–948. Lee, L.-F., Liu, X., Lin, X., 2010. Specification and estimation of social interaction
Ali, M.M., Amialchuk, A., Heiland, F.W., 2011. Weight-related behavior among ado- models with networks structure. Econometrics Journal 13 (2), 143–176.
lescents: the role of peer effects. PLoS ONE 6 (6), e21179. Levitsky, D., Halbmaier, C., Mrdjenovic, G., 2004. The freshman weight gain: a model
Ali, M.M., Amialchuk, A., Renna, F., 2011. Social network and weight misperception for the study of the epidemic of obesity. International Journal of Obesity 28 (11),
among adolescents. Southern Economic Journal 77 (4), 827–842. 1435–1442.
Alviola, I.V., Nayga Jr., P.A., Thomsen, R.M.M.R., Danforth, D., Smartt, J., 2014. The Lin, X., 2010. Identifying peer effects in student academic achievement by spatial
effect of fast-food restaurants on childhood obesity: a school level analysis. autoregressive models with group unobservables. Journal of Labor Economics
Economics & Human Biology 12, 110–119. 28 (4), 825–860.
Anderson, B., Lyon-Callo, S., Fussman, C., Imes, G., Rafferty, A.P., 2011. Peer reviewed: Liu, X., Patacchini, E., Rainone, E., 2013. The allocation of time in sleep: a social
fast-food consumption and obesity among Michigan adults. Preventing chronic network model with sampled data, Working Papers w162. Center For Policy
disease 8 (4). Research, The Maxwell School.
Anderson, M., Matsa, D., 2011. Are restaurants really supersizing America? American Lyons, R., 2011. The spread of evidence-poor medicine via flawed social-network
Economic Journal: Applied Economics 3 (1), 152–188. analysis. Statistics, Politics, and Policy 2 (1), 2.
Anderson, T., Hsiao, C., 1981. Estimation of dynamic models with error components. Maggio, C., Pi-Sunyer, F., 2003. Obesity and type 2 diabetes. Endocrinology and
Journal of the American Statistical Association 76, 598–606. metabolism clinics of North America 32 (4), 805–822.
Arellano, M., Bond, S., 1991. Some tests of specification for panel data: Monte Carlo Manski, C.F., 1993. Identification of endogenous social effects: the reflection prob-
evidence and an application to employment equations. Review of Economic lem. Review of Economic Studies 60 (3), 531–542.
Studies 58 (2), 277–297. Millimet, D., McDonough, I., 2013. Dynamic panel data models with irregular spac-
Auld, M.C., Powell, L.M., 2009. Economics of food energy density and adolescent ing: with applications to early childhood development, Working paper.
body weight. Economica 76 (304), 719–740. Niemeier, H., Raynor, H., Lloyd-Richardson, E., Rogers, M., Wing, R., 2006. Fast food
Badev, A., 2013. Discrete games in endogenous networks: Theory and policy, Work- consumption and breakfast skipping: predictors of weight gain from adoles-
ing Papers 2-1-2013. University of Pennsylvania Scholarly Commons. cence to adulthood in a nationally representative sample. Journal of Adolescent
Berentzen, T., Petersen, L., Schnohr, P., Sørensen, T., 2008. Physical activity in Health 39 (6), 842–849.
leisure-time is not associated with 10-year changes in waist circumference. Ogden, C.L., Carroll, M.D., Kit, B.K., Flegal, K.M., 2012. Prevalence of obesity and trends
Scandinavian Journal of Medicine & Science in Sports 18 (6), 719–727. in body mass index among us children and adolescents, 1999–2010. Journal of
Bleich, S., Cutler, D., Murray, C., Adams, A., 2008. Why is the developed world obese? the American Medical Association 307 (5), 483–490.
Annual Review of Public health 29, 273–295. Powell, L., Bao, Y., 2009. Food prices access to food outlets and child weight out-
Blume, L., Brock, W., Durlauf, S., Jayaraman, R., 2015. Linear social interactions mod- comes: a longitudinal analysis. Economics and Human Biology 7, 64–72.
els. Journal of Political Economy 123 (2), 444–496. Powell, L., Chriqui, J., Khan, T., Wada, R., Chaloupka, F., 2013. Assessing the poten-
Boucher, V., 2014. Conformism and Self-Selection in Social Networks. Mimeo. tial effectiveness of food and beverage taxes and subsidies for improving public
Bramoullé, Y., Djebbari, H., Fortin, B., 2009. Identification of peer effects through health: a systematic review of prices, demand and body weight outcomes. Obe-
social networks. Journal of Econometrics 150 (1), 41–55. sity Reviews 14 (2), 110–128.
Calabr, P., Golia, E., Maddaloni, V., Malvezzi, M., Casillo, B., Marotta, C., Calabrò, Powell, L.M., 2009. Fast food costs and adolescent body mass index: evidence from
R., Golino, P., 2009. Adipose tissue-mediated inflammation: the missing link panel data. Journal of Health Economics 28 (5), 963–970.
between obesity and cardiovascular disease? Internal and Emergency Medicine Renna, F., Grafova, I.B., Thakur, N., 2008. The effect of friends on adolescent body
4 (1), 25–34. weight. Economics and Human Biology 6 (3), 377–387.
Calle, E., 2007. Obesity and cancer. British Medical Journal 335 (7630), 1107–1108. Rosenheck, R., 2008. Fast food consumption and increased caloric intake: a system-
Caraher, M., Cowburn, G., 2007. Taxing food: implications for public health nutrition. atic review of a trajectory towards weight gain and obesity risk. Obesity Reviews
Public Health Nutrition 8 (08), 1242–1249. 9 (6), 535–547.
138 B. Fortin, M. Yazbeck / Journal of Health Economics 42 (2015) 125–138
Shalizi, C., Thomas, A., 2011. Homophily and contagion are generically confounded Yakusheva, O., Kapinos, K.A., Eisenberg, D., 2014. Estimating heterogeneous
in observational social network studies. Sociological Methods & Research 40 (2), and hierarchical peer effects on body weight using roommate assign-
211. ments as a natural experiment. Journal of Human Resources 49 (1),
Trogdon, J.G., Nonnemaker, J., Pais, J., 2008. Peer effects in adolescent overweight. 234–261.
Journal of Health Economics 27 (5), 1388–1399. Yakusheva, O., Kapinos, K., Weiss, M., 2011. Peer effects and the freshman 15:
VanderWeele, T., 2011. Sensitivity analysis for contagion effects in social networks. evidence from a natural experiment. Economics & Human Biology 9 (2),
Sociological Methods & Research 40 (2), 240. 119–132.
Journal of Health Economics 42 (2015) 139–150
a r t i c l e i n f o a b s t r a c t
Article history: We evaluate the introduction of a reimbursement schedule for self-employed mental health care
Received 10 December 2014 providers in the Netherlands in 2008. The reimbursement schedule follows a discontinuous discrete
Received in revised form 20 March 2015 step function—once the provider has passed a treatment duration threshold the fee is flat until a next
Accepted 28 March 2015
threshold is reached. We use administrative mental health care data of the total Dutch population from
Available online 6 April 2015
2008 to 2010. We find an “efficiency” effect: on the flat part of the fee schedule providers reduce treatment
duration by 2 to 7% compared to a control group. However, we also find unintended effects: providers
Keywords:
treat patients longer to reach a next threshold and obtain a higher fee. The data shows gaps and bunches
Mental health care
Provider payment
in the distribution function of treatment durations, just before and after a threshold. About 11 to 13% of
Regression discontinuity design treatments are shifted over a next threshold, resulting in a cost increase of approximately 7 to 9%.
Policy evaluation © 2015 Elsevier B.V. All rights reserved.
Regulated competition
The Netherlands
JEL classification:
I11
I12
I18
http://dx.doi.org/10.1016/j.jhealeco.2015.03.008
0167-6296/© 2015 Elsevier B.V. All rights reserved.
140 R. Douven et al. / Journal of Health Economics 42 (2015) 139–150
(for excellent overviews see Chandra et al., 2012; McGuire, 2000). primary care, which is provided by a general practitioner, psy-
Most empirical evidence concerns the US and shows that fee-for- chologist, psychotherapist or psychiatrist4 . Patients with a more
service payment provides incentives for overtreatment. Some of serious condition need specialized care and are referred to sec-
the first papers on this topic are Epstein et al. (1986), Hickson ondary care. Secondary care is split into curative care and long-term
et al. (1987) and Stearns et al. (1992). Recently, in the Netherlands, care. Long-term care patients usually remain in an institution such
similar behavioral responses have also been reported since the as a residence or other kind of mental health facility for longer than
introduction of regulated competition in the Dutch hospital market a year. Our study focuses on patients who receive curative care.
(Douven et al., 2015) and market for general practitioners (van Dijk They can receive care in an inpatient or outpatient setting and their
et al., 2013). Less research has been done on case mix based fund- treatment does not last longer than a year.
ing in the mental health care market (Mason and Goddard, 2009). In The reform to regulated competition in 2008 required many
the US, Jennison and Ellis (1987) found an 18% increase in the rate changes for providers, health insurers and regulators. The govern-
of visits per mental health provider per month when they shifted ment decided upon a transition period between 2008 and 2010, in
from a salaried basis to a fee-for-service basis. Rosenthal (2000) which health insurers became responsible for the services of mental
has examined the effects of risk sharing with mental health care health care providers. However, during the transition period insur-
providers. She found that providers that received a salary reduced ers did not incur financial risk on providing mental health care5 .
their number of visits by 20 to 25% compared to providers who Since 2008, providers are reimbursed on their case mix, called a DBC
were still paid for each visit. Bellows and Halpin (2008) studied (Diagnosis Treatment Combination). A DBC refers to the complete
the impact of Medicaid reimbursement on mental health quality treatment episode of a patient. It starts with the initial consulta-
indicators and found evidence of upcoding of quality indicators to tion and continues until the provider ends the treatment. Consider
increase reimbursement. for example a patient with mild depression that for ten months
This is the first study to evaluate the introduction of a new receives each month an individual therapy for 60 min by a psy-
reimbursement schedule in mental health care in the Netherlands. chotherapist (and no other form of medication or treatment). This
The reimbursement function follows a discontinuous discrete step patient’s treatment can be coded with the following DBC: “Depres-
function —once the provider has passed a treatment duration sion, 250 to 800 min, no medication” (DBC Onderhoud, 2013). If
threshold the fee does not increase until a next threshold is reached. a treatment episode lasts longer than one year, the DBC is closed
We look at two effects: efficiency and unintended effects. Our study automatically. After that year a new DBC is opened. With the closed
shows that the unintended effects – i.e. providers treat patients DBC a provider can receive reimbursement from his patient’s health
longer to reach a next threshold and obtain a higher fee – out- care insurer. The fee covers all labor and capital costs related to
weigh the efficiency effect—i.e. on the flat part of the fee schedule the treatment episode. The reimbursement fee for a DBC was fixed
providers treat patients shorter and prolong treatment only if during our period of study and set prospectively by the Dutch
marginal benefits to patients outweigh marginal costs. We separate Healthcare Authority (NZa). Patients’ out-of pocket payments were
out these two effects by using regression discontinuity design (see limited6 .
e.g. Lee and Lemieux, 2010)3 . Providers’ behavior around discontin- Most mental health providers worked in large regional institu-
uous fee thresholds are most likely be explained by the change in tions in the period under consideration. These institutions can be
fee, and not by other contemporary factors such as medical quality, a regional facility for ambulatory care, but also a specialized psy-
treatment outcome, location or other unobserved factors. We use chiatric hospital. Often, many different types of mental health care
a quasi-experimental design in which 10% of all mental health care specialists work together. Their payment was before (and after)
providers are paid according to the new reimbursement schedule, 2008 still based on annual budgets. These budgets were based on
while 90% of providers were not subject to the reform. This latter expected case mix and several regional budget parameters (such
group serves as a control group. We find an efficiency effect: we as inflation, wages, capital costs etc.). Mental health care specialists
estimate a reduction in treatment duration by 2 to 7% and lower who work at a budgeted institution received a fixed salary7 . Negoti-
costs by 3 to 6% compared to a control group. However, we also ations with health care insurers only took place with the dominant
find unintended effects: in total, about 11 to 13% of treatments health insurer in the geographical region between 2008 and 2010.
are shifted to over a next threshold, resulting in a cost increase These mostly large mental health care institutions account for
of approximately 7 to 9%. about 90% of the sector (NZa, 2012). Henceforth, we will use these
The outline of our paper is as follows. Section 2 provides a con- ‘budgeted’ or B providers in our study as a control group because
cise overview of the Dutch mental health care system. Section 3 their individual salaries during 2008–2010 were not related to the
describes the economic theory relating to the new reimbursement new reimbursement schedule.
schedule. Section 4 describes the data and Section 5 presents the About 10% of the mental health care providers choose to work
estimation methods. Section 6 presents the results and Section 7 independently, e.g. private practices. Only this group of self-
concludes. employed providers, and new providers that entered the market
after January 1st of 2008, received their income according to
2. The Dutch mental health care system the new reimbursement schedule. Contrary to B providers, the
15
An exception could be a provider with (too) many patients in his practice. Such a
12
Bajari et al. (2011) perform a similar analyses with figures. provider may have a financial incentive to end a treatment after hitting a threshold
13
The size of the spike has to be determined empirically. Around kl , locally holds because treating a new patient may be more rewarding (in terms of profits and
∂Pi (xi ) /∂xi = −∞, implying an infinite spike. However, in practice the decision to total patient benefits). Vice versa a provider with a shortage of patients may have
prolong treatment is more discrete in nature. For substantial shorter treatment dura- an incentive to prolong treatment duration securing his financial income. In our
tions than at thresholds kl , the provider has to trade off the costs associated with analysis we assume that these are second order effects.
treating the patient longer versus the size of the fee difference equal to Pl+1 − Pl . 16
Treatment durations below 250 min belong to primary mental health care.
14
Suppose treatment duration is at a local optimum. The farther away this treat- About 8% of the treatments had treatment duration longer than 4000 min. Note that
ment duration is from a threshold duration k the more costly it will be for a provider 4000 min is well below 6000 min, so estimation errors that occur because providers
to move to the threshold k. prolong treatment duration to 6000 min are likely to be small.
R. Douven et al. / Journal of Health Economics 42 (2015) 139–150 143
Table 1
Description of data* .
Table 2
Type of provider and number of DBCs (years 2008–2010)* .
Table 3
Number of observations in various subsamples (years 2008–2010)** .
providers, for NB providers we observe large gaps and spikes at estimation approach which allows us to estimate in one step the
thresholds. Similar figures are obtained if we plot subsamples of distribution functions for both types of providers.
our dataset. We fit the non-linear regression equation (5) for each mental
To estimate whether B providers treat on average longer or disorder category i, and provider type j (in what follows we omit i,
shorter than NB providers we use ideas from regression discon- j):
tinuity design (RDD)21 . However, while RDD-studies use local
Yt = f ˇ + t with t = Bt − Gt + εt (5)
linear smoothing around single thresholds to determine non-linear
responses, we have reasonably large bunches and gaps of several where Yt , t = 3, 3.5, 4, . . ., 39, 39.5 is the distribution function
thresholds that may be connected22 . Therefore, we use a global of treatment durations defined in treatment duration classes of
50 min23 . Alike Lee and Lemieux (2010) we assume that all fac-
tors evolve “smoothly”. If there are no discontinuities (Gt = 0, Bt = 0)
in the reimbursement schedule, f(ˇ) would be a reasonable guess
21
RDD studies related to health care include Card et al. (2008, 2009) who study for explaining Yt . This assumption is confirmed by estimates of f(ˇ)
the discontinuity of health care utilization around age 65 when US citizens become for the distribution function of B providers.
eligible for Medicare. Sojourner et al. (2013) use RDD to study the effects of union-
In standard RDD applications, sudden shifts in the outcome
ization of nursing homes. Shi (2013) finds evidence of income manipulation when
studying labor supply responses to income cutoffs of a subsidized health insurance variable result from an exogenous change. In this study we have
program in Massachusetts. Einav et al. (2013) study the response of drug expen- the same. Bunches and gaps in treatment durations of the NB
diture to non-linear contracts in Medicare part D. These studies are all related to providers are caused by exogenous changes in the fee structure, and
consumer responses. Our study is about provider responses and more related to
Bajari et al. (2011) who study hospital’s responses to discontinuities in linear reim-
bursement schedules. Their identification strategy is much more complicated than
23
in our paper because reimbursement schedules are only discontinuous in the first Thus, Y3 represents all treatment durations in the 300–350 min time interval and
derivative, and thresholds are not fixed but may differ across hospitals. Y39.5 in the 3950–4000 min time interval. The size of the surface of all distributions
22
For example, combining several separate local linear estimation procedures to is normalized to 1. Note that we performed our analysis also for classes of 100 min
one distribution function may not necessarily result in a smooth function. time intervals. This yielded similar, but slightly less stable, results.
R. Douven et al. / Journal of Health Economics 42 (2015) 139–150 145
39.5
2
min wt Yt − f ˇ with restrictions :
ˇ
t=3
(7)
10.5
20.5
32.5
39.5
Yt − f ˇ = 0, Yt − f ˇ = 0, Yt − f ˇ = 0, Yt − f ˇ =0
t=5 t=11 t=21 t=3
our theory in Section 2: bunching after a threshold occurs through t , G[30] = − t , B[30] = t .
18 21 30
26
a shift of treatment durations from before to after a threshold. This implies all systematic shifts are explained by the three previous restrictions,
and that no treatments with duration between 300 and 500 min are shifted to over
800 min threshold, and between 3300 and 4000 min are shifted to over the 6000 min
threshold.
27
In most cases we used wt = 1, however sometimes we experimented with
somewhat higher weights to obtain smooth convergence. We performed our opti-
24
Using standard smoothing functions in econometric software programs do not mizations with the numerical non-linear global optimization function “NMinimize”
work here because these functions ‘try to explain’ the bunches and gaps as well. of the software program Mathematica. To obtain convergence we sometimes had to
146 R. Douven et al. / Journal of Health Economics 42 (2015) 139–150
RDD-estimations at each individual threshold. The global approach thresholds. In total about 11–13% of treatments are shifted to over
allows us to connect the “bunches” and “gaps” estimates at individ- a next threshold. The second column in Table 4presents
average
ual thresholds and convergence of our estimation procedure will treatment duration. The difference between f ˇ ˆ and Yt for B
only occur if our assumption of equal gaps and bunches is supported
by the data. providers is small, confirming the good fit and resulting in small
ˆ ˆ standard errors sB30 . ForNB providers the average treatment dura-
Minimization procedure (7) generates ˇ1 , . . ., ˇ6 . This allows
in ˆ
us to compute ˆ
ˆ t = Yt − f ˇ . Next, we can compute our estimates tion corresponding to f ˇ is 19–24 min lower than Yt , indicating
for the gaps and bunches. that the increase in average treatment duration as a result of bunch-
In order to present the significance of our estimates for bunches ing is relatively small31 . Important is the large difference in average
and gaps we need an estimate for our error term εt in (5). Because treatment duration between B and NB providers in the “Total sam-
our computation does not allow us to compute for each t, B̂t , Ĝt in ple”, 22.2%, and “Total sample depression”, 24.2%, indicating that
(5) separately, we cannot properly estimate the random error term B providers treat on average more sick patients. After control-
εt . Therefore we assume ε̂t = ˆ Bt where
ˆ Bt are the estimated errors ling for patient characteristics (“Subsample depression, GAF scores
of the budgeted providers after estimating (5). Thus, we assume the 41–70”) treatment duration shrinks to 2.2%. In the third column of
standard error of the non-budgeted providers sNB in (5) equals the Table 4 we present average treatment costs. The unintended effects
standard error of the budgeted providers sB :28 increase average costs per treatment by 137 to 157 euro or a cost
increase of 7.1 to 7.9%. The efficiency effect for the “Total Sample
1 2 Depression, GAF scores 41–70” yields that on average treatments
NB B
S =S = Bt (8) are 3.3% (or 39 euro) more expensive for B than NB providers. This
(74 − 6) t
effect is however more than offset by the unintended effects; sum-
We use a 68 degrees of freedom correction (see e.g. Verbeek, ming both effects yields that NB providers treat on average patients
2004), 74 minus 6 (parameters ˇ to estimate in (8)). After obtain- 165–39 = 126 euro more expensive than B providers32 .
ing these statistics we can derive additional statistics such as an In addition to the three subsamples, we have also looked into
estimate of the average treatment duration, prolongation time as a other mental illnesses (see Table 3 for the subsamples and the num-
result of shifting treatments and associated costs. ber of observations in each subsample). We performed the same
estimations for these sixteen subsamples. The results are reported
6. Estimation results in Table 5. Columns (1)–(3) present the volume effects. Column
(1) represents the size of the unintended effects: the percentage
In this section we present our estimation results for all the sam- treatments that are shifted to over a next threshold. Column (2)
ples described in Table 3. We first show our results graphically in shows the average treatment duration for the actual distribution
Fig. 6 for the three samples 1, 2 and 2a in Table 3: “total sample”, Yt , and estimated distribution f ˇ ˆ , and column (3) shows the
“depression” and the subsample “depression with similar patient
differences in treatment duration or the “efficiency” effect; the per-
characteristics (GAF-scores 41–70)”. Fig. 6 contains for each sam-
cent change in treatment duration between NB and B providers.
Fig. 6a, d and g, show Yt and
ple three panels. The first panels,
Columns (4)–(6) show the same effects but now for fees.Col-
the corresponding estimate f ˇ ˆ of the B provider, from which
umn (4) shows the average fee of a treatment for Yt and f ˇ ˆ .
we will derive an estimate for our standard error. The estimates
Column (5) presents the unintended cost effects; the percent dif-
indicate that our exponential identification in (6) can fit f(ˇ) to Yt
ference between the two variables. Finally, column (6) represents
very well. The middle panels, Fig. 6b, e and h, indicate the unin-
the cost difference related to the “efficiency” effect between NB and
tended effects. Bunches and gaps are present in all three samples.
B providers.
The size of bunches and gaps are remarkably stable across subsam-
The results in Table 5 confirm our previous findings. First of all,
ples. Bunches and gaps are largest (and significant) at the first two
we observe that the unintended effects (column (1)) are present in
thresholds of 800 and 1800 min and positive at the threshold of
all subsamples. The effects are fairly stable across all our subsam-
3000 min in all cases29 . Differences in treatment duration between
ples and vary roughly between ±11 and 13%, with some outliers33 .
both providers, after controlling for the bunches and gaps, can be
This corresponds with a cost increase that varies between ±7
seen in the right three panels, Fig. 6c, f and i. For the total sample
and 9% (column 5). The efficiency effect in column (3) shows
and depression sample (panels c and f) we observe large effects; on
that B providers treat patients approximately ±2–7% longer than
average NB providers treat patients much shorter than B providers.
NB providers with corresponding cost increases of approximately
However, this effect almost disappears in the case of patients with
±3–6% (column (6))34 . Thus, for almost all cases we find that the
similar characteristics (panel i). Controlling for patient characteris-
tics is therefore crucial to identify possible differences in treatment
duration between B and NB providers.
The estimation results of the three subsamples are summarized 30
For smaller subsamples the graph Yt is less smooth increasing the size of the
in Table 4. The first column presents the unintended effects: the standard error sB .
31
percentage of treatments that are shifted over each of the three The average prolongation of treatment duration for treatments that are shifted
over to a next threshold is about 200 min.
32
We have tested the significance of the efficiency effect with thenon-parametric
Kolmogorov–Smirnov test. It rejected the hypothesis of similar f ˇ ˆ distribution
alter the minimization method in Mathematica (gradient-based and direct search functions for B and NB providers in the first two samples in Table 4. However, it does
methods), weights and starting values. not reject the hypothesis in the third sample. We therefore test various different
28
We make the reasonable assumption that the random errors and corresponding subsamples in Table 5.
standard deviations sB and sNB are of the same order of magnitude. If there are 33
The estimation results for the unintended effects are all significant on a 0.01
small systematic errors in ˆ Bt we will overstate sNB Note that we calculate sB from level.
34
a Yt distribution that has the same number of observations as the corresponding Yt Only for the subsample adjustment disorders GAF: 61–70 we find a 0.4 higher
distribution of the NB providers. average treatment duration for NB providers. The efficiency effects are not signif-
29
The effects are smaller around the 3000 min threshold; there are fewer obser- icant at a 0.05 level (see footnote 29). However, we still conclude that efficiency
vations and it may be the case that the marginal benefit to patients is closer to zero effects are present in our data because we repeated our estimations many times
(i.e. “flat of the curve”). (see column (3)) and our data covers the complete sample.
R. Douven et al. / Journal of Health Economics 42 (2015) 139–150 147
Fig. 6. (a–c) Total sample. (d–f) Total sample depression. (g–i) Subsample depression, GAF scores 41–70.
Table 4
Estimation results for “total sample”, “total sample depression” and “total sample depression (GAF scores 41–70)”.a
Table 5
Estimation results for subsamples 2a–d, 3a–d, 4a–d, 5a–d (see Table 3).
Subsample (1) Bunches, (2) Avg. treatment (3) “Efficiency” (4) Avg. treatment (5) Unintended (6) “Efficiency”
gaps (%) duration (min) effect (%) (min) costs (euro) effect (euro) effect (%)
Depression
2a. GAF: 41–70 12.9 1204 1185 −3.3 2149 1986 8.2 −2.9
2b. GAF: 41–50 14.1 1330 1300 −2.3 2353 2152 9.4 −2.4
2c. GAF: 51–60 13.1 1215 1189 −4.0 2162 1987 8.8 −3.5
2d. GAF: 61–70 11.8 1105 1083 −3.9 1996 1844 8.2 −3.2
Anxiety disorders
3a. GAF: 41–70 12.1 1186 1161 −7.9 2094 1926 8.7 −7.5
3b. GAF: 41–50 11.3 1325 1303 −7.1 2327 2146 8.4 −6.9
3c. GAF: 51–60 12.8 1201 1175 −8.4 2118 1942 9.1 −8.0
3d. GAF: 61–70 11.4 1096 1076 −7.0 1944 1800 8.0 −6.6
Adjustment disorders
4a. GAF: 41–70 10.6 1054 1039 −2.2 1761 1645 7.1 −2.1
4b. GAF: 41–50 10.6 1215 1174 −2.8 2010 1839 9.3 −2.1
4c. GAF: 51–60 10.8 1061 1042 −4.5 1771 1646 7.6 −4.3
4d. GAF: 61–70 9.5 1000 984 0.4 1682 1572 7.0 0.3
Personality disorders
5a. GAF: 41–70 11.6 1391 1372 −5.5 2435 2251 8.1 −5.1
5b. GAF: 41–50 11.9 1495 1475 −5.2 2598 2402 8.2 −5.2
5c. GAF: 51–60 12.3 1422 1402 −5.3 2489 2290 8.7 −5.3
5d. GAF: 61–70 10.4 1286 1277 −5.2 2265 2117 7.0 −4.4
marginal loss line d is situated somewhat below the line ˛j c (see An important message of our study is that the unintended effects
Fig. 3 in Section 3). The unintended financial effects in column (5) clearly demonstrate that mental health care providers react to
are in all cases larger than the “efficiency” effects (column (6)). financial incentives. Since financial rewards are high, one would
To conclude, the unintended effects appear very clear in the data expect that NB providers would anticipate ex-ante on the thresh-
and are very stable across all subsamples. The “efficiency” effects olds in the reimbursement schedule. Indeed, an article in a Dutch
are smaller and less certain because these effects are estimated by newspaper suggests that some providers have institutionalized the
comparing B and NB providers. A limitation of our measure for the number of therapy sessions. A psychologist stated: “Our institution
“efficiency” effects could be that there is still unobserved variation has calculated that each patient should receive eight ór sixteen ses-
in the treatment and control group that we do not capture ade- sions. This would be financially very attractive but not be seen as
quately. For example, we may have overestimated the efficiency fraud” (Effting, 2015).
effects if NB providers select more low severity patients (even for Monitoring providers’ behavior is therefore an important ele-
groups with similar GAF scores)35 . Another possibility is that our ment for the system to function properly. In the Dutch system of
“efficiency” effect captures not genuine efficiency but quality dif- regulated competition health insurers have the role to discipline
ferences in outcome between B and NB providers. In future research providers. However, until 2014 health insurers lacked informa-
we may be able to address some of these points if more information tion about the exact treatment duration of health care providers.
becomes available. They received only global information on treatment duration of
individual providers, i.e. they received only information between
which two treatment duration thresholds the provider performed
7. Discussion
the treatment, and not the exact treatment time. Thus, insurers had
no possibility to perform the same analysis as we carried out in this
We have evaluated the implementation of a new reimburse-
paper. This is now gradually changing; since 2014 health insurers
ment schedule in Dutch mental health care. The reimbursement
obtain exact information about treatment durations and are also
schedule follows a discontinuous discrete step function: once the
becoming more financially responsible for mental health care cost
provider has passed a treatment duration threshold the fee is flat
containment.
until a next threshold is reached. We find an “efficiency” effect: on
We measure an “efficiency” effect. However, we cannot be cer-
the flat part of the fee schedule providers prolong treatment only if
tain that we measure genuine efficiency since we cannot rule out
marginal benefits to patients outweigh marginal costs. We estimate
the possibility that patients may also have received too little care.
a reduction in treatment duration by 2 to 7% and lower costs by 3 to
Our efficiency arguments do hold if we assume ˛j = 1 in our utility
6% compared to a control group. However, we also find unintended
function (1), which is a fairly standard assumption (McGuire, 2000).
effects: providers treat patients longer to reach a next threshold
In that case NB providers produce cost efficient on the flat part of the
and obtain a higher fee. The data shows gaps and bunches in the
reimbursement schedule and bunching corresponds to overtreat-
distribution function of treatment durations, just before and after
ment. Efficiency differences between B and NB providers could also
a threshold. In total, about 11 to 13% of treatments are shifted to
be related to differences in practice styles or quality of treatments
over a next threshold, resulting in a cost increase of approximately
(see e.g. Chandra et al., 2012). To address these issues more properly
7 to 9%.
quality information about treatments would be necessary.
In 2014, the Dutch government decided to pay B providers also
according to the new reimbursement schedule. Our findings sug-
35
If this were the case then we would expect that selection effects are greater for gest that this policy may lead to higher costs since the higher costs
less severe patients (groups with high GAF-scores). However, we do not observe
that efficiency effects are stronger for those groups (see Table 5).
associated with the unintended effects outweigh the lower costs
R. Douven et al. / Journal of Health Economics 42 (2015) 139–150 149
of the efficiency effect. However, an important difference is that tariffs and providers should ultimately get paid taking into account
B providers correspond to large mental health institutions where the patient’s wellbeing as well. This is a long shot and much more
doctors are still paid on a salary basis, so the unintended effects research is needed to integrate quality aspects into the payment
have to be induced by the management. Furthermore, there are still system.
many external dynamic demand and supply factors that are diffi- In this study we rely on providers that register their own DBCs.
cult to assess. For example, B providers may put a lower weight We assume that providers register their treatment duration cor-
on profits (lower agency parameter ˛j in (1) for B providers) than rectly and honestly in their administration. However, literature
NB providers because the latter category of providers is of a more indicates that fraudulent behavior may also occur in payment sys-
entrepreneurial type. In sum, the unintended effects may therefore tems based on DRGs in the US, or DBCs in the Netherlands. This
be lower for B providers than NB providers. Also, insurers may be fraudulent behavior is often referred to as ‘upcoding’ (Steinbusch
better equipped to monitor providers’ treatment duration and, in et al., 2007). The Dutch reimbursement system may be vulnerable
the longer run, quality. Another important difference is that the to this ‘upcoding’ because Dutch providers code DBCs themselves.
Dutch government changed the flat reimbursement fees to maxi- They could tamper with the data. Especially, in mental health care
mum fees. Thus, health insurers can bargain with providers lower the risk for fraud may even be greater than for less discretionary
reimbursement fees, if providers’ performances turn out to be inad- treatments, as hip or knee replacements. Third parties, such as
equate. health insurers, also might find it particularly difficult to verify and
Also the conclusion that the introduction of the new reimburse- dispute mental health diagnoses.
ment schedule for NB providers in 2008 led to higher costs is
premature. Before 2008, NB providers received a fixed fee for each
visit. A fee for each visit is similar to the reimbursement schedule Acknowledgement
in our study but now there are thresholds after each visit of 60 min.
A fee for each visit is closer to a fee-for-service type of payment and We would like to thank the Dutch Healthcare Authority (NZa)
may also result in overtreatment. Unfortunately, we have no data for providing the data. The data are not publicly available. We are
for the period before 2008 available, making a comparison between grateful to the NZa and DBC-Onderhoud for explaining the data. We
the two regimes not possible. would like to thank seminar participants at BU/Harvard/MIT Health
An important policy question is how an optimal reimburse- Economics seminar in Boston at April 4, 2014, at NZa-seminar in
ment schedule for mental health care providers should look like. Utrecht at May 8, 2014, Academy of Health in San Diego, June
The reimbursement schedule that we study in this paper is inter- 8–10, 2014, IHEA in Dublin in July 13–16, 2014, CPB-seminar in
esting because it combines a prospective fee per episode of care The Hague, October 14, 2014, ESE-seminar at Erasmus University,
with elements of fee-for-service, to prevent selection incentives. January 15, 2015, ICMPE at Venice, March 28, 2015 for comments.
The drawback is that the unintended effects are quite large which Furthermore, we are grateful to two anonymous referees, Pieter
may make the schedule less attractive than salary or even fee- van Baal, Pieter Bakx, Leon Bettendorf, Aaron Maras, Tom McGuire,
for-service. One possible way to proceed would be to improve Jan van Ours, Bastian Ravesteijn, Ingrid Seinen, Harry van Til and
the current reimbursement schedule. A first option would be to Gert Jan Verhoeven for providing comments on earlier versions of
diminish the unintended effects by changing the position of the this paper.
thresholds. Ideally, thresholds should be placed where the mass of
the distribution function f(ˇ) is small. If the mass before a thresh-
References
old is small, unintended effects will diminish because there are
only few treatments to shift over to a next threshold. Unfortu- Bajari, P., Hong, H., Park, M., Town, R., 2011. Regression discontinuity designs with
nately, the threshold of 800 min is placed just after the top of an endogenous forcing variable and an application to contracting in health care.
the distribution function (see Fig. 4), thus exacerbating the unin- In: NBER Working Paper No. 17643.
Bellows, N.M., Halpin, H.A., 2008. Impact of Medicaid reimbursement on mental
tended effects. Moving the 800 min threshold to 500 min, just
health quality indicators. Health Serv. Res. 43, 582–597.
before the top of the distribution function, would diminish the Card, D., Dobkin, C., Maestas, N., 2008. The impact of nearly universal insurance
unintended effects. Thus by taking into account provider behavior coverage on health care utilization: evidence from medicare. Am. Econ. Rev. 98
(5), 597–636.
the reimbursement could be made much more attractive. A second
Card, D., Dobkin, C., Maestas, N., 2009. Does medicare saves lives? Q. J. Econ. 123 (1),
option to diminish the unintended effects would be to decrease (or 597–636.
even increase) the number of thresholds. Here there is a trade-off Chandra, A., Cutler, D., Song, Z., 2012. Who ordered that? The economics of treat-
between efficiency, equity and selection. For example, removing all ment choices in medical care. In: Pauly, M.V., McGuire, T.G., Barros, P.P. (Eds.),
Handbook of Health Economics, vol. II. Elsevier, Amsterdam, pp. 397–432.
thresholds would yield a single prospective fee for the total treat- Christianson, J.B., Conrad, D., 2011. Provider payment and incentives. In: Glied, S.A.,
ment. This would remove all unintended effects thereby increasing Smith, P.C. (Eds.), The Oxford Handbook of Health Economics. Oxford University
efficiency. However, if patients’ characteristics across providers dif- Press, Oxford, pp. 624–628.
Douven, R., Mocking, R., Mosca, I., 2015. The effect of physician remuneration
fer substantially, it could also result in a larger income variation on regional variation in hospital treatments. Int. J. Health Econ. Manage.,
across providers thereby diminishing equity considerations across 10.1007/s10754-015-9164-2.
providers. As a result providers might increase their incentives for van Dijk, C.E., van den Berg, B., Verheij, R.A., Spreeuwenberg, P., Groenewegen, P.P.,
de Bakker, D.H., 2013. Moral hazard and supplier-induced demand: empirical
selecting more favorable patients (McGuire, 2000). Adding more evidence in general practice. Health Econ. 22 (3), 340–352.
thresholds might also be an improvement since more thresholds DBC Onderhoud, 2013. Spelregels, DBC-Registratie GGZ, Versie RG13a. DBC Onder-
will diminish the financial incentives at each individual threshold. houd, Utrecht.
Einav, L., Finkelstein, A., Schrimpf, P., 2013. The response of drug expenditure to
More research is necessary to study these trade-offs.
non-linear contract design: evidence from Medicare Part D. In: MIT Working
Another possible way is to change the nature of the payment Paper.
system and to consider a mixed payment system of a prospec- Effting, M., 2015. Tjak, tjak, volgende patient: de ontsporing van de ggz. De Volk-
skrant. (Dutch Newspaper).
tive fee and a linear reimbursement schedule, as advocated by
Ellis, R.P., McGuire, T.G., 1986. Provider behavior under prospective reimbursement.
Ellis and McGuire (1986). The prospective fee reimburses the “non- Cost sharing and supply. J. Health Econ. 5 (1986), 129–151.
contractible” activities that may vary across specialties while the Ellis, R.P., McGuire, T.G., 1990. Optimal payment systems for health services. J. Health
linear reimbursement fee should be set less than marginal (and Econ. 9 (4), 375–396.
Epstein, A.M., et al., 1986. The use of ambulatory testing in prepaid and fee-for-
average) costs. This type of fee schedule is likely to diminish the service group practices: relation to perceived profitability. N. Engl. J. Med. 314,
unintended effects as well. Lastly, quality should be integrated into 1089–1093.
150 R. Douven et al. / Journal of Health Economics 42 (2015) 139–150
Frank, R.G., McGuire, T.G., 2000. Economics and mental health. In: Culyer, A.J., New- NZa, 2010. De curatieve GGZ in 2009: Ontwikkelingen in aanbod en volume. In:
house, J.P. (Eds.), Handbook of Health Economics, vol. 1B. Elsevier, Amsterdam, Monitor. Nederlandse Zorgautoriteit, Utrecht (in Dutch).
pp. 893–954. NZa, 2011. Curatieve GGZ 2010 Een sector in ontwikkeling. In: Monitor. Nederlandse
Nederland, G.G.Z., 2010. Zorg op waarde geschat, update. In: Sectorrapport ggz 2010, Zorgautoriteit, Utrecht (in Dutch).
Amersfoort. (in Dutch). NZa, 2012. Marktscan Geestelijke Gezondheidszorg. Weergave van de markt
Hickson, G.B., et al., 1987. Physician reimbursement by salary or fee-for-service: 2008–2011. Nederlandse Zorgautoriteit, Utrecht (in Dutch).
effect on a physician’s practice behavior in a randomized prospective study. Rekenkamer, 2013. Indicatoren voor kwaliteit in de zorg. In: Algemene Rekenkamer,
Pediatrics 80, 744–750. March 28. Algemene Rekenkamer, The Netherlands, The Hague.
Jennison, K., Ellis, R.P., 1987. Comparison of psychiatric service utilization in a single Rosenthal, M.B., 2000. Risk sharing and the supply of mental health services. J. Health
group practice. In: McGuire, Scheffler (Eds.), The Economics of Mental Health Econ. 19 (6), 1047–1065.
Services: Advances in Health Economics and Health Services Research, vol. 8. Shi, J., 2013. Labor supply response to income cutoffs of health insurance in the
JAI Press, Greenwich, USA, pp. 175–194. Massachusetts reform. In: Working Paper. Boston University.
Lee, D.S., Lemieux, T., 2010. Regression discontinuity designs in economics. J. Econ. Sojourner, A.J., Grabowski, D.C., Town, R.J., Chen, M.C., Frandsen, B.R., 2013. impacts
Lit. 48, 281–355. of unionization on quality and productivity: regression discontinuity evi-
Mason, A., Goddard, M., 2009. Payment by Results in Mental Health: A Review of the dence from nursing homes. In: Working Paper, https://economics.byu.edu/
International Literature and an Economic Assessment of the Approach in the frandsen/Documents/Nursing Home Unions.pdf.
English NHS, Research Paper 50. Centre for Health Economics, The University of Stearns, S., Wolfe, B., Kindig, D., 1992. Physician responses to fee-for-service and
York. capitation payment. Inquiry 29, 416–425.
McGuire, T.G., 2000. Physician Agency. In: Culyer, A.J., Newhouse, J.P. Steinbusch, P.J.M., Oostenbrink, J.B., Zuurbier, J.J., Schaepkens, F.J.M., 2007. The risk
(Eds.), Handbook of Health Economics, vol. 1A. Elsevier, Amsterdam, of upcoding in casemix systems: a comparative study. Health Policy 81 (2–3),
pp. 461–536. 289–299.
NZa, 2007. Tariefbeschikking DBC GGZ 2008. Nederlandse Zorgautoriteit, Utrecht Van de Ven, W.P.M.M., Schut, F.T., 2008. Universal mandatory health insurance in
(in Dutch). The Netherlands: a model for the United States? Health Affairs 27 (3), 771–781.
NZa, 2008. Tariefbeschikking DBC GGZ 2009. Nederlandse Zorgautoriteit, Utrecht Verbeek, M., 2004. A Guide To Modern Econometrics, second ed. Wiley, New York,
(in Dutch). NY.
NZa, 2009. Tariefbeschikking DBC GGZ 2010. Nederlandse Zorgautoriteit, Utrecht (in VWS, 2010. Interdepartementaal beleidsonderzoek curatieve GGZ. In: Attachment
Dutch).NZa (2010). In: Invoering Prestatiebekostiging Curatieve GGZ: Advies op by the Report: Heroverweging curatieve zorg. Ministry of Health, Welfare and
Hoofdlijnen. Duth Healthcare Authority, Utrecht (in Dutch). Sport, The Hague (in Dutch).
Journal of Health Economics 42 (2015) 151–164
a r t i c l e i n f o a b s t r a c t
Article history: We evaluate the productivity effects of investment in preventive health technology through a random-
Received 19 August 2014 ized controlled trial in rural Zambia. In the experiment, access to subsidized bed nets was randomly
Received in revised form 12 April 2015 assigned at the community level; 516 farmers were followed over a one-year farming period. We find
Accepted 21 April 2015
large positive effects of preventative health investment on productivity: among farmers provided with
Available online 28 April 2015
access to free nets, harvest value increased by US$ 76, corresponding to about 14.7% of the average output
value. While only limited information was collected on farming inputs, shifts in the extensive and the
JEL classification:
intensive margins of labor supply appear to be the most likely mechanism underlying the productivity
I15
J24
improvements observed.
J43 © 2015 Elsevier B.V. All rights reserved.
Keywords:
Investment
Health
Productivity
Agriculture
Malaria
1. Introduction sector jobs are generally scarce. Despite major government efforts
to reduce the burden of the disease in recent years (NMCC, 2010;
Despite the rapid speed of urbanization over the past decades, Zambia Ministry of Health, 2006), malaria continues to be the pri-
rural small-scale farming remains the primary source of food and mary cause of short-term morbidity in the country, with children
income for a majority of the population in developing countries and adults experiencing up to five episodes of malaria per year
(World Bank, 2007). In most settings, the degree of agricultural (NMCC, 2010; WHO, 2009). Since the planting season tends to over-
mechanization is limited, so that agricultural production remains lap with the malaria season, health related absences from field work
primarily dependent on the availability and productivity of human are frequent, and are commonly cited by local farmers as primary
labor. While labor is abundant in principle in most developing cause of lost field work and income.1
countries (Pitt and Rosenzweig, 1986), labor inputs can be com- To evaluate the degree to which health affects agricultural pro-
promised by episodes of ill health and can result in output losses if ductivity, we conducted a cluster-randomized field experiment
absent labor cannot be replaced immediately. with 516 farmers in Katete District, Zambia, from December 2009
In this paper we investigate the economic impact of short-term to August 2010. As part of the experiment, farmers were ran-
morbidity on agricultural output in the context of small-scale farm- domly selected for bed net programs, which allowed them to obtain
ing in Zambia. The study setting is representative of many rural long-lasting insecticide treated nets (LLITNs) through agricultural
areas in the developing world both in terms of the general lack of loan program schemes at differentially subsidized prices. The basic
advanced farming technology and in terms of the dominant role intuition underlying the experiment is relatively straightforward:
of farming as source of nutrition and income. With farming land as long as household labor and consumption decisions are non-
available free of charge in most communities, a large majority of separable from household production decisions2 (Benjamin, 1992),
the working-age population engages in agriculture, while formal
1
On average, farmers surveyed at baseline claimed that their harvest would
∗ Corresponding author at: Harvard School of Public Health, Boston, MA 02115, increase by 30% if field work was not interrupted by episodes of ill health.
2
USA. Tel.: +1 6174327389. If consumption and production decisions were perfectly separable, family labor
E-mail address: gfink@hsph.harvard.edu (G. Fink). could be perfectly substituted for by hired labor.
http://dx.doi.org/10.1016/j.jhealeco.2015.04.004
0167-6296/© 2015 Elsevier B.V. All rights reserved.
152 G. Fink, F. Masiye / Journal of Health Economics 42 (2015) 151–164
decreased exposure to malaria should increase the time and energy production increases by up to 31% with HIV treatment, and attribute
farmers can spend on their fields, and thus also increase the final this increase to increased overall labor supply and improved phys-
harvest amounts. ical and mental health. Similar effects were, however, not found
In a first paper based on this experiment, we analyzed the impact for iron supplementation and deworming among tea pluckers in
of the additional LLITNs distributed on self-reported morbidity Bangladesh (Gilgen et al., 2001). Most similar to the results pre-
(Fink and Masiye, 2012). In this paper, we analyze the impact of sented in this paper are two cross-sectional studies using harvest
the net programs on agricultural productivity, the main outcome data to compute the agricultural output effect of malaria: Girardin
variable of the trial. In the first part of our analysis, we analyze the et al. (2004) analyze vegetable farming in Côte d’Ivoire, and find that
impact of the interventions on net ownership and usage. Consistent farmers who reported being sick more often had 47% lower yields.
with recent work by Tarozzi et al. (2014), we find a substantial frac- Morel et al. (2008) use total farming output to quantify the agricul-
tion of farmers to be willing to purchase LLITNs at full or partially tural loss generated by work days lost due to malaria in Vietnam,
subsidized prices when financing options are provided. On average, and find an average cost of US$ 11 per case of malaria, suggesting
farmers in the loan group acquired 0.9 nets, resulting in a 24% point returns to malaria prevention similar to the ones identified in this
increase in the average fraction of sleeping spaces covered at the paper. Conceptually, an overwhelming majority of this literature
household level. suggests strong links between health and agricultural production in
In the second part of the paper, we estimate the impact of the low-income setting; this suggests that household production, labor
bed net programs on agricultural production. In order to facili- and consumption decisions are generally not separable (Benjamin,
tate a rapid distribution of bed nets, treatments were randomly 1992), a finding which is also supported by recent evidence from
assigned at the cluster level prior to the collection of baseline Zambia (Fink et al., 2014).
data in the experiment. The non-stratified cluster-level randomiza- While this study primarily focuses on household-level out-
tion resulted in a rather unbalanced sample, with treated farmers comes, the results presented here naturally also link to the broader
on average both larger and more productive than farmers in literature on the relation between health and income. Most of the
the control group. To address these imbalances, we focus on micro-level literature in this area has focused on the long term
analyzing changes in production outcomes between the 2009 (pre- benefits of improved childhood health in terms of education and
intervention) and the 2010 (post-intervention) farming seasons. labor market outcomes (Bleakley, 2007; Bleakley and Lange, 2009;
The point estimates from our preferred specification suggest that Clarke et al., 2008; Kremer and Miguel, 2004). This paper highlights
the returns to bed nets in the study sample were large: on aver- a more immediate and direct effect of health on income similar
age, we find that access to free bed nets (three nets for a typical to the results shown in Thomas et al. (2010) for iron supplemen-
household) increased agricultural output by US$ 76, which corre- tation; this effect will clearly not apply in all low resource settings,
sponds to 14.7% of the average annual harvest value. To address but may be of particular importance among rural and frequently
omitted variable bias concerns, we include a large set of covariates impoverished populations.
in our empirical models, and run an extensive series of robustness The rest of the paper is structured as follows: we provide a
and heterogeneity checks. Overall, treatment effects appear largest detailed description of the study site and local agriculture prac-
among more educated farmers as well as farms with more diver- tices in Section 2. In Section 3, we present the study design and
sified portfolios, and larger for cotton (as the more labor intensive provide details on study implementation. In Section 4, we ana-
crop) than for maize. lyze the effects of the bed net programs on net ownership and net
In the last part of our analysis, we explore potential mechanisms usage. In Section 5, we estimate the impact of the net programs on
underlying the productivity impacts observed. Unfortunately only productivity. Section 6 shows some evidence on the mechanisms
limited and self-reported data on malaria incidence (and no data underlying the main productivity results. We conclude with a short
on parasitemia or asymptomatic malaria) was collected as part of summary and discussion in Section 7.
this project. However, the general patterns observed in the data
suggest that the programs likely induced substantial reductions in
the days of field work lost due to ill health. Given that full recov- 2. Study background
ery from acute malaria is often slow, reduced exposure to malaria
can increase the marginal product of labor (Nur, 1993), particu- Fig. 1 shows the geographic location of the study site within
larly in cases where malaria induces anemia (Ehrhardt et al., 2006). Zambia. Katete district is one of eight districts within Zambia’s East-
While there is theoretically also the possibility that the reduced ern Province. Eastern Province is one of the least developed regions
exposure to ill health may have been associated with a reduction of Zambia, with a majority of the population living below the one-
in direct medical expenditure, most malaria treatment in the area dollar-per-day poverty line, and an estimated under-5 mortality
appears to be provided for free, so that no evidence of lower health rate of 151 per 1000 live births (Macro International, 2007). Katete
expenditure was found. district is similar in its topography to the Western part of Malawi,
Even though this paper is to our knowledge the first one using which is located about 100 km east of the district. The current
experimental data to evaluate the productivity effects of malaria, district population is estimated at 250,000, approximately half of
several studies have analyzed agricultural output in the context which live in the urban centers of Sinda and Katete (Zambia Central
of nutrition and other diseases. Following the initial work by Statistic Office, 2011a).
Strauss (1986) as well as Pitt and Rosenzweig (1986), Behrman Malaria is endemic in most parts of Zambia, and the primary
et al. (1997) document a rather robust association between cause of short term morbidity in the country (Zambia Ministry of
nutritional improvements and production in agricultural settings. Health, 2012). The regional climate displays pronounced seasonal
Loureiro (2009) and Ulimwengu (2009) find positive associations fluctuations, with virtually no rainfall from May to November, fol-
between health and productivity using stochastic frontier regres- lowed by a period of major rainfall from December to April. The
sion techniques. Audibert and Etard (2003) examine the effect of strong seasonal patterns are directly reflected in the seasonal fluc-
schistosomiasis among rice-growers, and find that exposure to tuations of malaria. Malaria in the area is considered endemic and
schistosomiasis reduces production by 26%. Fox et al. (2004) ana- seasonal, with a majority of the transmission occurring between
lyze the productivity declines associated with HIV positivity, and December and May, when continued rainfalls support the breed-
find that HIV-positive workers earn on average 16–17% less over a ing of the Anopheles mosquito larvae. According to the latest round
two year period. Similarly, Baranov et al. (2012) show that maize of the Malaria Indicator Survey, Eastern region is among the areas
G. Fink, F. Masiye / Journal of Health Economics 42 (2015) 151–164 153
Fig. 1. Zambia (white) and Katete District (shaded red). (For interpretation of the references to color in this text, the reader is referred to the web version of the article.)
with the highest parasite prevalence in the country, with parasites and per 5 km2 as computed by the Malaria Atlas Project in 2007
detected in 22 percent of children under the age of five in early (Hay and Snow, 2006). Katete district ranks among the most highly
2010 (NMCC, 2011). Fig. 2 shows the clinical burden of Plasmodium exposed areas of the country, with an estimated annual burden of
falciparum malaria in terms of the number of clinical cases per year approximately 500 cases per 5 km2 .
Since 2006, Zambia has made major efforts to reduce the burden
of malaria (Ashraf et al., 2010; Zambia Ministry of Health, 2006).
As part of the internationally supported Rollback Malaria Initia-
tive, four principal strategies have been employed by the country
through the National Malaria Control Centre: indoor-residual
spraying (for densely populated and primarily urban areas), mass
distribution of long-lasting insecticide treated nets (LLITNs), inter-
mittent preventive treatment of malaria in pregnancy (IPTp) and
case management through diagnostics and artemisinin-based com-
bination therapies (NMCC, 2007, 2009, 2010; Zambia Ministry of
Health, 2006). Between 2006 and 2008, 96,000 LLITNs were dis-
tributed in Katete district (NMCC, 2011). At the time of the 2008
Malaria Indicator Survey, the average number of LLITNs in Eastern
Province was 0.96 nets (NMCC, 2009), a level very similar to the
one observed in the study area at the beginning of the study in
December 2009.
The rural part of Katete targeted in this study is sparsely popu-
lated, with clusters of family-run farms grouped into small villages.
With an average size of approximately 4 ha (10 acres), the typi-
cal farm is small, and most planting and harvesting done without
machinery. Farm land is generally owned by communities, who
allocate the land to families via local headmen and chiefs. The
amount of farm land families can get access to is – at least the-
oretically – not limited; any individual can claim additional land
from the chief as long as they can show they have the manpower
and skills to use the land (Nolte, 2012).
To enhance productivity and strengthen commercial links with and a “subsidized price loan”, which required farmers to repay
farmers, cotton ginners have set up a variety of agricultural loan ZK 12,500 (US$ 2.5) at the end of the harvesting season. Both
schemes, which allow farmers to receive cotton seeds and “chem- prices are substantially higher than what households reported
icals” (fertilizer and pesticides) as well as agricultural machinery to have paid for bed nets in the past; 90% of bed nets found in
throughout the planting and growing season on a loan basis. Upon households at baseline were received through free governmental
receipt of the cotton, ginners deduct the outstanding loan amount or NGO programs (most likely received as part of the large national
from the final sales price, and pay out the remaining balance in cash. programs); 7% had paid ZK 3000 (US$ 0.6) – the price public health
Agricultural loans are generally provided free of interest, and are facilities charged prior to the national mass distribution, and the
given under the assumption that farmers will sell their harvest to remaining 3% reported to have paid a price between ZK 5000 and
the cotton company offering them the agricultural loan. At the time ZK 10,000 acquiring nets through door-to-door sales.
of the study, our partner organization (Dunavant Cotton) was pro- All nets were distributed at the beginning of the weeding season
viding loans to approximately 80,000 cotton farmers across Zambia. (late December) in order to provide protection for farmers during
According to the latest national estimates, 67% of the Zambia labor the peak malaria season (December–May).
force is employed in agriculture (Zambia Central Statistic Office,
2011b), which corresponds to approximately 1.3 million farming 3.3. Sampling frame, enrollment and program assignment
households. The share of farmers working with Dunavant is rel-
atively small (approximately 6% of all farming households), both The sampling frame for the study was provided by Dunavant
because Dunavant is only one of about 10 cotton ginners in the Zambia. Dunavant has nine regional offices, which distribute farm-
country and because only about 25% of small-scale farmers in the ing inputs and acquire cotton through local sheds and distributors.
country grow cotton (Goeb, 2011). Each distributor handles between 10 and 50 farmers in his commu-
nity. At the time the study was launched, Dunavant was working
3.2. The intervention with 96 distributors in the study area. In order to reduce the risk
of local spillovers, we restricted the sample to distributors operat-
In order to assess farm’s willingness to pay for nets within ing in spatially separated areas, with a minimum distance of 3 km
existing agricultural loan schemes, two different net programs between any two locations. This left us with a final sample of 49 dis-
were implemented as part of the experiment: a free net program tributors and their respective villages. On average, each of the 49
and a bed net loan program. Under the free net program, selected distributors was working with about 20 farmers, a listing of which
farmers were allowed to obtain one free bed net for each uncov- was provided to the study by Dunavant. We randomly selected 11
ered sleeping space in the household. Clusters in the loan arm farmers from each distributor, and visited them for a baseline inter-
were assigned to one of two loan types: a “full price loan”, which view in December 2009. Only the 11 farmers selected for the study
required farmers to repay the full price3 of the net (Zambian were allowed to receive the free or subsidized nets; with an average
Kwacha (ZK) 25,000, US$ 5) at the end of the harvesting season, village size of about 50 households, this means that the program
covered about 20% of the average village population. Out of 539
farmers invited to participate in the study, 516 farmers (95.7%)
3
“full price” we charged reflects the current wholesale price, which is about 30% were enrolled in the study, and completed the baseline interview
below regional retail prices of about ZK 35,000 (US$ 7). in December 2009.
G. Fink, F. Masiye / Journal of Health Economics 42 (2015) 151–164 155
Table 1
Descriptive statistics.
Control (N = 153) Loans (N = 185) Free nets (N = 155) Equal means test (p-value)a
Mean St. dev. Mean St. dev. Mean St. dev. Control vs. Control vs. Loan vs. All Means
Loans Free nets Free nets equal
Farmer age 39.24 12.59 41.32 13.24 39.70 12.90 0.27 0.80 0.31 0.50
Farmer is married 0.83 0.38 0.83 0.37 0.88 0.32 0.96 0.21 0.15 0.30
Farmer years of education 4.34 3.41 4.14 3.66 4.21 3.88 0.63 0.81 0.90 0.89
Members under age 5 0.87 0.88 0.95 1.00 0.86 0.86 0.55 0.93 0.39 0.66
Members age 5–14 1.40 1.36 1.88 1.50 1.92 1.55 0.00 0.00 0.82 0.00
Members age 15–59 2.44 1.22 2.76 1.53 2.84 1.58 0.09 0.03 0.71 0.07
Members age 60+ 0.18 0.49 0.22 0.51 0.17 0.44 0.54 0.97 0.51 0.78
Chicken, geese and ducks 4.58 6.20 6.89 8.45 6.26 7.96 0.01 0.07 0.49 0.01
Goats, pigs and sheep 2.57 3.40 3.41 4.18 3.00 3.68 0.12 0.35 0.41 0.29
Cows 1.47 3.51 2.34 4.09 1.92 3.24 0.07 0.32 0.37 0.19
Bicycles 0.77 0.56 0.93 0.81 0.87 0.60 0.02 0.22 0.44 0.08
Mobiles 0.23 0.45 0.36 0.80 0.35 0.75 0.14 0.17 0.95 0.23
TVs 0.07 0.25 0.14 0.37 0.06 0.27 0.04 0.98 0.05 0.08
Mosquito nets 1.09 1.33 1.53 1.25 0.48 0.69 0.08 0.01 0.00 0.00
Cars, tractors and trucks 0.00 0.00 0.06 0.43 0.03 0.20 0.05 0.05 0.33 0.03
Maize planting area (ha) 1.78 1.17 2.16 3.16 1.86 1.13 0.19 0.71 0.27 0.38
Cotton planting area (ha) 1.36 1.09 1.38 1.25 1.17 0.73 0.94 0.21 0.32 0.33
Other crops (ha) 0.58 0.77 0.73 0.84 0.83 0.93 0.32 0.15 0.55 0.39
Cotton harvest 2009 (bales) 9.46 8.59 12.16 11.12 11.60 11.33 0.09 0.19 0.75 0.19
Maize harvest 2009 (bags) 20.13 15.62 29.83 35.96 28.16 25.98 0.01 0.00 0.68 0.00
Total harvest value 2009 (US$) 453.2 304.0 649.0 589.1 613.8 476.0 0.00 0.00 0.63 0.00
Notes: Based on 493 observations with complete information. All variables reflect baseline conditions as collected in December 2009.
a
p-Values based on cluster-bootstrapped standard errors. Each cluster corresponds to one distributor and 11 randomly selected farmers working with the distributor.
In order to accelerate the distribution of bed nets,4 bed net loan 4.15 ha (median 3.1) in 2009, and average harvest value in 2009
programs were randomized prior to the collection of baseline data was US$ 577 (median US$ 463). With an average household size
at the distributor level. Randomization was done using a simple of close to six members, this implies average per-capita resources
random number draw generated by Stata. Out of the 49 eligible of approximately US$ 0.26 per day, placing the majority of these
distributors, 15 were assigned to the control group (30%), and 15 households well below the international US$ 1.25 dollar per day
distributors (30%) were selected for the free net program. Since we poverty threshold, even when input-related expenses (such as cot-
were particularly interested in the loan group and wanted to assess ton loans) are not accounted for. Cotton farmers are on average
differences in uptake with and without subsidy, a slightly larger slightly larger than other farms;5 at the national level, the average
number of distributors (20% of distributors in each loan program) plot size among small- and medium scale (non-commercial) farm-
were randomized into the net loan programs. The spatial distribu- ers is 3.1 ha (Jayne et al., 2008). On average, farms owned one bed
tion of treatment assignment is illustrated in Fig. 3. All farms in the net at baseline; with a mean of three sleeping spaces per house-
net program arms were informed about the programs at the end of hold, this implies that two thirds of household members were not
the baseline interview, and given 48 h to decide on the number of covered by nets at the beginning of the study.
nets they wanted to receive. Ordered nets were delivered within 10 While the randomized assignment of net programs across clus-
days of the baseline survey, between December 20 and December ters generated a fairly balanced sample with respect to household
31, 2009. head characteristics, the same was unfortunately not true for farm
A first follow-up or midline survey was conducted in April 2010, size, with farms in the free net and net loan arms on average larger
during which information on recent illness episodes was collected. and more productive than farms in the control group. As Table 1
The endline survey was conducted in July and August 2010 with a shows, the largest and most productive farms were found in the
primary focus on harvest outcomes and harvest sales. net loan group, followed by the free net group. Detailed data col-
Out of the 516 farmers initially enrolled in the study, 510 (98.8%) lected on 2009 harvest outcomes suggests that the average value
were followed up successfully throughout the subsequent farm- of farm production (sum of all crops harvested multiplied with the
ing and harvesting season. One farmer passed away, three farmers median sales prices of the respective crops in the area in 2009 – see
moved, and two farmers refused to participate in the follow-up Table 1 for further details) was US$ 453 in the control group, US$
surveys. An additional two surveys were excluded from analysis 614 in the free net group, and US$ 649 in the net loan group. Similar
due to missing planting and harvesting information. Sixteen fur- differences were found for household size: on average, households
ther surveys have missing values on at least one of the extended in the loan and free net groups had 0.8 and 0.9 more household
list of covariates used in some of the specifications, resulting in a members in the 5–59 age range than households in the control
final analytical sample of 493 farmers. group.
These differences between farmers in the control and farmers in
3.4. Descriptive statistics the two intervention groups are large and statistically significant,
and complicate statistical inference, since endline differences will
Table 1 shows descriptive statistics for the households enrolled be at least be partially attributable to differences observed at base-
and followed up in the study by study arm. Average plot size was line. The imbalance does not appear to be driven by differences in
the spatial distribution or by spatial clustering of farms, but rather
4
Due to delays in funding and IRB approval, baseline surveys got pushed back
to December, so that we decided to do the distribution of nets immediately after
5
baseline to make sure farmers would benefit from them during the peak rainy Cotton ginners recommend to use at least one hectare of land for growing cotton,
season. which means that farms growing cotton rarely use less than two hectares of land.
156 G. Fink, F. Masiye / Journal of Health Economics 42 (2015) 151–164
Table 2
Program effect on ownership and coverage panel a: ownership and usage of nets.
Number of nets Number of nets Number of nets Number of nets Number of nets Number of nets
received through owned at used at received through owned at used at
program endline endlinea programb endlineb endlineb
seems to reflect presumably random variations in productivity level 4. Program impact on bed net ownership, usage and
across farmers.6 While a large number of meta-reviews in the medi- coverage
cal literature suggest that results from imbalanced trials controlling
for baseline characteristics do on average not yield different results Table 2 shows the main results for bed net ownership and usage.
from fully balanced trials (Berger, 2010; Knottnerus and Tugwell, As described above, endline surveys were conducted in July and
2012; Riley et al., 2013), there is clearly a concern that observable August 2010. During both visits, net status was verified by inter-
differences may be correlated with other unobservable character- viewers. We consider a bed net “used” if the net was observed
istics such as malaria knowledge or farming skills. To deal with hanging by interviewers during the visit.7 In Panel A of Table 2,
these concerns, we follow the approach proposed by Glennerster we show the impact of the net programs on the number of nets
and Takavarasha (2013) as well as by Bennett et al. (2014) and received (columns 1 and 4), the number of nets owned at end-
estimate both models where we control for lagged dependent vari- line (columns 2 and 5) and the number of nets in use at endline
ables and models where we use differences in outcome measures (columns 3 and 6). In Panel B of Table 2 we show the impact of
between the 2009 (pre-intervention) and 2010 (intervention) sea- the two treatments on bed net coverage in the household, i.e. the
sons as the dependent variable. These models exclusively identify number of nets hanging relative to the number of sleeping spaces
changes in the outcome measures over time, and thus directly used by the household. In columns 1–3 of both panels we show
eliminate confounding or omitted variable bias concerns due to unadjusted differences between the three groups; in columns 4–6,
time-invariant farm-specific differences prior to the intervention. we show estimates with a full set of baseline covariates to con-
The resulting estimates will yield unbiased program impact assess- trol for pre-treatment differences in household size and bed net
ments as long as the random treatment assignment is not correlated ownership.
with changes in unobservable characteristics between baseline and As documented in previous studies (Cohen and Dupas, 2010;
endline conditional on initial values, which seems reasonable given Dupas, 2014; Tarozzi et al., 2014), the demand for bed nets is highly
the relatively short time period analyzed. price elastic; on average, households in the loan group obtained
0.8 nets, compared to 2.4 nets in the free net group. It is worth
7
Nets are generally tied to a knot during the day to keep them clean, which means
6
The baseline table looks virtually the same when specific areas (Eastern, Western that observing a net as hanging does not necessarily mean it was used the previous
or central parts) or villages are excluded. night.
G. Fink, F. Masiye / Journal of Health Economics 42 (2015) 151–164 157
highlighting that despite the high price elasticity, demand is strictly In order to have a comprehensive measure of farm productivity,
positive in this setting with credit financing even when the full we defined the total economic value of production (TEVP) as
price of the net is charged. This is rather different from the zero
demand found for nets with prices over US$ 1 in settings where net
8
Tarozzi et al. (2014) who find that 52% of household purchase bed where qi is the quantity of crop i and P50,i is the median price for
nets when financing options are provided. one unit of the respective crop.9 In the study sample, eight crops
Given that households in the control group had more nets at were grown: cotton, maize, ground nuts, sweet potato, sunflower,
baseline, the differences in ownership at endline are smaller than soy beans, tomato and cassava. In general, price variations across
the differences in the number of nets received; on average, house- farmers were low: for the two major crops (cotton and maize) prices
holds in the loan group owned 0.9 more nets than households in are negotiated and established at the national level by the govern-
the control group, while households in the free net group owned ment. For some of the minor crops like sunflowers and soy beans,
1.6 nets more than households in the control group. prices are established at local markets. For all crops, individual
Consistent with Cohen and Dupas (2010) as well as Tarozzi et al. prices reported rarely deviated by more than 10% from the most
(2014), we found no effect of net pricing on usage. On average, commonly reported market prices.
90% of nets owned were actively used by the household during our For our analysis, we focus on cotton and maize as the two most
second follow-up, with no differences in utilization rates between commonly grown crops as well as the aggregate TEVP variable. Fig. 4
intervention and control groups. The remaining nets were generally shows kernel density estimates of all three variables in 2010 on an
found stored for future use in the households, which is consis- absolute (top panel) and logarithmic (bottom panel) scale.
tent with the generally high levels of appreciation of nets in this The average cotton harvest in 2010 was 6 bales, which corre-
population. sponds to about 480 kg of cotton harvested with an average plot
In terms of sleeping space coverage, both treatments had a size- of 1.3 ha. Average maize harvest was 28 bags, which implies an
able impact. As shown in Panel B of Table 2, 41% of sleeping spaces average yield of about 1400 kg for an area of 2 ha. Both yields
were covered on average at endline in the control group; this frac- are small when compared to industrial farmers, who frequently
tion increased to 65% in the loan group, and to 88% in the free net achieve yields of over 10 tons of maize per hectare (13 times the
group. Along the same lines, the likelihood of no sleeping space sample average) and over 2 tons of cotton per hectare (about 9 times
being covered by a bed net decreased from 39% in the control group the sample average).10 The average total economic value of all crops
to 13% in the loan group and to less than 1% in the free net group. harvested in 2010 was 2.6 million Kwacha (US$ 517).
The main hypothesis investigated in this experiment is whether The basic empirical (intention-to-treat) model we estimate to
short-term fluctuations in labor supply generated by ill health lead identify the treatment effects of interest is given by
to lower agricultural output. To measure production, we focus on
three different measures: maize harvest, cotton harvest, and total yij = ˛ + Tj ˇ + Xij + εij , (2)
production value. Maize is by far the most common crop in the
where yij is the harvest outcome of interest observed for farm i
country, and grown on approximately 50% of all plots in the sample.
in cluster j in 2010, Tj is a vector of indicator variables capturing
Maize is traded in standard bags of 50 kg, which makes measuring
the distributor-level treatment assignment, and Xij is a vector of
total production of maize relatively easy. The second most common
baseline covariates. Given the limited use of fertilizer in the region,
crop in our sample is cotton. Cotton is used as a cash crop, and,
yields tend to display mean-reverting patterns over time, with good
as described above, sold to cotton ginners at the end of the har-
yield years depleting the soil and being followed by less productive
vesting season. While ginners pay by kilogram (2009 prices were
years. Fig. 5 illustrates these patterns, showing the year-over-year
US$ 0.30 per kg), farmers generally delivery cotton in large bags,
change in output as a function of total output in the 2009 (pre-
which are referred to as “bales”, and generally contain about 80 kg
intervention) period.
each. Even though cotton and maize account for the large major-
To control for baseline differences as well as the observed mean-
ity of farming land and production in this sample, most farmers
reverting patterns, we first estimate models where we including
use small plots to grow a variety of other plants such as sunflow-
the lagged (2009) value of the outcome variable of interest. With
ers, beans, groundnuts, and sweet potatoes. The diversity in crop
lagged outcome variables, the main estimated equation becomes
portfolios means that crop-specific quantities cannot easily be com-
pared across farmers or treatment groups. On the other hand, not yijt = ˛ + yijt−1 + Tj ˇ + εij , (3)
accounting for the resources generated by these crops would clearly
mean that the effects of additional labor inputs may not be fully where yi,t−1 is the lagged value of the dependent variable, i.e. the
captured. harvest outcome in the 2009 season. Following the approach taken
To measure total farm production, detailed harvest information in Bennett et al. (2014), we also test an alternative model where we
was obtained from all crops, and then converted into monetary val- take changes in the outcome variables as dependent variable, and
ues using the median 2009 market prices reported among farmers control for an extensive set of baseline covariates, which is given
who sold the respective crops. In theory, one may wish to use farm- by
specific crop prices to account for differences in market access or yij = ˛ + Tj ˇ + Xij + εij . (4)
production quality; in practice, this is unfortunately not feasible
since a large number of farms do not sell specific crops at all, but
rather use it for their own consumption.8
9
We assume that the quality of the produced crops are comparable across farm-
ers; this assumption is empirically always true for cotton and maize (were prices
are fixed), but may not necessarily reflect prices for local small-quantity trades.
8 10
The only cash crop in the sample is cotton; all other crops are mostly used for See http://www.indexmundi.com/agriculture/ for a country ranking for crop
own consumption, with some occasional sales to cover additional cash needs. productivity.
158 G. Fink, F. Masiye / Journal of Health Economics 42 (2015) 151–164
.02
Density
Density
Density
.01
0
0
0 20 40 60 0 100 200 300 0 5000 10000 15000 20000
kernel = epanechnikov, bandwidth = 0.9652 kernel = epanechnikov, bandwidth = 4.4401 kernel = epanechnikov, bandwidth = 355.2049
.6
.6
.6
.4
.4
Density
Density
Density
.4
.2
.2
.2
0
0
0 1 2 3 4 0 2 4 6 4 6 8 10
kernel = epanechnikov, bandwidth = 0.1381 kernel = epanechnikov, bandwidth = 0.1921 kernel = epanechnikov, bandwidth = 0.1668
Fig. 4. Agricultural outcomes: Kernel density estimates. Notes: total values are in rebased Zambian Kwachas (ZKR); one rebased Kwacha corresponds to 1000 “old” Zambian
Kwachas.
In this specification, yij is the change (difference) in the out- in total harvest value across the three groups are large, with farms
come between the 2009 (pre-intervention) and the 2010 harvests, in the loan and free net groups showing additional yields of US$
and Xij is a vector of baseline covariates. Table 3 shows the main 180 and US$ 156, respectively (Panel A, column 3). Given the large
treatment effect results. In Panel A of Table 3 we show uncondi- differences in total yields at baseline, a substantial fraction of this
tional differences across the three groups. The observed differences differential is clearly attributable differences in baseline covari-
ates. In Panel B of Table 3, we directly control for these baseline
differences in productivity by including lagged dependent vari-
ables in our model as outlined in Eq. (2). As expected, the lagged
300
than one. The adjusted model suggests that the loan program
increased total harvest value by an average of US$ 69, while the
free net program increased yields by about US$ 65 – only the lat-
100
2000
1500
Production Value 2010 (US$)
1000
1000
500
500 0
0
0 500 1000 1500 2000 0 500 1000 1500 2000
Production value 2009 (US$) Production value 2009 (US$)
Fig. 6. Fractional polynomial predictions: farm yields 2010 as function of 2009 farm yields.
crops in 2009, family members under 5, family members 5–14, family members
60 and older, household wealth, mosquito nets at baseline, ownership of chickens,
goats and sheep, farmer age, education and marital status. Cluster-bootstrapped
Control Loans Free nets
standard errors in parentheses. Each cluster corresponds to one distributor and
11 randomly selected farmers working with the distributor. *** p < 0.01, ** p < 0.05,
* p < 0.1 Fig. 7. Changes in farm yields at the cluster level.
160 G. Fink, F. Masiye / Journal of Health Economics 42 (2015) 151–164
the outside whiskers show the lower and upper adjacent values11
Table 4
Robustness checks and heterogeneous treatment effects.
Sample Education of Medium Excluding bottom Excluding top 2009 Excluding clusters
head > 0 diversification: 2009 productivity productivity in top or bottom
3–5 crops quintile quintile quintile in 2009
Observations 341 395 386 400 293
R-squared 0.467 0.456 0.470 0.392 0.414
Notes: All specifications control for lagged dependent variables as well as for plot sizes used for cotton, maize and other crops in 2009, family members under 5, family
members 5–14, family members 60 and older, household wealth, mosquito nets at baseline, ownership of chickens, goats and sheep, farmer age, education and marital status.
Cluster-bootstrapped standard errors in parentheses. Each cluster corresponds to one distributor and 11 randomly selected farmers working with the distributor. *** p < 0.01,
** p < 0.05, * p < 0.1
Table 5
Labor supply and health expenditure effects.
Days of field Days of field Total days of Total health Days of field Days of field Total days of Total health
work lost due work lost due field work lost expenditure work lost due work lost due field work lost expenditure
to own sickness to other in last 2 weeks last two weeks to own sickness to other in last 2 weeks last two weeks
in last 2 weeks sickness in last in last 2 weeks sickness in last
2 weeks 2 weeks
Any programa) −0.174 −0.129 −0.303 0.0783 −0.170 −0.0818 −0.252 0.0740
(0.149) (0.129) (0.258) (0.0604) (0.154) (0.117) (0.234) (0.0573)
Loan program −0.206 −0.0764 −0.283 0.0842 −0.182 0.00153 −0.180 0.0972
(0.157) (0.136) (0.257) (0.0676) (0.145) (0.125) (0.239) (0.0668)
Free net −0.134 −0.192 −0.326 0.0713 −0.154 −0.196 −0.350 0.0423
program
(0.170) (0.138) (0.273) (0.0850) (0.192) (0.152) (0.292) (0.0905)
Control group 0.391 0.360 0.752 0.072 0.391 0.360 0.752 0.072
mean
Notes: All specifications control for plot sizes used for cotton, maize and other crops in 2009 as well as in 2010, family members under 5, family members 5–14, family
members 60 and older, household wealth, mosquito nets at baseline, ownership of chickens, goats and sheep, farmer age, education and marital status, total maize harvest
2009, total cotton harvest 2009 and total economic value of production 2009. Cluster-bootstrapped standard errors in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1.
highly endemic areas like the one studied, asymptomatic malaria any lacking labor could be hired in local markets. During the first
infections are common, and likely to substantially reduce farm- follow-up rounds, we directly questioned farmers about labor sub-
ers’ ability to complete field work tasks even when they do not stitution. In total, 167 instances were reported where the head of
suffer from acute infections (Nur, 1993), As part of the follow-up household was not able to work on the field because he or she was
interviews, we also collected information on health expenditure. sick.15 Out of these 167 episodes, substitute labor was hired only in
Zambia’s health sector – in particular in rural areas – is dominated 10 cases (6%). In 7 out of these 10 substitution cases, farmers found
by public health facilities which generally provide basic health ser- somebody to work for free; only three farmers reported to pay for
vices (including malaria testing and drugs) for free. In our sample, labor, with wages ranging between 0.5 and 11 dollars per day. Anec-
90% of respondents indicated zero out-of-pocket expenditure for dotally, local piece work (“ganyu”) labor is widely available (Fink
household members getting treated when sick. Given this, the very et al., 2014); in practice, most small-scale farmers do however not
small and not significant estimated program impact on total health appear to have the resources to hire such labor.
expenditure found in Table 5 is not surprising. Even if one is willing to accept that hiring short-term labor
may be hard for farmers, the estimated impact numbers appear
large. As shown in Goldberg (2014), agricultural wages for day
7. Discussion labor are frequently less than US$ 1 in rural Malawi. In the Zambia
settings, wages appear slightly higher, with median daily wages
The estimates presented in this paper suggest a rather large varying between KR 12.5 and KR 25 (US$ 2.5–5) reported for the
positive impact of malaria prevention on agricultural productivity. peak labor season in focus groups. Even at US$ 5, direct labor costs
While the data collected as part of the project does unfortunately are unlikely to account for the full differences in output observed.
not allow us to precisely identify the mechanism underlying this
impact, increased labor inputs appear to be the most plausible
causal pathway. Given that labor is in principal abundant in Zam- 15
Note that the total number of episodes here related to the four months period
bia, one might argue that short-term labor supply shocks should between the baseline and the midline survey, and thus is different from the 85 illness
not matter at all for farm production; in settings where labor can episodes reported for the two-week period preceding the interview analyzed in the
be freely hired, morbidity should not affect farming output, since previous section of the paper.
162 G. Fink, F. Masiye / Journal of Health Economics 42 (2015) 151–164
Our estimates suggest that having all sleeping spaces covered with Table 6
Treatment and agricultural loan performance.
bed nets will save the average farm approximately 3–5 working
days, which translates to about US$ 15–25, and therefore to 30% No payment Partial Cotton sales to
of the estimated effects at most. Two factors may at least partially payment Dunavant (kgs)
explain the remaining gap: first, recovery from malaria is a slow (4) (5) (6)
process, frequently taking more than two weeks, during which Net loan program −0.003 −0.038 98.1
farm workers are likely less productive (Nur, 1993) even if they (0.06) (0.02) (92.6)
Free net program −0.001 −0.006 89.3
report back to work on the field and thus would not be counted
(0.06) (0.02) (108.6)
as “able to work” in our analysis. A second possibility is that farm- Observations 450 450 450
ers in the study may have only reported major health events, so R-squared 0.045 0.064 0.119
that the reported numbers do not fully capture the true program Notes: All specifications control for plot sizes used for cotton, maize and other crops
impact. In the 2007 Zambia Demographic and Health Survey (Macro in 2009 as well as in 2010, family members under 5, family members 5–14, family
International, 2007), 20% of children under-5 were reported to have members 60 and older, household wealth, mosquito nets at baseline, ownership
suffered from fever or diarrhea over the 2-week period preceding of chickens, goats and sheep, farmer age, education and marital status, total maize
harvest 2009, total cotton harvest 2009 and total economic value of production 2009.
the survey. Even if adults are substantially less prone to be sick, the Robust standard errors in parentheses are clustered at the cluster level.
reported morbidity prevalence seems very low: 32 episodes across *** p < 0.01, ** p < 0.05, * p < 0.1.
153 households in the control group implies about one episode for
every 25 individuals or a fever prevalence of about 4%, which is
about one fifth of the under-5 prevalence in the DHS. Given that the field work and net distribution were supported
One alternative explanation for the relatively large effects by Dunavant and respondents may associate interviewers with
observed are local spillover effects, which could potentially also the company (even though no information provided by the farmer
undermine the internal validity of the study: as demonstrated was shared with the company), farmers in the loan groups might
by Apouey and Picone (2014), social interactions in the realm of selectively under-report their cotton production in order to not
malaria and malaria prevention are likely, and may lead to stronger be obliged to repay their full loan to Dunavant. To investigate this
associations between health behaviors and health outcomes at the hypothesis, we compared repayment and cotton sales (the amount
village or regional level. Even though it is quite likely that farmers of cotton sold from farmers to Dunavant) across the three groups
interact with other farmers outside the 3 km radius chosen for the in Dunavant records.
randomization, large spillover effects do not seem very likely in our Out of the 493 farmers in our main analytical sample, admin-
setting: first, by working only with small-scale farmers having an istrative records could be found for 450 farmers (91.3%), with no
outgrower contract with Dunavant, we covered only a small frac- differences in tracking rates across arms. Table 6 compares three
tion (<25%) of farmers in each village, which means that changes in outcomes from Dunavant perspective: the likelihood of farmers’
net coverage levels at the village level were relatively small. The- default, the likelihood of a farmer’s partial default, and the total
oretically, our intervention could also have increased knowledge amount of cotton sold to Dunavant. While all the estimates suggest
or improved behavior. However, given that we did not provide any that farmers in the two nets arms did slightly better than farmers
information or encourage usage of nets at all but simply provided in the control arm (consistent with the documented productivity
access to subsidized nets large changes in knowledge or behavior effects), we do not find any evidence of farmer in the loan program
among untreated farms do not seem likely. In terms of the actual displaying higher default rates; if anything, the point estimates sug-
nets distributed, we closely monitored their usage and ownership, gest that farmers in the loan program perform best of all three
and found that virtually all nets (97%) remained in the house- groups. It also seems important to highlight that general repay-
holds who originally acquired throughout the study period, which ment levels are high, with 84% of farmers fully repaying loans right
makes us relatively confident that direct spillovers to control vil- after the harvest, and an additional 3% making partial payments.
lages (nets reaching control villages) did not occur. One could also This suggests that farmers are fully cognizant of their credit com-
have expected the intervention to increase the appreciation and mitments, and unlikely to just have signed up for the loan programs
utilization of bed nets – we do not find any differences in utilization with the expectation of not repaying them.
across groups. In terms of cotton production, the overall sales estimates are not
Even though we think that the most likely mechanism from statistically significant, but consistent with the numbers reported
bed nets to agricultural productivity are the direct and indirect in Table 3. The numbers reported in column 3 of Table 6 indicate
costs associated with ill-health, one alternative interpretation of that farmers sold on average an additional 90 kg of cotton to Duna-
the results presented is that subsidized or free nets constitute an vant. Given that each bale corresponds to about 80 kg of cotton,
upfront financial transfer, which allows farmers to spend money these numbers are very similar in magnitude to the 1.37 bales
originally earmarked for bed nets on other farming related items. reported in Panel C of Table 3.
One could argue that farmers could sell bed nets, and use the
resources for their own consumption or agriculture related expen- 8. Summary and conclusion
diture. Given the data collected as part of this project, this appears
rather unlikely. As discussed in Section 3 of the paper, less than 10% All results presented from our experiment suggest that preven-
of nets available at baseline were actively purchased by farmers; tive health investment in the form of LLITNs can lead to substantial
in general, bed net purchases appear to be more of an exception. improvements in agricultural output among small-scale farmers.
We also checked all households for the nets distributed: as stated The estimates presented in this paper suggest that a typical house-
above, out of 547 nets distributed in December 2009, 528 (97%) hold can increase harvest revenues by approximately US$ 76 if it
were located in the original households16 during the second follow- covers all sleeping places with bed nets. These numbers seem rather
up interview in July, so that frequent sales of nets can be excluded large in absolute terms, but also seem consistent with farmer per-
as alternative pathway. ceptions: when prompted regarding the expected health benefits
of reducing the burden of malaria at the beginning of the harvesting
season, farmers indicated that their farms would on average have
16
Respondents were asked about the source of each net in their household as part 30% higher yields if nobody in the household would fall sick; given
of both follow-up surveys. that bed nets reduce the incidence of fevers by 30–50% (Fink and
G. Fink, F. Masiye / Journal of Health Economics 42 (2015) 151–164 163
Masiye, 2012; Lengeler, 2004), the 15% increase in yields observed empirically, but does not find any evidence for free distribution
in this study appears to be within the range expected by farm- campaigns affecting subsequent willingness to pay in the context of
ers. It is worth highlighting that the study site was intentionally bed nets. A similar argument would be that free nets do not change
identified as an area with traditionally high malaria exposure: even willingness to pay directly, but may affect the perceived probability
though the overall farm structure in the sample analyzed appears of receiving free nets in the future. While we cannot fully rule
fairly similar to other small-scale farms in rural Zambia and neigh- out this hypothesis, waiting for future distributions appears to be
boring countries in terms of their size and profitability, malaria rather risky as a strategy given that net benefits are immediate and
prevalence and incidence differs largely across regions, and will go substantially beyond the productivity improvements observed.
almost certainly affect the overall impact of similar programs in Under the assumption that farmers are risk-averse, a more plau-
other areas. sible explanation for the reluctant uptake of pre-financed nets
Nevertheless, the high expected and realized returns to health might simply be that nets constitute a relatively large investment,
investment documented in this paper naturally raise the question which may not be attractive to farmers even if the mean return
of why private investment in bed nets remains low. From a private to the investment is positive and large. With an average dispos-
sector perspective, the provision of bed nets may seem an attractive able (cash) income of less than US$ 200 per year, committing to
investment for larger companies interacting with farmers such as an end-of-season payment of US$ 10–15 may appear difficult to
cotton ginners, both to generate goodwill and higher yields. Follow- farmers given the already high uncertainty faced with respect to
ing the study presented in this paper, our study partner, Dunavant final harvest outcomes. Recent evidence from Ghana suggests that
Cotton, did indeed decide to distribute bed nets to close to half reducing risk exposure with insurance programs may be central to
of the farmers with the support of the World Bank. After an ini- increasing farm investment and productivity (Karlan et al., 2012);
tial review, Dunavant opted against a continuation of the program further research in this area will be needed to better understand
because the overall benefits in terms of contracts signed did not the determinants of farm behavior and to design optimal policies
appear large enough to support net programs on a continued basis in this area.
from the ginner’s perspective.17
While ginning companies may not be able to capture a large Acknowledgements
enough fraction of additional farming outputs, the limited willing-
ness of farming households to invest in bed nets and their own The authors would like to thank the Milton Foundation for
health appears a bit puzzling. Even though similarly low farmer funding this project, as well as Dunavant Cotton and in particu-
willingness to invest has been documented for other high-yield lar Rodrick Masaiti for the invaluable logistical support during all
investment options such as fertilizer or enhanced farming tech- stages of the field work. We would also like to thank Richard Sedl-
nologies (Duflo et al., 2008; Udry and Anagol, 2008), the low levels mayr and Felix Lam for their input into the study design and the
of (unsubsidized) adoption appear particularly puzzling in the con- coordination of the field work, Peter Mulenga for the coordination
text of nets. Nets are a technology well-known by farmers, and, if of data entry, and Jenny Aker, Nava Ashraf, David Atkin, Jessica
farmers’ own statements are to be believed, a technology widely Cohen, Erica Field, Maggie McConnell, Kelsey Jack, Michael Kre-
recognized as effective. When prompted at baseline, farmers in the mer, Zoe McLaren and John Strauss as well as the participants at
study indicated that the burden of malaria in their households could the NEUDC conference and the development seminar at Bocconi
be reduced by approximately 50% if full bed net coverage was avail- University for their comments and suggestions.
able, a number which largely coincides with the WHO’s current net
effectiveness estimates (Lengeler, 2004). The fact that farmers of References
the study area believe in the effectiveness of nets in the area is
also underlined by the fact that more than 90% of nets were used Apouey, B., Picone, G., 2014. Social interactions and malaria preventive behaviors in
throughout the study – overall, lacking belief in net effectiveness sub-Saharan Africa. Health Economics 23, 994–1012.
Arnold, B.F., Galiani, S., Ram, P.K., Hubbard, A.E., Briceno, B., Gertler, P.J., Colford Jr.,
does not appear to be a big issue in the studied setting.
J.M., 2013. Optimal recall period for caregiver-reported illness in risk factor and
Two possible explanations for the lack of preventive investment intervention studies: a multicountry study. American Journal of Epidemiology
in the context of malaria are lacking capital markets or credit con- 177, 361–370.
Ashraf, N., Fink, G., Weil, D.N., 2010. Evaluating the effects of large scale health inter-
straints more generally. The results presented in this study clearly
ventions in developing countries. In: The Zambian Malaria Initiative (Ed.), NBER
show some evidence in support of this hypothesis. Compared to the Working Paper, vol. 16069. National Bureau of Economic Research, Cambridge,
very small fraction of women purchasing bed nets at prices above MA.
US$ 1 when upfront payments are required (Cohen and Dupas, Audibert, M., Etard, J-F., 2003. Productive benefits after investment in health in Mali.
Economic Development and Cultural Change 51, 769–782.
2010) the demands for nets appears substantially increased when Baranov, V., Bennett, D., Kohler, H.-P., 2012. The Indirect Impact of Antiretroviral
financing options are made available as it was done in this study Therapy. PSC Working Paper Series, 9-27-2012.
and in Tarozzi et al. (2014). These increases in demand are both con- Behrman, J.R., Foster, A.D., Rosenzweig, M.R., 1997. The dynamics of agricultural
production and the calorie-income relationship. Journal of Econometrics 77,
sistent with lacking access to credit (Udry and Anagol, 2008) and 187–208.
with models of hyperbolic discounting (Duflo et al., 2010); demand Benjamin, D., 1992. Household composition labor markets, and labor demand: test-
for nets may also have been particularly strong because repayment ing for separation in agricultural household models. Econometrica 60, 287–322.
Bennett, D., Naqviy, S.A.A., Schmidt, W-P., 2014. Learning, Hygiene and Traditional
was due during the harvesting season, when farmers are relatively Medicine. Working Paper.
well endowed with cash (Duflo et al., 2010). Berger, V.W., 2010. Testing for baseline balance: can we finally get it right? Journal
However, even when full (zero-interest) financing was offered of Clinical Epidemiology 63, 939–940, author reply 940-932.
Bleakley, H., 2007. Disease and development evidence from hookworm eradica-
to farmers in this study, uptake was only modest. One possible
tion in the American South, February 2007. Quarterly Journal of Economics 122,
explanation for this is that farmers may expect to receive bed nets 73–117.
for free due to the large number of free or highly subsidized distri- Bleakley, H., Lange, F., 2009. Chronic disease burden and the interaction of education,
fertility and growth. Review of Economics and Statistics 91, 52–65.
bution campaigns run by the Government of Zambia in the region
Clarke, S.E., Jukes, M.C.H., Njagi, J.K., Khasakhala, L., Cundill, B., Otido, J., Crudde, C.,
over the past years. Dupas (2014) investigates this hypothesis Estambale, B.B.A., Brooke, S., 2008. Effect of intermittent preventive treatment
of malaria on health and education in schoolchildren: a cluster-randomised,
double-blind, placebo-controlled trial. Lancet, 127–138.
Cohen, J., Dupas, P., 2010. Free distribution or cost-sharing. Evidence from a ran-
17
See http://papers.ssrn.com/sol3/papers.cfm?abstract id=2358045 for a more domized malaria prevention experiment. Quarterly Journal of Economics 125,
detailed report on this initiative. 1–45.
164 G. Fink, F. Masiye / Journal of Health Economics 42 (2015) 151–164
Das, J., Hammer, J., Sánchez-Paramo, C., 2012. The impact of recall periods on Lengeler, C., 2004. Insecticide treated bednets and curtains for preventing malaria.
reported morbidity and health seeking behavior. Journal of Development Eco- Cochrane Database of Systemic Reviews, CD000363.
nomics 98, 76–88. Loureiro, M.L., 2009. Farmers’ health and agricultural productivity. Agricultural Eco-
Duflo, E., Kremer, M., Robinson, J., 2008. How high are rates of return to fertilizer. nomics 40, 381–388.
Evidence from field experiments in Kenya. American Economic Review Papers Macro International, 2007. Zambia: DHS, 2007 – Final Report (English). Macro Inter-
and Proceedings 98, 482–488. national, Calverton, MD.
Duflo, E., Kremer, M., Robinson, J., 2010. Nudging farmers to use fertilizer: the- Morel, C.M., Thang, N., Xa, N., Hung, L.X., Thuan, L.K., Ky, P.V., Erhart, A., Mills, A.J.,
ory and experimental evidence from Kenya. American Economic Review 101, D’Alessandro, U., 2008. The economic burden of malaria on the household in
2350–2390. south-central Vietnam. Malaria Journal, 7.
Dupas, P., 2014. Short-run subsidies and long-run adoption of new health products: NMCC, 2007. Zambia Malaria Indicator Survey 2006. Zambia Ministry of Health,
evidence from a field experiment. Econometrica 82, 197–228. Lusaka.
Ehrhardt, S., Burchard, G., Mantel, C., Cramer, J., Kaiser, S., Kubo, M., Otchwemah, R., NMCC (Ed.), 2009. Zambia Malaria Indicator Survey 2008. Zambia Ministry of Health,
Bienzle, U., Mockenhaupt, F., 2006. Malaria, anemia, and malnutrition in African Lusaka, Zambia.
children – defining intervention priorities. Journal of Infectious Diseases 194, NMCC, 2010. Zambia Malaria Indicator Survey 2010. Zambia Ministry of Health,
108–114. Lusaka.
Fink, G., Masiye, F., 2012. Assessing the impact of scaling-up bednet coverage NMCC (Ed.), 2011. ITN Distribution Data Base. NMCC, Lusaka.
through agricultural loan programmes: evidence from a cluster randomised Nolte, K., April 2012. Large scale agricultural investments under poor land gover-
controlled trial in Katete, Zambia. Transactions of the Royal Society of Tropical nance systems: actors and institutions in the case of Zambia. In: World Bank
Medicine and Hygiene 106, 660–667. Conference on Land and Poverty Paper 2012.
Fink, G., Jack, B.K., Masiye, F., 2014. Seasonal Credit Constraints and Agricultural Nur, E.T.M., 1993. The impact of malaria on labour use and efficiency in the Sudan.
Labor Supply: Evidence from Zambia. NBER Working Paper., pp. 20218. Social Science & Medicine 37, 1115–1119.
Fox, M.P., Rosen, S., MacLeod, W.B., Wasunna, M., Bii, M., Foglia, G., Simon, J.L., 2004. Pitt, M.M., Rosenzweig, M.R., 1986. Agricultural prices, food consumption, and the
The impact of HIV/AIDS on labour productivity in Kenya. Tropical Medicine & health and productivity of Indonesian farmers. In: Singh, I.J., Squire, L., Strauss,
International Health 9, 318–324. J. (Eds.), Agricultural Household Models. Johns Hopkins University Press, Balti-
Frigge, M., Hoaglin, D.C., Iglewicz, B., 1989. Some implementations of the box plot. more.
The American Statistician 43, 50–54. Riley, R.D., Kauser, I., Bland, M., Thijs, L., Staessen, J.A., Wang, J., Gueyffier, F.,
Gilgen, D.D., Mascie-Taylor, C.G., Rosetta, L.L., 2001. Intestinal helminth infections, Deeks, J.J., 2013. Meta-analysis of randomised trials with a continuous outcome
anaemia and labour productivity of female tea pluckers in Bangladesh. Tropical according to baseline imbalance and availability of individual participant data.
Medicine & International Health 6, 449–457. Statistics in Medicine 32, 2747–2766.
Girardin, O., Daoa, D., Koudou, B.G., Essé, C., Cissé, G., Yao, T., N’Goran, E.K., Tschan- Strauss, J., 1986. Does better nutrition raise farm productivity. Journal of Political
nen, A.B., Bordmannd, G., Lehmannc, B., Nsabimana, C., Keiser, J., Killeen, G.F., Economy 94, 297–320.
Singer, B.H., Tanner, M., Utzinger, J., 2004. Opportunities and limiting factors of Tarozzi, A., Mahajan, A., Blackburn, B., Kopf, D., Krishnan, L., Yoong, J., 2014.
intensive vegetable farming in malaria endemic Côte d’Ivoire. Acta Tropica 89, Micro-loans insecticide-treated bednets malaria evidence from a randomized
109–123. controlled trial in Orissa (India). American Economic Review 104, 1909–1941.
Glennerster, R., Takavarasha, K., 2013. Running Randomized Evaluations: A Practical Thomas, D., Frankenberg, E., Friedman, J., Habicht, J.-P., Ingwersen, N., McKelvey, C.,
Guides. Princeton University Press, Princeton, NJ. Mohammed Hakimi, J., Pelto, G., Sikoki, B., Seeman, T., Smith, J.P., Sumantri,
. Goeb, J.C. (Ed.), 2011. Impacts of Government Supports on Smallholder Cotton C., Suriastini, W., Wilopo, S., 2010. Causal Effect of Health on Labor Market
Production in Zambia, vol. Master of Science. Michigan State University, Lansing. Outcomes: Experimental Evidence.
Goldberg, J., 2014. Kwacha gonna do? Experimental evidence about labor supply in Udry, C., Anagol, S., 2008. The Return to Capital in Ghana.
rural Malawi. Working paper. Ulimwengu, J., 2009. Farmers’ health and agricultural productivity in rural Ethiopia.
Hay, S.I., Snow, R.W., 2006. The malaria atlas project: developing global maps of African Journal of Agricultural and Resource Economics 3, 83–100.
malaria risk. PLOS Medicine 3, 473. WHO (Ed.), 2009. World Malaria Report 2008. WHO, Geneva.
Jayne, T.S., Zulu, B., Kajoba, G., Weber, M.T., 2008. Access to land and povery reduction World Bank (Ed.), 2007. World Bank Development Indicators CD-ROM.
in rural Zambia: connecting the policy issues. In: Food Security Research Project Zambia Central Statistic Office (Ed.), 2011a. 2010 Census of Population and Housing.
Working Paper, p. 34. Zambia Central Statistic Office, Lusaka, Zambia.
Karlan, D., Osei, R., Osei-Akoto, I., Agricultural, Udry C., 2012. Decisions after Relaxing Zambia Central Statistic Office (Ed.), 2011b. Living Conditions Monitoring Survey
Credit and Risk Constraints. Mimeo. Report 2006 and 2010. Zambia CSO, Lusaka, Zambia.
Knottnerus, J.A., Tugwell, P., 2012. Good baseline balance – a prerequisite for valid Zambia Ministry of Health, 2006. A 6-year Strategic Plan A Road Map for Impact on
comparison. Journal of Clinical Epidemiology 65, 119–120. Malaria in Zambia 2006–2011s. NMCC, Lusaka.
Kremer, M., Miguel, E., 2004. Worms identifying impacts on education and health Zambia Ministry of Health (Ed.), 2012. Health Management Information System
in the presence of treatment externalities. Econometrica 72, 159–217. (HMIS). Zambia Ministry of Health, Lusaka.
Journal of Health Economics 42 (2015) 165–173
a r t i c l e i n f o a b s t r a c t
Article history: Using Roy’s model of sorting behavior, I study welfare implications of learning about medical care quality
Received 3 April 2014 through the current health care data production infrastructure that relies on solicitation of research
Received in revised form 9 April 2015 subjects. Due to severe adverse-selection issues, I show that such learning could be biased and welfare
Accepted 10 April 2015
decreasing. Direct diversification of treatment receipt may solve these issues but is infeasible. Unifying
Available online 20 April 2015
Manski’s work on diversified treatment choice under ambiguity and Heckman’s work on estimating
heterogeneous treatment effects, I propose a new infrastructure based on temporary diversification of
Keywords:
access that resolves the prior issues and can identify nuanced effect heterogeneity.
Learning
Diversification © 2015 Elsevier B.V. All rights reserved.
Comparative effectiveness research
Economic evaluation
Instrumental variables
Heterogeneity
JEL classification:
C1
C9
D6
I1
One of the fundamental challenges in health care markets is who device social policies on access2 . Most public and private
lack of information about the quality of medical care and tech- stakeholders that are engaged in data production on medical qual-
nology (Arrow, 1963). Information on medical product quality is ity signals have employed such mechanisms. Recently, substantial
usually generated by employing an artificial form of ‘learning by public investments were made in the US, under the umbrella term
doing’ mechanism where a selected group of individuals (doers) is “comparative effectiveness research” (CER) and patient-centered
allowed to consume alternative medical products (e.g. using stan- outcomes research (PCOR)3 , to facilitate production of such data
dard statistical designs, such as randomized assignment of patients on alternative medical technologies that are currently being used
to products). Wisdom from their experiences is disseminated to in clinical practice, albeit with incomplete knowledge about their
other individuals, who will face the choice of using these medical comparative qualities4 .
products in the near future, and to inform other decision makers,
2
There are situations where learning from own’s doing is popular, aka the
repeated use of pharmaceutical products in chronic illnesses.
3
Patient Protection and Affordable Care Act of 2009, H.R. 3590, 111th Congress
∗ Corresponding author at: Department of Pharmacy, Health Services and §6301 (2010).
4
Economics, University of Washington, Seattle, WA 98195-7660, United States. Throughout our paper, I assume the CER compares two medical technologies
Tel.: +1 206 616 2986; fax: +1 206 543 3964. that have been approved for use based on meeting the minimum safety thresh-
E-mail address: basua@uw.edu olds as those set by the Food and Drug Administration of the United States. Our
1
I am grateful for comments from Karl Claxton, David Meltzer, Justin Robertson discussions do not encompass evaluation of experimental therapies. Such discuss-
and two anonymous reviewers and support from NIH research grants RC4CA155809 ions are delegated to future work. Also see Philipson (1997) and Malani (2008) who
and R01CA155329. Opinions expressed are mine and do not reflect those of the make distinct arguments about selection in trials of experimental therapies in the
University of Washington or the NBER. presence of health insurance.
http://dx.doi.org/10.1016/j.jhealeco.2015.04.001
0167-6296/© 2015 Elsevier B.V. All rights reserved.
166 A. Basu / Journal of Health Economics 42 (2015) 165–173
In this paper, using a simple Roy’s model (Roy, 1951) of sorting control/standard treatment for a population of N patients indexed
behavior, I prove that, when incremental treatment effects are het- by i. Standard treatment may also include the do-nothing option.
erogeneous across patients who have access to these treatments Let the individual-level true treatment effects represent the bene-
under insurance, a data production infrastructure for comparative fits (net of harms) of the new treatment over the control and are
medical quality that relies on soliciting voluntary participation of denoted by bi . Let p denote the price of the new treatment, which
subjects fails to identify any interpretable treatment effect param- is also the marginal cost for manufacturing the new treatment6 .
eter. Therefore, evidence generated through this process fails to Patients are members of risk classes ˝, ˝ = 1,2,.k; k ≤ N, which
inform, objectively, either the individual patient on optimal med- determine heterogeneity in treatment effects across individuals
ical care use or a social insurer on optimal medical care insurance through a production function:
coverage5 .
Unfortunately, such a data production infrastructure is and has bi = ˛k × I(˝i = k) (1)
been the norm for CER randomized clinical trial (RCT) studies. There k
are many examples of such failures in the literature. For example, where I() is an indicator function and ˛k is interpreted as the
Ioannidis and Lau (1997) show that in human immunodeficiency true comparative effect of the new treatment over the standard
virus-related trials and trials of magnesium in acute myocardial treatment in risk class k. Let’s assume that this comparative effect
infarction, when the benefit or toxicity from a treatment varies is expressed in monetary terms. That is the effectiveness unit is
with the baseline risk of each patient, the treatment effect may multiplied with the some predefined threshold representing the
be markedly different in populations with a different representa- monetary value of the marginal unit of benefit7 .
tion of high- and low-risk patients. I show that such differential A population-level average effect parameter is given as
representation of the population in trials may be driven more fun-
damentally by patient and physician behaviors and therefore the = Pr(˝ = k) × ˛k (2)
problems of interpretation of trial results are systemic. k
The implications of this finding are substantial. Incomplete com-
There are two types of decision makers, (1) the patient-
parative quality information generated by CER RCTs research has
physician dyad, which I will refer to as the individual decision
the potential to misguide treatment choices since ex-ante percep-
maker, is assumed to always have knowledge about their risk class;
tion of benefits do not coincide with the ex-post accrual of the same,
and (2) an insurer or social planner who decides the coinsurance
resulting in welfare losses (Basu, 2011). These inefficiencies in the
rate for providing health insurance coverage for the new treatment.
choice of medical products can also accentuate the inefficiencies
due to moral hazard stemming from health insurance (Arrow, 1963;
2. Data production and incompleteness in quality
Pauly, 1968), translating to higher premiums and less protection
information
against risk, in both competitive and non-competitive insurance
markets (Basu, 2011). In this paper, I focus on understanding why
A first-best scenario can be achieved under complete informa-
current data production infrastructure leads to incomplete infor-
tion, where both the insurer and the individuals are aware of the
mation.
risk classes and the production function and are able to perfectly
I begin in the next section by laying out our ideal target param-
predict bi . If individuals had full insurance they would choose treat-
eters in CER evaluations; those that we would like to obtain
ment if bi ≥ 0. Since the insurer can fully anticipate this individual
estimates for in order to guide treatment decisions at the individual
behavior, she can provide full coverage for treatment only for those
level and policy decisions on coverage at the social level. In Sec-
individuals who would experience benefits greater than cost and
tions 2 and 3, I highlight the current data production infrastructure
not provide coverage for the rest. Thus, there is no efficiency loss
and prove why it would produce incomplete information. I study
due to moral hazard.
the implications for such incompleteness on decision-making and
Under the second-best scenario, there exist asymmetry of infor-
welfare. In Section 4, I introduce a new framework for data produc-
mation where, even though, individuals are assumed to be aware
tion that can efficiently resolve the biases inherent in the current
of ˝k and b() and to be able to combine them to predict bi perfectly,
data production infrastructure by using diversification of access
the insurer cannot as they have either no or only partial informa-
to create a conduit for learning about meaningful and decision-
tion on ˝k (Arrow, 1963; Pauly and Blavin, 2008). Consequently,
relevant effect parameters. This work unifies two broad themes
the insurer cannot exclude patients from coverage who would get
in the econometrics literature, one based on Manski’s work on
treatment benefits lower than the cost of treatment (i.e. bi − p < 0).
treatment choice under ambiguity (Manski, 2000, 2004, 2009) that
This leads to moral hazard (Pauly, 2008). To counter this, the insurer
utilizes the concept of diversification of treatment and the other
may offer coverage with a fractional coinsurance rate (r), which is
based on Heckman, Vytlacil and others’ works on estimating het-
the fraction of price a patient must pay in order to receive treat-
erogeneous treatment effects (Heckman, 1997, 2001; Heckman and
ment. When r = 1, the new medical product is not covered through
Vytlacil, 1999, 2001; Heckman et al., 2006). I show how this frame-
insurance.
work can help overcome inefficiencies in health care markets that
I assume individuals choose treatment by maximizing a generic
stem from incomplete information.
Net-Benefit criterion that is based on their perceived benefits from
treatment net of the demand price they face in acquiring the treat-
1. Defining the true population average effect of a ment. I also assume that the social insurer’s goal is to maximize
treatment consumer surplus as is realized ex post based on individual level
choices. Throughout this paper, I express the realized population
Let us begin with a problem of evaluating the compara- level benefits under different levels of coverage for the new treat-
tive effectiveness of a new (approved) treatment compared to a ment as changes to the total outcomes had all patients taken the
5 6
Note that my assertions about optimality are very general and does not depend Assume for now that the marginal cost is constant.
7
on specific welfare functions. What I prove is that the structural target parameters Under the welfare economic foundations, this threshold is the inverse marginal
on which information is required to maximize any welfare function is not informed utility of income (Weinstein and Zeckhauser, 1973; Garber and Phelps, 1997;
by current data production infrastructure. Meltzer, 1997).
A. Basu / Journal of Health Economics 42 (2015) 165–173 167
standard treatment. Under any co-insurance rate r, r ∈ [0, 1], this have perfect information on either ˛k or ˛i . However, she may have
population level benefit, H0 , is given as information about the average effect, ˛.¯ The best a social insurer can
do at this point is to calculate the average net monetary benefits of
H0 = I(˛k − r × p ≥ 0) × ˛k × I(˝i = k) (3) treatment,
i k
˛
¯ −p (5)
When individuals have complete information, they choose to
and recommend coverage if ˛ ¯ −p≥0 9.
receive the new treatment only if ˛k ≥ r × p. Individuals who would
expect to get harmed by treatment (i.e. ˛k < 0) would not select Without loss of generality,
treatment even if it were available to them for free, thereby self- Assumption 1. Let > 0, the true population average treatment
limiting the magnitude of moral hazard. effect is positive, but ˛k ’s span the whole real line.
For a social insurer’s point of view, an optimal co-insurance rate
may be expressed as a solution to maximizing H0 net of costs and Assumption 2. Let ˛
¯ − p ≥ 0 and full coverage was recommended,
taking into account the social value of risk protection provided i.e. r* = 0.
by the insurance (Manning and Marquis, 1996). Consequently, the
Theorem 1. Under Assumptions 1 and 2, LPRE (0) > L2nd (r* ) for
moral hazard (welfare loss) under optimal coinsurance rate (r* ) in
∀r* ∈ [0, 1] if ˛i ≥ 0, ∀i. The welfare loss under pre-CER information
a second-best scenario is given as
with full insurance coverage is strictly larger than the welfare loss
under any second-best scenario as long as all individuals perceive
∗ ∗
L2nd (r ) = I(r × p ≤ ˛k < p) × (p − ˛k ) × I ˝i = k a positive benefit from treatment.
i k
Proof. Under the Pre-CER scenario, two groups of individuals
p making inefficient choices drive the welfare loss. The first group
= Nk × (p − ˛k ) (4) consists of people who fail to receive treatment because their ˛i < 0
˛k =r ∗ ×p but they belong to risk groups where the treatment produces incre-
mental benefits that are more than the price of the treatment (i.e.
which constitutes the welfare loss due to the total number of indi- ˛k > p). The second group consists of individuals who would receive
viduals in each risk group (Nk ) who would choose treatment given treatment as they are led to believe that they would get a positive
the lower demand price (r × p) but ultimately obtain benefits lesser benefit (˛i ≥ 0) but realize a benefit less than the price ((i.e. ˛k < p).
than the price of treatment, i.e. r* × p ≤ ˛k < p Therefore total welfare loss is given by
Reality, however, deviates from both the first and second
best scenarios, because both individuals and the social decision
maker face incomplete comparative information. To understand
this incompleteness, one must study the data production mech- LPRE,CER = I ˛i < 0 × I (˛k − p > 0) × (˛k − p) × I ˝i = k
anisms in place. I consider and compare the circumstances before i k
and after a CER study. I begin by understanding the consequences of
incomplete information before a CER is conducted and why added + I ˛i ≥ 0 × I (˛k − p < 0) × (p − ˛k ) × I ˝i = k
investments for data productions, such as those provisions by the i k
∞
erated during the process of approving the use of this new medical
+ ˚ −1 × Nk × (p − ˛k ) (7)
product would determine an individual patient’s anticipated belief s
˛k =0
about the incremental benefits of treatment given one’s own risk
˛¯
0
class. Let this evidence suggest that the average effectof treatment
2 /n 8 , where + ˚ × Nk × (p − ˛k )
is ˛
¯ that is a random draw from Normal , s
is the average effect parameter defined in (2) and represents ˛k =−∞
the heterogeneity and is the standard deviation of the effect in the The first and the third terms in (7) are the incremental losses
population. Let individual beliefs, ˛i , be given as a single draw from due to incomplete information pre-CER. The first term is the same
the distribution Normal (˛, ¯ s2 ) where s is the estimated standard as in (6) and comprises of individuals who fail to take treatment
deviation from prior evidence. It is assumed that s2 is a consis- but would have benefited more than its price. The loss represented
tent estimator of 2 . The schedule of ˛ across individual patients
i in the third term emanate from risk groups where ˛k < 0 and a frac-
determines the marginal benefits curve in the population in the tion of individuals in these risk groups take treatment based on
absence of a CER, which is based on aggregation of individual’s their perceived positive benefits, which was not the case under the
perceived marginal benefits. Moreover, the social insurer may not second-best scenario.
8
I take a conservative approach is assuming that ˛ ¯ and ˛i are consistent estima- 9
This is, in fact, the standard method used in most cost-effectiveness modeling
tors of . To the extent this is not true, the welfare losses described below may be studies that try to evaluate the cost-effectiveness of a new approved treatment for
higher. which there is no head-to-head comparison with its alternatives.
168 A. Basu / Journal of Health Economics 42 (2015) 165–173
The second term in (7) is a pervasive benefit of incomplete for ˛K ≤ p as by construction ˛K < ˛ ¯ for all ˛K ≤ p. Therefore,
information compared to the second-best scenario (expressed LPOST∗ (r ∗ ) < LPRE (r ∗ ) ∀r ∗ ∈ [0, 1].
as negative loss). The benefit emanate from risk groups where This unambiguous dominance of an ideal CER over pre-
0 < ˛k < p and a fraction of individuals in these risk groups forgo CER scenario arises because individuals are able to better
treatment based on their perceived benefits (i.e. their ˛i < 0), self-select their optimal treatment based on the risk group
which was not the case under the second-best scenario, thereby
specific knowledge generated from an ideal CER. In fact, as
generating welfare gains. k → 0, ˚ ˛k /k → 1 for ˛k ≥ 0 and ˚ ˛k /k → 0 for ˛k < 0.
Under Assumptions 1 and 2, if one assumes that ˛i ≥ 0, ∀i, that Consequently, LPOST * (r) → L2nd (r). Therefore, one can potentially
is every individual perceives a positive benefit from treatment, approach a second-best scenario under any level of insurance cov-
Theorem 1 is proved from (7) as the first two terms drop out erage if new CER studies are able to generate information that can
and LPRE,CER (0) > L2nd (0). Thus, naturally, LPRE,CER (0) > L2nd (r* ) for enable individuals to better self-select treatments based on their
∀r* ∈ [0, 1], since L2nd (0) > L2nd (r* ) for ∀r* ∈ [0, 1]. risk classes, even if the social insurer is unaware of these heteroge-
neous effects. The growing awareness of the potential value of such
2.2. An ideal role for CER has led to considerable federal investment in CER. New legislation
such as the Affordable Care Act of 2009 has also identified the need
Often an ideal CER is construed as one having larger sample size. to risk stratify comparative effectiveness.
In fact, much of the value of information literature in medicine has However, the current data production infrastructure for CER
focused estimating the marginal value of a trial with additional may not be aligned with the goals of such legislation. The gold
patients enrolled (see literature on the Expected Value of Sample standard of data production in medical care involves controlled
Information, EVSI). However, it is not clear whether, in the presence experiments, where alternative treatments under investigation are
of heterogeneity, such an approach to CER, is welfare enhancing. allocated to a selected group of patients by a chance mechanism.
For example, increase in sample size cannot overcome the issues I will refer to such a mechanism as a randomized clinical trial
about selection into trial that we discuss later. Nevertheless, for (RCT) henceforth. I consider two issues within this data-generating
now, let’s assume that those concerns are not relevant. Then, as infrastructure that contributes towards the inability of current CER
n→ ∞, ˛−→¯ , s2 −→ 2 . Consequently, following Eq. (6) and sub- infrastructure to resolve incompleteness in information: selection
p p
in RCT enrollment and target parameters for RCTs.
stituting the post-CER estimate of average effect in it, the welfare
loss post CER with an infinite sample will be:
2.3. Non-ideal design and implementation of CER
∞
studies—Understanding selection into randomized trials
LPOST,CER(n→∞) = 1−˚ × Nk × (˛k − p)
˛k =p Unlike evaluation of experimental therapy where enrollments
may be more likely driven by altruistic motives, CER and economic
value of CER is maximized when ˛i bi , where denotes statistical where the weights wk = k × Pr ˝ = k / k × Pr ˝ = k
independence. In practice, however, it is common to find some k
and k = Pr 0 < ˛i < (COUT − CRCT ) / (1 − R ) |˝i = k . The degree
dependency between ˛i and bi . Such dependencies may arise from of selection in the trial determines the target parameter of an RCT.
biological knowledge about the treatment’s mechanism of actions, The touted internal validity of RCTs rests on obtaining a consistent
past experiences by physicians on using similar treatments on
certain patient risk-groups and by patient’s own learning by doing
estimate for this target
parameter.
When ˛i bi , F ˛i |˝i = k = F ˛i ∀k, where F() is the cumu-
mechanism in a chronic disease setting. Under such dependencies, function. This implies, k = , ∀k ⇒ wk =
effect of selection into RCT becomes non-trivial. Specifically, I show distribution
lative
Pr ˝ = k , ∀k ⇒ RCT = (according to Eqs (2) and (13)). On
Theorem 2. A CER randomized trial produces an unbiased esti-
mate of the populationaverage treatment effect (in Eq. (2)) if the contrary, if , RCT =/ since the weights would
and only if ˛i bi . If Corr ˛i , bi , > 0, RCTs will typically find small vary depending on which risk classes are more likely to enroll in
positive benefits of treatment that are likely to be biased estimate the RCT.
of true average treatment effect.
In fact, if perceived benefits are positively correlated with true
benefits, i.e. Corr ˛i , bi > 0, it implies Corr (wk , ˛k ) < 0 for ˛i > 0
Proof. I formalize selection into a CER RCT following Roy’s model
and Corr (wk , ˛k ) > 0 for ˛i ≤ 0. Individuals who correctly antic-
(Roy, 1951) of self-selection using the following notation
ipate large positive or negative benefits from treatment are less
Si = I Ui∗ ≥ 0 (10) likely to enroll in RCTs. In fact, Eqs (12) and (13) suggest that the
margin of individual who enroll in RCT anticipates a moderated
where S is an indicator for enrolling in an RCT that is driven by positive magnitude of benefits from treatment. This implies that
the latent utility U* for enrolling. Again, without loss of generality, RCT results would typically find small positive benefits of a newer
Ui * is interpreted as the anticipated incremental net benefits (net treatment and the generalizability of these results to the whole
of costs) of enrolling versus not enrolling in an RCT for individual target population remains severely compromised14 .䊏
i given that the individual anticipates a positive benefit from Consequently, in the presence of any anticipatory knowledge
treatment (i.e. ˛i > 0)13 about true treatment effects, the average effect from an RCT is not
a consistent estimator for either population average effect or the
Ui∗ = R × ˛i − CRCT − ˛i − COUT
average effect of any segment of the population: E ˆ RCT = /
= (COUT − CRCT ) − (1 − R ) × ˛i , if ˛i > 0 (11) and E ˆ RCT = / ˛k × ∀k. Next, I study how such results can mis-
where CRCT and COUT are the costs of accessing the treatment lead individual level decision-making and create inefficiencies both
within and outside an RCT, respectively; R is the known random through population-level coverage decisions and individual treat-
probability of receiving the new treatment within the CER RCT. ment selections.
In the presence of uncertainty, the population probability of an Assumption 3. In what follows, I will assume Corr ˛i , bi > 0
RCT enrollment is given by even in the absence of a formal CER.
(COUT − CRCT )
= Pr (Si ) = E Ui∗ ≥ 0 = Pr 0 < ˛i < (12) 2.4. Implications of incompleteness for decision-making
(1 − R )
Therefore, only patients who anticipate positive benefits but
whose magnitudes are less than the expected incremental cost of Theorem 3.
accessing treatment outside RCT would enroll. Interestingly, when
COUT ≈ CRCT , enrollment in CER can be quite difficult. In contrast, if (a) Under Assumption 3, CER RCT may misguide a social planner
COUT CRCT , then the probability of enrollment will approach one to provide coverage on treatments with negative average net
for everyone. Similarly, as R decreases, it reduces the cost differ- health benefits and to withhold coverage on treatments with
ential between accessing the new treatment outside and within the positive average net health benefits.
RCT, thereby lowering the probability of RCT enrollment. These fac- (b) Under Assumptions 3, LPOST (0) >=< LPRE (0). The welfare loss
tors severely limit the generalizability of results from CER RCTs. For under post-CER information can be larger than that under pre-
example, in one of the few surveys ever conducted to understand CER scenario with full insurance coverage.
the factors that determine RCT enrollment, it was found that only
2.7% of eligible patients enrolled in clinical oncology trials (Movsas Proof.
et al., 2007).
Target Parameters for RCT The goal of RCT is to estimate a (a) Based on CER RCT results, the social planner updates her belief
structurally meaningful population level parameter such as the over the average effect of the new treatment using a Bayesian
average treatment effect (ATE) of treatment compared to the con- updating rule (Basu et al., 2011):
trol. Instead, the target population ends up being defined by the
¯¯ = × ˛
˛ ˆ RCT
¯ + 1− × (14)
RCT enrollees. Consequently, its target population defines the tar-
get parameter that an RCT tries to estimate. This parameter is where the weight is determined by a weighted average
an average effect that is a weighted average of risk-class-specific of prior uncertainty 2 and the sampling variance of ˆ RCT ,
effects, where the weights are arbitrarily defined based on the risk and calculates the average net monetary benefits of treat-
class-specific propensity to enroll in the RCT. Therefore, the target ment to be ˛¯¯ − p. Under Assumption 3, Theorem 2 proves
parameter for RCT is given by
that E ˆ RCT > 0 but E ˆ RCT >=< . Therefore, since E (˛)
¯ =
RCT = wk × ˛k (13)
k
14
It is possible that under an ideal symmetric condition, the weights are such
that equivalent portions of the risk-groups with large positive effects and those
with large negative effects select out of enrolling and the average effect among the
13
If individual anticipates a negative benefit from treatment he would not consider enrollees still reflects the population average. However, such a scenario is highly
enrolling in the first place. unlikely.
170 A. Basu / Journal of Health Economics 42 (2015) 165–173
, E ˛ ˆ RCT < a and E ˛
¯¯ − p < E (( − p)) if E ¯¯ − p > k (˛i )/sk (˛i ) < ˛/s
¯ for ˛k < p, implying that more individuals
with ˛k < p are drawn to not use the treatment, the CER infra-
E (( − p)) if E ˆ RCT > .
structure is welfare enhancing (i.e. LPOST CER < LPRE,CER ).䊏
This implies that CER RCT may misguide a social planner to
provide coverage on treatments with negative average net health
benefits or to withhold coverage on treatments with positive aver- 3. Learning through diversification (LtD): A new
age net health benefits. This also highlights the fact that economic framework for data production
evaluations based on CER RCT studies can be misleading. 䊏
As I have shown in the previous sections, under some gen-
(a) Individual beliefs, ˛i , about comparative effects will also evolve eral assumptions, the current CER-RCT framework that relies on
following the CER RCT using a similar Bayesian updating rule voluntary participation fails to consistently inform either the
(Basu et al., 2011): population-level or individual-level comparative effect parameters
and cannot potentially lead us towards the second—best solutions
˛i = i × ˛i + 1 − i ×
ˆ RCT )
(in fact, it may lower welfare through evidence-based misguid-
ance).
where the weights i are determined by a weighted average Manski (2009) proposed that one way a social decision maker
of prior uncertainty s2 and the sampling variance of ˆ RCT . It
can maximize welfare is through fractional allocations, where a
is important to note that even though original beliefs
may random fraction of the patient population received one treatment
have been consistent, i.e., E ˛i = bi , after CER, E ˛i = / bi . while the other receives the alternative. Manski argued that, given
Most importantly, ˛i > ˛i if ˛i < 0, under Assumption 3,
the ambiguity of evidence on counterfactual outcomes, such an
since E ˆ RCT > 0. That is, some patients who would have had allocation would maximize a broad set of utilitarian welfare func-
originally anticipated a negative effect from treatment, may tion for the social decision maker. Manski (2009) also pointed out
be rightfully so, are now led to believe in a larger, presum- that such an allocation automatically created randomized experi-
ably, positive effect from treatment. Similarly, patients who ments, which were particularly important for learning treatment
would have, rightfully anticipated large benefits from treat- responses. The current proposal builds on this idea of “diversi-
ment, would have their updated anticipation moderated by fied treatment” proposed by Manski (2009). However, our proposal
the small effect size estimated in the RCTs. Thus, the average takes into account two realities in the context of health care.
result from a CER study that is based on voluntary participa- First is that it is almost impossible, at least in the United States,
tion actually misleads individuals about their own comparative to completely restrict “receipt” of a treatment that has crossed the
effectiveness. Following (6), the welfare loss with the post CER regulatory and evidentiary hurdles and has been approved on the
information is given by basis of safety and efficacy. Therefore, diversification of treatment
allocation in terms of “receipt”, which is essential to answer CER
∞
k ˛i and PCOR type question, is usually not possible.
LPOST,CER = 1−˚ × Nk × (˛k − p)
sk ˛i Second, the social decision maker in the context of health care
˛k =p
is typically involved on deciding on insurance coverage of medi-
cal treatment, while individual subjects are typically left to decide
p
k ˛i
+ ˚ × Nk × (p − ˛k ) , (16) on the choice of treatment given insurance coverage. Therefore, a
sk ˛i social decision maker’s problem can be viewed to be a two-step
˛k =−∞
process (Dehejia, 2005). Under any information set (i.e. alternative
where k ˛i = E ˛i |˝i = k and sk2 ˛i = Var ˛i |˝i = k . coverage decisions and CER information), first physician decides
Since k (˛i )/sk (˛i ) <=> ˛/s
¯ for any k, it proves that wel- whether to prescribe treatment for each individual. Second, given
fare loss under post-CER information can be larger than that these potential allocations, the social decision maker decides on
under pre-CER scenario with full insurance coverage. Only the level of coverage for treatment that would improve population
when k (˛i )/sk (˛i ) ≥ ˛/s
¯ for ˛k ≥ p, implying that more indi- health either by sustaining the individual choices or by incen-
viduals with ˛k ≥ p are drawn to take up the treatment, and tivizing to alter them if needed. That is, a social decision maker
Fraconal Coverage
via Techn
Technology
ology dras
Full Coverage Outcomes Evaluaon
sub-groups
for sub -groups
No Coverage
for sub-groups
Sufficiency of
Cross
Crossing
ing Evidenary
Thresholds
studies these potential allocations of treatments under different be used to identify Marginal Treatment Effect (MTE) parameters
information sets and makes the optimal coverage policy that would ((Heckman and Vytlacil, 1999):
generate the highest population benefits driven by the individual
allocation of treatments that would follow that policy. To the extent
∂Eϑ Y |Ð , ˝
= E (Y1 − Y0 ) |˝, V = v = MTE ˝, v , (17)
that one can combine the ideas of diversified treatment for learning ∂p
to that of the two-step process of social decision making on opti- where Y = D × Y1 + (1 − D) × Y0 is the observed outcome, unobserved
mal coverage, one can improve the decision making for both the confounders are V ∼ Uniform[0,1] by construction and probability
individual patients and the social decision maker. Here I propose a of treatment choice given the lottery can be represented by p(Ð, ˝).
“Learning through Diversification” (LtD) infrastructure (Fig. 1) that Basu (2014) extends the LIV methods to identify Person-centered
can potentially mimic the ideal CER designs discussed in Section treatment (PeT) effects, which, for persons who choose treatment,
2.2. The three main features of a LtD infrastructure are: follow
(1) Fractional coverage can be achieved using a technology lottery, EV |˝,P(Ð ),D E Y1 − Y0 |˝, P (Ð ) , D = 1
Ð: For each new product, develop a random order based on,
P(Ð )
say, birth dates so that this new product with uncertain effec- −1
(18)
tiveness profile will be paid at varying levels by insurance (in = E Y1 − Y0 |˝, V < P (Ð ) = P(Ð ) MTE ˝, v dv
continuous fashion) in the first year. That is, such coverage cre- 0
ates a completely stochastic distribution of co-insurance rates
Similarly, conditional effect for a person who did not choose
in the population, F(bi |Ð) = F(bi ). Note that the lottery is done
treatment is obtained by integrating MTEs over values of V greater
anew for each new technology so that the probability that any
than p. Mean treatment effect parameters, ˛k (Eq. (1)) or (Eq.
one person would be denied coverage for all new technologies
(2)) are readily obtained by average PeT effects over respective
will approach zero with increasing number of technologies.
subgroups (Basu, 2014).
(2) Outcomes evaluation: Using the randomization inherent in the
lottery, evaluating patient outcomes across different levels of
3.1.3. Welfare effects
coinsurance rates will directly answer the economic evalua-
Let’s take a two period model in which the first period is the
tion questions on expanding coverage for the target population.
Pre-CER period during which a CER study is being conducted. As
Additionally, the lottery would serve as a perfect instrumental
the end of the first period, the CER study results are disseminated
variable (IV) to study the comparative effectiveness of receiving
and therefore the second period represents the post-CER world.
the new product versus its competitor and the heterogeneity
Therefore, under Assumptions 1–3, total welfare loss over the two
in these effects in the population (Heckman, 1996, 1997, 2001;
periods in a CER-based data production world is given as:
Heckman and Vytlacil, 1999, 2005; Heckman et al., 2006; Basu,
2014) without the challenges of voluntary participation in CER LCER = LPRE,CER +LPOST,CER (19)
RCTs.
(3) Sequential decision making: Based on the outcome evaluation where LPRE,CER and LPOST,CER are given in Eqs (7) and (16), respec-
results, fractional allocation rules can be adapted over time tively.
for specific risk groups. Fractional allocation would continue Under the LtD framework of data production, welfare losses in
within risk groups where ambiguity persists. Optimal stopping both periods will be different. Let the total welfare loss over the
rules for fractional coverage can be developed using Bayesian two periods in a LtD-based data production world is given as:
methods. LLtD = LPRE,LtD + LPOST,LtD
3.1. Key Features of the Learning through Diversification Since the LtD infrastructure allows for consistent estimation of
Infrastructure the mean treatment effect parameters, in the second period, sub-
jective beliefs about the benefits of treatment will align with the
3.1.1. Coinsurance (demand price) as an instrument true values for subjects in each risk groups, E ˛ ˆ k = ˛k . Moreover,
Traditional IV analyses focus around the debate on whether a estimates
given that these data
were generated using at the popula-
chosen instrument is contaminated, given that the strength of the tion scale, ˚ ˛ ˆ k /ˆ k → 1 for ˛k ≥ 0 and ˚ ˛
ˆ k /ˆ k → 0 for ˛k < 0.
instrument is testable. In the LtD framework, the lottery, by design, Consequently,
is orthogonal to all confounders and therefore side steps the typical ∞
˛ˆk
debates in this literature. The strength of the instrument is driven LPOST,LtD = 1−˚ × Nk × (˛k − p)
by variation in out-of-pocket payments by patients that in turn will ˆ k
˛k =p
depend on the market price of the new technology and the price
elasticity of demand. The LtD infrastructure would be most efficient
p
˛
ˆk
in data production for CER for new technologies that come at a high + ˚ × Nk × (p − ˛k ) → 0 (20)
ˆ k
price tag, which aligns with the notion, that most welfare can be ˛k =−∞
generated if we can properly identify people who would and would
Therefore, an LtD infrastructure will always be welfare enhancing
not benefit from the most expensive technologies.
compared to the CER infrastructure as long as
3.1.2. The target parameters in the LtD infrastructure LLtD − LCER = LPRE,LtD + LPOST,LtD − LPRE,CER + LPOST,CER
Meaningful and interpretable structural parameters for evalua-
tion can be recovered using data arising out of an LtD infrastucture < 0 → (LPRE,LtD ) < (LPRE,CER + LPOST,CER )
(Heckman and Vytlacil, 1999, 2001; Heckman et al., 2006, Basu,
2014)15 . For example, local instrumental variable approaches can
correlated with factors such as income (because of the price differentials that the
15
Note that it is important to pay close attention to dealing with essential het- lottery creates), which in turn may be correlated with gains and losses from the new
erogeneity within the LtD infrastructure. This is because treatment receipt will be treatment.
172 A. Basu / Journal of Health Economics 42 (2015) 165–173
That is fractional allocation should be designed in a way that econometric tools available to researchers. Both clinical guidelines
the loss during the initial data generation process is not greater and coverage decisions can then be sequentially revised to reflect
than the combined losses under the CER infrastructure both during this evidence. I show that under non-stringent conditions, the LtD
and after the data generation process. To obtain the most power infrastructure will be welfare enhancing compared to the current
for analyses and consistency within the LtD infrastructure, it may data production infrastructure, such as CER.
be useful to set the mean coinsurance rate to be 0.5. The welfare One aspect of the LtD infrastructure that would appear to be
losses, if any, during the data production period (that may be one politically challenging is the notion of fractional coverage, albeit it
or few years) can be easily recuperated from the welfare gains from is for a short time during the introduction of the new treatment.
LtD in the post data-production period that typically lasts for many A full legal and ethical consideration of such random allocation is
years. beyond the scope of this paper. However, it is important to note
that unlike earlier discussions in this line of reasoning that revolved
around quasi-random treatment prescription (Manski, 2009), the
4. Conclusions LtD infrastructure does not withhold treatment from anyone, but
rather changes the cost of accessing it in a random fashion. The
Regulatory bodies often approves a new medical treatment potential for patient welfare and the richness of scientific and policy
based on its potential safety profile and its incremental efficacy question that this infrastructure can answer should play a part in
compared to either placebo or a basic control treatment. Often deciding its ultimate feasibility.
superiority of the new treatment is not established and its compar-
ative effectiveness compared to current clinical practice remains
ambiguous. Nevertheless, the treatment becomes available for
consumption at a substantial price in anticipation of a positive References
effectiveness claims based on efficacy results. Variability in effec-
Arrow, K.J., 1963. Uncertainty and the welfare economics of medical care. The Amer-
tiveness profile remains far from known. Under such ambiguity, a ican Economic Review 53 (5), 941–973.
social insurer faces the challenge of deciding whether to pay for Basu, A., 2011. Economics of individualization in comparative effectiveness research
the treatment. In the US, public health insurance provider like the and a basis for a patient-centered health care. Journal of Health Economics 30
(3), 549–559.
Medicare usually extend full coverage of these new treatments as Basu, A., 2014. Person-centered treatment (PeT) effects using instrumental vari-
long as there is positive efficacy signals. Other countries, like UK and ables: An application to evaluating prostate cancer treatments. Journal of
Canada, formally look at the budget impact of coverage by compar- Applied Econometrics 29 (4), 671–691.
Basu, A., Jena, A.B., Philipson, T.J., 2011. The impact of comparative effectiveness
ing the costs of treatments (inclusive of its price) to the projected
research on health and health care spending. Journal of Health Economics 30
effectiveness based on efficacy signals. When coverage is allowed, (4), 695–706.
a large welfare loss may ensue even when the new treatment can Dehejia, R.H., 2005. Program evaluation as a decision problem. Journal of Economet-
genuinely produce higher effectiveness in a certain margin of the rics 125, 141–173.
Garber, A., Phelps, C., 1997. Economic foundations of cost-effectiveness analysis.
population. This is due to the lack of evidence of how to match Journal of Health Economics 16, 1–31.
patients to alternative treatments. Heckman, J.J., 1996. Randomization as an instrumental variable. The Review of Eco-
In this paper, I show that under the status quo policy of extend- nomics and Statistics 78 (2), 336–341.
Heckman, J.J., 1997. Instrumental variables: a study of implicit behavioral assump-
ing coverage to a new treatment in the absence of complete tions used in making program evaluations. Journal of Human Resources 32 (3),
information on its effectiveness profile, welfare loss can be substan- 441–462.
tial. These losses can be minimized by investments in studies that Heckman, J.J., 2001. Accounting for heterogeneity, diversity and general equi-
librium in evaluating social programmes. The Economic Journal 111,
aims at generating such evidence. However, I also show, following F654–F699.
a Roy’s model of sorting behavior, that the current infrastructure Heckman, J.J., Urzua, S., Vytlacil, E., 2006. Understanding instrumental variables in
on data production for this purpose, suffer for severe self-selection models with essential heterogeneity. Review of Economics and Statistics 88 (3),
389–432.
issues since the incentives to enroll in research studies is eroded
Heckman, J.J., Vytlacil, E., 1999. Local instrumental variables and latent variable mod-
by the low demand prices of obtaining medical care outside these els for identifying and bounding treatment effects. Proceedings of the National
studies. Consequently, the parameters identified from these stud- Academy of Sciences 96 (8), 4730–4734.
Heckman, J.J., Vytlacil, E., 2001. Local instrumental variables. In: Hsiao, C., Morimue,
ies do not inform any of the decision-relevant parameters, either
K., Powell, J.L. (Eds.), Nonlinear Statistical Modeling: Proceedings of the Thir-
at the individual or the population level. I show that if one takes teenth International Symposium in Economic Theory and Econometrics: Essays
the normative approach of a social insurer who is forward look- in the Honor of Takeshi Amemiya. Cambridge University Press, New York, NY,
ing and wants to maximize any given social welfare function pp. 1–46.
Heckman, J.J., Vytlacil, E., 2005. Structural equations, treatment effects and econo-
over a duration of time period (typically over the longevity of metric policy evaluation. Econometrica 73 (3), 669–738.
the new technology being considered), then it makes sense for Ioannidis, J.P., Lau, J., 1997. The impact of high-risk patients on the results of clinical
the social insurer, irrespective of what coverage decision is made trials. Journal of Clinical Epidemiology 50 (10), 1089–1098.
Malani, A., 2008. Patient enrollment in medical trials: selection bias in a randomized
today, to device ways to learn about variations in incremental experiment. Journal of Econometrics 144, 341–351.
effectiveness of treatment in the population so that she can encour- Manning, W.G., Marquis, M.S., 1996. Health insurance: the trade-off between
age/discourage appropriate subgroups to uptake/discard the new risk pooling and moral hazard. Journal of Health Economics 15 (5),
609–639.
treatment. In fact, generating such public evidence can directly Manski, C., 2000. Identification problems and decisions under ambiguity: empiri-
inform individuals within the population to use this new treat- cal analysis of treatment response and normative analysis of treatment choice.
ment appropriately without additional effort by the social insurer, Journal of Econometrics 95, 415–442.
Manski, C., 2004. Statistical treatment rules for heterogeneous populations. Econo-
thereby approaching the second-best solutions. Based on this
metrica 72, 1221–1246.
normative framework, I propose a positive Learning through Diver- Manski, C., 2009. The 2009 Lawrence R. Klein Lecture: Diversified treatment under
sification (LtD) infrastructure, through which a social insurer can ambiguity. International Economic Review 50 (4), 1013–1041.
Meltzer, D., 1997. Accounting for future costs in medical cost-effectiveness analysis.
achieve her objectives.
Journal of Health Economics 16, 33–64.
The LtD infrastructure comprises of introducing the new Movsas, B., Moughan, J., Owen, J., Coia, L.R., Zelefsky, M.J., Hanks, G., Wilson, J.F.,
treatment with fractional coverage based random individual-level 2007. Who enrolls onto clinical oncology trials? A radiation patterns of care
co-insurance rates. One then uses these co-insurance rates as an study analysis. International Journal of Radiation Oncology, Biology, Physics 68
(4), 1145–1150.
artificially created, but an almost perfect, instrumental variable Pauly, M.V., 1968. The economics of moral hazard: comment. The American Eco-
to study treatment effect heterogeneity based on a spectrum of nomic Review 58 (3), 531–537.
A. Basu / Journal of Health Economics 42 (2015) 165–173 173
Pauly, M.V., 2008. Adverse selection and moral hazard: implications for health insur- Philipson, T.J., 1997. The evaluation of new health care technology: the labor eco-
ance markets. In: Sloan, F, Kasper, H (Eds.), Incentives and Choice in Health and nomics of statistics. Journal of Econometrics 76, 375–395.
Health Care. MIT Press, Cambridge, MA. Roy, A.D., 1951. Some thoughts on the distribution of earnings. Oxford Economic
Pauly, M.V., Blavin, F.E., 2008. Moral hazard in insurance, value-based cost shar- Papers 3 (2), 135–146.
ing, and the benefits of blissful ignoring. Journal of Health Economics 27, Weinstein, M., Zeckhauser, R., 1973. Critical ratios and efficient allocation. Journal
1407–1417. of Public Economics 2, 147–158.
Journal of Health Economics 42 (2015) 174–185
a r t i c l e i n f o a b s t r a c t
Article history: In this paper, we present estimates of the effect of informal care provision on female caregivers’ health.
Received 24 October 2013 We use data from the German Socio-Economic Panel and assess effects up to seven years after care
Received in revised form 13 January 2015 provision. The results suggest that there is a considerable negative short-term effect of informal care
Accepted 4 March 2015
provision on mental health which fades out over time. Five years after care provision the effect is still
Available online 3 April 2015
negative but smaller and insignificant. Both short- and medium-term effects on physical health are virtu-
ally zero throughout. A simulation analysis is used to assess the sensitivity of the results with respect to
JEL classification:
potential deviations from the conditional independence assumption in the regression adjusted matching
I10
I18
approach.
C21 © 2015 Elsevier B.V. All rights reserved.
J14
Keywords:
Informal care
Regression adjusted matching
Propensity score matching
Mental health
Physical health
1. Introduction costs of long-term care in the European Union (EU 27) to increase
from 1.2% of GDP in 2007 to 2.5% in 2060 (Alzheimer’s Disease
Europe’s societies are getting older. Low birthrates and popu- International, 2013).
lation ageing due to technological progress in medicine shift the Already today, costs are one reason why many governments pre-
age structure towards higher shares of elderly individuals. This has fer informal care (care provision of close relatives and friends) over
strong implications for labour markets and social security systems professional formal care provision. In Germany, for instance, the
with the long-term care sector as one important part of those. The public long-term care insurance paid 700D per month in 2012 for
World Alzheimer Report, for instance, expects, as a result of grow- care recipients of the highest care level who are cared by family
ing numbers of people in need of long-term care, publicly funded members and 1550D per month to the same recipient cared by
professional caregivers. Germany is a country in which long-term
care is still predominantly regarded the task of the family (Schulz,
夽 We thank Martin Fischer, Pilar García-Gómez, Audrey Laporte, Jan Marcus, 2010) and informal care is more common than in comparable states
Jürgen Maurer, Alfredo Paloyo, Stefan Pichler, and Arndt Reichert for valuable sug- like the Netherlands (Bakx et al., 2015). More than one million offi-
gestions. We further thank two anonymous referees and two editors for very helpful cial care recipients (about 46% of all) are exclusively cared by family
comments. Moreover, we are grateful for comments at the 22nd European Work- members rendering informal care the most important part of the
shop on Econometrics and Health Economics (Rotterdam), the meeting of the health
economics section of the VfS (Hamburg), the annual meeting of the dggö (Essen),
German long-term care system.
the CINCH health economics seminar in Essen, the CINCH academy, the economics However, provision of informal care is both mentally and phys-
of disease conference in Darmstadt, and seminars in Bayreuth and Paderborn. All ically challenging. We, therefore, analyse the question of whether
errors are our own. Financial support by the Fritz Thyssen Stiftung is gratefully there are some hidden costs – or costs often neglected in the pub-
acknowledged.
∗ Corresponding author at: University of Paderborn, Warburger Strasse 100, 33098 lic debate – that make informal care provision not as economic as
Paderborn, Germany. Tel.: +49 5251 603213. often thought. This could be the case if informal care provision goes
E-mail address: hendrik.schmitz@uni-paderborn.de (H. Schmitz). along with health impairments of the caregivers. Other costs (not
http://dx.doi.org/10.1016/j.jhealeco.2015.03.002
0167-6296/© 2015 Elsevier B.V. All rights reserved.
H. Schmitz, M. Westphal / Journal of Health Economics 42 (2015) 174–185 175
considered here but heavily analysed in the economic literature1 ) justify the conditional independence assumption that would allow
are forgone income for those who leave the labour force to provide for a causal interpretation of the results. To be more precise, we
care. use a regression adjusted matching approach. Although we argue
The economic literature on health effects of caregiving is fairly below that, given our, data we can justify the conditional indepen-
scarce.2 To the best of our knowledge, there are only three stud- dence assumption, we allow in a sensitivity analysis that follows
ies on the effect of care provision on health in a narrow sense. Coe Ichino et al. (2008) for certain deviations from this assumption.
and van Houtven (2009) estimate health effects of informal care- Second, to the best of our knowledge, this is the first study
giving in the US using seven waves of the Health and Retirement that does not only look at contemporary, or short-term effects
Survey (HRS). They use sibling characteristics and the death of the of informal care provision on health, but also on medium-term
mother as instrumental variables that control for selection into and effects of up to seven years after care provision. By medium-term
out of caregiving in order to identify causal effects. They find that effects we mean: if a women provides care in a certain year, what
continued caregiving leads to a significant increase in depressive is her expected change in health up to seven years afterwards.
symptoms for both sexes while physical health does not seem to be This adds on work by Coe and van Houtven (2009) who also dis-
affected. Do et al. (2015) use data from South Korea where informal cuss persistence of health effects but need to stick to a two year
care is quite common among females caring for their parents-in- period. Medium-term consequences could be more severe than
law. The data allow identifying a health effect for daughters-in-law instantaneous short-term health impacts restricted to the period
where selection into care is taken into account by instrumenting of providing care. Moreover, knowledge about the persistence of
the informal care decision with parents-in-law’s health endow- health effects is arguably more important for policy makers than
ment. Their findings suggest that there is an increased probability about short-run effects only.
of worse physical health by providing informal care. Di Novi et al. The results suggest that there is a considerable negative short-
(2013) use the first two waves of SHARE to estimate the effect of term effect of informal care provision on mental health which,
caregiving on self-rated health and quality of life, measured by the however, fades out over time. Five years after care provision the
CASP-12. They find positive effects of care provision on self-rated effect is still negative but smaller and insignificant. Both short- and
health (seen as a measure of physical health) and mixed evidence medium-term effects on physical health are virtually zero through-
regarding quality of life (seen as a measure of mental health). out. The sensitivity analysis suggests that sensible deviations from
Two further papers evaluate the relationship of caregiving and the conditional independence assumption do not change these
caregiver drug utilisation. On the one hand, drug intake could be results.
seen as an objective measure of poor health. On the other hand, it The paper is organized as follows. Section 2 briefly outlines the
sheds light on direct costs of caregiving. Van Houtven et al. (2005) institutional setting of long-term care in Germany. Section 3 dis-
assess the impact of caring on the intake of drugs using data on cusses the empirical approach, Section 4 presents the data. The
caregivers for US veterans. One finding is that the intensive care results are reported in Section 5 while Section 6 assesses the sensi-
margin is an important factor for drug intake. Schmitz and Stroka tivity of the results. Section 7 concludes.
(2013) exploit data of a large German sickness fund that enables
to consider prescriptions of anti-depressants and drugs to restore
physical health. Their results support Van Houtven et al. (2005), 2. Institutional background
providing some evidence that caregiving increases the intake of
anti-depressants in particular if coupled with having a job. Other The German social long-term care insurance system was
studies look at broader welfare consequences of caring and use introduced in 1995 as a pay-as-you-go system. It is financed by
life satisfaction as a proxy (Bobinac et al., 2010, Van den Berg and a mandatory pay payroll tax deduction of currently 2.35% of gross
Ferrer-i Carbonell, 2007, Leigh, 2010, van den Berg et al., 2014). One labour income (2.6% for employees without children). In order to
issue with these studies is that they do not address reverse causality qualify for benefits, individuals need to be officially defined as care
and selection problems based on time-varying unobserved hetero- recipients and be classified into one of three care levels. In care
geneity. level one individuals need support in physical activities for at least
We use representative household data from the German Socio- 90 min per day and household help for several times a week. Indi-
Economic Panel to estimate the effects of informal care provision viduals in need of more care are classified into care levels two or
on female caregivers’ health. The outcome variables are mental three, where the benefits increase in care levels.
and physical summary scale measures (called MCS and PCS) for Benefits also depend on the type of care, where monthly pay-
the years 2002 to 2010 that capture the multidimensional nature ments for informal care range from 235D (level one) to 700D (level
of health. Our contributions to the literature on health and infor- three), for professional ambulatory care from 450D to 1550D and
mal care are twofold: First, we use a different approach to address for professional nursing home care from 1023D to 1500D . The lat-
selection into and out of care provision. Except for Di Novi et al. ter, in particular, does not fully cover the expenses for nursing home
(2013), previous studies that deal with endogeneity problems all visits and copayments of up to 50% are standard. Copayments for
use instrumental variables approaches. We try to identify the effect professional ambulatory care are smaller and amount to an aver-
of caring using different assumptions that can put the literature on age of 247D or about 20% (Schmidt and Schneekloth, 2011). Social
a broader basis and thereby complement it. Our approach is to fully welfare may step in if individuals are not able to bear the copay-
exploit the time dimension and richness of panel data in order to ment. Thus, the decision for formal or informal ambulatory care is
usually not driven by financial aspects as each care recipient who
is assigned a care level is entitled to benefits for all kinds of care.
1
The introduction of the insurance system in 1995 stressed the
E.g., Carmichael and Charles, 2003; Heitmueller, 2007; Heitmueller and Inglis,
2007; Bolin et al., 2008; Leigh, 2010; Van Houtven et al., 2013; Meng, 2013. family as the main provider of care, as it is thought to provide
2
In the medical literature, there is a fair amount of studies on the relationship of care cheaper, more agreeable, and more efficiently. From the care
health and care provision. They mainly stem from the US (see e.g., Schulz et al., 1995; recipient’s perspective, the decision to receive informal care typi-
Stephen et al., 2001; Gallicchio et al., 2002; Tennstedt et al., 1992; Beach et al., 2000; cally expresses a preference for being cared by familiar relatives
Ho et al., 2009; Shaw et al., 1999; Lee et al., 2003; Dunkin and Anderson-Hanley,
1998; Colvez et al., 2002). In general, these studies use non-representative samples
or friends. In some cases, informal care recipients are addition-
and widely disregard endogeneity problems. Furthermore, they often concentrate ally supported by professional carers. These are, on average older
on more specific definitions of care, such as caring for people with dementia. recipients with a higher care level and, thus, a higher care burden
176 H. Schmitz, M. Westphal / Journal of Health Economics 42 (2015) 174–185
(Schulz, 2010). Apart from the care burden, a reason for profes-
sional care can be the absence of appropriate informal caregivers,
either because they chose to only participate in the labour market
or because their own physical or mental health conditions prohibits
the full amount of necessary care provision.
From the caregiver’s perspective, affection and sense of respon-
sibility towards a loved parent or spouse mainly drive the decision
to provide care. Although the insurance benefits for informal care
are often passed on to the care provider this comparably small
Fig. 1. Basic time structure.
amount cannot be regarded a financial incentive to provide care,
as it is also needed to cover other expenses for care provision (see
Schmidt and Schneekloth, 2011 for all points). However, the insur- if it does not hold and both, regression model and propensity score
ance funds do pay pension contributions for informal carers who estimation are wrongly specified, the estimates are biased. The esti-
provide care at least 14 h a week (Schulz, 2010). In 2002, people mation strategy is a two-step process, originally proposed by Bang
cared on average 14 h per week for care recipients whose assess- and Robins (2005). As a first step, the probability of being a care-
ment of needs is at least classified as the lowest official category giver (the propensity score) conditional on relevant covariates is
(Schneekloth and Leven, 2003). estimated with a probit model. Subsequently, treatment and con-
Between 2001 and 2011 there were only minor adjustments trol group are matched. We use an Epanechnikov kernel with a
to the German long-term care system. They were minor because bandwidth of 0.03 in the basic specification. To further increase
benefits were increased but only to keep pace with the inflation the comparability, the sample is restricted to the common support
(Rothgang, 2010) and, thus, did not change the incentives to provide of the propensity scores of the treatment and control group.
care. As of 2008, employed individuals are allowed to take a 10 day As a second step, the health outcome is regressed on informal
(not repeatable) unpaid leave to organize or provide care in case of care and, again, all control variables where the observations are
an incidence of care dependency in the family. However, only very weighted by the kernel weights W estimated by the matching algo-
few caregivers make use of this.3 Thus, the tasks of informal care- rithm: ˇ ˆ = (X WX)−1 X y. Standard errors are computed according
givers, the composition of caregivers and care recipients as well as to the suggestion of Marcus (2014) who employs robust standard
financial incentives remained fairly similar over time. errors of the regression above since they are slightly more con-
servative but easier to estimate than bootstrapped standard errors
3. Empirical strategy that, in addition, are not formally justified.4 However, we cluster
standard errors on the individual level since individuals appear
We aim at estimating the effect of informal care provision on several times in the data set.
health. Certainly, the decision to provide care is not random per se. We employ the time structure as presented in Fig. 1. Assign-
Given that someone close becomes care dependent, some individ- ment to treatment T occurs in t = 0. We condition on a large set
uals choose to provide care while others do not. The willingness to of covariates in t = −1, thus reducing the potential problem that
provide care depends on factors such as the financial and temporal covariates are affected by the treatment status. We, then, compute
affordability, own health endowment as well as innate tendencies the treatment effect four times: 1 year after treatment, 3 years after
such as personality traits. treatment, 5 years after treatment, and 7 years after treatment. Note
To deal with this problem we apply the model of Rubin (1974). that conditioning variables and treatment group assignment are
Following his notation we observe Y = T · Y1 + (1 − T) · Y0 , where T always the same and determined in t = −1 and t = 0, respectively. As
indicates whether an individual is assigned to treatment (2 h of explained in Section 4, the outcome variable is available biannually
informal daily care provision, but we will also consider alterna- between 2002 and 2010 in our data set. Since we condition on pre-
tive definitions) or control group, Y is the outcome (health), and treatment outcome (see explanation below), the earliest possible
the index {0, 1} indicates the potential health outcome of being a treatment year is 2003. We use the maximum available informa-
caregiver or not. If we simply compare the realized outcomes, i.e., tion in the data and pool it to one sample. Then, individuals treated
E(Y1 |T = 1) − E(Y0 |T = 0), selection bias will most likely arise due to in 2003 (call this wave 1) can be followed until t = 7 in 2010 whereas
the non-randomness of care provision. However, the average treat- individuals treated in 2009 (call this wave 4) can only be followed
ment effect on the treated (ATT) can be identified if the conditional until t = 1. Hence, the effect in t = 1 will be measured more precisely
independence assumption holds and assignment to treatment is than the one in t = 7.
random conditional on controls: Y1 , Y0 ⊥ T|X. That is, if all the deter- Even though we condition on a large set of covariates that are
minants that simultaneously influence the health outcome and the supposed to capture the process of the decision to provide care,
selection into treatment are observed. Then, ATT = E(Y1 − Y0 |T = 1, X) there are probably some threats to the conditional independence
is the causal ceteris paribus impact of informal care provision on assumption. First, there might be health driven selection into
health. treatment. Individuals who are confronted with the question to
We use propensity score methods to estimate this effect and provide care but are themselves in poor health might not be able
combine matching with regression methods, thus employing the to do so. As informal care provision is both physically and mentally
so called regression adjusted matching approach (see, for exam- challenging, this possible selection holds for both dimensions of
ple, Rubin, 1979). The advantage to using either only matching or health. If this is indeed the case and informal care provision has
linear regression is that it yields consistent estimates if either one negative health effects, ignoring this reverse causality problem
of each method fails to remove the selection bias. This is called the would lead to an underestimation of the true effects (in absolute
double robustness property (Bang and Robins, 2005). Nevertheless, values). We follow, e.g., Lechner (2009a) and García-Gómez (2011)
this method rests on the conditional independence assumption and and match individuals on pre-treatment outcomes (here, health
status in t = −1), thus only comparing individuals of the same
3
Schmidt and Schneekloth (2011) report that only 9000 out of possibly 150,000
4
made use of this until 2011. The most frequent reason for not making use in their We can confirm this finding in our data. Bootstrapped standard errors yield
survey was that individuals were not aware of the possibility. slightly less conservative standard errors.
H. Schmitz, M. Westphal / Journal of Health Economics 42 (2015) 174–185 177
Fig. 2. Group assignment rules. Note: 1 = providing care; 0 = not providing care; X = care status not specified (= either 1 or 0). Right panel does not include all possible paths
but only a small excerpt.
Table 2
Sample size.
Source: SOEP, own calculations. Number in parentheses is the share among all individuals with positive hours of care. Hours of care are measured in t = 0 only.
not allow for a link between caregiver and care recipient. Hence, et al., 2007). Thus, both variables capture the multidimensional
we have no information on the care recipient and we are not able aspect of health. The scales range from 0 to 100, normalised to
to stratify our analysis with respect to her (e.g., in order to eval- mean values of 50 and standard deviations of 10 in the 2004 refer-
uate differences between caring for spouses or parents). This is a ence sample. Higher values mean a better health status. MCS loads
common shortcoming in this literature. information on perceived melancholy, time pressure, mental bal-
Table 3 gives a notion of the duration of care episodes. It counts ance and emotional problems into one summary scale.10 The SF-12
the consecutive years individuals provide care of at least 2 h per is commonly used to measure general health and functioning in
day. In presenting the numbers we distinguish between uncen- epidemiological research (Ware et al., 1996). It includes informa-
sored spells (of individuals that are observed to provide no care tion on subjective health but the component summary scales are
before and after a care episode) and censored spells (individuals correlated actual with health diagnoses. For example, Gill et al.
that either enter the sample as caregivers or are caregivers at the (2007) find that MCS”is a useful screening instrument for depres-
end of the observation period). Due to the sample construction, sion and anxiety disorders in the general community, and thus, a
there are many right censored individuals which complicates the valid measure of mental health”. This view is supported by Vilagut
interpretation of the table somewhat. What should be taken away et al. (2013) who find”acceptable results for detecting both active
from it is that the vast majority has care spells of about one to three and recent depressive disorders in general population samples”.
years. Therefore, the effects after seven years are mainly driven by This property could build the bridge between the short-term symp-
individuals who had shorter caregiving episodes. Individuals who toms that are measured to longer-lasting health consequences that
constantly care over many years hardly add to the results.9 are thus also captured by this summary scale. Salyers et al. (2000)
The two outcome measures are a mental and a physical health regard it as a valid and reliable instrument to measure health-
score that are based on information from the SF-12v2 question- related quality of life. Recently, MCS has also been used in the
naire, a component of the SOEP, which includes twelve questions economic literature where it was shown to be correlated with, e.g.,
on mental and physical health. All items capture the general cur- unemployment (Schmitz, 2011; Reichert and Tauchmann, 2011),
rent mental and physical health status since all questions relate and unemployment of spouses (Marcus, 2013). MCS and PCS were
to the past four weeks, see the questionnaire in Table A2 in the first introduced in the SOEP in 2002 and subsequently sampled
Appendix. Answers to these questions are collapsed into the Men- every other year. This is why we restrict our observation period
tal Component Summary Scale (MCS) and the Physical Component to the years 2002–2010.
Summary Scale (PCS) by explorative factor analysis (see, Andersen
9 10
This is due to the very low number of observations. Moreover, these 19 indi- The physical component comprises: Physical fitness (2 Questions), general
viduals caring throughout in our sample exhibit a mean MCS of 45.81 (compared health, bodily pain, role physical (2). The mental component comprises: Mental
to 49.38 overall). Thus, they do not affect the results in a quantitatively important health (2), role emotional (2), social functioning, vitality. See the questionnaire in
way. Table A2 in the Appendix.
H. Schmitz, M. Westphal / Journal of Health Economics 42 (2015) 174–185 179
Table 3
Care duration.
thereof:
Left censored Observations 80 27 16 11 10 6 4 19 173
Share 46% 16% 9% 7% 6% 3% 2% 1% 100%
Source: SOEP, own calculations. Uncensored individuals did not provide care in t = −1 and stopped caregiving some time before t = 7. Therefore, the maximum observable
care duration is 7 years. In contrast to the empirical analysis in the rest of the paper, this table uses information up to the wave of 2011 or t = 8 in order to be able to calculate
the number of individuals who exactly care for 7 years.
We now turn to the selection of the control variables. Taking on the SOEP and in years after the treatment assignment,13 they are
the burden of care could theoretically be modeled as a three-stage useful controls because these measures are supposed to be stable
process. Women provide care if (i) they need to. Given that they over a shorter period of time. The individual average of each mea-
need to provide care, they (ii) must be willing to do so. Finally, (iii), sure is taken over all years as a proxy for time invariant personality.
they need to be able to provide care.11 Finally, on the third stage, the own health status determines the
At the first stage, the event that someone close becomes care ability to provide care. As discussed in Section ‘2, we control for pre-
dependent is a prerequisite of the need to provide informal care. treatment health (MCS and PCS). Moreover, we control for health
This first stage in general depends on the age and the intra-familial satisfaction and life satisfaction. All control variables are listed in
social environment. We model the social environment by using Table 4. Variables that might theoretically belong into the model
indicators whether parents are alive, their age as well as the num- but were not significant in the propensity score regression are left
ber of siblings.12 The latter can reduce the need to provide care for out. This holds, for instance, for income, the age of the father, the
frail parents as siblings could step in. Variables on this stage are number of brothers, or calendar year dummies.
sometimes employed as instruments for care provision in other
studies.
5. Results
At the second stage, given that someone close is in need of
care, the willingness to provide care can be modeled as a function
5.1. Matching quality
of socio-economic characteristics and personality traits. Socio-
economic characteristics grouped in here are, e.g., own age, marital
Table 4 reports descriptive statistics of all covariates for different
status, employment status, and level of education. Note, however,
subgroups. It reveals that the mean as well as the standard deviation
that family background variables might also belong to the first
of the covariates are significantly different in the unweighted base-
stage. For instance, singles do not need to care for a spouse or
line sample. Column 4 gives the standardized difference between
parents-in-law. Furthermore, we use character traits measured
both means. Without matching almost all confounders are differ-
in the Big Five Inventory (Big5), well-known in psychology for
ent at the 5% significance level between the carer and non-carer
being a proxy of human personality (see McRae and John, 1992
sample. In particular age, the age of the mother, and marital sta-
or Dehne and Schupp, 2007) as well as positive and negative reci-
tus exhibit large differences but also personality traits seem to be
procity. Although the SOEP captures each item of the Big5 with
quite strong predictors of care provision. The kernel matching algo-
relatively few questions in the 2005 and 2009 questionnaires, sur-
rithm equalizes both samples by assigning different weights to each
veys revealed sufficient validity and reliability (see Dehne and
member of the control group. In order to compute these weights, we
Schupp, 2007). The items of the Big5 are: neuroticism, the ten-
employ an Epanechnikov kernel with a bandwidth of 0.03. Whereas
dency of experience negative emotions; extraversion, the tendency
a bandwidth of 0.06 does not accomplish to equalize all covariates,
to be sociable; openness, the tendency of being imaginable and
a bandwidth of half the size balances every control variable to a
creative; agreeableness, the dimension of interpersonal relations
standardized bias around 5 or less.
and conscientiousness the dimension of being moral and orga-
As regards the propensity score, the regions of common sup-
nized (see Budria and Ferrer-i Carbonell, 2012). There are three
port are roughly [0.04, 0.14] for the stratum of women who did not
questions for each of these items which are gathered on a 7-item
provide care in t = −1 and [0.23, 0.87] for those who did provide
scale. Furthermore, there is positive reciprocity, the tendency of
care. The overlap within each stratum is good as we do not lose
being cooperative and negative reciprocity, the tendency of being
treatment observations by restricting the sample to the common
retaliatory. For each personality measure, the score is generated
support.14 The low probabilities in the first stratum are simply due
by averaging over the outcome of the corresponding questions per
to the small amount of caregivers. This indicates that there is a
individual. Although these questions are only prompted twice in
large unobserved component determining caregiving. But we argue
that this unobserved heterogeneity is not a big concern given the
estimation strategy outlined in Section 3. Yet, there is one advan-
tage of this fuzziness: It brings about a sufficiently large amount
11
Note that we do not explicitly model this three-stage process but that we just
have it in mind. Which variable belongs to which stage is then just a matter of
interpretation.
12 13
However, the number of brothers does not seem to play a role statistically. Thus, The Big5 are included in the surveys in 2005 and 2009, whereas questions on
in the empirical model we only focus on the number of sisters. An alternative spec- negative and positive reciprocity are asked in 2005 and 2010.
14
ification using that – among others – also uses the number of brothers can be found Of course, this also means that the required overlap condition stating that some
in the Supplementary Material. randomness is needed is ensured in our model (see Heckman et al., 1998).
180 H. Schmitz, M. Westphal / Journal of Health Economics 42 (2015) 174–185
Table 4
Descriptive statistics according to treatment and matching status.
(0.06) (0.03)
The standardized difference is calculated according to: Diff = 100 · x̄1 −x̄0
where 0.06 and 0.03 refer to the employed Kernel bandwidth. While the bandwidth of 0.06
1 ( 2 + 2 )
2 1 0
is only shown for sake of illustration, 0.03 is used in the estimations.
of observations in the control group having a similar value of the those who did care in t = −1. The confidence bands are wider for
estimated propensity score. This provides a hint that the results are care continuers, since this is a much smaller group. The weighted
not sensitive to a different choice of the matching methods. average over both effects has confidence bands comparable to the
black ones in Fig. 3.
The effects are remarkably similar for both groups. If a woman
5.2. Estimation results cares at least 2 h per day, her mental health score decreases by
2.00 units (or 20% of a standard deviation, SD)15 in the first year,
The baseline estimation results are reported in Fig. 3 for both all other things equal. Three years after treatment assignment, this
outcome variables MCS (3(a)) and PCS (3(b)). For convenience, effect reduces to 16% of a SD before settling at below 12% five and
we restrict this section to a graphical presentation of the results. seven years afterwards. That is, women who provide care in t = 0 can
Table A1 in the Appendix gives an overview of all results shown expect to have a reduced mental health score by 12% of a SD seven
in this section. The dotted lines denote 95% confidence bands for
the corresponding effect. Fig. 3 reports the results for both pre-
treatment strata separately. Care starters (black points) are those
who did not care in t = −1 and care continuers (light grey points) 15
For convenience we already report the average effect over both groups here.
H. Schmitz, M. Westphal / Journal of Health Economics 42 (2015) 174–185 181
Fig. 3. Baseline results MCS and PCS. Source: SOEP. Own calculations. Note: The dotted lines indicate 95% confidence bands.
years after. The confidence bands indicate significant results at the In order to test if this drives the results, we exclude all individ-
5% level one and three years after assignment to treatment. The uals from the control group that provided care in any year between
effects five and seven years after are insignificant because the point t = 1 and t = 7. That is, we only use individuals in the lowest path in
estimates attenuate but in the first place because the numbers of Fig. 2(b). In principle, this is not a desirable specification as it bases
observations strongly drop. The magnitude of the effect after seven the control group definition on later outcomes. Thus, it should only
years, however, is still 60% the amount of the baseline effect and be regarded as a brief check whether these individuals drive the
thus, not negligible. All in all it is fair to note that, independent of results observed above. Fig. 4(b) shows that this is not the case. The
the previous care status, there is a considerable short-term effect of results are largely the same.
care provision on mental health (in line with findings from previous The results suggest a significant short-term effect of informal
studies, e.g., Coe and van Houtven, 2009) which decreases over time care-provision on mental health while there is a smaller and not
without being completely irrelevant in its extent to those who care. significant medium-term effect. Given that the vast majority of
In contrast, for PCS (right panel), there is basically a zero effect individuals provide care for about one to three years, the main
throughout all periods and for both strata, providing evidence for pathway of these effects is probably the following one. Contem-
negligible effects of informal care provision on physical health. poraneously, care provision is a mentally burdensome task. The
Given the absence of physical health effects, we restrict our analysis short-term effects are mostly generated by individuals who just
to mental health in the following. Moreover, we only report aver- stopped to provide care or who are still providing care in t = 1. As
aged effects of both strata of care provision in t = −1. Fig. 4 presents to be expected, this effect increases in care intensity. Yet, after the
the results for alternative daily care intensities and different def- care episode ceased, individuals recover and their mental health
initions of the control group.16 In Fig. 4(a) we compare the effect status approaches former levels.
when care provision of at least 2 h per day are used to define the The short-term effect is not necessarily entirely due to care pro-
treatment indicator (light grey-dashed line, the baseline specifica- vision. It might be a joint effect of care and the observation of the
tion) with 1 h per day (black line) and 3 h per day (dark grey-dashed decline of a beloved person. As most of the previous literature,
line). There are basically no differences in the effect between 1 and we cannot disentangle the family effect from the active caregiving
2 h of care as a definition. As regards 3 h of care we find a consid- effect. As results of Bobinac et al. (2010) suggest, the overall effect
erably stronger short-term effect with a reduction of MCS by 31% is a mixture of both but a caregiving effect remains after control-
of a sd. This probably reflects a higher burden of higher care inten- ling for the family effect. Yet, this does not affect the interpretation
sities. Subsequently, however, the effect does not remain on this of the medium-run effect of almost no mental health impairment
high level. It immediately drops back to regions similar to those a couple of years after care provision. Given that the effect in t = 7
for 1 and 2 h. Most notably, the qualitative result of a considerable is very small, it can be concluded that there is less evidence for
short-term effect and a much smaller medium-term effect remains a scarring effect of care provision. Moreover, since only handful
unchanged regardless of the care intensity.17 individuals in the sample care throughout the entire observation
The definition of treatment and control group only in t = 0 allows period, this result can apparently not be explained by an adaptation
for cases where individuals in the control group start to provide care effect of care providers to their new situation.
in later years. This is in fact the case for some 15% of all observations In Section 4 we mentioned that we cannot stratify the analysis
in the control group. It might be suspected that these individuals with respect to the care recipient as we do not have information
suffer from a short-term mental health drop later which, compared on who is being cared. We can, however, approach such an anal-
with the effects in the treatment group, lead to the observed rel- ysis by splitting the sample into caregivers below and above the
ative decline in the mental health drop of the treatment group. age of 60. The former group has a higher likelihood to care for a
parent while the latter should be more likely to care for a spouse.
Note that stronger restrictions such an age cutoff at 70 or groups
such as unmarried women with at least one parent alive are hardly
16
In the Supplementary Material, we also report the results for females caring 4 h
feasible due to strongly reduced numbers of observations. Fig. 5
and more. The results are comparable.
17
Although not shown here, also the PCS results are robust to these different
shows the effect over time for both subgroups. Initially, they coin-
definitions. cide nearly perfectly. Five years after care is observed, they deviate
182 H. Schmitz, M. Westphal / Journal of Health Economics 42 (2015) 174–185
Fig. 4. Alternative definitions of treatment and control groups (MCS only). Source: SOEP. Own calculations. Note: The dotted lines indicate 95% confidence bands.
from each other. Whereas younger carers drop back almost to the parent alive who arguably can be identified as caring for their par-
initial level, for older carers the impact on their mental score is ents.
even stronger. The results could be interpreted such that the active
caregiving effect does not depend on the care recipient. However,
a likely family effect might arguably be stronger in case of carepro- 6. Sensitivity analysis
vision for a spouse than for oldest old parents. However, the effects
come closer after seven years and due to large confidence bands Thus far, we have argued that our estimation strategy allows us
one should interpret these results cautiously. to interpret the results in a causal manner since, by fully exploiting
Altogether, the results from this section could be interpreted as the panel information in the SOEP, the conditional independence
good news. While there is a considerable negative short-term effect assumption is likely to hold. However, this inherently untestable
of contemporaneous caregiving, the scarring effect is less likely to assumption might nevertheless fail. For example, in the context of
be prevalent. One negative interpretation for these results could, care, it might be particularly challenging to properly control for
however, be an increased consumption of antidepressants as found intrinsic willingness to provide care. Yet, the conditional indepen-
by Van Houtven et al. (2005) and Schmitz and Stroka (2013) for the dence assumption is not necessarily an “all or nothing” assumption
short run. If this would hold for the long run, the mental health and there might be different degrees of its violation. To examine
score might increase over time due to drug consumption and not to what extent the magnitude and the significance of our results
due to improved health. Whether this is the case or not requires depend on the potential exclusion of a relevant variable, we follow
long-term data on care and drug consumption and is left for future an approach by Ichino et al. (2008) who refined the suggestions for
research. sensitivity analyses by Rosenbaum and Rubin (1983) and Imbens
In the Supplementary Material we report results from alterna- (2003). This analysis is also in the spirit of the one suggested by
tive specifications of the propensity score, the treatment indicator Altonji et al. (2005) without the need to make strong parametric
and a subgroup analysis for unmarried women with at least one assumptions.
Assume that the conditional independence assumption does not
hold but that the failure is due to an unobserved variable U. If
we could condition on it, we would be able to restore conditional
independence:
Y0 ⊥⊥ T |(X, U).
18
This section contains a non-technical and intuitive discussion of the analysis. A
Fig. 5. Alternative definitions of treatment and control groups (MCS only). Source: more detailed account is provided in the Supplementary Material published online.
SOEP. Own calculations. Note: The dotted lines indicate 95% confidence bands. For an extensive treatment, refer to Ichino et al. (2008).
H. Schmitz, M. Westphal / Journal of Health Economics 42 (2015) 174–185 183
19
We use a modified version of the user-written Stata command sensatt
20
(Nannicini, 2007). http://www.bmg.bund.de/ministerium/presse/english-version.html
184 H. Schmitz, M. Westphal / Journal of Health Economics 42 (2015) 174–185
problems, or professional short-term care (also overnight) in case health outcome in seven years (irrespective of future events that I
of short-term absence of informal care providers due to sickness, cannot control today).
obligations in the job, or holidays. Thus, while family members
will certainly continue to play an important role in care provision, Appendix A.
these measures are thought to assist them and to reduce the most
stressful aspects of care. Tables A1 and A2
The measured effect in this study is an average effect over dif-
ferent groups of care providers. Schmitz and Stroka (2013), for
instance, focus on individuals who not only provide informal care Table A1
but also work full-time. This double burden might well also have Table of results.
health effects in the longer run. This question is left for future t=1 t=3 t=5 t=7
research. The main limitations in this study arise from the imperfect
Care 2 h per day (baseline) −2.00*** −1.64*** −1.01 −1.19
data set. Both measures of care provision as well as health indica- (0.39) (0.47) (0.62) (0.86)
tors are self-reported and potentially measured with error. We do ...care starters −2.03*** −1.67*** −1.02 −1.21
not observe any characteristics of the care recipient. Hence, we can- (0.40) (0.49) (0.64) (0.88)
not distinguish between the family effect that occurs just because a ...continued care −1.42** −0.93* −0.70 −0.94
(0.61) (0.74) (0.89) (1.51)
close relative is in need of care and the caregiving effect. However, Care 3 h per day −3.02*** −1.44** −1.00 −1.64
this should not qualitatively affect the interpretation of the already (0.53) (0.68) (0.78) (1.12)
small medium-term effect. Likewise, as it is not observed whether Care 1 hr per day −1.90*** −1.59*** −0.48 −0.97
care recipients receive additional professional care or only informal (0.31) (0.39) (0.48) (0.67)
Observations 28,622 20,288 12,254 5,552
care, we cannot discriminate between cases in which the caregiver
assists professional care and in which she is the only care provider. Only never carers in control group
Moreover, due to data restrictions we are not able to identify the Care 2 h per day −2.08*** −1.56*** −1.25** −0.787
(0.27) (0.34) (0.46) (0.69)
cumulative effect of care provision for many consecutive years. This
Observations:
might go along with even long-run health impairments. However, 25,914 18,464 11,301 5,166
our representative data suggest that only a very small group of
PCS as outcome:
women is faced with the need (and willingness) to provide care Care 2 h per day 0.14 0.14 0.08 −1.01
for many consecutive years. Moreover, we argue that our approach (0.33) (0.40) (0.49) (0.84)
allows us to answer a question that is more relevant from an indi- Source: SOEP, own calculations. Note: * p < 0.1; ** p < 0.05; *** p < 0.01 indicate the cor-
vidual perspective: if I provide care today, what is my expected responding significance level. Standard errors are in parantheses.
Table A2
SF-12v2 questionnaire in the SOEP.
When you ascend stairs, i.e. go up several floors on foot: Does your
state of health affect you greatly, slightly or not at all?
And what about having to cope with other tiring everyday tasks,
i.e. where one has to lift something heavy or where one requires
agility: Does your state of health affect you greatly, slightly or
not at all?
Please think about the last four weeks. How often did it occur Always Often Sometimes Almost never Never
within this period of time, . . .
Appendix B. Supplementary Data Lechner, M., 2009b. Sequential causal models for the evaluation of labor market
programs. Journal of Business and Economic Statistics 27, 71–83.
Lechner, M., Miquel, R., 2010. Identification of the effects of dynamic treatments by
Supplementary data associated with this article can be found, in sequential conditional independence assumptions. Empirical Economics 39 (1),
the online version, at http://dx.doi.org/10.1016/j.jhealeco.2015.03. 111–137.
002 Lechner, M., Miquel, R., Wunsch, C., 2011. Long-run effects of public sector sponsored
training in West Germany. Journal of the European Economic Association 9 (4),
742–784.
References Lee, S., Colditz, G.A., Berkman, L.F., Kawachi, I., 2003. Caregiving and risk of coronary
heart disease in U.S. women: a prospective study. American Journal of Preventive
Altonji, J.G., Elder, T.E., Taber, C.R., 2005. Selection on observed and unobserved Medicine 24 (2), 113–119.
variables: assessing the effectiveness of catholic schools. Journal of Political Leigh, A., 2010. Informal care and labor market participation. Labour Economics 17
Economy 113 (1), 151–184. (1), 140–149.
Alzheimer’s Disease International, 2013. World Alzheimer Report 2013. Journey Marcus, J., 2013. The effect of unemployment on the mental health of spouses
of Caring – An Analysis of Long-Term Care for Dementia. Technical Report. – evidence from plant closures in Germany. Journal of Health Economics 32,
Alzheimer’s Disease International (ADI), London. 546–558.
Andersen, H.H., Mühlbacher, A., Nübling, M., Schupp, J., Wagner, G.G., 2007. Compu- Marcus, J., 2014. Does job loss make you smoke and gain weight? Economica 324
tation of standard values for physical and mental health scale scores using the (81), 626–648.
SOEP version of SF-12v2. Schmollers Jahrbuch 127, 171–182. McRae, R.R., John, O.P., 1992. An introduction to the five factor model and its appli-
Augurzky, B., Reichert, A., Schmidt, C.M., 2012. The effect of a bonus program for cations. Journal of Personality and Social Psychology 60 (2), 175–215.
preventive health behavior on health expenditures. Ruhr Economic Papers 373, Meng, A., 2013. Informal home care and labor-force participation of household
Essen. members. Empirical Economics 44 (2), 959–979.
Bakx, P., de Meijer, C., Schut, F., van Doorslaer, E., 2015. Going formal or informal, Nannicini, T., 2007. Simulation-based sensitivity analysis for matching estimators.
who cares? The influence of public long-term care insurance. Health Economics Stata Journal 7 (3), 334–350.
24 (6), 631–643. Reichert, A., Tauchmann, H., 2011. The Causal Impact of Fear of Unemployment on
Bang, H., Robins, J.M., 2005. Doubly robust estimation in missing data and causal Psychological Health. Ruhr Economic Papers 266.
inference models. Biometrics 61 (4), 962–973. Rosenbaum, P.R., Rubin, D.B., 1983. Assessing sensitivity to an unobserved binary
Beach, S.R., Schulz, R., Yee, J.L., Jackson, S., 2000. Negative and positive health effects covariate in an observational study with binary outcome. Journal of the Royal
of caring for a disabled spouse: longitudinal findings from the caregiver health Statistical Society, Series B: Methodological 45 (2), 212–218.
effects study. Psychology and Aging 15 (2), 259–271. Rothgang, H., 2010. Social insurance for long-term care: an evaluation of the German
Bobinac, A., van Exel, N.J.A., Rutten, F.F., Brouwer, W.B., 2010. Caring for and caring model. Social Policy and Administration 44 (4), 436–460.
about: disentangling the caregiver effect and the family effect. Journal of Health Rubin, D.B., 1974. Estimating causal effects of treatments in randomized and non-
Economics 29 (4), 549–556. randomized studies. Journal of Educational Psychology 56 (5), 688–701.
Bolin, K., Lindgren, B., Lundborg, P., 2008. Your next of kin or your own career? Rubin, D.B., 1979. Using multivariate matched sampling and regression adjustment
Caring and working among the 50+ of europe. Journal of Health Economics 27 to control bias in observational studies. Journal of the American Statistical Asso-
(3), 718–738. ciation 74 (366), 318–328.
Budria, S., Ferrer-i Carbonell, A., 2012. Income comparisons and non-cognitive skills. Salyers, M.P., Bosworth, H.B., Swanson, J.W., Lamb-Pagone, J., Osher, F.C., 2000. Reli-
In: SOEPpapers No. 441., pp. 1–29. ability and validity of the sf-12 health survey among people with severe mental
Carmichael, F., Charles, S., 2003. The opportunity costs of informal care: does gender illness. Medical Care 38 (11), 1141–1150.
matter? Journal of Health Economics 22 (5), 781–803. Schmidt, M., Schneekloth, U., 2011. Abschlussbericht zur Studie “Wirkungen des
Coe, N.B., van Houtven, C.H., 2009. Caring for mom and neglecting yourself? The Pflege-Weiterentwicklungsgesetzes”.
health effects of caring for an elderly parent. Health Economics 18 (9), 991–1010. Schmitz, H., 2011. Why are the unemployed in worse health? The causal effect of
Colvez, A., Joel, M.-E., Ponton-Sanchez, A., Royer, A.-C., 2002. Health status and work unemployment on health. Labour Economics 18 (1), 71–78.
burden of Alzheimer patients’ informal caregivers: comparisons of five different Schmitz, H., Stroka, M.A., 2013. Health and the double burden of full-time work and
care programs in the European Union. Health Policy 60 (3), 219–233. informal care provision: evidence from administrative data. Labour Economics
Dehne, M., Schupp, J., 2007. Persoenlichkeitsmerkmale im Sozio-ökonomischen 24, 305–322.
Panel (SOEP) – Konzept, Umsetzung und empirische Eigenschaften. Technical Schneekloth, U., Leven, I., 2003. Hilfe und Pflegebedürftige in Privathaushalten in
report. Deutschland 2002.
Di Novi, C., Jacobs, R., Migheli, M., 2013. The Quality of Life of Female Informal Care- Schulz, E., 2010. The Long-Term Care System for the Elderly in Germany. DIW Dis-
givers: From Scandinavia to the Mediterranean Sea. CHE Research Paper 84. cussion Paper 1039, DIW Berlin.
Centre for Health Economics, University of York. Schulz, R., O’Brien, A., Bookwala, J., Fleissner, K., 1995. Psychiatric and physical
Do, Y.K., Norton, E.C., Stearns, S., Houtven, C.H.V., 2015. Informal care and caregiver’s morbidity effects of dementia caregiving: prevalence, correlates, and causes.
health. Health Economics 24 (2), 224–237. Gerontologist 35 (6), 771–791.
Dunkin, J.J., Anderson-Hanley, C., 1998. Dementia caregiver burden – a review of Shaw, W.S., Patterson, T.L., Ziegler, M.G., Dimsdale, J.E., Semple, S.J., Grant, I., 1999.
the literature and guidelines for assessment and intervention. Neurology 51 (1), Accelerated risk of hypertensive blood pressure recordings among Alzheimer
53–60. caregivers. Journal of Psychosomatic Research 46 (3), 215–227.
Gallicchio, L., Siddiqi, N., Langenberg, P., Baumgarten, M., 2002. Gender differences Stephen, M.A., Townsend, A.L., Martire, L.M., Druley, J.A., 2001. Balancing parent
in burden and depression among informal caregivers of demented elders in the care with other roles: interrole conflict of adult daughter caregivers. The Jour-
community. International Journal of Geriatric Psychiatry 17 (2), 154–163. nals of Gerontology, Series B: Psychological Sciences and Social Sciences 56 (1),
García-Gómez, P., 2011. Institutions, health shocks and labour market outcomes P24–P31.
across Europe. Journal of Health Economics 30 (1), 200–213. Tennstedt, S., Cafferata, G.L., Sullivan, L., 1992. Depression among caregivers of
Gill, S.C., Butterworth, P., Rodgers, B., Mackinnon, A., 2007. Validity of the men- impaired elders. Journal of Ageing and Health 4 (1), 58–76.
tal health component scale of the 12-item short-form health survey (mcs-12) Van den Berg, B., Ferrer-i Carbonell, A., 2007. Monetary valuation of informal care:
as measure of common mental disorders in the general population. Psychiatry the well-being valuation method. Health Economics 16 (11), 1227–1244.
Research 152 (1), 63–71. van den Berg, B., Fiebig, D.G., Hall, J., 2014. Well-being losses due to care-giving.
Heckman, J., Ichimura, H., Smith, J., Todd, P., 1998. Characterizing selection bias using Journal of Health Economics 35 (0), 123–131.
experimental data. Econometrica 66 (5), 1017–1098. Van Houtven, C., Wilson, M., Clipp, E., 2005. Informal care intensity and caregiver
Heitmueller, A., 2007. The chicken or the egg? Endogeneity in labour market par- drug utilization. Review of Economics of the Household 3 (4), 415–433.
ticipation of informal carers in England. Journal of Health Economics 26 (3), Van Houtven, C.H., Coe, N.B., Skira, M.M., 2013. The effect of informal care on work
536–559. and wages. Journal of Health Economics 32 (1), 240–252.
Heitmueller, A., Inglis, K., 2007. The earnings of informal carers: wage differentials Vilagut, G., Forero, C.G., Pinto-Meza, A., Haro, J.M., de Graaf, R., Bruffaerts, R., Kovess,
and opportunity costs. Journal of Health Economics 26 (4), 821–841. V., de Girolamo, G., Matschinger, H., Ferrer, M., Alonso, J., 2013. The mental
Ho, S.C., Chan, A., Woo, J., Chong, P., Sham, A., 2009. Impact of caregiving on health and component of the short-form 12 health survey (sf-12) as a measure of depres-
quality of life: a comparative population-based study of caregivers for elderly sive disorders in the general population: results with three alternative scoring
persons and noncaregivers. The Journals of Gerontology, Series A: Biological methods. Value in Health 16 (4), 564–573.
Sciences and Medical Sciences 64 (8), 873–879. Wagner, G.G., Frick, J.R., Schupp, J., 2007. The German Socio-Economic Panel
Ichino, A., Mealli, F., Nannicini, T., 2008. From temporary help jobs to permanent Study (SOEP), scope, evolution, and enhancements. Journal of Applied Social
employment: what can we learn from matching estimators and their sensitivity. Science Studies (Schmollers Jahrbuch: Zeitschrift für Wirtschafts- und Sozial-
Journal of Applied Econometrics 23, 305–327. wissenschaften) 127 (1), 139–169.
Imbens, G.W., 2003. Sensitivity to exogeneity assumption in program evaluation. Ware, J.E., Kosinski, M., Keller, S.D., 1996. A 12-item short-form health survey: con-
American Economic Review 93 (2), 126–132. struction of scales and preliminary tests of reliability and validity. Medical Care
Lechner, M., 2009a. Long-run labour market and health effects of individual sport 34 (3), 220–233.
activities. Journal of Health Economics 28, 839–854.
Journal of Health Economics 42 (2015) 186–196
a r t i c l e i n f o a b s t r a c t
Article history: We analyzed a nationwide registry of all pregnancies in Uruguay during 2007–2013 to assess the impact of
Received 23 April 2014 three types of tobacco control policies: (1) provider-level interventions aimed at the treatment of nicotine
Received in revised form 13 April 2015 dependence, (2) national-level increases in cigarette taxes, and (3) national-level non-price regulation
Accepted 20 April 2015
of cigarette packaging and marketing. We estimated models of smoking cessation during pregnancy at
Available online 29 April 2015
the individual, provider and national levels. The rate of smoking cessation during pregnancy increased
from 15.4% in 2007 to 42.7% in 2013. National-level non-price policies had the largest estimated impact
JEL classification:
on cessation. The price response of the tobacco industry attenuated the effects of tax increases. While
I18
I12
provider-level interventions had a significant effect, they were adopted by relatively few health centers.
D12 Quitting during pregnancy increased birth weight by an estimated 188 g. Tobacco control measures had
no effect on the birth weight of newborns of non-smoking women.
Keywords: © 2015 Elsevier B.V. All rights reserved.
Economic evaluation
Cigarette taxes
Package warnings
Advertising bans
Tobacco control
http://dx.doi.org/10.1016/j.jhealeco.2015.04.002
0167-6296/© 2015 Elsevier B.V. All rights reserved.
J.E. Harris et al. / Journal of Health Economics 42 (2015) 186–196 187
not address the quantitative contributions of individual campaign 2. Background and data
components. Pursuing that objective here, we classify the inter-
ventions implemented in Uruguay during 2007–2013 into three 2.1. Nationwide anti-smoking policies
categories: (1) provider-level interventions aimed at the treatment
of nicotine dependence, (2) national-level increases in cigarette In 2005, one year after the legislature had ratified the Frame-
taxes, and (3) national-level non-price regulation of cigarette pack- work Convention on Tobacco Control, Uruguay’s newly elected
aging and marketing. We study the effects of these individual administration launched a National Program for Tobacco Control
campaign components on a critical target population – pregnant that formed the basis for a succession of progressively more strin-
women. gent tobacco control policies (Abascal et al., 2012). In March 2006,
Studying the population of pregnant women is important not all enclosed public spaces and all public and private workspaces
only for the well-recognized adverse health consequences of were declared 100% smoke-free. In June 2008, the scope of tobacco-
smoking during pregnancy (Permutt and Hebel, 1989; da Veiga free spaces was extended to taxis, buses, airplanes and other public
and Wilder, 2008; McCowan et al., 2009), but also for the nar- transport.
row nine-month window during which pregnant women have These curbs on environmental tobacco smoke were paralleled
heightened susceptibility to health-related interventions. We take by a series of advertising restrictions on tobacco products. In May
advantage of a continuous nationwide registry of all live preg- 2005 the government banned cigarette advertising on television
nancies from 2007 to 2013 to study the effects of the campaign during children’s viewing hours (before 9:30 pm) and prohibited
on two main outcomes: the probability that a pregnant smoker advertising, promotion or sponsorship by tobacco companies of all
will quit smoking by her third trimester and her infant’s birth sporting events. These restrictions were subsequently codified in
weight. March 2008, when comprehensive tobacco control legislation (Law
To identify the effect of the provider-level interventions, we 18.256) prohibited all advertising and promotion of tobacco prod-
use a difference-in-differences (DID) approach, exploiting the fact ucts except at point of sale. In October 2008, logos, trademarks and
that these policies were implemented at different health centers other tobacco-related symbols were banned on non-tobacco prod-
at different times. To assess the effect of taxes, we rely upon a ucts. In May 2014, all advertising was prohibited, even at the point
series of discrete tax increases during our study period. Finally, to of sale.
assess the effects of non-price regulation of packaging and market- In addition, the Uruguayan government promulgated warning
ing, we take advantage of the fact that these nationwide measures requirements on cigarette packages and imposed restrictions on
went into effect at different times. As an additional control, we manufacturers’ branding practices. A May 2005 ministerial decree
compare the effect of these interventions on the birth weight of banned all references to “light,” “ultra light,” “mild,” “low tar” and
children whose mothers smoked during pregnancy with the cor- other descriptors that might misleadingly imply reduced harm. The
responding effect, if any, on the offspring of mothers who did not decree also mandated a series of rotating warnings with images
smoke. covering 50% of the front and back of each cigarette pack. The
Our study contributes to an extensive literature evaluating deadline for compliance with the first round of these rotating war-
the impact of such tobacco control policies as tax increases, nings was April 2006. Subsequent rounds had respective deadlines
control of environmental tobacco smoke, cigarette pack war- of December 2007, February 2009, February 2010, January 2012,
nings, restrictions on cigarette marketing, regulation of tobacco and April 2013. A “single presentation rule,” issued as a minis-
constituents, mass media anti-smoking campaigns, and the treat- terial decree along with the third round of warnings, barred the
ment of addiction (Saffer and Chaloupka, 2000; Wakefield and marketing of multiple versions of the same brand, such as Silver
Chaloupka, 2000; Powell et al., 2005; Blecher, 2008; Carpenter and or Blue. Finally, a 2009 decree mandated that the size of the war-
Cook, 2008; DeCicca et al., 2008; Anger et al., 2011; Hammond, nings be increased to 80% of the front and back of each pack. This
2011; Hoek et al., 2011; Chaloupka et al., 2012; Emery et al., requirement was implemented with the fourth round of warnings
2012; Mons et al., 2013). Our work is distinguishable in that and became effective by February 2010.1
we exploit an extensive micro database to evaluate the relative Fig. 1 shows a timeline summarizing the major nationwide
impacts of multiple types of interventions in the context of a non-price regulatory measures from 2005 to 2013. The blue text
nationwide tobacco control campaign conducted in a developing describes each of the six rounds of package warnings, while the
country. boldface red text describes regulatory measures other than the
We find persuasive evidence on the impact of each of the three mandated warnings. The black lines point to the compliance dead-
policy categories analyzed – provider-level interventions, taxes, lines for each regulatory measure.2
and non-price policies – on the likelihood of quitting smoking Fig. 2 further describes the six rounds of rotating package war-
during pregnancy and on birth weight. In terms of the relative nings. In each round, we show only one of several mandated images.
contributions of each of these policies to the observed increase The relative sizes of the images in the figure correspond to their rel-
in quit rates, the regulation of marketing and packaging had the ative sizes on each pack, with the last three rounds reflecting the
strongest effect, accounting for 71% of the total observed varia- required increase from 50% to 80% of the front and back surfaces.
tion in quit rates during 2007–2013. While interventions to treat
nicotine dependence had a strong effect at the level of the indi- 2.2. Smoking cessation programs directed at healthcare providers
vidual provider, relatively few prenatal care centers adopted these
interventions during the study period, thus contributing little to In 2008, the comprehensive tobacco control law mandated that
the overall increase in the quit rate. Tax increases, on the other every primary care provider, whether public or private, incorporate
hand, explained an estimated 25% of the variation in quit rates dur-
ing 2007–2013. While real taxes increased 122% during that time,
the tobacco-industry passed on only a fraction of the tax increases 1
This “80% rule” was promulgated 3 months before the issuance of the fourth
to consumers, so that real cigarette price increased by only 17%. round of images. However, we have no evidence of significant compliance with the
80% rule before the deadline for compliance with the fourth round of images.
Finally, we find that smoking cessation was associated with a sig- 2
With the exception of the comprehensive tobacco control law, all measures
nificant increase in birth weight. By contrast, the tobacco control provided for a 180-day compliance period. By specifying the end of the compliance
policies under study had no effect on the birth weight of offspring period as the effective date of each measure, we assumed that tobacco manufactur-
of mothers who did not smoke. ers waited until each deadline to comply.
188 J.E. Harris et al. / Journal of Health Economics 42 (2015) 186–196
3rd round of
package warnings.
Comprehensive tobacco
Brands restricted to
control legislation.
a single presentation.
All advertising except
point-of-sale banned. 4th round of
package warnings.
Warnings must cover
2nd round of 80% of both front & back.
package warnings.
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014
Fig. 1. Timeline of nationwide non-price tobacco control measures. The blue text refers to the deadlines for each of the six rounds of rotating package warnings, while the
boldface red text refers to other tobacco control measures.
the diagnosis and treatment of tobacco dependence into its menu centers that had no agreements with the FNR were required
of basic services. Pursuant to this legislation, in 2009 the Ministry to provide smoking cessation services in accordance with the
of Public Health and the National Resource Fund (“Fondo Nacional guidelines, but they were permitted to charge nontrivial copay-
de Recursos” or FNR), the governmental agency responsible for ments to patients. In what follows, we refer to these agreements
financing resource-intensive medical technologies, established between health centers and the FNR as “provider-level agree-
national guidelines for primary care providers on the diagnosis ments.”
and treatment of nicotine dependence. Through a set of agree- Among all sites providing prenatal care to pregnant women, the
ments (“convenios”) with the FNR, healthcare institutions were proportion with FNR agreements increased from 7 to 12% during
eligible to receive training and free nicotine patches and bupro- 2005–2013. Concurrently, the proportion of all pregnant women
pion in return for setting up a smoking cessation program with receiving prenatal care at sites with FNR agreements increased from
little or no patient copayments (Esteves et al., 2011). Health 13% in 2005 to 36% in 2007, but then declined to 31% by 2013.
Fig. 2. Timeline of six rounds of rotating package warnings. Each round displays only one of several mandated images. The relative sizes of the images correspond to their
relative sizes on each pack, with the last three rounds reflecting the required increase from 50% to 80% of the front and back surfaces.
J.E. Harris et al. / Journal of Health Economics 42 (2015) 186–196 189
80
Uruguayan Pesos per Pack of 20 Cigarettes
70
60
(Base = December 2010)
50
Real Price
40
Real Taxes
30
20 (Excise + VAT)
A B C D E F G H
10
0
Fig. 3. Real price and real taxes per pack of cigarettes, 2001–2013. The vertical lines show the timing of non-price nationwide policy measures, as described in Fig. 1.
(A) Smoking prohibited in all enclosed public spaces and all public and private workspaces. (B) 1st round of package warnings. (C) 2nd round of package warnings. (D)
Comprehensive tobacco control legislation. (E) 3rd round of package warnings; brands restricted to a single presentation. (F) 4th round of package warnings; warnings must
cover 80% of front and back. (G) 5th round of package warnings. (H) 6th round of package warnings.
2.3. Cigarette tax increases information at the level of the individual pregnancy on maternal
characteristics, self-reported smoking behavior, current and past
In addition to the foregoing policy interventions, the Uruguayan obstetric history, the timing of prenatal care, the sites of prenatal
government increased its indirect taxes on tobacco products. care and delivery, and birth outcomes including birth weight (CLAP,
Imposed solely at the national level, these taxes consist of an 2001). In 2012, the SIP covered an estimated 94% of all live births
excise tax (“impuesto específico interno” or IMESI) and a value in Uruguay.
added tax (“impuesto al valor agregado” or IVA). The IMESI, which Our analyses relied upon the following individual-level mater-
was first applied to cigarettes in 1993, underwent a series of dis- nal characteristics, derived from the SIP registry: the timing of
crete increases in June 2002, May 2003, July 2007, June 2009, and the first prenatal visit (first-trimester prenatal care); the mother’s
February 2010. The IVA, by contrast, was first applied to cigarettes age (<16, 17–19, 20–34, 35–39, and 40+ years), marital status
in July 2007 and since then has constituted 22% of the pre-tax (single, married, cohabiting, other), and educational attainment
price including the IMESI or, equivalently, 18% of the retail price. (primary, secondary, university); the number of prior deliveries
Fig. 3 shows the estimated real price and real tax on a pack of (0, 1, 2, 3, 4+); number of prior abortions; a history of diabetes
cigarettes during 2001–2013. Only 45% of the abrupt increase in or hypertension; whether any complications of pregnancy were
cigarette taxes in July 2007 was passed on to consumers in the observed, in particular, the presence of preeclampsia or eclamp-
form of higher retail prices. The construction of the real price sia; the mother’s body mass index based on her self-reported
and tax series is described in our working paper (Harris et al., height and weight prior to the pregnancy (underweight, normal
2014). weight, overweight, obese); the mother’s use of alcohol or illicit
In Uruguay, an estimated 99% of tobacco users smoke manu- drugs; the sites of prenatal care; and the newborn’s sex and birth
factured cigarettes, hand-rolled cigarettes, or both (Abascal et al., weight.3
2012). Manufactured cigarettes, in particular, make up more Prior to 2007, each individual record in the SIP database con-
than 85% of taxable cigarette consumption (Dirección General tained the pregnant woman’s smoking status only at the time of
Impositiva, 2012). During 2004–2012, by one estimate, contraband initiation of prenatal care. It did not show changes in smoking, if
cigarette sales constituted approximately 12% of total cigarette any, during the course of her pregnancy. Under a new data entry
consumption on average (Curti, 2013). With the possible excep- system beginning in 2007, the prenatal record noted the woman’s
tion of less densely populated provinces (“departamentos”) along smoking status separately in each trimester of her pregnancy. For
Uruguay’s borders with Brazil and Argentina, where contraband example, if a woman initiated prenatal care in her second trimester,
tobacco use appears more prevalent, there has been little effective the healthcare provider recorded her smoking status in the first
geographical variation in retail price. trimester, based on her recall, as well as in the current trimester.
Her smoking status would subsequently be recorded in a follow-up
2.4. Perinatal information system (SIP) prenatal visit during her third trimester. The perinatal data derived
from this new system, which we refer to as the “new SIP,” were the
Our source of micro data on the smoking practices of pregnant focus of our analysis.
women was the Perinatal Information System (“Sistema Infor-
mático Perinatal” or SIP), a mandatory nationwide electronic
registry operating in all prenatal care clinics in Uruguay since 1990. 3
To avoid loss of observations, we included dummy variables equal to 1 when
Developed and overseen by the Latin American Center for Peri- some maternal characteristics were missing. For further details on maternal and
natology (“Centro Latinoamericano de Perinatología” or CLAP) of pregnancy characteristics, including descriptive statistics, see our working paper
the Pan American Health Organization, the database contained (Harris et al., 2014).
190 J.E. Harris et al. / Journal of Health Economics 42 (2015) 186–196
3. Principal endpoints: smoking cessation and birth weight 3.1. Impact on quitting during pregnancy
To assess the impact of Uruguay’s tobacco control campaign, we 3.1.1. Individual-level analysis
focused our attention primarily on pregnant women who smoked We first investigate the effect of the provider-level agreements
cigarettes at any time from the first prenatal visit onward. Within and national-level tobacco control policies on the quit rate during
this target group of pregnant smokers, our principal endpoint pregnancy. To that end, we begin with a linear probability model
was smoking cessation by the third trimester. Our analysis of based upon observations at the level of the individual pregnant
this endpoint was confined to the interval from 2007 to 2013, woman:
when data on smoking habits during each trimester were avail-
yist = ci ˛0 + xst
˛1 + zt ˛2 + Ds + eist (1)
able through the new SIP system. Fig. 4 shows the progressive
increase in the annual mean quit rate among pregnant smokers where the subscript i indexes each woman, the subscript s refers
from 15.4% in 2007 to 42.7% in 2013. Our main research objec- to the health center where she received prenatal care, and the sub-
tive was to determine what proportion of this substantial rise in script t refers to the calendar date corresponding to the midpoint
quit rates could be attributed to Uruguay’s multi-component cam- of her third trimester.4 The data yist are binary variables repre-
paign. senting smoking cessation, where yist = 1 if the woman quit and
Why did not we focus on the prevalence of smoking at the onset yist = 0 if she continued to smoke through her third trimester. The
of pregnancy? Unfortunately, within the limitations of Uruguay’s vector of exogenous variables ci represents individual-level mater-
SIP system, this alternative endpoint was subject to a critical source nal characteristics, while xst represents the presence or absence of
of potential measurement bias. Starting in July 2008, the Uruguayan a provider-level agreement at health center s on calendar date t,
Ministry of Health established a new system of financial incentives and the vector zt represents those national-level policy variables
for providers to increase the completeness of data reporting on SIP including cigarettes taxes that were in effect at calendar date t. The
records. As Fig. 5 shows, the prevalence of smoking at the first parameters Ds represent health center-specific fixed effects. We
prenatal visit remained stable at about 25% from 2000 to 2005. assume that the unobserved error terms eist are uncorrelated with
Thereafter, with the onset of the tobacco control campaign, the the observed explanatory variables and have zero means.
prevalence had declined toward 15% by the early months of 2009. Since the presence or absence of a provider-level agreement
In April 2009, however, nine months after the institution of the new xst varied by health center and calendar date, the impact of these
system of financial incentives, the prevalence abruptly increased. programs was identifiable via a DID model. With respect to the
At the same time, as shown in Fig. 5, the proportion of records with national-level policies zt , however, we needed to be careful about
missing data declined dramatically from about 20% in 2009 to 1–2% what policy impacts could and could not be identified from the
by 2013. data. Successive increases in cigarette taxes during the 2007–2013
The best explanation for the abrupt break in prevalence is observation period, beginning with the inclusion of tobacco in the
that women with missing data on smoking at the onset of value-added tax on July 1, 2007 (Fig. 3), permitted us to identify
pregnancy were more likely to be smokers. Moreover, as the the impact of this policy. On the other hand, we could not iden-
Ministry of Health imposed increasingly strict goals for data com- tify the impacts of the two non-price policies that went into effect
pleteness, the effect of including these previously unreported before the start of our observation period: the prohibition of smok-
smokers in the prevalence calculation is likely to have grown ing in public places and enclosed workspaces (March 1, 2006) and
ever larger. While we did have some data on the Ministry’s the requirement that all packs contain rotating images with war-
data-completeness goals, we concluded that the task of correct- nings covering 50% of the front and back of each pack (April 18,
ing for this missing data bias and at the same time identifying 2006). Moreover, the effective dates of the second, third and fourth
the effects of post-2009 tobacco control measures would prove rounds of warnings were either close to or coincident with other
intractable. policy measures, and thus their impacts could not be separately
On the other hand, we did not detect any comparable break in identified without additional strong assumptions concerning their
the monthly time series of smoking cessation rates. We thus con- effects over time (Fig. 1).
sidered the quitting to be a more reliable and sensitive endpoint That left us with five non-price, national-level policies: the com-
than smoking prevalence. Still, to the extent that the new system prehensive tobacco legislation banning nearly all advertising (in
of data-completeness goals recorded an increasing number of hard- effect from March 6, 2008 onward); the single presentation rule (in
core smokers, our estimates in Fig. 5 of quit rate during 2009–2013 effect from February 14, 2009 onward); the increase in the warning
would be biased downward. We stress that our principal endpoint size from 50% to 80% of the front and back of each pack (in effect
represents the rate of cessation conditional upon smoking on or from February 28, 2010 onward); the fifth round of warnings (in
after the first prenatal visit. effect from January 7, 2012 through April 7, 2013); and the sixth
Smoking cessation during pregnancy is known to reduce its round of warnings (in effect from April 8, 2013 onward).
adverse health effects. Accordingly, as an additional endpoint, we Column (A) of Table 1 shows our results. We estimated the
studied the impact of Uruguay’s tobacco control campaign on parameters of Eq. (1) by ordinary least squares (OLS) with Huber-
birth weight. As shown in Fig. 6, the difference in birth weight White robust standard errors implemented at the individual level.
between non-smokers and continuing smokers was on the order of The presence of an agreement between the woman’s health center
210 g, while the corresponding difference between non-smokers and the FNR increased the probability of quitting by an esti-
and quitters was only about 25 g. Although the smaller number mated 4.8 percentage points (p = 0.024). The coefficient of the log
of quitters in the SIP database during 2007–2008 decreased the tax per pack was 0.079 (p = 0.031). At the sample mean value
precision of the mean birth weight estimates, Fig. 6 still shows a of the dependent variable equal to 0.377, the estimated elastic-
background increase in birth weight for all three groups. The anal- ity of the smoking cessation with respect to cigarette taxes was
ysis of birth weight also helped us address the concern that the SIP
registry contained self-reported data on smoking habits. If women
had falsely reported having quit smoking with increasing frequency 4
For the individual-level model of Eq. (1), we assigned a woman to calendar date
as the campaign progressed, we would have expected to see a grow- t based on the midpoint of her prenatal care visits during her third trimester of
ing reduction in the apparent favorable effects of cessation on birth pregnancy. If she had no prenatal visits, we assigned her a date t equal to 30 days
weight. prior to delivery.
J.E. Harris et al. / Journal of Health Economics 42 (2015) 186–196 191
50
6623 6581
40
6583
5888
30
2582
20
638
10
0
Fig. 4. Increase in mean quit rate among pregnant smokers, 2007–2013. Vertical bars represent 95% confidence intervals. Adjacent to each point is the number of pregnant
smokers in the SIP database with data on smoking status in the third trimester. Each smoker was assigned to the calendar year of the mean date of all her third-trimester
prenatal visits.
0.079/0.377 = 0.21. All five of the non-price nationwide policies then all indicators of non-price policies zt equal to 0. Based on this
had significant effects, with the comprehensive tobacco control decomposition procedure, we estimated the following attributable
law assuming the dominant role. All five non-price policies com- proportions: provider-level agreements, 4.3%; tax increases, 25.2%;
bined increased the probability of quitting by an estimated 26.9 and regulations of packaging and marketing, 70.5%.
percentage points (p < 0.001).
We used the results of our individual-level analysis to com- 3.2. Health center level analysis
pute the relative contributions of each of the three categories of
tobacco control policy to the overall change in smoking cessa- It is arguable that an individual-level analysis of Eq. (1) over-
tion rates observed during the 2007–2013 study period. To that states the precision of estimated policy impacts. The central thrust
end, we first computed the predicted values ŷist derived
from Eq. of this criticism is that the agreements for smoking cessation ser-
(1) and then calculated the corresponding sum Ŷ = ŷ over
i,s,t ist vices were made with health centers rather than with individual
all observations. We then recomputed the predicted values ŷist patients. Moreover, the tax increases and non-price policies were
and corresponding sums Ŷ , successively setting all values of the carried out at the national rather than the individual level. To
provider-level agreement variable xst equal to 0, then all values of address these concerns, we performed a series of aggregate anal-
the log real tax rate equal to the initial level in January 2007, and yses at both the health center level and national level. Following
30
25
at First Prenatal Visit (%)
Prevalence of Smoking
20
% Prevalence
15
25
Proportion of Records
with Missing Data (%)
10
20
15
5
10
0
% Missing Data
0
Fig. 5. Monthly prevalence of smoking at the first prenatal visit. Monthly proportion of records with missing data. Smoking prevalence is measured on the left axis, while
the proportion with missing data is measured on the right. The diameter of each data point is proportional to the number of observations.
192 J.E. Harris et al. / Journal of Health Economics 42 (2015) 186–196
3400
95% CI
Non-Smokers
3300
Birth Weight (gm)
Continuing Smokers
3100
3000
Fig. 6. Annual mean birth weight among non-smoking pregnant women, continuing smokers, and smokers who quit, 2007–2013. Non-smokers were pregnant women who
did not report smoking at any prenatal visit. Continuing smokers were pregnant women who reported smoking at one or more prenatal visits, but had not quit smoking by
the third trimester. Smokers who quit likewise reported smoking at one or more prenatal visits, but had quit by the third trimester.
Table 1
Estimated effects on probability of quitting by the third trimester of pregnancy.
Provider-level agreements 0.048** (0.021) 0.056* (0.034) 0.056** (0.025) 0.056** (0.025)
Log real tax per pack 0.079** (0.036) 0.106** (0.053) 0.105* (0.054) 0.136** (0.055)
Tobacco control law 0.094*** (0.017) 0.138*** (0.038) 0.138*** (0.026) 0.139*** (0.023)
Single presentation rule 0.075*** (0.014) 0.053** (0.023) 0.053*** (0.019) 0.039** (0.015)
80% rule 0.028*** (0.011) 0.029* (0. 015) 0.030* (0.015) 0.032* (0.016)
Fifth round of warnings 0.037*** (0.008) 0.030*** (0.011) 0.030*** (0.011) 0.036*** (0.007)
Sixth round of warnings 0.035*** (0.011) 0.0355** (0.0147) 0.035** (0.015) 0.038*** (0.010)
Five non-price measures combined 0.269*** (0.020) 0.286*** (0.044) 0.285*** (0.027) 0.283*** (0.022)
No. observations 31,230 1422 1400 1400 28
*
Significant at p < 0.10.
**
Significant at p < 0.05.
***
Significant at p < 0.01.
A. OLS estimation on individual maternal-level observations, based on Eq. (1). White-Huber robust standard errors. Coefficients of individual maternal characteristics and
health center fixed effects not shown.
B. OLS estimation on observations grouped by health center and calendar quarter, based on Eq. (3). Standard errors adjusted for clustering at level of health center. Observations
weighted by inverse of standard errors of fixed effects derived from Eq. (2).
C. FGLS estimation on observations grouped by health center and calendar quarter, based on Eq. (3). Standard errors adjusted for clustering at level of health center. Uniform
first-order serial correlation estimated to equal 0.0210. Observations weighted by inverse of standard errors of fixed effects derived from Eq. (2).
D. FGLS estimation on observations grouped by health center and calendar quarter, based on Eq. (3). Standard errors adjusted for clustering at level of health center. Uniform
first-order serial correlation estimated to equal 0.0237. Observations weighted by inverse of standard errors of fixed effects derived from Eq. (2).
E. OLS estimation on observations grouped by calendar quarter, based upon Eq. (4). Newey-West standard errors adjusted for heteroskedasticity and serial correlation.
Observations weighted by inverse of standard errors of fixed effects derived from Eq. (2).
(Amemiya, 1978; Hansen, 2007; Imbens and Woolridge, 2014) and calendar quarter t, while the data zt denote the national-level poli-
others, we specified the following two-stage model. cies in effect in calendar quarter t. The parameters Gs are S health
center-specific fixed effects. We assumed that the unobserved error
yist = ci ˇ0 + Fst + uist (2) terms uist and vist were uncorrelated with the observed explanatory
variables and with each other, and had zero means.
Fst = xst ˇ1 + zt ˇ2 + Gs + vst (3)
To estimate the model parameters, we first ran OLS on Eq.
Here, the subscript s = 1, . . ., S continues to index health centers, (2), thus obtaining estimates F̂st of the parameters Fst . In effect,
while the subscript t = 1, . . ., T now indexes calendar quarters. The these OLS estimates represented the predicted quit rate in each
subscript i = 1, . . ., Nst now indexes women who received prena- health center s and calendar quarter t of a pregnant smoker in
tal care at health center s and whose third trimester occurred in the reference category of maternal characteristics.5 We then esti-
calendar quarter t. mated the parameters of the DID model (3), where the dependent
In the first stage (Eq. (2)), the data yist are binary variables repre-
senting smoking cessation and the exogenous variables ci represent
individual-level maternal characteristics, while Fst are fixed effects 5
A woman in the reference category was married, aged 20–34 years, had less
for each of the ST combinations of health center and calendar quar- than a high school education, did not seek prenatal care in her first trimester, had
ter. In the second stage (Eq. (3)), the data xst represent the presence no prior abortions or deliveries, a pre-pregnancy body mass index 18.5–24.9 kg/m2 ,
or absence of a provider-level agreement at health center s during no history of diabetes, hypertension, eclampsia or pre-eclampsia, and gave birth to
J.E. Harris et al. / Journal of Health Economics 42 (2015) 186–196 193
variable Fst was replaced by its estimated value F̂st . We weighted the (5) by OLS with Newey–West standard errors with a maximum lag
observations by the inverse of the standard errors of the estimates of 4 calendar quarters to take account of possible heteroskedas-
F̂st . ticity and serial correlation of the error terms (Newey and West,
Within the context of this aggregate model, researchers have 1987). Similarly, we weighted the observations by the inverse of
expressed concerns that traditional OLS estimation of DID models the standard errors of the estimates Ĵt .
ignores serial correlation of errors and thus overstates the precision Fig. 7 motivates the logic underlying the third stage of our anal-
of the coefficient estimates (Bertrand et al., 2004; Cameron et al., ysis. The vertical axis measures the estimated fixed effects Ĵt , which
2008). In our context, serial correlation would arise when health represent calendar quarter-specific quit rates adjusted for individ-
centers were subject to common unobserved shocks that per- ual maternal characteristics, health center fixed effects, and the
sisted for more than a calendar quarter. To address these concerns, presence or absence of a provider-level agreement. The horizon-
we estimated the parameters of Eq. (3) by feasible generalized tal axis measures the corresponding calendar quarter t. Highlighted
least squares (FGLS) under varying specifications of the covariance are the dates on which the value added tax was imposed on tobacco,
matrix V = E[vst vs t ]. comprehensive tobacco legislation was passed, only brands with a
Columns (B) and (C) in Table 1 show our regression estimates of single presentation were permitted, images with warnings were
the parameters ˇ1 and ˇ2 of Eq. (3) under two different assump- mandated to cover 80% of the front and back of each pack, and the
tions concerning the covariance matrix. In column (B), we assumed fifth and sixth rounds of images went into effect.
clustering of errors within each health center. In column (C), we Columns (D) and (E) show the results of our three-stage proce-
assumed first-order temporally correlated errors with a uniform dure. Column (D) shows the estimate of 1 in Eq. (4) in the case
correlation coefficient among health centers. We found quite simi- of FGLS with serially correlated errors. As in the previous models,
lar results when we estimated Eq. (3) under the assumption that the the presence of provider-level agreement increased the probability
coefficient of serial correlation varied by health center (not shown of smoking cessation by 5.6 percentage points (p = 0.027). Column
in Table 1). In comparison with the maternal-level estimates (col- (E) shows the estimates of the policy impact parameters 2 in Eq.
umn A), we observed larger coefficients for two policies: the log real (5). The estimated policy impacts are comparable to those derived
cigarette tax and the comprehensive tobacco control law. However, from the 2-stage model (columns B and C). The point estimate of
their corresponding standard errors were also increased, so that we the impact of cigarette taxes is larger and statistically significant
could not reject the hypothesis that their impacts equalled those (p = 0.045), implying an estimated elasticity of quitting equal to
estimated at the individual-level data. Finally, in both columns (B) 0.36. Still, we could not reject the hypothesis that the estimated tax
and (C), the combined effect of the five non-price policies was indis- impact equalled that estimated from maternal-level data. Finally,
tinguishable from the effect estimated from individual-level data. the estimated effect of the five non-price policies combined was
indistinguishable from the estimates based on the individual-level
3.3. National level analysis and two-stage models.
It is arguable that the 2-stage model of Eqs. (2) and (3) still
overstates the precision of the estimated impacts of cigarette taxes 4. Impact on birth weight
and non-price regulations of cigarette packaging and marketing,
as these policies were carried out at the national level. To address We relied upon maternal-level data to assess the effect of
this criticism, we extended our two-stage model of Eqs. (2) and quitting smoking during pregnancy and birth weight. Following
(3) to three stages. In particular, we continued to specify the first- (Permutt and Hebel, 1989), we adopted a simultaneous equa-
stage model of Eq. (2), but replaced Eq. (3) with the following two tion framework in which Uruguay’s anti-smoking policies served
equations: as instruments for the endogenous variable of smoking cessa-
tion. In addition, we took advantage of the birth-weight endpoint
Fst = xst 1 + Hs + Jt + st (4) to strengthen our identification of the causal effect of Uruguay’s
tobacco control campaign. Employing the population of non-
Jt = zt 2 + t (5)
smoking pregnant women as a control group, we performed a
As before, we assumed that the unobserved error terms in (2), falsification test to determine whether the same policy instru-
(4) and (5) were uncorrelated with their respective explanatory ments had any effect on the birth weight of infants delivered by
variables. non-smokers.
The second-stage Eq. (4) differs from (3) in that Fst now depends We estimated the following equation on our sample of smoking
on the presence or absence of a provider-level agreement xst at pregnant women:
health center s during calendar quarter t, as well as health center
fixed effects Hs , calendar quarter fixed effects Jt , and an unobserved wist = ı1 yist + ci ı0 + Ks + ς ist (6)
random error st with mean zero. To estimate the policy impact
parameters 1 in (4), we replaced the fixed effects Fst with their where the variable wist denotes the birth weight of the infant deliv-
estimates F̂st from Eq. (2). As in the two-stage model, Eq. (4) was ered by mother i in health center s and the calendar date t refers to
estimated by feasible GLS under various assumptions concerning the midpoint of her third trimester of pregnancy. In Eq. (6), birth
the covariance matrix of the errors, where the observations were weight depends on smoking cessation (yist ), individual maternal
weighted by the inverse of the standard errors of the estimates F̂st . characteristics (ci ), a health center fixed effect (Ks ), as well as a
In third-stage Eq. (5), the calendar quarter fixed effects Jt in turn random error term (ς ist ) with zero mean.
depend on national level policies as well as an unobserved random If the random error ς ist is correlated with yist , then the coeffi-
error t with mean zero. To estimate the policy impact parameters cient ı1 will be biased when Eq. (6) is estimated by OLS. There is,
2 in (5), we replaced the fixed effects Jt with their estimates Ĵt in fact, good reason to believe that this is the case. For example, a
from the second stage Eq. (4). We then estimated the linear model woman with a propensity to engage in risky behaviors will tend not
to quit smoking and deliver a low-weight baby. In that case, failure
to account for the unobserved heterogeneity will overestimate the
a singleton female. For further details on maternal and pregnancy characteristics, parameter ı1 . Alternatively, a woman who runs into complications
including descriptive statistics, see our working paper (Harris et al., 2014). during her pregnancy will be under pressure to quit smoking and
194 J.E. Harris et al. / Journal of Health Economics 42 (2015) 186–196
.5
.4
6th Round
.3
80% Rule Warnings
Single
Presentation
.2
Rule
Advertising Ban
Anti-Smoking Law
.1
Fig. 7. Trend in quit rate estimated from national-level model. The vertical axis measures the fixed effects Ĵt estimated from Eq. (4), which represent calendar quarter-specific
quit rates adjusted for individual maternal characteristics, health center fixed effects, and the presence or absence of a provider agreement. The horizontal axis measures the
corresponding calendar quarter t. Highlighted are the dates on which various national-level tobacco control measures went into effect.
tend to deliver a low-weight baby. In that case, the parameter ı1 5. Discussion and conclusions
will be underestimated.
To address the potential endogeneity of the quit variable yist , To assess the impact of Uruguay’s nationwide tobacco control
we therefore estimated Eq. (6) by two-stage least squares (2SLS), campaign, we analyzed a comprehensive nationwide registry of
where the first stage is defined by Eq. (1) and the instruments for pregnancies ending in a live birth during 2007–2013. We focused
smoking cessation are the policy variables xst and zt . As shown in sharply on smoking cessation among those women who reported
Table 2, estimation of Eq. (6) by OLS, with clustering of standard smoking at any time during pregnancy, as well as the conse-
errors by health center, gave an estimate of ı̂1 = 123.2 g (p < 0.001). quences of smoking cessation for birth weight. We observed a
By contrast, estimation of (6) by 2SLS, similarly with standard striking increase in the proportion of pregnant smokers who had
errors clustered by health center, gave an estimate of ı̂1 = 187.9 g quit by their third trimester, from 15.4% in 2007 to 42.7% in
(p = 0.028). While 2SLS increased the estimated standard error of 2013.
ı̂1 , all tests rejected the hypotheses of weak instruments or over- We employed a difference-in-differences approach to evalu-
identification. Although the 95% confidence intervals of the two ate the effects of provider-level agreements on quit rates during
estimates of ı̂1 overlapped substantially, the results still suggest pregnancy. We found that smoking cessation programs estab-
that the OLS estimate may be biased downward. lished under agreements between health centers and the National
For our falsification test, we used the parameters estimated Resource Fund increased quit rates by between 4.8 and 5.6 per-
from the first-stage Eq. (1) to predict the values of yist for those centage points. Unfortunately, no more than one-third of pregnant
women who did not report smoking during pregnancy. We then smokers received care at health centers with contractual agree-
estimated the model of Eq. (6) on all women in this comparison ments during the period of analysis, and by 2013, the proportion of
group, replacing yist with its predicted value. If the tobacco con- women exposed to such treatments had declined. As a result, these
trol policies captured in the variables xst and zt in fact improved provider-level agreements contributed little to the overall increase
birth weight by increasing the rate of quitting, then the estimate in smoking cessation observed during 2007–2013.
of ı̂1 in (6) should be indistinguishable from zero in the compari- Although real taxes per pack increased by 122% during our study
son group of non-smoking pregnant women. On the other hand, if period (Fig. 3), tax increases alone explained only about 20% of the
these policies are simply correlated with other unobserved factors overall rise in smoking cessation during pregnancy. In addition,
that improved birth weight, then the estimate of ı̂1 in (6) should be our estimates of the tax elasticity of quitting, ranging from 0.21 to
significantly positive for non-smokers as well. In fact, as shown in 0.36, were lower than those reported in the U.S. (Ringel and Evans,
Table 2, the estimate of ı̂1 was indistinguishable from zero among 2001; Colman et al., 2003). The principal explanation for this limited
non-smokers (−27.8 g, p = 0.527). influence is that manufacturers moderated their retail prices in
response to the application of the value added tax to cigarettes
Table 2 in July 2007 and other non-price regulatory policies enacted dur-
Estimated effect of smoking cessation on birth weight.a,b ing 2007–2009 (Fig. 3) (Harris et al., 2014). As a result, the real
retail price of cigarettes increased by only 17% during 2007–2013.
Population OLS 2SLS Falsification test
Endogenous responses of the tobacco industry to state and nation-
Smokers 123.2*** (10.8) 187.9** (85.5)
wide tax increases have been previously documented in the U.S.
Non-smokers −27.8 (43.8)
No. observations 31,186 31,186 126,504
(Harris, 1987; Harris et al., 1996; Chaloupka et al., 2010; Miura,
**
2010).
Significant at p < 0.05.
*** To the contrary, most of the observed increase in quit rates was
Significant at p < 0.01.
a
All estimates correspond to the parameter ı1 in Eq. (6). attributable to non-price regulation of the marketing and packaging
b
All standard errors adjusted for clustering by health center. of cigarettes. While the combined effect of all five non-price policies
J.E. Harris et al. / Journal of Health Economics 42 (2015) 186–196 195
was indistinguishable across the models that we tested (Table 1), the social pressure on a pregnant woman to deny smoking when
it was nonetheless difficult to identify with precision the quanti- queried by her obstetrician. While we cannot completely rule out
tative impacts of the individual component policies. We could not misreporting bias, we note that if women had falsely reported
distinguish immediate versus long-run effects of policies, nor could having quit smoking with increasing frequency as the campaign
we identify synergies between policies. Our analysis at best identi- progressed, we would have expected to see a growing reduction
fied the average impacts of the tobacco control measures over time, in the apparent favorable effect of quitting on birth weight. That is
conditional upon a specific temporal sequence of policy interven- not what we observed in Fig. 6. Moreover, if the apparent increase
tions. Thus, the increase in the size of warnings from 50 to 80% of in smoking cessation during 2007–2013 were solely the result of
the front and back of each pack of cigarettes, in effect since February increasing misreporting, then we would have observed no relation
28, 2010, was associated with a 3 percentage point increase in the between quitting and birth weight. That is not what we observed
cessation rate (Table 1). However, this estimated impact was in the in Table 2. In the context of smoking during pregnancy, a number
context of the coincident fourth round of warnings, as well as a of authors have found a strong correlation between self-reported
ban on smoking in public places and private workspaces (March 1, cigarette consumption and objectively measured levels of nicotine
2006), the comprehensive tobacco control law (March 6, 2008), the metabolites (Castellanos et al., 2000; Althabe et al., 2008; Himes
single presentation rule (February 14, 2009) and other policies in et al., 2013).
effect at the same time (Figs. 1 and 2). During 2005–2009, the prevalence of smoking during pregnancy
While we had micro data on smoking cessation at the individual declined from approximately 25% to 15% (Fig. 5). Unfortunately,
level, the tobacco control policies that we evaluated were carried missing data biases seriously complicated the interpretation of
out at either the health center level or the national level. Accord- prevalence trends thereafter. Moreover, there is evidence that some
ingly, if there were significant inter-correlation of quitting behavior women who quit during pregnancy later resumed smoking and
among women who attended the same health center or who were then quit once again during the next pregnancy (Harris et al.,
pregnant in the same calendar quarter, then the effective number of 2014). Still, if the trend observed during 2005–2009 is an accurate
observations could be far less than the approximate 31,000 shown indicator of an overall decline in prevalence, then our results on
in column A of Table 1. However, our estimates on data aggregated smoking cessation may substantially understate the overall impact
at the health center level (columns B and C) and at the national level of Uruguay’s tobacco campaign.
(columns D and E) generally confirmed our individual-level results. Our results have important implications for future research
Some economists have suggested that the impact of smoking and for the future design of tobacco control policies. Our find-
cessation during pregnancy on birth weight, as estimated from ings suggest that enhanced targeting of healthcare providers in the
cross-sectional databases, may be exaggerated by the presence of implementation of tobacco cessation programs, as well as increased
unobserved heterogeneity (Lien and Evans, 2005; Abrevaya, 2006; recruitment of patients, could have a high payoff. At the same time,
Abrevaya and Dahl, 2008; Walker et al., 2009; Juarez and Merlo, our findings strongly suggest that non-price policies, in particu-
2013). That is, a woman who tends to engage in risky behaviors lar regulation of marketing and packaging, can have an important
will continue to smoke during pregnancy and have lower weight impact in reducing tobacco use.
babies. However, the presence of unobserved heterogeneity can
also result in an underestimate of the impact of smoking cessa-
tion. Thus, a woman who encounters complications in the third Financial support
trimester, such as intrauterine growth retardation, will quit smok-
ing and have a lower weight baby. Our OLS estimate of the effect We gratefully acknowledge the financial support of the
of quitting on birth weight was 123 g (95% CI, 102–145). When we Bloomberg Foundation through an unrestricted grant to the Min-
used Uruguay’s tobacco control policies as instruments for quit- istry of Public Health (Ministerio de Salud Pública), Uruguay.
ting smoking, our 2SLS estimate was 188 g (95% CI, 20–356). While Neither the Bloomberg Foundation nor the Ministry of Public Health
our 2SLS estimate reinforces the conclusions that smoking cessa- exerted any influence on the conduct of this study or the drafting
tion during pregnancy, even in the third trimester, has a significant of this manuscript.
positive effect on birth weight, the confidence interval surrounding
that estimate was too wide to draw definitive conclusions about the
Conflicts of interest
direction of bias, if any, in the OLS estimate. Our results confirm that
even delayed cessation of smoking can reduce the adverse effects
We have no conflicts of interest to declare.
of smoking during pregnancy (Lieberman et al., 1994; Raatikainen
et al., 2007; Batech et al., 2013; Yan and Groothuis, 2013).
As we have already noted, Uruguay’s tobacco control campaign Authors’ contributions
was associated with declines in per capita cigarette consumption,
adult smoking prevalence and adolescent cigarette use that were All three coauthors contributed to the conceptualization and
significantly greater than those observed in Argentina, a country design of this study, the analysis of the data, and the writing of
with a common international border, language and culture that this report.
served as a control (Abascal et al., 2012). Unfortunately, we were
unable to locate reliable nationwide data from Argentina on the
smoking practices of pregnant women during 2007–2013, and thus Acknowledgments
could not construct a comparable external control group for this
study. Still, our falsification test took advantage of an internal We thank the Area Sistema Informático Perinatal from the
control group, namely, pregnant Uruguayan women who did not Epidemiology Division of the Ministry of Public Health (UINS)
smoke during pregnancy. We found, in particular, that the tobacco for providing us with the perinatal data. We acknowledge
control policies implemented in Uruguay during 2007–2013 had no valuable inputs from Winston Abascal, Rafael Aguirre, Wanda
effect on the birth weight of mothers in this non-smoking control Cabella, Fernando Esponda, Elba Estévez, Marinés Figueroa, Ana
group. Lorenzo, Luis Mainero, Anna Mikusheva, and Giselle Tomasso.
Our data on smoking from the SIP registry were self-reported. It The opinions expressed in this paper are ours and ours
is conceivable that Uruguay’s tobacco control campaign increased alone.
196 J.E. Harris et al. / Journal of Health Economics 42 (2015) 186–196
References Hansen, C.B., 2007. Generalized least squares inference in panel and multilevel
models with serial correlation and fixed effects. Journal of Econometrics 140,
Abascal, W., Esteves, E., Goja, B., Gonzalez Mora, F., Lorenzo, A., Sica, A., Triunfo, 670–694.
P., Harris, J.E., 2012. Tobacco control campaign in Uruguay: a population-based Harris, J.E., 1987. The 1983 increase in the federal excise tax on cigarettes. In: Sum-
trend analysis. Lancet 380 (9853), 1575–1582. mers, L.H. (Ed.), Tax Policy and the Economy, vol. 1. MIT Press, Cambridge, MA,
Abrevaya, J., 2006. Estimating the effect of smoking on birth outcomes using a pp. 87–111.
matched panel data approach. Journal of Applied Econometrics 21 (4), 489–519. Harris, J.E., Balsa, A.I., Triunfo, P., January 2014. Tobacco Control Campaign in
Abrevaya, J., Dahl, C.M., 2008. The effects of birth inputs on birthweight: evidence Uruguay: Impact on Smoking Cessation During Pregnancy and Birth Weight.
from quantile estimation on panel data. Journal of Business and Economic Statis- National Bureau of Economic Research, Working Paper No. 19878, Cambridge,
tics 26 (4), 379–397. MA.
Althabe, F., Colomar, M., Gibbons, L., Belzan, J.M., Buekens, P., 2008. Tabaquismo Harris, J.E., Connolly, G.N., Brooks, D., Davis, B., 1996. Cigarette smoking before
durante el embarazo en Argentina y Uruguay (Smoking during pregnancy in and after an excise tax increase and an antismoking campaign – Mas-
Argentina and Uruguay]). Medicina (Buenos Aires) 68, 48–54. sachusetts, 1990–1996. MMWR Morbidity and Mortality Weekly Report 45 (44),
Amemiya, T., 1978. A note on a random coefficient model. International Economic 966–970.
Review 19 (3), 793–796. Himes, S.K., Stroud, L.R., Scheidweiler, K.B., Niaura, R.S., Huestis, M.A., 2013. Prenatal
Anger, S., Kvasnicka, M., Siedler, T., 2011. One last puff? Public smoking bans and tobacco exposure, biomarkers for tobacco in meconium, and neonatal growth
smoking behavior. Journal of Health Economics 30 (3), 591–601. outcomes. Journal of Pediatrics 162 (5), 970–975.
Batech, M., Tonstad, S., Job, J.S., Chinnock, R., Oshiro, B., Allen Merritt, T., Page, G., Hoek, J., Wong, C., Gendall, P., Louviere, J., Cong, K., 2011. Effects of dis-
Singh, P.N., 2013. Estimating the impact of smoking cessation during pregnancy: suasive packaging on young adult smokers. Tobacco Control 20 (3), 183–
the San Bernardino County experience. Journal of Community Health 38 (5), 188.
838–846. Imbens, G., Woolridge, J.M., 2014. New Developments in Econometrics. Economet-
Bertrand, M., Duflo, E., Mullainathan, S., 2004. How much should we trust rics of Cross Section and Panel Data. Lecture 7. Cluster Sampling. Centre for
differences-in-differences estimates? Quarterly Journal of Economics 119 (1), Microdata Methods and Practice (CEMMAP), London.
249–275. Juarez, S.P., Merlo, J., 2013. Revisiting the effect of maternal smoking during
Blecher, E., 2008. The impact of tobacco advertising bans on consumption in devel- pregnancy on offspring birthweight: a quasi-experimental sibling analysis in
oping countries. Journal of Health Economics 27 (4), 930–942. Sweden. PLOS ONE 8 (4), e61734.
Cameron, A.C., Gelbach, J.B., Miller, D.R., 2008. Bootstrap-based improvements for Lieberman, E., Gremy, I., Lang, J.M., Cohen, A.P., 1994. Low birthweight at term and
inference with clustered errors. Review of Economics and Statistics 90 (3), the timing of fetal exposure to maternal smoking. American Journal of Public
414–427. Health 84 (7), 1127–1131.
Carpenter, C., Cook, P.J., 2008. Cigarette taxes and youth smoking: new evidence Lien, D.S., Evans, W.N., 2005. Estimating the impact of large cigarette tax hikes: the
from national, state, and local Youth Risk Behavior Surveys. Journal of Health case of maternal smoking and infant birth weight. Journal of Human Resources
Economics 27 (2), 287–299. 40 (2), 373–392.
Castellanos, M.E., Munoz, M.I., Nebot, M., Paya, A., Rovira, M.T., Planasa, S., Sanroma, Mathers, C.D., Boerma, T., Ma Fat, D., 2008. The Global Burden of Disease: 2004
M., Carreras, R., 2000. Validez del consumo declarado de tabaco en el embarazo Update. World Health Organization, Geneva.
(Validity of the declared tobacco consumption in pregnancy]). Aten Primaria 26 McCowan, L.M., Dekker, G.A., Chan, E., Stewart, A., Chappell, L.C., Hunter, M., Moss-
(9), 629–632. Morris, R., North, R.A., 2009. Spontaneous preterm birth and small for gestational
Chaloupka, F.J., Peck, R., Tauras, J.A., Xu, X., Yurekli, A., 2010. Cigarette Excise Tax- age infants in women who stop smoking early in pregnancy: prospective cohort
ation: the Impact of Tax Structure on Prices, Revenues, and Cigarette Smoking. study. British Medical Journal 338, b1081.
Cambridge, Massachusetts, National Bureau of Economic Research, Working Miura, M., 2010. Regulating Tobacco Product Pricing: Guidelines for State and Local
Paper 16287, August. Governments. Tobacco Control Legal Consortium, Saint Paul, MN.
Chaloupka, F.J., Yurekli, A., Fong, G.T., 2012. Tobacco taxes as a tobacco control Mons, U., Nagelhout, G.E., Allwright, S., Guignard, R., van den Putte, B., Willemsen,
strategy. Tobacco Control 21 (2), 172–180. M.C., Fong, G.T., Brenner, H., Potschke-Langer, M., Breitling, L.P., 2013. Impact
CLAP, 2001. Sistema Informático Perinatal en el Uruguay 15 Años de Datos of national smoke-free legislation on home smoking bans: findings from the
1985–1999. Centro Latinoamericano de Perinatologia y Desarrollo Humano, International Tobacco Control Policy Evaluation Project Europe Surveys. Tobacco
Publicación Científica del CLA, Montevideo, Uruguay, pp. P1485. Control 22 (e1), e2–e9.
Colman, G., Grossman, M., Joyce, T., 2003. The effect of cigarette excise taxes on Newey, W.K., West, K.D., 1987. A simple, positive semi-definite, heteroskedas-
smoking before, during and after pregnancy. Journal of Health Economics 22 ticity and autocorrelation consistent covariance matrix. Econometrica 55 (3),
(6), 1053–1072. 703–708.
Curti, D., 2013. El comercio iliı́cito en Uruguay y su relacioı́n con los impuestos: Permutt, T., Hebel, J.R., 1989. Simultaneous-equation estimation in a clinical trial of
resultados de investigacioı́n (Illicit trade in Uruguay and its relation to taxes: the effect of smoking on birth weight. Biometrics 45 (2), 619–622.
research results]). Centro de Investigación para la Epidemia de Tabaquismo Powell, L.M., Tauras, J.A., Ross, H., 2005. The importance of peer effects, cigarette
(CIET), May 9, Montevideo. prices and tobacco control policies for youth smoking behavior. Journal of Health
da Veiga, P.V., Wilder, R.P., 2008. Maternal smoking during pregnancy and birth- Economics 24 (5), 950–968.
weight: a propensity score matching approach. Maternal Child Health Journal Raatikainen, K., Huurinainen, P., Heinonen, S., 2007. Smoking in early gestation
12 (2), 194–203. or through pregnancy: a decision crucial to pregnancy outcome. Preventive
DeCicca, P., Kenkel, D., Mathios, A., 2008. Cigarette taxes and the transition from Medicine 44 (1), 59–63.
youth to adult smoking: smoking initiation, cessation, and participation. Journal Ringel, J.S., Evans, W.N., 2001. Cigarette taxes and smoking during pregnancy. Amer-
of Health Economics 27 (4), 904–917. ican Journal of Public Health 91 (11), 1851–1856.
Dirección General Impositiva, 2012. Volúmenes físicos de bienes gravados por el Saffer, H., Chaloupka, F., 2000. The effect of tobacco advertising bans on tobacco
IMESI – Series anuales (archivo xls). Montevideo Dirección General Impositiva, consumption. Journal of Health Economics 19 (6), 1117–1137.
República Oriental del Uruguay. Wakefield, M., Chaloupka, F., 2000. Effectiveness of comprehensive tobacco control
Emery, S., Kim, Y., Choi, Y.K., Szczypka, G., Wakefield, M., Chaloupka, F.J., 2012. The programmes in reducing teenage smoking in the USA. Tobacco Control 9 (2),
effects of smoking-related television advertising on smoking and intentions to 177–186.
quit among adults in the United States: 1999–2007. American Journal of Public Walker, M.B., Tekin, E., Wallace, S., 2009. Teen Smoking and Birth Outcomes. South-
Health 102 (4), 751–757. ern Economic Journal 75 (3), 892–907.
Esteves, E., Gambogi, R., Saona, G., Cenández, A., Palacio, T., 2011. Tratamiento de la World Health Organization, 2012. WHO Global Report: Mortality Attributable to
dependencia al tabaco: experiencia del Fondo Nacional de Recursos (Treatment Tobacco. World Health Organization, Geneva.
of tobacco dependence: experience of the National Resource Fund]). Revista Yan, J., Groothuis, P.A., August 2013. Timing of Prenatal Smoking Cessation or Reduc-
Uruguaya de Cardiología 26 (3), 78–83. tion and Infant Birth Weight: Evidence from the United Kingdom Millennium
Hammond, D., 2011. Health warning messages on tobacco products: a review. Cohort Study. Appalachian State University, Department of Economics Working
Tobacco Control 20 (5), 327–337. Paper, Number 13–16, Boone, NC.
Journal of Health Economics 42 (2015) 197–208
a r t i c l e i n f o a b s t r a c t
Article history: Health care financing and funding are usually analyzed in isolation. This paper combines the corre-
Received 2 July 2014 sponding strands of the literature and thereby advances our understanding of the important interaction
Received in revised form 10 March 2015 between them. We investigate the impact of three modes of health care financing, namely, optimal income
Accepted 20 April 2015
taxation, proportional income taxation, and insurance premiums, on optimal provider payment and on
Available online 15 May 2015
the political implementability of optimal policies under majority voting. Considering a standard multi-
task agency framework we show that optimal health care policies will generally differ across financing
JEL classification:
regimes when the health authority has redistributive concerns. We show that health care financing also
H24
I14
has a bearing on the political implementability of optimal health care policies. Our results demonstrate
I18 that an isolated analysis of (optimal) provider payment rests on very strong assumptions regarding both
the financing of health care and the redistributive preferences of the health authority.
Keywords: © 2015 Elsevier B.V. All rights reserved.
Health care financing
Provider payment
Service quality
Cost containment
Political economy
http://dx.doi.org/10.1016/j.jhealeco.2015.04.003
0167-6296/© 2015 Elsevier B.V. All rights reserved.
198 R. Nuscheler, K. Roeder / Journal of Health Economics 42 (2015) 197–208
income. Given this heterogeneity an allocation is assessed along environment. It may relate to the quality elasticity of demand as
three dimensions: quality, effort, and the distribution of income (or, in the first three papers, or to the complementarity between the
equivalently, the numéraire commodity). If optimal income taxa- different dimensions under consideration (the latter two papers).
tion is feasible — or in the absence of redistributive concerns — and We use a simplified version of their models enabling us to integrate
if quality and effort are contractible, the first-best allocation can health care financing. There are only two articles we are aware
be implemented. When quality and effort are non-contractible, the of that consider a median voter approach to provider payment,
health authority uses a linear cost-sharing arrangement to steer namely, Gravelle (1999) and Nuscheler (2003). These papers look at
the provider’s incentives to invest in quality and to exert effort. how optimal capitation payments for physicians relate to the ones
As the health authority has two margins but only one instrument that would be implemented by majority voting. Both papers do not
the first-best allocation can no longer be implemented. The health consider a multi-task agency framework and remain silent about
authority then uses the cost-sharing parameter to optimally trade health care financing.
off the inefficiencies in quality and effort. If health care financing Second, the health care financing literature. The normative liter-
is through optimal income taxes, this tradeoff is not blurred by ature typically takes an optimal income taxation approach and asks
any redistributive consequences which the financing of health care whether there is a case for redistributive social health care finan-
provision might have. cing in the presence of progressive income taxation. Blomqvist and
The second-best allocation is then contrasted with allocations Horn (1984) and Cremer and Pestieau (1996), for instance, show
under alternative financing regimes, namely, proportional income that the desirability of social health insurance in parallel to an opti-
taxes and insurance premiums. We call the resulting allocations mal income taxation scheme crucially depends on the correlation
third-best. With proportional income taxation, income is redis- between income and health risk. For the empirically relevant case
tributed from high-income agents to low-income ones and from of a negative correlation, a redistributive public health care sys-
low-risk agents to high-risk ones. Depending on the distributional tem can improve on a purely private health care market.2 Breyer
characteristics of risk and income the third-best policy may imply and Haufler (2000) advocate for a strict separation of income redis-
more cost-sharing than the second-best policy and with it higher tribution and health care financing as this would allow for better
quality and less effort, causing health care expenses to be higher. health insurance contracts (in terms of ex post moral hazard) and
When health care financing is through insurance premiums the more efficient public financing in general (lower shadow costs of
second-best quality-effort tradeoff is affected if and only if insur- public funds). Political feasibility of optimal policies and provider
ance premiums involve some pooling. Then, premiums redistribute reimbursement are ignored. The positive literature on health care
income from low-risk agents to high-risk ones with the extent financing aims at explaining the existence of public health care,
being governed by the degree of pooling. The comparison between its size and its form of financing. Epple and Romano (1996a) and
the second-best and third-best allocations hinges on the distribu- Gouveia (1997) were the first to address these issues.3 The former
tional characteristic of risk and on the extent of pooling. paper considers agent heterogeneity in income and shows that
To complete the picture, we derive the allocations under major- there is an ‘ends against the middle’ equilibrium when public health
ity voting and contrast them with the optimal policies for both care can be topped up by actuarially fair private health insurance.
proportional income taxes and insurance premiums. While the Gouveia (1997) shows that this result continues to hold when het-
redistributive preferences of the health authority are governed by erogeneity in risk is added to the framework. Both papers derive
the distributional characteristics, the preferences of the median conditions under which a mixed health care system with pub-
voter depend on individual heterogeneity. This implies that, only lic and private health care financing arises. The mode of public
in knife-edge cases, can the optimal (third-best) policies be imple- financing, however, is taken as given. We explicitly analyze the con-
mented as political equilibria. For the case of proportional income sequences of alternate financing regimes on economic allocations.
taxes the comparison of the two allocations depends on how the Rather than taking a reduced form approach where a health good
relative inequity between risk and income compares to the relative is uniformly distributed to those who need it, we add a multi-task
distributional characteristics between these two dimensions. For provider payment setting to the model. Finally, Epple and Romano
insurance premiums it is only the inequity in risk together with the (1996a) and Gouveia (1997) offer no normative analysis. By con-
extent of pooling and its relation to the distributional characteristic trast, the current article studies normative and positive allocations
of risk that matters. and demonstrates how they compare to one another. Kifmann
Finally, it should be noted that, rather remarkably, risk-rated (2005) extends Gouveia’s analysis by introducing a constitutional
premiums imply second-best optimal health care provision for both stage where voters have a say on the mode of health care financing.
the third-best allocation and the political outcome. The reason But, again, a normative analysis is missing as well as the integration
being that risk-rated premiums preclude any form of redistribution. of provider payment.
There is then no conflict in the electorate about how to shape the Finally, our paper relates to the normative literature that ana-
health care system and second-best health care provision results. lyzes both, health care financing and funding. Zeckhauser (1970)
From the normative end it does not pay off to distort the opti- was the first to simultaneously analyze provider payment and
mal policy away from the second-best as the associated efficiency health care financing. Ma and McGuire (1997) generalized this
losses are not compensated by redistributive gains. As the resulting framework. These papers analyze optimal health insurance in an
income distribution may not be optimal the equilibrium alloca- ex post moral hazard setting. We consider a multi-task agency set-
tion may not be second-best efficient. Our results demonstrate that up instead and investigate a much richer set of financing regimes.
studies on optimal provider payment that neglect health care finan- Moreover, their frameworks are normative in nature. An analysis
cing rest on very strong assumptions regarding the redistributive
motives of the health authority, or on health care financing.
This article relates to two strands of the health economics
literature. First, provider payment. The papers of Chalkley and 2
Kifmann and Roeder (2011) extend the analysis to premium subsidies and exam-
Malcomson (1998a,b), Ma (1994), and, more recently, Eggleston ine whether this approach is superior to social health insurance from a welfare
(2005), and Kaarbøe and Siciliani (2011), show that mixed payment perspective. For a negative correlation they find that combining premium subsidies
with social health insurance is the optimal policy.
systems, i.e., a combination of capitation payments and cost- 3
Epple and Romano (1996b) is another example. In this article the authors inves-
sharing, will generally be optimal. Whether quality incentives tigate a framework where individuals can opt out the public plan and buy private
are high powered or low powered depends on the respective health insurance. As a result, preferences are no longer single-peaked.
R. Nuscheler, K. Roeder / Journal of Health Economics 42 (2015) 197–208 199
Health l pl rl l >0.5
where the budget constraint amounts to xij = yi − Tij , xij = (1 − t)yi ,
risk h ph rh h < 0.5
and xij = yi − [(1 − ϕ)j + ϕ]p for financing regimes T, t, and p,
p > 0.5 r < 0.5 1 respectively.
of how their outcomes relate to those that would be implemented 2.2. The health care provider
in a political process is missing. Our model is considerably richer
in this respect allowing us to analyze the important interaction When treating a patient the provider incurs treatment costs K.
between health care financing and funding in normative and pos- As health care always includes a random element, these costs are
itive settings in great detail and thereby to contribute to a rather uncertain. From an ex ante perspective only expected treatment
slim financing and funding literature. costs, c = E(K), matter. Despite this uncertainty the HCP has an influ-
The remainder of the paper is organized as follows. Section 2 ence on costs and we suggest that c is a function of the quality of
introduces the basic framework, followed by a normative analysis care, q, and the provider’s effort to keep treatment costs down, e.7
in Section 3. Political outcomes are derived and compared to the We let the expected treatment cost function c(q, e) satisfy cq > 0,
respective normative allocations in Section 4. We discuss model cqq > 0, ce < 0, cee ≥ 0 and ceq = 0. Expected treatment costs, thus,
extensions in Section 5. Section 6 concludes. increase with quality and do so at an increasing rate. More effort
implies lower treatment costs in expectation but at a decreasing
2. The model rate.8 The provider’s effort to contain costs involves a disutility v(e)
per patient, where ve > 0 and vee > 0. Finally, the HCP incurs a fixed
2.1. Individuals cost F.
Total provider reimbursement is denoted P and the actual
We consider a continuum of individuals of size one who differ payment is determined by a simple linear cost-sharing arrange-
along the two most important dimensions when it comes to health ment, P = K + . The parameter ∈ [0, 1] is the extent to which the
care financing and health care provision, namely, income and risk. health authority (HA) is willing to share treatment costs. As an addi-
There are two income types i = r, p (rich and poor) with income lev- tional compensation the HCP may receive a lump-sum payment
els yr > yp > 0. The probability of falling ill is denoted j and assumes ∈ R per patient treated. The expected reimbursement is given by
the value l ∈ (0, 1) for low-risk individuals and h ∈ (l , 1) for high-
E(P) = c(q, e) + . (2)
risk individuals. Both income and risk are exogenously given. The
two-dimensional heterogeneity gives rise to ij-types and we denote In addition to reimbursement the HCP derives utility from the
their share in the population ij ∈ [0, 12 ), where the upper bound is patients’ benefits generated through treatment. This gives rise to
introduced for the sake of interest.4 To ease notation we define the HCP’s expected payoff:
i ≡ il + ih and j ≡ pj + rj . In the following, we assume p > 0.5,
that is, median income, yp , is below average income, y = p yp + r yr . H(q, e) = [˛b(q) − c(q, e) − v(e) + E(P)] − F, (3)
In addition, the majority of individuals is exposed to a low health
risk, l > 0.5. This implies that median risk, l , is smaller than aver- where is the share of agents treated, that is, all individuals who
age risk, = l l + h h . Table 1 summarizes. are sick. The parameter ˛ captures the HCP’s degree of altruism
Sickness inflicts a disutility L > 0 on individuals. Through the towards patients’ health benefits (see, e.g., Ellis and McGuire, 1990;
receipt of care these costs can be mitigated but not eliminated. We Eggleston, 2005; Jack, 2005). For ˛ = 1, the HCP is a ‘perfect agent’
refer to this reduction as the benefit of treatment and denote it and for ˛ < 1 an imperfect one.9
b ∈ [0, L). There is one health care provider (HCP), e.g., a hospital, Substituting E(P) in Eq. (3) by Eq. (2) we arrive at the HCP’s
specialist, or general practitioner, who treats all patients in need of optimization program
care. We summarize all provider activities that aim at increasing maxH(q, e) = [˛b(q) − (1 − )c(q, e) − v(e) + ] − F. (4)
b in a single, one-dimensional quality index q.5 We let bq > 0 and q,e
bqq ≤ 0, that is, an improvement in health care quality increases the
Our assumptions about the benefit function b(q) and the cost func-
benefits from treatment but at a decreasing rate.
tions c(q, e) and v(e) ensure that the HCP’s problem is concave. The
Health care is publicly financed. We distinguish between three
first-order conditions with respect to q and e amount to
financing regimes indexed n ∈ {T, t, p}. The government either
uses individualized lump-sum transfers Tij (optimal income tax- ∂H(q, e)
ation, n = T), proportional income taxes with tax rate t ≥ 0 (n = t), = 0 ⇔ ˛bq − (1 − )cq = 0, (5)
∂q
or insurance premiums (n = p) to generate the revenue required to
reimburse the HCP. Insurance premiums can be either risk-based,
pooled, or be a mixture of both. More precisely, the insurance pre-
6
We assume that patients passively accept the quality of treatment the HCP is
mium is given by [(1 − ϕ)j + ϕ]p, where p is the price of health
willing to provide.
care and ϕ ∈ [0, 1] the extent of pooling. 7
Although we are not concerned with multiple quality dimensions we have a
In addition to the (expected) benefits from health care, agents multi-task agency problem: the HCP chooses quality and effort.
8
derive utility from consumption of a numéraire commodity The assumption that the cross derivative vanishes is made for analytical conve-
nience.
9
Note that even in the case of perfect agency, the individuals’ well-being does not
fully enter the HCP’s utility function. The reason is that even though the HCP con-
4
In our political game an ij-type with ij ≥ 12 could dictate the allocation. siders the patients’ utility from treatment, it does not take into account the financial
5
This assumption rules out multi-task quality issues when it comes to optimal costs of service delivery, that is, the taxes or premiums that are borne by the patients
provider payment (Chalkley and Malcomson, 1998a,b and (Kaarbøe and Siciliani, (see also Jack, 2005). So, the HCP’s and the patient’s valuation of health care services
2011)). may differ even if ˛ = 1.
200 R. Nuscheler, K. Roeder / Journal of Health Economics 42 (2015) 197–208
de(; ˛) ce
e ≡ = < 0. (8) dT () dt() dp()
d (1 − )cee + vee =y = = [(cq − ˛ˇbq )q + (ce + ve )e ] > 0.
d d d
The more cost-based the HCP’s payment the larger the quality he is (16)
willing to supply and the smaller the effort he exerts to reduce cost.
The first effect is due to the lower price the HCP has to pay for quality A higher cost-sharing parameter requires higher public revenues.
improvements when the cost-sharing parameter increases. The This is a direct consequence of the HCP’s response to an increase
less expensive the provision of quality the more the HCP is willing in cost-sharing: more cost-sharing implies higher quality (7) and
to offer. The second effect is a moral hazard effect. An increase in the less cost reducing effort (8) both increasing expected health care
share of reimbursed treatment costs undermines the HCP’s incen- spending and with it the revenues needed to balance the public
tives to contain costs. Additionally, q( ; ˛) positively depends on budget.
˛: the more the HCP cares about the patients’ health care benefits,
the higher the quality of health care services he delivers. 2.4. The economic equilibrium
2.3. The public health care scheme The following definition introduces the notion of economic
equilibrium into our model economy.
The government or the health authority (who is the purchaser
of health care services) faces two constraints, a participation con-
Definition 1. (Economic equilibrium)
An allocation xpl , xph , xrl , xrh , q, e with policy instruments
straint and a budget constraint. We assume throughout that the Tij /t/p, and constitutes an equilibrium of the economy if the
benefit from treatment is sufficiently large so that the HA always following conditions hold:
wants to contract with the HCP. As the HCP cannot be forced to
provide health care services, the lump-sum transfer per patient
(i) the utility of all agents is maximized, i.e., the program given in
and the cost-sharing parameter must be chosen such that the HCP
Eq. (1) is solved,
is willing to accept the contract. With a reservation utility of zero
(ii) the health care provider’s participation constraint (9) is satis-
the participation constraint reads as10
fied and its payoff is maximized, i.e., the program given in Eq.
˛ˇb(q) − (1 − )c(q, e) − v(e) + − F ≥ 0. (9) (4) is solved, and
(iii) the government’s budget constraint is balanced, i.e., for opti-
The parameter ˇ ∈ {0, 1} allows us to distinguish between two sce- mal income taxation Eq. (10) holds, for proportional income
narios. For ˇ = 0 the monetary part of the HCP’s utility needs to be taxation Eq. (11) holds, and for insurance premiums Eq. (12)
non-negative. In contrast, for ˇ = 1 the participation constraint is holds.
less demanding as the (partial) internalization of patients’ health
benefits allows for negative monetary payoffs. The functional form In an economic equilibrium the utility level obtained by type-
of (9) and the parameters ˛ and ˇ are common knowledge. ij agents can be expressed by their indirect utility function Vijn (),
As already noted above, the expected health care expenses, where
HCE = [c(q, e) + ], can either be financed by optimal income
taxes, proportional income taxes, or insurance premiums. To bal- VijT () = yi − Tij () + j [b(q()) − L] , (17)
ance the public budget, public revenues need to be equal to the
expected health care expenses. We have Vijt () = (1 − t())yi + j [b(q()) − L] , (18)
p
ij ij Tij ≡ T = HCE, (10) Vij () = yi − [(1 − ϕ)j + ϕ]p() + j [b(q()) − L] . (19)
ti i yi ≡ ty = HCE, (11) The indirect utility function represents an individual’s preferen-
ces over the cost-sharing parameter, . Inspection of Eqs. (17)–(19)
pj j [(1 − ϕ)j + ϕ] = p = HCE, (12) reveals that these preferences depend on the mode of health care
financing. This already points to the different distributional proper-
ties of the three financing regimes. With individualized lump-sum
10
A strictly positive reservation utility would simply add to the fixed costs of being transfers the government can perfectly redistribute between indi-
active in the market. viduals so as to equalize their (marginal) utilities. With proportional
R. Nuscheler, K. Roeder / Journal of Health Economics 42 (2015) 197–208 201
income taxation, income can still be redistributed from rich to poor ∂W ∗ (Tij , q, e)
using health care as a vehicle (see, e.g., Gouveia, 1997). While any = ce + ve = 0, (24)
∂e
form of redistribution is ruled out when premiums are risk-based
(ϕ = 0), partial pooling, ϕ ∈ (0, 1], implies redistribution from low- where
is the Lagrangean multiplier on the budget constraint.
risk to high-risk agents with the extent of redistribution being The first condition states that individualized lump-sum transfers
increasing in ϕ. In Sections 3 and 4 we carefully analyze how these should be chosen such that marginal utilities are equalized. For
differences affect the allocations in normative and political econ- < 0 this implies that, for a given risk type, high-income indi-
omy environments. viduals have to pay more taxes than low-income agents: Trj > Tpj .
Similarly, for a given income type, low-risk agents have to pay more
3. Optimal financing and funding taxes than high-risk agents as the former suffer the utility loss from
illness with a higher probability than the latter: Til > Tih . We get
In this section we study the optimal interaction between the following ordering of individual lump-sum taxes: Trl > max {Tpl ,
health care financing and funding considering four different allo- Trh } ≥ min {Tpl , Trh } > Tph . Whether Tpl ≶ Trh depends on the rel-
cations: first-best, second-best, and two third-best allocations. ative inequity between income and risk. Without redistributive
To determine the first-best allocation we consider quality and concerns, = 0, all Tij that balance the public budget would be
effort contractible and let optimal income taxation be feasible, optimal.
that is, health care financing is through individualized lump-sum Eq. (23) states that first-best quality, q* , should be expanded
transfers. By maintaining the optimal income taxation assump- until the marginal benefit to patients, bq , is equal to the marginal
tion and introducing non-contractibility for both quality and effort costs of providing health care services, cq − ˛ˇbq . Obviously, the
we obtain the second-best allocation. This allocation is then com- costs of expanding quality are lower when the participation
pared to two third-best allocations where quality and effort are still constraint of the HCP includes an altruistic component (ˇ = 1)
considered non-contractible but alternative health care financing as compared to a situation where the HCP has to break even
regimes are considered, namely, proportional income taxation and in monetary terms (ˇ = 0). As a result, the first-best efficient
insurance premiums. quality level is higher in the former case and the quality differ-
Throughout our analysis we assume a HA or government who ential is increasing with ˛. The first-best cost reduction effort, e* ,
may have redistributive concerns, that is, who may aim at redis- equates the marginal disutility of exerting effort, ve , to the marginal
tributing to the double disadvantaged in society, that is, to high-risk expected treatment cost savings, −ce ; see Eq. (24).
low-income individuals. We incorporate redistributive consider-
ations into the analysis by letting the HA’s objective function be
3.1.2. Second-best
W= ij (Uij ), (20) The government can still impose optimal income taxes but both
quality and effort are non-contractible.12 Using the reimbursement
ij
system the HA can steer the HCP’s incentives to invest in quality and
where (·) is a strictly increasing and weakly concave function of to contain costs. We focus on linear contracts as given in Eq. (2)
individual utility levels, i.e. (·) >0 and ( · ) ≤ 0. This formulation that specify a lump-sum transfer per patient and a cost-sharing
comprises a utilitarian HA as limiting case, ( · ) = 0. Due to quasi- parameter .
linearity every income distribution would then be optimal. In other Again, the HA maximizes the welfare function subject to the
words, all redistributive concerns of the HA are contained in . An participation constraint (9) of the HCP and the public budget con-
immediate implication is that a necessary condition for a difference straint (10). The HA, however, no longer optimizes over Tij , q and e,
between the second-best and the two third-best allocations is the but over Tij and . In other words, she loses one degree of freedom
presence of redistributive concerns. In the following, we investigate as compared to the first-best problem. In determining the optimal
how these concerns shape optimal health policy. cost-sharing parameter the HA has to consider the HCP’s optimal
quality and effort responses to cost-sharing arrangements as given
3.1. Normative benchmarks — optimal income taxation by Eqs. (5) and (6). Specifically, the optimization problem is given
by
3.1.1. First-best
The HA maximizes (20) with respect to Tij , q and e subject to the max W T (Tij , ) = ij (yi − Tij + j [b(q) − L])
HCP’s participation constraint (9) and the public budget constraint Tij ,
ij (25)
(10). The optimization problem is11
s.t. (5), (6) and (13).
∂W T (Tij , )
y
ij ij ij i
−
ij ij ij
y
ij ij i
= −cq + [1 + ˛ˇ]bq q − (ce + ve )e = 0. (27) y ≡ . (32)
∂
ij ij ij
y
ij ij i
Like in the first-best solution individualized lump-sum transfers
should be chosen such that marginal utilities are equalized. Eq. (27) These measures amount to the standardized covariances between
yields the optimal degree of cost-sharing, T , which, using the HCP’s the welfare weights the government attaches to a particular ij-
first order conditions (5) and (6), dictates qT and eT : type and health risk and income, respectively. For a utilitarian
HA, = 0, all individuals have the same welfare weight irrespec-
˛bq (qT ) = (1 − T )cq (qT ) and − (1 − T )ce (eT ) = ve (eT ). (28) tive of health and income, that is, ij = > 0 ∀ ij. The absence of
Implementation of the first-best allocation is generally impossible redistributive concerns is reflected in the distributional character-
as the HA has only one instrument, the cost-sharing parameter , istics both assuming the value zero, = y = 0. Similarly, without
but two margins, namely, q and e. It can easily be verified that a agent heterogeneity the distributional characteristics would be
cost-sharing parameter = 0 implements the efficient effort level zero: l = h implies = 0 and yp = yr implies y = 0.13
as the provider is the residual claimant for his cost savings. Quality Using Eqs. (31) and (32), Eq. (30) can be rewritten as
provision, however, will then be inefficient unless ˛(1 − ˇ) = 1. As
3.2. Third-best — the optimal financing and funding interaction (i) ≥ y ⇔ t ≥ T ⇔ qt ≥ qT and et ≤ eT .
(ii) The third-best allocation with proportional income taxes is identi-
In addition to the resource constraints and the non- cal to the second-best allocation if and only if y = = 0.
contractibility of quality and effort the HA now faces financing
constraints. When optimal income taxation is no longer feasible
The intuition for part (i) is as follows.14 Whenever > 1, which
the HA has to resort to alternative financing regimes, propor-
is equivalent to > y , the marginal benefit of quality provision
tional income taxation and insurance premiums being the most
is higher under proportional income taxes than under optimal
prominent arrangements. We investigate the resulting third-best
income taxation. The reason being the distributional consequences
allocations in turn.
of improvements in quality. High-risk individuals are more likely to
benefit from higher quality care but, ceteris paribus, do not pay more
3.2.1. Proportional income taxes
taxes than low-risk individuals. Additionally, rich individuals con-
With proportional income taxes the policy instruments are
tribute more to the financing of better quality without benefitting
given by the tax rate t and the cost-sharing parameter . The opti-
more than poor individuals. The increase in quality, thus, implies
mization problem reads as
more distribution towards poor and high-risk individuals. These
max W t (t, , ) = ij ((1 − t)yi + j [b(q) − L]) redistributive benefits have to be weighed against the additional
t, distortion of effort incentives and the third-best optimal degree
ij (29)
s.t. (5), (6) and (14). of cost-sharing does so optimally. For < 1 cost-sharing in the
third-best is less pronounced than in the second-best and qual-
Inserting the constraints, the optimization problem simplifies to ity and effort levels compare accordingly. Optimal cost-sharing in
the third-best and second-best optima is identical if and only if
= 1. If = y =
/ 0, then health care provision is second-best effi-
W t () = ij 1− [c(q(), e()) cient. As distributional concerns matter, however, the allocation is
y
ij not second-best efficient. As a consequence of suboptimal health
F
care financing the income redistribution is inefficient. Only in the
+v(e()) − ˛ˇb(q()) + yi + j [b(q()) − L] . absence of redistributive concerns, = 0, or when there is no need
for redistribution, l = h and yp = yr , the third-best and second-best
The optimal cost-sharing parameter, t , is implicitly determined by allocations coincide (ii).
the following first order condition One can certainly speculate about how the distributional char-
acteristics compare to one another. We conjecture that the HA
dW t () attaches a higher welfare weight to the disadvantaged, that is,
= ij ij − yi (cq − ˛ˇbq )q + (ce + ve )e to individuals with high health risks, Cov( ij , j ) > 0, and low
d y
ij income, Cov( ij , yi ) < 0. This is equivalent to > 0 and y < 0
which implies > 1 and with it more cost-sharing, higher qual-
+j bq q = 0, (30) ity and less effort under proportional income taxation as compared
to optimal income taxation.
where q and e are defined as in Eqs. (7) and (8). To gain a better
understanding of how redistributive concerns affect optimal cost-
sharing under proportional income taxation t and thus optimal 13
Interpretation of the distributional characteristics is not straightforward. Note,
quality qt and effort et , we introduce the distributional character- that we made no assumptions about the correlation between income and risk.
istics of health risk and income (see Feldstein, 1972): This implies that any assumption on the sign of the distributional characteristics
implicitly includes assumptions on the sign and size of this correlation.
ij ij ij j
−
ij ij ij
ij ij j
14
Analytically, the result follows — like in all other propositions and corollaries —
≡
, (31) from a simple comparison of the coefficients on quality in the respective first order
ij ij ij
ij ij j conditions, here Eqs. (27) and (33).
R. Nuscheler, K. Roeder / Journal of Health Economics 42 (2015) 197–208 203
3.2.2. Insurance premiums independent of the extent of pooling ϕ. As a result, health care qual-
With insurance premiums the optimization problem of the HA ity in tax based systems is expected to exceed quality in insurance
is given by financed systems. It is the other way round for cost reducing effort,
leading to higher expected health care expenses in the former sys-
max W p (p, , ) = ij (yi − [(1 − ϕ)j + ϕ]p + j [b(q) − L])
p,
tems than in the latter.
(34)
ij
s.t. (5), (6) and (15).
4. Political implementability of optimal policies
Employing the same approach as in the previous section and using
Eq. (31), the first order condition for the optimal cost-sharing In the previous section we assumed a normative perspective
parameter p can be written as and investigated the optimal health care policies under optimal
income taxation (first-best and second-best) and contrasted them
1 +
−cq + + ˛ˇ bq q − (ce + ve )e = 0. (35) with alternative financing regimes (third-best). Implementing an
1 + (1 − ϕ)
optimal cost-sharing parameter, however, may not be feasible
This condition allows us to state our next proposition where we politically. Under majority voting, policy makers will commit to
compare the results with insurance premiums to the second-best policies that maximize the number of votes rather than the social
allocation. objective. Taking a median voter approach, this section derives the
political equilibria under proportional income taxation and insur-
Proposition 2. (Optimal policy with insurance premiums) ance premiums and contrasts them with the respective third-best
and second-best policies.
(i) When insurance premiums entail some redistribution, ϕ > 0, then: With a continuum of individuals each voting agent has zero
mass, so that no individual vote can change the outcome of the
election. We let all agents alive cast a ballot over the cost-sharing
≥ 0 ⇔ p ≥ T ⇔ qp ≥ qT and ep ≤ eT . parameter . Although it appears more natural to let voters decide
on the parameter which determines the financing of health care
(ii) For purely risk-based premiums, ϕ = 0, health care provision is rather than on one parameter of the reimbursement scheme, this
second-best efficient while health care financing is not. approach is without loss of generality. Remember that in an eco-
(iii) The third-best allocation with insurance premiums is identical to nomic equilibrium only one of the three policy instruments can
the second-best allocation if and only if y = = 0. be set freely. The other two are then residually determined by the
HCP’s participation constraint and the HA’s budget constraint. It is
The intuition is similar to the one of Proposition 1. Suppose thus irrelevant whether voters decide on , t/p or . We opt for
that insurance premiums imply some redistribution from low-risk the former as it allows for a direct comparison of the equilibrium
to high-risk individuals as stated in part (i) of the proposition. allocations of the two approaches, normative and positive.
Then an increase in quality triggered by more cost-sharing ben-
efits high-risk individuals with a higher probability than low-risk
4.1. Proportional income taxes
individuals. The resulting increase in insurance premiums, how-
ever, is not shared accordingly when premiums are partially pooled.
Individuals maximize their indirect utility function (18) with
High-risk individuals disproportionately benefit from an increase
respect to the cost-sharing parameter subject to the constraint
in quality making such improvements desirable whenever the HA
on the tax rate t given by Eq. (14). The first order condition of an
attaches a higher welfare weight to high-risk individuals than to
ij-type amounts to15
low-risk individuals, i.e., whenever > 0. In this case it pays off
to distort quality and effort away from its second-best levels as the dVijt ()
resulting improvement in the distribution of income outweighs the = (−ıi cq + [ıj + ˛ˇıi ]bq )q − ıi (ce + ve )e = 0, (36)
d
associated losses in efficiency. There is no such distributional ben-
efit when premiums are purely risk-rated, (ii). This result carries an where we defined ıi ≡ yi /y and ıj ≡ j /. The above equation
important policy message: second-best optimal health care provi- implicitly determines the most preferred cost-sharing parame-
sion can be achieved by both financing regimes, optimal income ter ijt of a type-ij individual.16 To see how the most preferred
taxation and risk-rated insurance premiums. It emphasizes that an cost-sharing parameter depends on income and risk we apply the
isolated (second-best) analysis of provider payment rests on strong implicit function theorem which, observing the first order condi-
assumptions regarding the financing of health care — unless, (iii), tions of the HCP, Eqs. (5) and (6), yields
the HA has no redistributive concerns.
Combining Propositions 1 and 2 we find that the optimal policy dijt (cq + ˛(1 − ˇ)bq )q + ce e
= < 0 and (37)
is sensitive to the mode of health care financing when redistribu- dıi SOC ij
tive concerns matter. Only in the special case of no redistributive
concerns is the redistribution of income, implied by a change from dijt bq q
=− > 0. (38)
one health care financing regime to the other, welfare neutral. We dıj SOC ij
emphasize this in the following corollary.
These inequalities allow us to order the four types according to their
Corollary 1. (Optimal policy: proportional income taxes vs. insur- most preferred cost-sharing parameter and with it to identify the
ance premiums) median voter.
(1 − ϕ) ≥ y ⇔ t ≥ p ⇔ qt ≥ qp and et ≤ ep .
Lemma 1. (Median voter: proportional income taxes)
< ⇔ ıl < 1 + . High-risk individuals disproportionately benefit
The most preferred cost-sharing parameters can be ordered as fol- from quality improvements. As the median voter is a low-risk agent,
lows: rlt < min{pl
t , t } ≤ max{ t , t } < t . The median voter is
rh pl rh ph
he aims at limiting the redistribution towards high-risk individuals
t = t .
a pl-type agent, i.e., m and he does so by implementing a low cost-sharing rate. As the dis-
pl
tributional preferences of the HA will usually point in the opposite
Eq. (37) implies rjt < pj t . For a given risk type high-income
direction, > 0, cost-sharing is more pronounced in the third-best
agents contribute more to the financing of the health care scheme allocation than in the political outcome. These arguments extend to
than low-income agents. As the risk type is given, the bene- the case with inequities in both, income and risk.
> : the cost-
fit from more cost-sharing in terms of higher quality is uniform sharing benefit accruing to the median voter is larger than in the
across income types making cost-sharing less attractive for high- social objective as, due to the sufficiently low health risk inequal-
income individuals. Similarly, Eq. (38) implies ilt < iht . For a given ity as compared to income inequality, the benefits of better quality
income, the expected benefits from quality improvements trig- care are distributed more evenly. In that sense the redistribution is
gered by more cost-sharing are higher for high-risk individuals better targeted towards the median voter. The HA’s redistributive
than for low-risk individuals. As financing costs are uniform across concerns are lower so that her cost-sharing incentives are weaker.
risk types, high-risk agents prefer more cost-sharing than low-risk
= : although this is a knife-edge case it is noteworthy that
agents. Although the comparison of the most preferred cost-sharing the distributional preferences of the median voter and the HA are
parameters of pl-types and rh-types generally depends on the aligned so that the political outcome is identical to the third-best
distribution of types, identification of the median voter is straight- allocation.
forward: as neither the high-risks nor the rich can form a majority If the inequality in health risk and income is identical,
= 1,
( h < 0.5 and r < 0.5, respectively), the median voter is a type-pl then the political outcome with health care financing through pro-
agent who either forms a majority with low-income high-risk indi- portional income taxation leads to second-best efficient health care
viduals or with high-income low-risk individuals. provision. The reason is that the median voter’s benefits from cost-
Inserting the median voter’s type into Eq. (36) and dividing by sharing through improvements in the distribution of income are
ıp allows a direct comparison with the third-best policy under pro- exactly compensated by the worsening of the distribution of health
portional income taxation. benefits. Finally, this allocation is second-best efficient if = y = 0.
We argued above that the HA likely attaches a higher welfare
(−cq + [
+ ˛ˇ]bq )q − (ce + ve )e = 0, (39)
weight to high-risk or low-income individuals than to low-risk or
where
≡ ıl /ıp . While ıl ≤ 1 and ıp ≤ 1 measure the inequity in high-income ones. But then > 1 and the third-best can only be
health and income respectively, their fraction,
, is a measure of implemented through a majority voting process if the inequality
the relative inequity between the two dimensions. in health risk is smaller than the inequality in income such that
=
> 1. In general, however, the second-best cannot be achieved
Proposition 3. (Political outcome with proportional income taxes) as a political equilibrium.
Proposition 4. (Political outcome with insurance premiums) income taxation. Although somewhat unrealistic, the former finan-
cing arrangement offers a valuable benchmark (the second-best
(i) When insurance premiums entail some redistribution, ϕ > 0, then allocation) against which the allocations under alternative finan-
the relationship between cost-sharing arrangements in the third- cing regimes can be compared. The latter financing arrangement,
best optimum and the political equilibrium depends on how the proportional premiums, are the rule rather than the exception
inequity in risk relates to the distributional characteristic of risk. in the financing of social health insurance systems. 14 out of 16
p
More precisely, we have: 1 + ≥ ıl ⇔ p ≥ m ⇔ qp ≥ OECD countries with such a system use proportional contributions
p p p
qm and e ≤ em . to finance health care.20 The only exceptions are Hungary and
(ii) Health care provision is second-best efficient if and only if ϕ = 0. Switzerland. While the former country relies, in part, on progressive
Second-best efficiency of the allocation requires the absence of income taxes, Switzerland is an example for a regressive finan-
redistributive concerns, = y = 0. cing scheme that collects partially pooled insurance premiums.
Redistribution across risk types is limited insofar as individuals
If the HA aims at redistributing towards high-risk individuals, can choose from a set of contracts that differ in both, premiums
> 0, then the condition formulated in part (i) of the proposi- and deductibles. As individual selection is — among other things
tion always holds. High cost-sharing parameters imply generous — driven by risk, pooling is incomplete, ϕ < 1. Income is relevant
redistribution towards high-risk individuals. Other than the HA insofar as individuals below a canton specific threshold receive
the median voter, a low-risk agent, dislikes redistribution towards premium subsidies.21
high-risk types. Accordingly, cost-sharing is less pronounced in the Tax financed national health service systems tend to use pro-
political outcome than under the third-best policy. This ordering gressive taxation to finance health care. Such a system is more
can only be reversed if the HA has a sufficiently strong preference to redistributive than a proportional financing regime and likely less
redistribute towards low-risk type agents. Remarkably, for purely redistributive than a system that applies optimal income taxes.
risk-based premiums, ϕ = 0, the incentives of voters are aligned Therefore, the resulting allocation under progressive taxation will
with those of the HA in both the second-best and the third-best be some sort of mixture between the allocations under proportional
environment.18 Although this result appears somewhat surprising, income taxation and optimal income taxation. In the following, we
the intuition is relatively straightforward. The optimal allocations investigate the normative and positive allocations for a simplified
trade off inefficiencies along cost reduction effort and quality and progressive income tax scheme.
this tradeoff is not blurred by redistributive motives. Similarly, Suppose that there are two proportional income tax rates and
in the political game redistribution plays no role as risk-based that the one for the rich exceeds the one for the poor. Such a scheme
premiums preclude any form of redistribution. The third-best allo- is more redistributive than a proportional income tax system miti-
cation can thus be implemented through a majority voting process. gating the role of health care in redistributive policies. This implies
In contrast to the proportional income taxation case, this result that the HA would optimally implement a lower degree of cost-
holds independent of agent heterogeneity. Finally, note that for sharing.
purely risk-based premiums health care provision is second-best For the positive analysis of progressive income taxes we first
efficient. The resulting income distribution may still be inefficient have to identify the median voter. Rich low-risk individuals
unless there are no redistributive concerns. are still the ones who stand to benefit the least from publicly
Combining Propositions 3 and 4 we arrive at the following financed health care and poor high-risk individuals are the ones
corollary that facilitates comparison of political outcomes under who benefit the most. As rich high-risk individuals can never
alternative financing regimes. form a majority with risk types or income types of their kind,
poor low-risk individuals remain pivotal. It is, thus, the most
Corollary 2. (Political outcome: proportional income taxes vs. insur-
preferred cost-sharing level of the type-pl agent that is being
ance premiums)
t ≥ p p implemented. With progressive income taxation their net marginal
ıp − ıl ≤ ϕ(1 − ıl ) ⇔ m m ⇔ qtm ≥ qm and em t ≤
p benefit of health care provision is larger than with proportional
em .
income taxation as high-income individuals now contribute more
The comparison of cost-sharing across financing regimes hinges to the financing of the health care scheme. This implies more
on how risk pooling, as measured by ϕ, relates to the relative cost-sharing under progressive than under proportional income
inequity between income and risk. Only in the knife-edge case, taxation.
ıp − ıl = ϕ(1 − ıl ), is health care provision under majority voting We can now compare the normative to the positive outcome.
invariant to a change from insurance premiums to proportional Like with proportional income taxation the comparison hinges
income taxes. on the marginal incentives for redistribution. As argued above,
in the normative setting, these incentives are likely to be smaller
5. Discussion under progressive income taxation than under proportional income
taxation. This is the other way round in the positive setting. Conse-
One can think of many extensions of the framework analyzed quently, the set of parameter values for which the positive outcome
in this article. Here we discuss two important ones: progressive entails more cost-sharing than the normative outcome is larger
income taxation and endogenous choice of the health care financing under progressive income taxation (see Proposition 3).
regime. We discuss these extensions in turn.19
from low-risk to high-risk individuals. The extent to which the public health care program be invariant to health care financing
health authority distorts provider payment away from its second- and be independent of individual heterogeneity and redistributive
best (and with it health care provision) depends on individual preferences. More precisely, if no additional redistribution over and
heterogeneity in risk and income and on the welfare weights the above the redistribution via the transfer scheme was beneficial (the
health authority attaches to the different types. We show that transfer scheme is financed by optimal income taxes26 ) or if the
Feldstein’s (1972) distributional characteristics can be used to public health care program and the transfer scheme have identi-
characterize the third-best allocation. If the distributional charac- cal redistributive properties (e.g., if both programs are financed by
teristics of income and risk are identical, so are second-best and proportional income taxes and transfers within the income scheme
third-best health care provision. As income distributions might are perfectly correlated with risk).
differ, the third-best allocation is not second-best efficient unless Our results make very clear that an isolated analysis of provider
both distributional characteristics are nil. This is the case when the payment rests on very strong assumptions regarding the mode
health authority has no redistributive concerns, or in the absence of of health care financing (optimal income taxation or risk-based
individual heterogeneity. The political outcome is governed by the premiums) or the redistributive concerns of the health author-
preferences of the median voter who happens to be a poor, low- ity (none). This has a bearing on the applicability of the results
risk agent. With proportional income taxation the median voter derived in the optimal provider payment literature. The quality-
contributes less to the financing of the health care scheme than the cost containment tradeoff will generally differ internationally. In
average voter. But, at the same time, he needs health care with a principle, our framework could be used to derive testable hypoth-
lower probability than the average voter. The median voter’s pre- esis regarding the relationship between health care financing and
ferences for cost-sharing are thus driven by the relative inequity supply-side cost-sharing. Actual hypothesis testing would require
between risk and income. These preferences will likely differ from accurate data on the degree of cost-sharing, Feldstein’s distribu-
the ones of the health authority so that the third-best allocation tional characteristics, and individual heterogeneity. To the best of
can generally not be implemented as a median voter equilibrium. our knowledge, such data is currently not available so that we leave
Fig. 1 below gives the condition under which the two are identical. the empirical analysis for future research.
It should be noted that the political outcome may yield second-best
health care provision. This is the case whenever the inequality in Appendix A.
risk is identical to the inequality in income. Only in the absence of
individual heterogeneity or redistributive concerns is this alloca- A.1. Second-order conditions
tion second-best efficient.
When considering insurance premiums as a mode of health care A.1.1. The health care provider
financing, redistributive consequences are driven by the extent Since the two leading principal minors alternate in sign:
of risk pooling. In the extreme case of risk-based premiums (no
pooling) the third-best allocation yields second-best health care |H
qq | = |˛bqq −(1 − )cqq | < 0
provision. The reason is that risk-based premiums preclude any Hqq Hqe ˛bqq − (1 − )cqq 0
= >0
form of redistribution so that the health authority refrains from Heq Hee 0 (1 − )cee + vee
distorting health care provision away from its second-best. But
this also implies that there is no conflict in the electorate how due to bqq < 0, cqq > 0, cee ≥ 0 and vee > 0, the HCP’s chosen quality
to shape health care provision. The third-best allocation can thus level, q( ; ˛), and cost-reducing effort, e( ; ˛), are a maximum of
be implemented as a political equilibrium and health care pro- the objective function H(q, e).
vision is second-best efficient. As the income distribution might
not be optimal the resulting allocations will generally not be A.1.2. Individuals
second-best efficient. In the event of (partially) pooled premiums, The second-order condition of individual-ij amounts to
health care financing through insurance premiums involves a
2
redistributive element. Income is redistributed from low-risk to d Vijt ()
2
= (−ıi cqq + (ıj + ˛ˇıi )bqq )(q ) + (−ıi cq + (ıj + ˛ˇıi )bq )q
high-risk individuals. Redistribution from high- to low-income d 2
individuals is ruled out. Consequently, the third-best allocation <0
2
≷0
−ıi (cee + vee )(e ) −ıi (ce + ve )e .
and the political outcome will depend on the distributional char-
acteristic of risk and the inequity in risk, respectively. Only in the <0 <0
event of aligned cost-sharing preferences of the health authority
and the median voter can the third-best allocation be imple-
mented by majority voting. Again, the respective condition is given A.2. Comparative statics
in Fig. 1.
Finally, we derived conditions under which health care finan- Taking the total derivative of the HCP’s first order conditions (6)
cing is irrelevant in third-best and political economy environments. and (5) yields
It turns out that health care financing will generally matter unless
(˛bqq − (1 − )cqq )dq = −cq d − bq d˛ (41)
risk pooling relates to the distributional characteristics and indi-
vidual heterogeneity in the way laid out in Fig. 1. ((1 − )cee + vee )de = ce d (42)
The framework we adopted only considers public health care
and abstracts from additional public programs like, most impor- as we assume cqe = 0. The above system of equations can be written
tantly, an income transfer scheme. An immediate implication is as
that the former program implicitly assumes the redistributive role
dq 1 (1 − )cee + vee 0 −cq d − bq d˛
of the latter program (Cremer and Gahvari, 1997). Our results are = , (43)
de D 0 ˛bqq − (1 − )cqq ce d
largely robust to adding an income transfer scheme in the sense that
individual heterogeneity, redistributive preferences, and health
financing would still matter. The reason is that, apart from risk-
based premiums, health care financing is inherently redistributive. 26
This is, of course, equivalent to assuming health care financing through optimal
Only in rather extreme and unrealistic cases would the size of the income taxes.
208 R. Nuscheler, K. Roeder / Journal of Health Economics 42 (2015) 197–208
where D = (˛bqq − (1 − )cqq )((1 − )cee + vee ) > 0. Hence, we Chalkley, M., Malcomson, J.M., 2000. Government purchasing of health services.
have Handbook of Health Economics 1a, 847–890.
Cremer, H., Gahvari, F., 1997. In-kind transfers, self-selection and optimal tax policy.
dq −((1 − )cee + vee )cq −cq European Economic Review 41, 97–114.
= = >0 (44) Cremer, H., Pestieau, P., 1996. Redistributive taxation and social insurance. Interna-
d D ˛bqq − (1 − )cqq
tional Tax and Public Finance 3, 281–295.
de (˛bqq − (1 − )cqq )ce ce Dranove, D., Kessler, D., McClelland, M., Satterthwaite, M., 2003. Is more information
= = < 0. (45) better? The effects of “Report Cards” on health care providers. Journal of Political
d D (1 − )cee + vee Economy 111, 555–588.
Eggleston, K., 2005. Multitasking and mixed systems for provider payment. Journal
Additionally, we have of Health Economics 24, 211–223.
Ellis, R.P., McGuire, T.G., 1990. Optimal payment systems for health services. Journal
d2 q cq cqq of Health Economics 9, 375–396.
= >0 (46)
d 2 (˛bqq − (1 − )cqq )
2 Epple, D., Romano, R.E., 1996a. Ends against the middle: determining public service
provision when there are private alternatives. Journal of Public Economics 62,
297–325.
d2 e ce cee
= < 0. (47) Epple, D., Romano, R.E., 1996b. Public provision of private goods. Journal of Political
d 2 ((1 − )cee + vee )2 Economy 104, 57–84.
Feldstein, M.S., 1972. Distributional equity and the optimal structure of public prices.
American Economic Review 62, 32–36.
A.3. The health care financing and funding interaction Gouveia, M., 1997. Majority rule and the public provision of a private good. Public
Choice 93, 221–244.
Appendix B. Supplementary data Gravelle, H., 1999. Capitation contracts: access and quality. Journal of Health Eco-
nomics 18, 315–340.
Jack, W., 2005. Purchasing health care services from providers with unknown altru-
Supplementary data associated with this article can be found, in ism. Journal of Health Economics 24, 73–93.
the online version, at http://dx.doi.org/10.1016/j.jhealeco.2015.04. Kaarbøe, O.M., Siciliani, L., 2011. Multitasking, quality and pay for performance.
Health Economics 20, 225–238.
003 Kifmann, M., 2005. Health insurance in a democracy: why is it public and why are
premiums income related? Public Choice 124, 283–308.
Kifmann, M., Roeder, K., 2011. Premium subsidies and social health insurance:
References substitutes or complements? Journal of Health Economics 30, 1207–1218.
Ma, C.A., 1994. Health care payment systems: cost and quality incentives. Journal of
Blomqvist, A., Horn, H., 1984. Public health insurance and optimal income taxation. Economics & Management Strategy 3, 93–112.
Journal of Public Economics 24, 353–371. Ma, C.A., McGuire, T.G., 1997. Optimal health insurance and provider payment.
Breyer, F., Haufler, A., 2000. Health care reform: separating insurance from income American Economic Review 87 (4), 685–704.
redistribution. International Tax and Public Finance 7, 445–461. Nuscheler, R., 2003. Physician reimbursement, time consistency, and the quality of
Chalkley, M., Malcomson, J.M., 1998a. Contracting for health services with unmon- care. Journal of Institutional and Theoretical Economics, 302–322.
itored quality. The Economic Journal 108, 1093–1110. Zeckhauser, R., 1970. Medical insurance: a case study of the tradeoff between
Chalkley, M., Malcomson, J.M., 1998b. Contracting for health services when patient risk spreading and appropriate incentives. Journal of Economic Theory 2,
demand does not reflect quality. Journal of Health Economics 17, 1–19. 10–26.