100%(2)100% ont trouvé ce document utile (2 votes)

813 vues453 pagesIntroduction to Economic Growth - Daron Acemoglu - 2006

Introduction to Economic Growth - Daron Acemoglu - 2006

© Attribution Non-Commercial (BY-NC)

PDF, TXT ou lisez en ligne sur Scribd

Introduction to Economic Growth - Daron Acemoglu - 2006

Attribution Non-Commercial (BY-NC)

100%(2)100% ont trouvé ce document utile (2 votes)

813 vues453 pagesIntroduction to Economic Growth - Daron Acemoglu - 2006

Introduction to Economic Growth - Daron Acemoglu - 2006

Attribution Non-Commercial (BY-NC)

Vous êtes sur la page 1sur 453

Daron Acemoglu MIT Department of Economics January 2006

ii

Contents

I Introduction 1

5 5 16 18 19 19 19 23 27 28 29 35 41 41 42 49

1 Stylized Facts of Economic Growth and Development 1.1 A Quick Look at the Facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 The Agenda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 The Solow Growth Model 2.1 The Basic Model in Discrete Time . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 2.1.2 2.1.3 2.1.4 2.1.5 2.1.6 The Production Structure . . . . . . . . . . . . . . . . . . . . . . . . Endowments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fundamental Law of Motion of the Solow Model . . . . . . . . . . . . Denition of Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . Equilibrium Without Population Growth and Technological Progress Transitional Dynamics in the Solow Model . . . . . . . . . . . . . . .

2.2 The Solow Model in Continuous Time . . . . . . . . . . . . . . . . . . . . . . 2.2.1 2.2.2 2.2.3 From Dierence to Dierential Equations . . . . . . . . . . . . . . . . The Fundamental Equation of the Solow Model in Continuous Time . A First Look at Sustained Growth . . . . . . . . . . . . . . . . . . . iii

14.451: Introduction to Economic Growth 2.3 Solow Model with Technological Progress . . . . . . . . . . . . . . . . . . . . 2.3.1 2.3.2 2.3.3 2.3.4 Balanced Growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Neutral Technological Progress . . . . . . . . . . . . . . . . . . . . . The Steady-State Technological Progress Theorem . . . . . . . . . . . The Solow Growth Model with Technological Progress: Continuous Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 The Solow Model and the Data 3.1 Growth Accounting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Solow Model and Cross-Country Income Dierences . . . . . . . . . . . . . . 3.2.1 3.2.2 3.2.3 Solow Model with Human Capital . . . . . . . . . . . . . . . . . . . . Problems with the Mankiw, Romer and Weil Approach . . . . . . . . The Macro Mincer Approach (Bils-Klenow-Rodriguez-Hall-Jones) . . 59 63 63 66 66 71 75 79 83 83 84 89 51 51 53 55

3.3 An Alternative Approach to Estimating Productivity Dierences (Treer) . . 4 Fundamental Determinants of Dierences in Income 4.1 From Proximate to Fundamental Causes . . . . . . . . . . . . . . . . . . . . 4.2 Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Europes Expansion and Colonial Origins of Institutions . . . . . . . . . . .

II

Neoclassical Growth

95

99

5.1 Representative Consumer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 5.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 5.3 Welfare Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 iv

14.451: Introduction to Economic Growth 5.4 Optimal Growth in Discrete Time . . . . . . . . . . . . . . . . . . . . . . . . 109 5.5 Optimal Growth in Continuous Time . . . . . . . . . . . . . . . . . . . . . . 111 6 Dynamic Programming and Optimal Growth 113

6.1 Brief Review of Dynamic Programming . . . . . . . . . . . . . . . . . . . . . 114 6.2 Digression: Technical Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 6.2.1 6.2.2 Contraction Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Application of Contraction Mappings to Dynamic Programming . . . 123

6.3 Back to the Fundamentals of Dynamic Programming . . . . . . . . . . . . . 135 6.3.1 6.3.2 Basic Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Dynamic Programming Versus the Sequence Problem . . . . . . . . . 138

6.4 Optimal Growth in Discrete Time . . . . . . . . . . . . . . . . . . . . . . . . 141 6.5 Competitive Equilibrium Growth . . . . . . . . . . . . . . . . . . . . . . . . 146 7 Brief Review of Optimal Control 149

7.1 Finite-Horizon Optimal Control . . . . . . . . . . . . . . . . . . . . . . . . . 150 7.1.1 7.1.2 7.1.3 7.1.4 7.1.5 The Fundamental Problem . . . . . . . . . . . . . . . . . . . . . . . . 150 Variational Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Simplied Maximum Principle . . . . . . . . . . . . . . . . . . . . . . 154 Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

7.2 Innite-Horizon Optimal Control . . . . . . . . . . . . . . . . . . . . . . . . 160 7.2.1 7.2.2 7.2.3 The Basic Problem: Necessary and Sucient Conditions . . . . . . . 160 Lack of Transversality Conditions . . . . . . . . . . . . . . . . . . . . 163 Discounted Innite-Horizon Optimal Control . . . . . . . . . . . . . . 164 v

8.1 Preferences, Technology and Demographics . . . . . . . . . . . . . . . . . . . 167 8.2 Characterization of Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . 173 8.2.1 8.2.2 8.2.3 Denition of Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . 173 The Consumer Problem . . . . . . . . . . . . . . . . . . . . . . . . . 174 Equilibrium Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

8.3 Optimal Growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 8.4 Steady-State Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 8.5 Transitional Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 8.6 Technological Change and the Canonical Neoclassical Model . . . . . . . . . 185 8.7 The Role of Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 8.8 Quantitative Evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 8.8.1 8.8.2 Policy Dierences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 . . . . . . . . . . . . . . . . . . . . . . . 199 203

9.1 Problems of Innity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 9.2 Overlapping Generations and Overaccumulation . . . . . . . . . . . . . . . . 206 9.2.1 9.2.2 9.2.3 9.2.4 9.2.5 Demographics, Preferences and Technology . . . . . . . . . . . . . . . 206 Consumption Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . 208 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 More Specic Utility Functions . . . . . . . . . . . . . . . . . . . . . 210 Pareto Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

14.451: Introduction to Economic Growth 9.3.1 9.3.2 Fully Funded Social Security . . . . . . . . . . . . . . . . . . . . . . . 217 Unfunded Social Security . . . . . . . . . . . . . . . . . . . . . . . . . 219 221

10.1 The Brock-Mirman Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 10.2 Application: Risk, Diversication and Growth . . . . . . . . . . . . . . . . . 223 10.2.1 The Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 10.2.2 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 10.2.3 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 10.2.4 Eciency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 10.2.5 Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 10.2.6 Ineciency with Alternative Market Structures . . . . . . . . . . . . 234

III

Endogenous Growth

239

243

11.1 AK Model Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 11.1.1 Demographics, Preferences and Technology . . . . . . . . . . . . . . . 244 11.1.2 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 11.1.3 Transitional Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . 247 11.1.4 The Role of Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 11.2 The Extended AK Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 11.3 Growth with Externalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 11.3.1 Preferences and Technology . . . . . . . . . . . . . . . . . . . . . . . 256 11.3.2 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 11.3.3 Pareto Optimal Allocations . . . . . . . . . . . . . . . . . . . . . . . 261 vii

14.451: Introduction to Economic Growth 12 Multiple Equilibria and the Process of Development 263

12.1 Multiple Equilibria From Aggregate Demand Externalities . . . . . . . . . . 264 12.1.1 Preferences and Technology . . . . . . . . . . . . . . . . . . . . . . . 264 12.1.2 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 12.2 Human Capital Accumulation with Imperfect Capital Markets . . . . . . . . 275 12.2.1 A Simple Case With No Borrowing . . . . . . . . . . . . . . . . . . . 276 12.2.2 The Galor and Zeira Model . . . . . . . . . . . . . . . . . . . . . . . 279 12.3 Learning-by-Doing, Structural Change and Non-Balanced Growth . . . . . . 283 12.3.1 Demographics, Preferences and Technology . . . . . . . . . . . . . . . 283 12.3.2 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285 13 Interdependence and Growth in the Open Economy 289

13.1 Human Capital and Technology (Nelson-Phelps) . . . . . . . . . . . . . . . . 289 13.2 Trade and Technology Diusion . . . . . . . . . . . . . . . . . . . . . . . . . 291 13.2.1 The Basic Krugman Model . . . . . . . . . . . . . . . . . . . . . . . . 291 13.2.2 Understanding the Eects of Trade . . . . . . . . . . . . . . . . . . . 295 13.3 Trade, Specialization and the World Income Distribution . . . . . . . . . . . 296 13.3.1 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 13.3.2 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 13.3.3 Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303 13.4 Growth with Factor Price Equalization . . . . . . . . . . . . . . . . . . . . . 304

IV

307

311

14.1 The Lab-Equipment Model of Growth with Product Varieties . . . . . . . . 312 viii

14.451: Introduction to Economic Growth 14.1.1 Demographics, Preferences and Technology . . . . . . . . . . . . . . . 312 14.1.2 Digression on Continuous Time Value Functions . . . . . . . . . . . . 314 14.1.3 Characterization of Equilibrium . . . . . . . . . . . . . . . . . . . . . 315 14.1.4 Denition of Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . 317 14.1.5 Steady State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 14.1.6 Transitional Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . 319 14.1.7 Pareto Optimal Allocations . . . . . . . . . . . . . . . . . . . . . . . 320 14.1.8 Policy in the Endogenous Technology Model . . . . . . . . . . . . . . 322 14.2 Growth with Knowledge Spillovers . . . . . . . . . . . . . . . . . . . . . . . 324 14.2.1 The Role of Competition Policy . . . . . . . . . . . . . . . . . . . . . 326 14.3 Growth without Scale Eects . . . . . . . . . . . . . . . . . . . . . . . . . . 328 15 Models of Quality Competition 333

15.1 Baseline Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333 15.2 Pareto Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337 16 Directed Technical Change 341

16.1 Basics and Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 16.1.1 Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 16.1.2 Basic Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 16.1.3 Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350 16.2 Equilibrium Technology Bias: Some More General Results . . . . . . . . . . 354 16.3 Endogenous Labor-Augmenting Technological Change . . . . . . . . . . . . . 356 16.3.1 Demographics, Preferences and Technology . . . . . . . . . . . . . . . 356 16.3.2 Consumer and Firm Decisions . . . . . . . . . . . . . . . . . . . . . . 361 16.3.3 Asymptotic and Balanced Growth Paths . . . . . . . . . . . . . . . . 364 ix

14.451: Introduction to Economic Growth 16.3.4 The Balanced Growth Path . . . . . . . . . . . . . . . . . . . . . . . 366 16.3.5 Transitional Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . 370 16.3.6 Policy Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370 17 Recitation Material: Appropriate Technology 373

17.1 Dierences in Capital-Labor Ratios (Atkinson-Stiglitz) . . . . . . . . . . . . 374 17.2 The Role of Human Capital (Acemoglu-Zilibotti) . . . . . . . . . . . . . . . 375 17.2.1 A Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375 17.2.2 Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380 17.2.3 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380 18 Epilogue: Political Economy of Growth 383

18.1 Thinking of Institutions and Growth . . . . . . . . . . . . . . . . . . . . . . 384 18.1.1 The Impact of Institutions . . . . . . . . . . . . . . . . . . . . . . . . 385 18.1.2 Modeling Institutional Dierences . . . . . . . . . . . . . . . . . . . . 390 18.1.3 Institutions in Action . . . . . . . . . . . . . . . . . . . . . . . . . . . 392 18.2 A Simple Model of Non-Growth Enhancing Institutions . . . . . . . . . . . . 396 18.2.1 Baseline Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400 18.2.2 Economic Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . 403 18.2.3 Inecient Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405 18.2.4 Revenue Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406 18.2.5 Factor Price Manipulation . . . . . . . . . . . . . . . . . . . . . . . . 408 18.2.6 Revenue Extraction and Factor Price Manipulation Combined . . . . 409 18.2.7 Political Consolidation . . . . . . . . . . . . . . . . . . . . . . . . . . 413 18.2.8 Subgame Perfect Versus Markov Perfect Equilibria . . . . . . . . . . 416 18.2.9 Lack of CommitmentHoldup . . . . . . . . . . . . . . . . . . . . . . 417 x

14.451: Introduction to Economic Growth 18.2.10 Technology Adoption and Holdup . . . . . . . . . . . . . . . . . . . . 419 18.2.11 Inecient Economic Institutions . . . . . . . . . . . . . . . . . . . . . 422 18.3 Modeling Political Institutions . . . . . . . . . . . . . . . . . . . . . . . . . . 427 18.3.1 Dictatorship of the Middle Class . . . . . . . . . . . . . . . . . . . . . 428 18.3.2 Democracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429 18.3.3 Ineciency of Political Institutions and Inappropriate Institutions . . 431 18.3.4 Institutional Change and Persistence . . . . . . . . . . . . . . . . . . 433

xi

xii

Part I Introduction

14.451: Introduction to Economic Growth We start with a quick look at the stylized facts of economic growth and the most basic model of growth, the Solow growth model. The purpose is to both prepare us for the analysis of more modern models of economic growth with forward-looking behavior and explicit capital accumulation and technological progress, and also give us a way of mapping the simplest model to the data. I will also discuss dierences between proximate and fundamental causes of economic growth and development.

1.1 A Quick Look at the Facts

There are very large dierences in income per capita or output per worker across countries today. Countries at the top of the world income distribution are thirty times as rich as countries at the bottom in PPP adjusted dollars. For example, in 2000, GDP per capita in the United States was $32500 (valued at 1995 $ prices). In contrast, income per capita is much lower in many other countries: $9000 in Mexico, $4000 in China, $2500 in India, $1000 in Nigeria, and much much lower in some other sub-Saharan African countries such as Chad, Ethiopia, Mali (all gures adjusted for purchasing power parity). The gap is larger when there is no PPP adjustment. The next gure shows a cross-sectional look at these income-level dierences in the year 2000. 5

Should we care about cross-country income dierences? The answer is a big yes. High income levels reect high standards of living. It is true that together with economic growth, pollution increases and individual aspirations may also increase so that the same bundle of consumption may no longer make an individual as happy. But at the end of the day, when one compares an advanced, rich country with a less-developed one, there are striking dierences in the quality of life, standards of living and health. In fact, it is even dicult for us to imagine the burden of poverty at the levels experienced by countries in sub-Saharan Africa. There is little doubt that the consumption level, living standards and health level of richer countries are appreciably higher than those with lower income per capita. These gaps represent big welfare dierences. Understanding how some countries can be so rich while some others are so poor is one of the most important, perhaps the most important, challenges facing social science. 6

14.451: Introduction to Economic Growth How could a country be 30-times or so richer than another? The answer lies in dierences in growth rates. Take two countries, A and B, with the same level of income to start with. Imagine that country A has 0% growth per capita, so its income per capita remains constant, while country B grows at 2% per capita. In 200 years time country B will be more than 52 times richer than country A. Therefore, the United States is considerably richer than Nigeria because it has grown steadily over an extended period of time, while Nigeria has not. In fact, even in the historically-brief postwar era, we see tremendous dierences in growth rates across countries. This is shown in the next picture for the postwar era:

This picture shows how East Asian tigers have grown at much higher rates than the rest of the world over the past 40 years, while a number of countries in sub-Saharan Africa and Central America have experienced negative growth.

14.451: Introduction to Economic Growth However, the substantial growth dierences in the postwar era do not mean that these growth dierences are responsible for the current dierences in income levels. For one thing, it may be precisely the poor countries that are growing faster. For instance, Hong Kong, South Korea, Singapore and Taiwan were substantially poorer than the United States and Western Europe in 1960. For another, these growth dierences may be small relative to those necessary to cause large per capita income level dierences. The next question is therefore when this growth gap opened up. The answer is that much of the divergence happened during the 19th century and early 20th century. There are striking growth dierences during the postwar era, but the world income distribution has been more or less stable, with a slight tendency towards becoming more unequal.

For example, despite some big growth successes and disasters, countries that were rich in 1960 are very very likely to be rich today. A regression of log income per worker in 1990 on log income per worker in 1960 gives the following relationship:

ln y 1990 =

0.56 (0.48)

1.00 (0.06)

ln y 1960

(1.1)

If we look at output or income per worker, the overall shape of the world income distribution has been relatively stable in the postwar period. There is certainly no narrowing of income gaps. Instead, there is a small but notable increase in the dispersion of incomes. This is shown in the next gure which depicts the standard deviation of log income per capita in the world and the ratio of the income of the ve richest to the ve poorest countries in the world. 9

Moreover, there is also a pattern of stratication, whereby some of the middle-income countries of the 1960s appear to have joined either the low-income or the high-income club. This is shown in the next gure:

10

The above statements refer to the unconditional distributionthat is, they refer to whether the income gap between two countries increases or decreases irrespective of these countries characteristics. Alternatively, we can look at the conditional distribution (e.g., Barro and Sala-i-Martin, 1992). Here the picture is one of conditional convergence: in the postwar period, the income gap between countries that share the same characteristics typically closes over time (though it does so quite slowly). How do we capture conditional convergence? Consider a typical Barro growth regression: gt,t1 = ln yt1 + Xt01 + t (1.2)

where gt,t1 is the annual growth rate between dates t 1 and t, yt1 is per capita income at date t 1 and X is a set of variables that the regression is conditioning on (in theory, the determinants of steady state income and/or growth). When no covariates are included, this 11

14.451: Introduction to Economic Growth regression leads to a positive or zero estimate of , reiterating the absence of unconditional convergence as shown in the estimation of equation (1.1) above. In fact, without covariates, this is really identical to the regression equation (1.1), since gt,t1 ' ln yt ln yt1 , so equation (1.2) can be written as ln yt ' (1 + ) ln yt1 + t , which is identical to (1.1) above. The estimate of (1 + ) in (1.1) equal to 1 implies that ' 0, thus no unconditional convergence. But when Xt1 includes some human capital-related variables such as years of schooling or life expectancy, is estimated to be approximately -0.02, indicating that the income gap between countries that have the same human capital endowment has been typically narrowing over the postwar period, roughly at the rate of 2 percent a year. If we look at a longer period, for example, from 1870 to today, the pattern is quite dierent, however. Here, there is divergence. The income gap between countries was much smaller during the 19th century than today. Pritchett illustrates this point using data from Angus Maddison and deriving an absolute lower bound on country incomes due to subsistence. He argues that $250 in terms of 1985 purchasing power parity is a practical lower bound below which the death rate would be extremely high. This suggests that in 1870, the U.S. was at most eight times as rich as the poorest country in the world, while it is over 30 times as rich today. Therefore there has been signicant divergence over the past 130 years. This is illustrated in the next gure: 12

If we go even further back, the pattern may be one of reversal : Acemoglu, Johnson and Robinson (2002) show that in 1500, among the societies that were later to be colonized by European powers, those that were relatively prosperous are today relatively poor. How do we measure/proxy economic prosperity in 1500? It turns out that urbanization rates and population density are good proxies for prosperity during preindustrial periods (and urbanization rates are also good proxies even today). A variety of evidence shows that in 1500 the Mughals, Aztecs and Incas were much more urbanized and densely settled than the civilizations in North America, New Zealand and Australia. Today the U.S., Canada, New Zealand and Australia are orders of magnitude richer than the countries now occupying the territories of the Mughal, Aztec and Inca Empires, such as India, Ecuador or Peru. Therefore, among this set of countries there was a pattern of reversal, whereby those that were relatively prosperous in 1500 have become relatively poor today. The reversal is not conned to this set of countries, and is more wide13

14.451: Introduction to Economic Growth spread among the former European colonies. This is shown in the next two gures, the rst using urbanization, the second population density as proxies for prosperity in 1500:

USA

CAN AUS

CHL

ARG VEN URY BRA DOM PRY JAM PHL IDN SLV BOL LKA HND NIC PAK VNM IND HTI LAO BGD MYS COL PAN CRI ECU BLZ PER GTM

MEX

TUN

DZA MAR

GUY

EGY

7 0 5

10 Urbanization in 1500

15

20

10

CAN AUS

VEN ZAF

MEX

SUR GUY

TUN DZA JAM PHL IDN MAR SLV AGO LKA ZWE HND NIC CMR GIN CIV COG MRT SEN COM GHA IND SDN PAK LSO VNM GMB TGO CAF HTI LAO KEN BEN UGA NPL BGD ZAR BFA TCD MDG ZMB NGA NER ERI MLI BDI RWA MWI MOZ TZA SLE ETH

EGY

14

When did this reversal take place? Consistent with the discussion from Pritchetts paper above, the evidence suggests that the reversal among the former European colonies took place during the 19th century as well. Up to the late 18th century, previously prosperous places continued to be somewhat more prosperous. It was the age of industrialization, the 19th century, when previously less-prosperous former colonies became rapidly urbanized, industrialized and increased their GDP per capita. The next two pictures give a sense of these processes:

Urbanization in excolonies with low and high urbanization in 1500 (averages weighted within each group by population in 1500)

25

20

15

10

0 800 1000 1200 1300 1400 1500 1600 1700 1750 1800 1850 1900 1920

15

Industrial Production Per Capita, UK in 1900 = 100 (from Bairoch)

400

350

300

250

200

150

100

50

0 1750 1800 US 1830 Australia 1860 Canada 1880 New Zealand 1900 Brazil 1913 Mexico India 1928 1953

1.2

Interpretation

This discussion points to the following set of facts and questions that are central to an investigation of the determinants of long-run dierences in income levels and growth: 1. The major pattern to be explained is why there are such large dierences in income per capita and worker productivity across countries. This immediately takes us to questions of why some countries grow (or have grown) while other countries have failed to grow and stagnated. 2. The relative stability of the postwar income distribution has suggested to many economists that we should look for dierences across countries leading to very large permanent dierences in income, but not necessarily large permanent dierences in 16

14.451: Introduction to Economic Growth growth rates in the recent decades. This is based on the following reasoning: with substantially dierent long-run growth rates (as in models of endogenous growth, where countries that invest at dierent rates grow at dierent rates), we should expect significant divergence. We saw above that despite some widening between the top and the bottom, the cross-country distribution of income across the world is relatively stable. So this reasoning might have some merit. Furthermore, economists with this view argue that the nding of conditional convergence suggests the presence of transitional dynamics taking countries towards their steady state values as in the basic Solow and neoclassical models. 3. Nevertheless, we have seen that there there is still some notable (though perhaps not so large) divergence in the world income distribution. Clearly, countries have not settled into a stationary world income distribution. It is important to understand why even in this age of free-ow of technology some countries are growing faster than others. Equally puzzling is how the very large income dierences we observe today can persist in this age of free-ow of technology, trade and nancial integration. 4. Moreover, the divergence from the 19th century to today suggests that we might want to look for a set of theories where the large dierences in income per capita, at least to some extent, reect technological or institutional changes that took place during the 19th and early 20th centuries. For example, some countries may have taken advantage of industrialization opportunities, while other societies have failed to do so, or may have only started adopting technologies very late. We therefore need theories which can shed light on why certain societies may fail to take advantage of better technologies. 5. The reversal (among the former European colonies) suggests that theories that empha17

14.451: Introduction to Economic Growth size dierences in (economic and perhaps political) institutions or social organization or more generally man-made factors as key determinants of economic performance may be more promising than theories emphasizing xed environmental factors such as geography or climate. (With such environmental factors as the main determinants of income dierences, we should expect countries that were relatively rich 200 or 500 years ago to be also relatively rich todayi.e., persistence not a reversal). More ambitiously, we may want to investigate whether and why certain characteristics that make countries richer at some point contribute to their relative poverty during other episodes. Alternatively, we may want to see what type of shocks could cause a reversal in the relative incomes of countries over long periods.

1.3

The Agenda

In the rest of the class, we will look at models that can help us understand the mechanics of economic growth. This means understanding a variety of models that underpin the way economists think about the process of capital accumulation, technological progress, and productivity growth. Only by understanding these mechanics can we have a framework for thinking about the causes of why some countries are growing and some others are not, and why some countries are rich and some others are not. Therefore, the approach will be two pronged: on the one hand, we want to understand the mathematical structure of these models as well as possible; on the other, we want to understand what these models and others have to say about which key parameters or key economic processes are dierent across countries and why.

18

2.1

2.1.1

The Production Structure

We start with the simplest growth model, sometimes referred to as the Solow-Swan model after two economists who developed versions of it, or simply as the Solow growth model after our own Bob Solow, who was awarded the Nobel prize for his contributions to growth theory. This is a closed economy, with a unique nal good. The economy is in discrete time running to innite horizon, so that time is indexed by t = 0, 1, 2, .... Time periods here can correspond to days, weeks, or years. So far we do not need to take a position on this. The economy is inhabited by a large number of households, and for now we are going to make relatively few assumptions on the households because in this baseline model, they will not be optimizing. To x ideas, you may want to assume that all households are identical, so that the economy admits a representative consumer. We return to what this assumption of the representative consumer involves below. As an aside, you should know 19

14.451: Introduction to Economic Growth from basic general equilibrium theory that most economies do not admit a representative consumer, in fact the celebrated Debreu-Mantel-Sonnenschein theorem states that we can say relatively little about the preferences of a consumer obtained by aggregating a number of well-behaved neoclassical consumers. But much of macroeconomics (unfortunately) ignores this basic theorem, and works with representative consumers. In many situations this can be justied on the basis of parsimony. Here I will adopt the same defense and for much of this course I will limit myself to models with representative consumers. Heterogeneity of preferences, abilities and income are in fact quite important to understand the process of economic growth, but many of these topics are beyond the scope of this class. The key assumption of the Solow model will be that each household saves an exogenous fraction s of their income. Much of the neoclassical growth theory is about understanding exactly how much individuals save and how capital accumulates. In the basic model this is taken as exogenous. The other key agents in the economy are rms. Let us assume that the economy also admits an aggregate production function for the unique nal good Y (t) = F [K (t) , L (t) , A (t)] (2.1)

where Y (t) is the total amount of production of the nal good, K (t) is the capital stock, L (t) is total employment and A (t) is technology. The capital stock here denotes the quantity of machines used in production. Both the capital stock and technology are taken to be single indices, and at some level, they are treated as black boxeswe will later discuss how such models can be extended to think of multiple types of technologies and capital goods. For now, the important assumption is that technology is free, it is publicly available as a non-excludable, non-rival good. Thus the rm does not have to pay for it. 20

14.451: Introduction to Economic Growth As an aside, you might want to note that some authors use xt or Kt when working with discrete time and reserve the notation x (t) or K (t) for continuous time. Since I will go back and forth between continuous time and discrete time, I use the latter notation all throughout, except when discussing dynamic programming where the subscripts are the usual notation. Throughout, I will drop time dependence when this causes no confusion, but include it when there is any chance of such confusion. The production function F : R3 R is, for simplicity, assumed to be twice continuously dierentiable and increasing in all of its arguments, and to be strictly concave in K and L. In particular, we have: Assumption 1 (Continuity, Dierentiability, Positive Marginal Products, Concavity and Constant Returns to Scale) F is twice continuously dierentiable in K and L, and satises F (K, L, A) > 0, K 2 F (K, L, A) < 0, FKK (K, L, A) K 2 FK (K, L, A) FL (K, L, A) F (K, L, A) > 0, L 2 F (K, L, A) FLL (K, L, A) < 0. L2

Moreover, F exhibits constant returns to scale in K and L. All of the components of Assumption 1 are important. It species that marginal products are positive (thus ruling out some production functions), but more importantly that there are diminishing returns both to capital and labor, i.e., FKK < 0 and FLL < 0. We will see below that the degree of diminishing returns to capital will play a very important role in many of the results of the basic growth model. The other important assumption is that of constant returns to scale. Recall that F exhibits constant returns to scale in K and L if it is linearly homogeneous (homogeneous of degree 1) in these two variables. More specically: 21

14.451: Introduction to Economic Growth Denition 1 Let z RK for some K 1. The function g (x, y, z ) is homogeneous of degree m in x R and y R if and only if g (x, y, z ) = m g (x, y, z ) for all R+ and z RK . Linearly homogeneous (constant returns to scale) production functions are particularly useful because of the following theorem: Theorem 1 (Eulers theorem) Suppose that g : RK +2 R is continuously dierentiable in x R and y R, with partial derivatives denoted by gx and gy and is homogeneous of degree m in x and y . Then mg (x, y, z ) = gx (x, y, z ) x + gy (x, y, z ) y for all x R, y R and z RK . Moreover, gx (x, y, z ) and gy (x, y, z ) are themselves homogeneous of degree m 1 in x and y. Proof. We have that g is continuously dierentiable and g (x, y, z ) = m g (x, y, z ) . Dierentiate both sides of equation (2.2) with respect to , which gives mm1 g (x, y, z ) = gx (x, y, z ) x + gy (x, y, z ) y for any . Setting = 1 yields the rst result. To obtain the second result, dierentiate both sides of equation (2.2) with respect to x: gx (x, y, z ) = m gx (x, y, z ) . 22 (2.2)

14.451: Introduction to Economic Growth Dividing both sides by establishes the desired result. Throughout this course we are going to assume that all factor markets are competitive. Until we come to models of endogenous technological change, we will further assume that product markets are also competitive, so ours will be a prototypical competitive general equilibrium model. Moreover, as noted above, we will work with aggregate production functions as a representation of underlying production structure of the economy. This would be the case, for example, when the economy consists of a large number of rms all having access to the same constant returns to scale production function, for example F above. In that case, there is no dierence between assuming an aggregate production function or working with a large number of rms competing for factors of production. Notice, however, that the assumption of an aggregate production function could be quite restrictive. In particular it rules out heterogeneity of productivity among rms, and it also creates problems when there are non-constant returns to scale (can you see what would go wrong with decreasing returns to scale?).

2.1.2

Endowments

Let us imagine that all factors of production are owned by households. In particular, households own all of the labor, which they supply inelastically. If there is population growth, this can be thought of as existing households becoming larger, or new households being born. For our purposes here this does not matter. The households also own the capital stock of the economy, and we take their initial holdings of capital, K (0), as given (as part of the description of the environment), and this will determine the initial condition of the dynamical system we will be analyzing. For now how this initial capital stock is distributed among the 23

14.451: Introduction to Economic Growth households is not important. The more important point is that the households will rent their capital to rms. Let the rental price of capital be denoted by R (t) and the rental price of labor by w (t). Then in competitive markets a representative rm is solving the problem of prot maximizing. Another important set of issues involves how to think of capital. There are many dierent ways of conceptualizing capital, and some of them are beyond the scope of this course. Loosely speaking, we want to think of capital as corresponding to machines. But for now let us make the rather heroic assumption that capital is essentially the same as the nal good. So the economy consists of corn, and it can use some amount of this corn as input into producing further corn. Then K (0) is the amount of corn that individual households have at the beginning of period t = 0, which they can eat or rent to rms to enable them to produce further corn. [...These types of models are sometimes referred to as putty-putty, since capital is totally malleable both before and after it is designated as capital. Alternatives include putty-clay models where corn can be used as capital, but once it is in place, it becomes xed and it cannot be turned back into consumption goods, and certain features of it, for example, at which capital-labor ratio it can be used, cannot be changed...] Given this structure, there is a natural choice of numeraire in this economy which is to normalize the price of the nal good in each period to 1. Recall that we always have to choose a numeraire, but here we are making a normalization in each period. But this is without loss of any generality, because the interest-rate between periods will play the role of relative prices. This discussion should already alert you to a central fact: you should think of all of the models we are going to be talking about as general equilibrium economies, where dierent 24

14.451: Introduction to Economic Growth commodities correspond to the same good at dierent dates. Recall from basic general equilibrium theory that the same good at dierent dates (or in dierent states or in dierent localities) is a dierent commodity. Therefore, in almost all of the models that we will study in this course, there will be an innite number of commodities (because time runs to innity). This raises a number of special issues in the theory of general equilibrium which we will touch on as we go along. Now returning to our treatment of the basic model, the next important assumption is that capital depreciates. We assume that this depreciation takes an exponential form. This means that capital depreciates (exponentially) at the rate , so that out of 1 unit of capital this period, only 1 is left for next period. This depreciation in general stands for the wear and tear of the machinery, as well as, in more realistic models, the replacement of old machines by new machines. For now it is treated as a black box. The importance of this for a household is that, combined with the normalization of the price of the nal goods to 1, it implies that the rate of return faced by the household will be r (t) = R (t) + 1 . Recall that every unit of capital can be eaten now or rented to rms. In the latter case, the household will receive R (t) units of good as the rental price, but will get back only 1 units of the capital, since the rest has depreciated. This implies that the individual has given up one unit of commodity dated t 1 for r (t) units of commodity dated t. Now let us consider the problem of a representative rm. This rm will maximize prots, which implies

L(t),K (t)

(2.3)

14.451: Introduction to Economic Growth 1. I set up the problem in terms of aggregate variables. This is without loss of any generality given the representative rm (or the existence of aggregate production function). 2. There is nothing multiplying the F term, since the price of the nal good has been normalized to 1. 3. This way of writing the problem already imposes competitive factor markets, since the rm is taking the prices of labor and capital, w (t) and R (t) , as given. 4. This is a concave problem, since F is concave (though not necessarily strictly so). The rst-order necessary conditions of the rms problem (combined with dierentiability of F ) imply that the competitive factor returns are equal to their marginal products: w (t) = FL [K (t), L(t), A(t)]. and R (t) = FK [K (t), L(t), A(t)]. (2.5) (2.4)

An immediate corollary of Theorem 1 combined with competitive factor markets is: Proposition 1 In equilibrium, rms make no prots, and in particular, Y (t) = w (t) L (t) + R (t) K (t) . Proof. This follows immediately from Theorem 1 for the case of m = 1, i.e., constant returns to scale. This result is convenient, since it implies that rms make no prots, so, in contrast to the basic general equilibrium theory, the ownership of rms does not need to be specied. All we need to know is that rms are prot-maximizing entities. 26

14.451: Introduction to Economic Growth In addition to these standard assumptions on the production function, in growth theory we often impose the following additional boundary conditions, referred to as Inada conditions. Assumption 2 (Inada conditions) F satises the Inada conditions lim FK (K, L, A) = and lim FK (K, L, A) = 0 for all L > 0 and all A

K L L0

K 0

lim FL (K, L, A) = and lim FL (K, L, A) = 0 for all K > 0 and all A.

2.1.3

Finally, we can write the law of motion of the capital stock of the economy. Recall that K depreciates exponentially at the rate , so that the law of motion of the capital stock is given by K (t + 1) = (1 ) K (t) + I (t) , (2.6)

where I (t) is investment at time t. From national income accounting for a closed economy, we have Y (t) = C (t) + I (t) + G (t) , (2.7)

where C (t) is consumption and G (t) is government spending. For now, we take G (t) 0, so that national income is divided between consumption and investment. Therefore, using (2.1), (2.6) and (2.7), feasible dynamic allocations in this economy would have to satisfy K (t + 1) F [K (t) , L (t) , A (t)] + (1 ) K (t) C (t) . The question is to determine the equilibrium dynamic allocation among the set of feasible dynamic allocations. Here the behavioral rule of the constant savings rate simplies the structure of equilibrium considerably. It is important that the constant savings rate is a 27

14.451: Introduction to Economic Growth behavioral rule, it is not derived from a well-dened utility function. This means that any welfare comparisons based on the Solow model have to be taken with a grain of salt. We have no idea what the utility function of the individuals are. First note that given G (t) 0 (and the closed economy assumption), aggregate investment is equal to savings, S (t) = I (t) = Y (t) C (t) . Now recall that individuals are assumed to save a constant fraction s of their income, i.e., S (t) = sY (t) , so that they consume the remaining 1 s fraction of their income: C (t) = (1 s) Y (t) (2.9) (2.8)

Thus combining (2.1), (2.6) and (2.8), we have the key dynamic (dierence) equation of the Solow growth model: K (t + 1) = sF [K (t) , L (t) , A (t)] + (1 ) K (t) . (2.10)

In the Solow growth model, the equilibrium is essentially described by this equation together with laws of motion for L (t) and A (t).

2.1.4

Denition of Equilibrium

The Solow model is a mixture of an old-style Keynesian model and a modern dynamic macroeconomic model. Households do not optimize when it comes to their savings/consumption decisions. Instead, their behavior is captured by a behavioral rule. But rms maximize and 28

14.451: Introduction to Economic Growth factor markets clear. Thus it is useful to start dening equilibria in the way that is customary in modern dynamic macro models. Denition 2 In the basic Solow model for a given sequence of {L (t) , A (t)} t=0 and an initial capital stock K (0), an equilibrium path is a sequence of capital stocks, output levels, consumption levels, wages and rental rates {K (t) , Y (t) , C (t) , w (t) , R (t)} t=0 such that K (t) satises (2.10), Y (t) is given by (2.1), C (t) is given by (2.9), and w (t) and R (t) are given by (2.4) and (2.5).

2.1.5

We can make more progress by exploiting the constant returns to scale nature of the production function. To do this, let us make some further assumptions: 1. Let us assume that population is constant and individuals supply labor inelastically, so that L (t) = L. 2. Let us also assume that there is no technological progress, so that A (t) = A. We will relax these assumptions later. For now, let us dene the capital-labor ratio of the economy as k (t) Y (t) /L, is given by K (t) , 1, A y (t) = F L f (k (t)) . 29 K (t) . L

Then using the constant returns to scale assumption we have that income per capita, y (t)

(2.11)

14.451: Introduction to Economic Growth In other words, with constant returns to scale, income per capita is simply a function of the capital-labor ratio. Given Theorem 1, we also have R (t) = f 0 (k (t)) > 0 and w (t) = f (k (t)) k (t) f 0 (k (t)) > 0. (2.12)

The fact that both of these factor prices are positive follows from Assumption 1, which imposed that the rst derivatives of F with respect to capital and labor are always positive (with more general production functions, zero factor prices are possible over certain ranges). Given this, we can divide both sides of (2.10) by L and obtain a simpler dierence equation k (t + 1) = sf (k (t)) + (1 ) k (t) . (2.13)

Since this dierence equation is derived from (2.10), it also can be referred to as the equilibrium dierence equation of the Solow model, in that it describes the equilibrium behavior of the key object of the model, the capital-labor ratio, and the other equilibrium quantities can be obtained from the capital-labor ratio k (t). At this point, we can also dene a steady-state equilibrium for this model without technological progress and population growth. Denition 3 A steady-state equilibrium without technological progress and population growth is an equilibrium path in which k (t) = k for all t. In other words, in the steady-state equilibrium the capital-labor ratio remains constant. Most of the models we will analyze in this course will admit a steady state equilibrium, and typically the economy will tend to this steady state equilibrium over time (but often never reach it in nite time). This is also the case for this simple model. 30

14.451: Introduction to Economic Growth This can be seen by plotting the dierence equation which governs the equilibrium behavior of this economy, (2.13). The intersection of the right hand side with the 45 line gives the steady-state value of the capital-labor ratio k , which satises f (k ) = . s k (2.14)

An alternative visual representation of the steady state is to view it as the intersection between a ray through the origin with slope (representing the function k ) and the function sf (k ). The next gure shows this picture, which is also useful in seeing the level of consumption and investment in a single gure.

This establishes:

Proposition 2 Consider the basic Solow growth model and suppose that Assumptions 1 and 2 hold. Then there exists a unique steady state where the capital-labor ratio is equal to 31

14.451: Introduction to Economic Growth k (0, ) and is given by (2.14), per capita output is given by y = f (k ) and per capita consumption is given by c = (1 s) f (k ) . Proof. (2.16) (2.15)

The preceding argument establishes that (2.14) is a steady state, i.e., a zero

of the dierence equation (2.13). To establish existence, note that from Assumption 2, limk0 f (k) /k = and limk f (k) /k = 0. Moreover, f (k) /k is continuous from Assumption 1, so there exists k such that (2.14) is satised. To see uniqueness, dierentiate f (k) /k with respect to k, which gives [f (k) /k] f 0 (k) k f (k) w = < 0, = k k2 k (2.17)

where the last equality uses (2.12). Since f (k) /k is everywhere decreasing, there can only exist a unique value k that satises (2.14). Equation (2.15) and (2.16) then follow by denition. So far the model is very parsimonious, and does not have many parameters. But what we are most interested in is to understand how cross-country dierences in certain parameters translate into dierences in growth rates or income levels. This will be done in the next proposition. But before doing so, let us generalize the production function in one simple way, and assume that (k) f (k) = af so that a is a shift parameter, with greater values corresponding to greater productivity of factors. This type of productivity is referred to as Hicks-neutral as we will see below, but 32

14.451: Introduction to Economic Growth for now it is just a convenient way of looking at the impact of productivity dierences across (k). countries. Since f (k) satises the regularity conditions imposed above, so does f (k). Denote the steadyProposition 3 Suppose Assumptions 1 and 2 hold and f (k) = af state level of the capital-labor ratio by k (a, s, ) and the steady-state level of output by y (a, s, ) when the underlying parameters are given by a, s and . Then we have k (a, s, ) k (a, s, ) k (a, s, ) > 0, > 0 and <0 a s y (a, s, ) y (a, s, ) y (a, s, ) > 0, > 0 and < 0. a s Proof. The proof follows immediately by writing (k ) f = , k as which holds for an open set of values of k . Now apply the implicit function theorem to obtain the results. For example, k (k )2 = 2 >0 s sw where w = f (k ) k f 0 (k ) > 0. The other results follow similarly. Therefore, countries with higher savings rates and better technologies will have higher capital-labor ratios and will be richer. Those with greater (technological) depreciation, will tend to have lower capital-labor ratios and will be poorer. All of the results in Proposition 3 are intuitive, and start giving us a sense of some important determinants of the capital-labor ratios and income levels across countries. The same comparative statics with respect to a and immediately apply to c as well. However, it is straightforward to see that c will not be monotonic in the savings rate (think, 33

14.451: Introduction to Economic Growth for example, of the case where s = 1!). To obtain the steady state relationship between c and s, let us suppress the other parameters and write c (s) = (1 s) f (k (s)) . = f (k (s)) k (s) Now dierentiating this expression with respect to s (again using the implicit function theorem), we have c (s) k = [f 0 (k (s)) ] . s s Since from Proposition 3 we have k /s > 0, consumption can only be maximized when f 0 (k (s)) = . Moreover, when f 0 (k (s)) = , it can be veried that 2 c (s) /s2 < 0, so f 0 (k (s)) = is indeed a local maximum. That f 0 (k (s)) = is also the global maximum follows from the following observations: s [0, 1], we have k /s > 0 and moreover, when s < sgold , f 0 (k (s)) > 0 by the concavity of f , so c (s) /s > 0 for all s < sgold , and by the converse argument, c (s) /s < 0 for all s > sgold . Therefore, only sgold satises f 0 (k (s)) = and gives the unique global maximum of consumption per capita. The relationship between consumption and the savings rate takes the form plotted in the next gure. Consequently, we have established: Proposition 4 In the basic Solow growth model, the highest level of consumption is reached

for sgold , with the corresponding steady state capital level kgold such that

In other words, there exists a unique savings rate and the corresponding capital-labor ratio which will maximize steady-state consumption. This is shown in the next gure with 34

f 0 kgold = .

14.451: Introduction to Economic Growth the consumption-maximizing savings rate denoted by sgold and the corresponding consumption per capita by cgold :

Below this savings rate, the society has too low a capital-labor ratio to maximize consumption, and above this rate, the capital-labor ratio is too high, i.e., individuals are investing too much and not consuming enough. This is the essence of what people refer to as dynamic ineciency, which we will encounter in greater detail in models of overlapping generations. However, recall that there is no explicit utility function here, so statements about ineciency have to be considered with caution and skepticism. In fact, the reason why such dynamic ineciency will not arise once we endogenize consumption-saving decisions of individuals will be apparent to many of you already.

2.1.6

Proposition 2 establishes a unique steady state equilibrium. Recall, however, that an equilibrium path does not refer simply to the steady state but to the entire path of capital stock, 35

14.451: Introduction to Economic Growth output, consumption and factor prices. To determine what this equilibrium path looks like we need to study the transitional dynamics of the equilibrium dierence equation (2.13) starting from an arbitrary capital-labor ratio, k (0). Of special interest is the answer to the question of whether the economy will tend to this steady state starting from such an arbitrary capital-labor ratio, and how it will behave along the transition path. It is important to consider an arbitrary capital-labor ratio, since, as noted above, the total amount of capital at the beginning of the economy, K (0), is taken as a state variable, while for now, the supply of labor L is xed. Therefore, at time t = 0, the economy starts with k (0) = K (0) /L as its initial value and then follows the law of motion given by the dierence equation (2.13). Thus the question is whether the dierence equation (2.13) will take us to the unique steady state. Before doing this, recall some denitions and key results from the theory of dynamical systems. Consider the nonlinear system of autonomous dierence equations, x (t + 1) = F (x (t)) , (2.18)

where x (t) Rn and F : Rn Rn . Let x be a zero (equilibrium) of this system, which means a xed point of the mapping F (), i.e., x = F (x ). Denition 4 An equilibrium point x is (locally) asymptotically stable if there exists an

open set B (x ) 3 x such that for any solution {x (t)} t=0 to (2.18) with x (0) B (x ), we

have x (t) x . Moreover, x is globally asymptotically stable if for all x (0) Rn , for

Theorem 2 Consider the following linear dierence equation system x (t + 1) = Ax (t) 36 (2.19)

14.451: Introduction to Economic Growth with initial value x (0), where x (t) Rn for all t and A is an n n matrix. Suppose that all of the eigenvalues of A are strictly inside the unit circle (i.e., the absolute value of the real parts of the eigenvalues is strictly less than 1). Then the dierence equation (2.19) is globally asymptotically stable, in the sense that starting from any x (0) Rn , the unique

solution {x (t)} t=0 satises x (t) x where x is the steady state (zero) of the dierence

equation given by Ax = x . The proof of this theorem can be found in any textbook on dynamical systems, for example, David Luenberger Introduction to Dynamic Systems: Theory Models and Applications, John Wiley & Sons, 1979, and a version of it for dierential equations is in Carl Simon and Lawrence Bloom Mathematics for Economists, Norton, 1994. Next let us return to be the nonlinear autonomous system (2.18). Unfortunately, much less can be said about nonlinear systems, but the following is a standard local stability result. Theorem 3 Consider the following nonlinear autonomous system x (t + 1) = F [x (t)] (2.20)

where F :Rn Rn and suppose that F is continuously dierentiable, with initial value x (0). Let x be a zero of this system, i.e., F (x ) = x . Dene A =F (x ) , and suppose that all of the eigenvalues of A are strictly inside the unit circle. Then the dierence equation (2.20) is locally asymptotically stable, in the sense that there exists an open neighborhood of x , B (x ) Rn such that starting from any x (0) B (x ), we have x (t) x . 37

14.451: Introduction to Economic Growth Therefore, for nonlinear systems, we can have local stability results. An immediate corollary of these results is: Corollary 1 Let x (t) R, then the linear dierence equation x (t + 1) = ax (t) + b is asymptotically stable (in the sense that x (t) x = b/ (1 a)) if |a| < 1. Moreover, let g : R R be a continuous function, dierentiable at x where g (x ) = x . Then, the nonlinear dierence equation x (t + 1) = g (x (t)) is locally asymptotically stable if |g 0 (x )| < 1. Now let us apply this result to (2.13): Proposition 5 Suppose that Assumptions 1 and 2 hold, then the equilibrium of the Solow growth model described by the dierence equation (2.13) is asymptotically stable, and starting from any k (0) > 0, k (t) k . Proof. From (2.13), we have k (t + 1) = sf (k (t)) + (1 ) k (t) , (2.21)

with a unique zero at k . Now recall that f () is concave from Assumption 1 and satises f (0) = 0 from Assumption 2. For any strictly concave function, we have that f (k) > f (0) + kf 0 (k) = kf 0 (k ) , (2.22)

where the second line uses the fact that f (0) = 0. Now linearizing (2.21) around k , we have k (t + 1) ' [sf 0 (k ) + (1 )] (k(t) k ). Since from (2.14), k = sf (k ), (2.22) implies that = sf 0 (k ) /k > f 0 (k ), and thus [sf 0 (k ) + (1 )] (0, 1), establishing local asymptotic stability for the Solow model from Corollary 1. 38

14.451: Introduction to Economic Growth Moreover, (2.21) also implies that for all k > k (0) > 0, we have k (t + 1) k (t) > 0 and for all k (0) > k , we have k (t + 1) k (t) < 0. Consequently, the solution to (2.21),

{k (t)} t=0 always approaches k , thus must be globally stable.

This stability result is easier to see diagrammatically, which is shown in the next gure. The following corollary is then immediate:

Corollary 2 Suppose that Assumptions 1 and 2 hold, and k (0) < k , then {w (t)} t=0 is

an increasing sequence and {R (t)} t=0 is a decreasing sequence. If k (0) > k , the opposite

results apply.

Intuitively, if the economy starts with too little capital relative to its labor supply, there will be capital deepening (capital accumulation relative to labor), and as a result the marginal product of capital will fall given the diminishing returns to capital feature embedded in Assumption 1, and the wage rate will increase. Conversely, if it starts with too much capital, it will decumulate capital, and in the process the wage rate will decline and the rate of return to capital will increase. The next gure shows this process diagrammatically, emphasizing that the trade-o is between the replacement of the capital stock per eective labor due to depreciation (and perhaps population growth and technological change) and the capital to eective labor ratio: 39

Therefore, the Solow growth model has a number of nice properties; unique steady state, asymptotic stability, and simple and intuitive comparative statics.

So far, it has no growth however. The steady state is the point at which there is no growth in the capital-labor ratio, no more capital deepening, and no growth in income per capita. The Solow model typically incorporates economic growth by allowing technological change. Before doing this, however, it is useful to look at the mapping between discrete time and continuous time. 40

2.2

2.2.1

From Dierence to Dierential Equations

Recall from the discussion above that the time periods could refer to days, weeks, months or years. In some sense, the time unit is not important. This suggests that perhaps it may be more convenient to look at dynamics by making the time unit as small as possible, i.e., by going to continuous time. The continuous time setup in general has a number of advantages, since some pathological results of discrete time disappear in continuous time (see Problem Set 1). Moreover, especially in the presence of uncertainty, continuous time models have more exibility both in doing dynamics and for providing explicit form solutions. For us, they are useful particularly because a lot of growth theory is cast in continuous time. Let us start with a simple dierence equation x (t + 1) x (t) = g (x (t)) . (2.23)

This equation states that between time t and t + 1, the absolute growth in x is given by g (x (t)). Let us now consider the following approximation x (t + t) x (t) ' t g (x (t)) , for any t [0, 1]. When t = 0, this equation is just an identity. When t = 1, it gives (2.23). In-between it is a linear approximation, which should not be too bad if the distance between t and t + 1 is not very large, so that g (x) ' g (x (t)) for all x [x (t) , x (t + 1)] (however, you should also convince yourself that this approximation could in fact be quite bad if you take a very nonlinear function g, for which the behavior changes signicantly between x (t) and x (t + 1)). Now divide both sides of this equation by t, and take limits 41

14.451: Introduction to Economic Growth to obtain x (t + t) x (t) =x (t) ' g (x (t)) , t0 t lim as a dierential equation representing the same dynamics as the dierence equation (2.23) for the case in which the distance between t and t + 1 is small. Recall that here x (t) denotes the time derivative x (t) /t.

2.2.2

We can now repeat all of the analysis so far using the continuous time representation. Nothing has changed on the production side, so we continue to have (2.4) and (2.5) as the factor prices, but now these refer to instantaneous rental rates (i.e., w (t) is the ow of wages that the worker receives for an instant etc.). Savings are again given by S (t) = sY (t) , while consumption is given by (2.9) above. Also, let us now introduce population growth into this model, and assume that the labor force L (t) grows proportionally, i.e., L (t) = exp (nt) L (0) . (2.24)

The purpose of doing so is that in many of the classical analyses of economic growth, population growth plays an important role, so it is useful to see how it aects things here. We are not introducing technological progress yet, which will be done below. 42

14.451: Introduction to Economic Growth Recall that k (t) which implies that (t) (t) K k = n. k (t) K (t) The law of motion of the capital stock, from the limiting argument in the previous subsection, is given by: (t) = sF [K (t) , L (t) , A(t)] K (t) . K Now using the denition of k (t) as the capital-labor ratio and the constant returns to scale properties of the production function, we obtain the fundamental law of motion of the Solow model in continuous time for the capital-labor ratio as (t) = sf (k (t)) (n + ) k (t) , k Therefore we have: Denition 5 In the basic Solow model in continuous time with population growth at the rate n, no technological progress and an initial capital stock K (0), an equilibrium path is a sequence of capital stocks, labor, output levels, consumption levels, wages and rental rates [K (t) , L (t) , Y (t) , C (t) , w (t) , R (t)] t=0 such that K (t) satises (2.25), L (t) satises (2.24), Y (t) is given by (2.1), C (t) is given by (2.9), and w (t) and R (t) are given by (2.4) and (2.5). As before, a steady-state equilibrium involves k (t) remaining constant. As before, we will refer to the steady-state equilibrium capital-labor ratio as k . 43 (2.25) K (t) , L (t)

14.451: Introduction to Economic Growth It is easy to verify that the equilibrium dierential equation (2.25) has a unique zero at k , which is given by a slight modication of (2.14) above to incorporate population growth: n+ f (k ) = . k s (2.26)

In other words, going from discrete to continuous time has not changed any of the basic economic features of the model, and again the steady state can be plotted in the familiar gure used above (now with the population growth rate featuring in there as well):

We immediately obtain:

Proposition 6 Consider the basic Solow growth model in continuous time and suppose that Assumptions 1 and 2 hold. Then there exists a unique steady state equilibrium where the capital-labor ratio is equal to k (0, ) and is given by (2.26), per capita output is given 44

14.451: Introduction to Economic Growth by y = f (k ) and per capita consumption is given by c = (1 s) f (k ) . Moreover, again let

(k) . f (k) = af Then we have (k). Denote the steadyProposition 7 Suppose Assumptions 1 and 2 hold and f (k) = af state equilibrium level of the capital-labor ratio by k (a, s, , n) and the steady-state level of output by y (a, s, , n) when the underlying parameters are given by a, s and . Then we have k (a, s, , n) k (a, s, , n) k (a, s, , n) k (a, s, , n) > 0, > 0, and <0 a s n y (a, s, , n) y (a, s, , n) y (a, s, , n) y (a, s, , n) > 0, > 0, and < 0. a s n The new result relative to the earlier comparative static proposition is that now a higher population growth rate, n, also reduces the capital-labor ratio and income per capita. The reason for this is simple. A higher population growth rate means there is more labor to use the existing amount of capital, which only accumulates slowly, and consequently the equilibrium capital-labor ratio ends up lower. This result implies that countries with higher population growth rates will have lower incomes per person (or per worker). The stability analysis is also unchanged. To do this in detail, we simply need to remember the equivalents of the above theorems for dierential equations. In particular we have: 45

14.451: Introduction to Economic Growth Theorem 4 Consider the following linear dierential equation system x (t) = Ax (t) (2.27)

with initial value x (0), where x (t) Rn for all t and A is an n n matrix. Suppose that all of the eigenvalues of A have negative real parts. Then the dierential equation (2.27) is asymptotically stable, in the sense that starting from any x (0) Rn , x (t) x where x is the steady state (zero) of the system given by Ax = 0. Theorem 5 Consider the following nonlinear autonomous dierential equation x (t) = F [x (t)] (2.28)

where F : Rn Rn and suppose that F is continuously dierentiable, with initial value x (0). Let x be a zero of this system, i.e., F (x ) = 0. Dene A =F (x ) , and suppose that all of the eigenvalues of A have negative real parts. Then the dierential equation (2.28) is locally asymptotically stable, in the sense that there exists an open neighborhood of x , B (x ) Rn such that starting from any x (0) B (x ), we have x (t) x . Corollary 3 Let x (t) R, then the linear dierence equation x (t) = ax (t) is asymptotically stable (in the sense that x (t) 0) if a < 0. Moreover, let g : R R be continuous and (t) = dierentiable at x where g (x ) = 0. Then, the nonlinear dierential equation x g (x (t)) is a locally asymptotically stable if g0 (x ) < 0. Finally, with continuous time, we also have another useful theorem: 46

14.451: Introduction to Economic Growth Theorem 6 Let g : R R be a continuous function, and suppose that there exists a unique x such that g (x ) = 0. Moreover, suppose g (x) < 0 for all x > x and g (x) > 0 for all (t) = g (x (t)) is a (globally) asymptotically x < x . Then the nonlinear dierential equation x stable, and starting with any x (0), x (t) x . Notice that the equivalent of Theorem 6 is not true in discrete time, and this will be illustrated by one of the problems in Problem Set 1. In view of these results, Proposition 5 immediately generalizes: Proposition 8 Suppose that Assumptions 1 and 2 hold, then the basic Solow growth model in continuous time with no population growth and technological change is asymptotically stable, and starting from any k (0) > 0, k (t) k . Proof. The proof of stability is now simpler and follows immediately from Theorem 6 by noting that whenever k < k , sf (k) (n + ) k > 0 and whenever k > k , sf (k) (n + ) k < 0. It is also useful at this point to look at one of the most common examples of the production function used in macroeconomics, the Cobb-Douglas production function: Example 1 Supposed the aggregate production function is given by F [K, L] = AK L1 with 0 < < 1. You should remember from basic micro theory that the Cobb-Douglas production function is extremely special, in particular because it has an elasticity of substitution equal to 1 between capital and labor. This production function is very easy to work with, but it also has many special features that are far from general. It is a good vehicle to illustrate issues, but you should not think that all production functions are Cobb-Douglas! 47

14.451: Introduction to Economic Growth One very important feature of the Cobb-Douglas production function is that factor shares are constant. It can be immediately calculated that, with competitive factor markets, we have the share of capital is constant irrespective of the capital-labor ratio: K (t) = R (t) K (t) Y (t) FK (K (t), L (t)) K (t) = Y (t) A [K (t)]1 [L (t)]1 K (t) = A [K (t)] [L (t)]1 = .

Similarly, the share of labor is L (t) = 1 . With this production function, we have that f (k) = Ak , so the steady state is given again from (2.26) (with population growth at the rate n) as A (k )1 = or k =

n+ s

1 1

which is a very nice and simple interpretable form for the steady-state capital-labor ratio. Transitional dynamics are also straightforward in this case. In particular, we have: (t) = sA [k (t)] (n + ) k (t) k with initial condition k (0). To solve this equation, let x (t) k (t)1 , so the equilibrium law of motion of the capital labor ratio can be written in terms of x (t) as x (t) = (1 ) sA (1 ) (n + ) x (t) , 48

sA n+

14.451: Introduction to Economic Growth which is a linear dierential equation, with a general solution sA sA x (t) = + x (0) exp ( (1 ) (n + ) t) n+ n+ or in terms of the capital-labor ratio 1 1 sA sA 1 k (t) = + [k (0)] exp ( (1 ) (n + ) t) . n+ This solution illustrates that starting from any k (0), the equilibrium k (t) k = (sA/ (n + ))1/(1) , and in fact, the rate of adjustment is related to (1 ) (n + ). This is intuitive: a higher implies less diminishing returns to capital, which slows down dynamics. Similarly a smaller means less replacement of depreciated capital and a smaller n means slower population growth, both of those slowing down the adjustment of capital per worker and thus transitional dynamics.

2.2.3

Before discussing technological progress, it is useful to see how the model we have developed so far can generate sustained growth (without technological progress). The Cobb-Douglas example above already shows that when is close to 1, adjustment of the capital-labor ratio back to its steady-state level can be very very slow. A very slow adjustment towards a steady-state has the avor of sustained growth rather than the system settling down to a stationary point quickly. In fact, the simplest model of sustained growth essentially takes = 1 in terms of the Cobb-Douglas production function above. To do this, let us relax Assumptions 1 and 2 (which do not allow = 1), and suppose that F [K (t) , L (t) , A (t)] = AK (t) , 49 (2.29)

14.451: Introduction to Economic Growth where A > 0 is a constant. This is the so-called AK model, and in its simplest form output does not even depend on labor. The results I would like to highlight apply with a more general constant returns to scale production function, for example, F [K (t) , L (t) , A (t)] = AK (t) + BL (t) , (2.30)

but it is simpler to illustrate the main insights with (2.29), leaving the analysis of the richer production function (2.30) to Problem Set 1. With this production function, the fundamental law of motion of the capital stock is given by (again with population growth given by (2.24)): (t) k = sA n. k (t) Therefore, if sA n > 0, there is sustained growth in the capital-labor ratio, and given (2.29), there is sustained growth in income per capita. This immediately establishes the following proposition: Proposition 9 Consider the Solow growth model with the production function (2.29) and suppose that sA n > 0. Then in equilibrium, there is sustained growth of income per capita at the rate sA n. In particular, starting with a capital-labor ratio k (0) > 0, the economy has k (t) = exp ((sA n) t) k (0) and y (t) = exp ((sA n) t) Ak (0) . This proposition not only establishes the possibility of endogenous growth, but also shows that in this simplest form, there are no transitional dynamics. The economy always grows 50

14.451: Introduction to Economic Growth at a constant rate sA n, irrespective of what level of capital-labor issue it starts from. The next gure shows this equilibrium diagrammatically, denoting the growth rate of the economy (and the capital-labor ratio by K ):

2.3

2.3.1

Balanced Growth

The models analyzed so far did not feature technological progress. We now introduce changes in A (t) to capture improvements in the technological know-how of the economy. There is little doubt that what human societies know to produce, and how eciently they can produce them, has progressed tremendously over the past 200 years, and even more tremendously over the past 1000 or 10,000 years. An attractive way of introducing economic growth is to allow technological progress. The question is how to do this. At some level we will see that the production function F [K (t) , L (t) , A (t)] is too general to achieve our objective. In 51

14.451: Introduction to Economic Growth particular, with this general structure, we may not have balanced growth. By balanced growth, we mean a path of the economy in which, while income per capita increases, the capital-labor ratio and the distribution of income between capital and labor is roughly constant. These are sometimes referred to as the Kaldor facts. The next picture, for example, shows the evolution of the share of capital in national income in the United States.

100% 90% Labor and capital share in total value added 80% 70% 60% 50% 40% 30% 20% 10% 0% 1929 1934 1939 1944 1949 1954 1959 1964 1969 1974 1979 1984 1989 1994

Labor Capital

Despite fairly large uctuations, there is no trend. This and the relative constancy of capital-output ratios until the 1970s have made many economists prefer models with balanced growth to those without. (Since the 1970s capital-output ratios may or may not be constant depending on how you measure them). Also for future reference, note that the capital share in national income is about 1/3, while the labor share is about 2/3. We are ignoring the share of land here as we did in the analysis so far: land is not a major factor of production. This 52

14.451: Introduction to Economic Growth is clearly not the case for the poor countries, and we should think about how incorporating land into this picture changes the patterns. In any case, this pattern of factor distribution of income, combined with economists desire to work with simple models, often makes them choose an aggregate production function of the form AK 1/3 L2/3 as an approximation to reality (especially since it ensures that factor shares are constant by construction). This production function does a good job in certain circumstances, but of course it is very special. For us, the most important characteristic of balanced growth is that it is much easier to handle than non-balanced growth. So it is an advantage to have models featuring balanced growth. In reality, growth has many non-balanced features. For example, the share of dierent sectors changes systematically over the growth process, with agriculture shrinking, manufacturing rst increasing and then shrinking. Ultimately, we would like to have models that combine certain quasi-balanced features with these types of structural transformations embedded in them. These are interesting frontiers of research, but for this course, we will largely focus on models with balanced growth.

2.3.2

What are some convenient special forms of the general production function F [K (t) , L (t) , A (t)]? First we could have [K (t) , L (t)] , F [K (t) , L (t) , A (t)] = A (t) F so that technological progress simply multiplies output. This is known as Hicks-neutral technological progress. Intuitively, in this case if we think of the isoquants in the L-K space, technological progress simply corresponds to a relabeling of the isoquants (without any change in their shape). 53

14.451: Introduction to Economic Growth Another alternative is to have capital-augmenting or Solow-neutral technological progress, in the form [A (t) K (t) , L (t)] . F [K (t) , L (t) , A (t)] = F This is referred to as capital-augmenting progress, because a higher A (t) is equivalent to the economy having more capital. This type of technological progress corresponds to the isoquants shifting with technological progress in a way that they have constant slope at a given labor-output ratio. Finally, we can have labor-augmenting or Harrod-neutral technological progress [K (t) , A (t) L (t)] , F [K (t) , L (t) , A (t)] = F whereby an increase in technology increases output as if the economy had more labor. Equivalently, the slope of the isoquants are constant along rays with constant capital-output ratio. Of course, in practice technological change can be a mixture of these, so we could have a vector valued at index of technology and a production function that looks like [AK (t) K (t) , AL (t) L (t)] . F [K (t) , L (t) , A (t)] = AH (t) F

It turns out that, although all of these forms of technological progress look equally plausible ex ante, balanced growth forces us to one of these types of neutral technological progress. In particular, balanced growth necessitates that all technological progress be labor augmenting or Harrod-neutral. This is a very surprising result, and it is also somewhat troubling, since we have no idea why technological progress should take this form. We now state and prove the relevant theorem here. 54

2.3.3

A version of the following theorem was rst proved by Uzawa in 1961. For simplicity and without loss of any generality, let us focus on continuous time models. The key elements of balanced growth, as suggested by the discussion above, are the constancy of factor shares and the constancy of the capital-output ratio, K (t) /Y (t). Since there is only labor and capital in this model, by factor shares, we mean L (t) w (t) L (t) R (t) K (t) and K (t) . Y (t) Y (t)

By Assumption 1 and Theorem 1, we have that L (t) + K (t) = 1. The following theorem was rst stated and proved by Uzawa. Here I present a version of Uzawas proof along the lines of the more recent paper by Jones and Scrimgeour (2005), and then also give a more heuristic proof. Theorem 7 (Uzawa) Consider a growth model with a constant returns to scale aggregate production function F [K (t) , L (t) , A (t)] and capital accumulation equation (t) = F [K (t) , L (t) , A (t)] C (t) K (t) . K Suppose also that there is a constant growth rate of population, i.e., L (t) = exp (nt) L (0). If a balanced growth path exists with constant capital-output ratio and per capita growth rate, i.e., y (t) /y (t) = g > 0, and factor shares are nonzero and constant, i.e., K (t) = (x ) (0, 1) as t , then asymptotically, the production function can be represented as: [K (t) , A (t) L (t)] , Y (t) = F where *s denote asymptotic steady-state values, and (t) A = g. A (t) 55

14.451: Introduction to Economic Growth Proof. Let us look at the following derivative v = log y (t) log (k (t) /y (t)) 1

log K (t) log Y (t)

= = =

1 1

K (t) , 1 K (t)

where the last line uses the denition of K (t). Now let x (t) K (t) /Y (t), and by hypothesis asymptotically K (t) = (x ) where x refers to the steady state value of K/Y , and the share of capital in national income is potentially a function of this capital-output ratio. Therefore, asymptotically, we have the following partial dierential equation: log y (t) (x ) = . log x (t) 1 (x ) Integrating both sides and noting that the right hand side does not depend on time, we have Z (x ) dx log y (t) = a (t) + 1 (x ) x for some function a (t), which only depends on time. Taking exponents, we have y (t) = A (t) (x ) , R (x ) dx . Notice, also, for future use that where A (t) exp (a (t)) and (x ) exp 1(x ) x denoted by 1 (y/A) 56

from the inverse function theorem, (x ) is invertible in the neighborhood of x , with inverse

14.451: Introduction to Economic Growth Since (x ) is constant and y (t) /y (t) = g, we must have A (t) exp (gt) A (0). Finally, note that, by denition, k (t) = x (t) y (t), which implies asymptotically (in steady state) that y (t) 1 y (t) k (t) = A (t) A (t) A (t) y (t) = f 1 A (t) or y (t) k (t) =f , A (t) A (t) K (t) Y (t) = A (t) L (t) f , A (t) L (t)

and thus

which, under constant returns to scale, is another way of writing [K (t) , A (t) L (t)] , Y (t) = F completing the proof. For a more heuristic reasoning, consider production function of the form F [AK (t) K (t) , AL (t) L (t)]. Balanced growth requires factor shares to be constant, which can only be the case when total capital inputs, AK (t) K (t), and total labor inputs, AL (t) L (t), grow at the same rate; otherwise, the share of either capital or labor will be increasing over time. Capital accumulation implies that K (t) will grow at the same rate as AL (t) L (t). Thus balanced growth can only be possible if AK (t) is asymptotically constant. There is one exception to this, which is the Cobb-Douglas production function, where we can have Y (t) = [AK (t) K (t)] [AL (t)L(t)]1 57

14.451: Introduction to Economic Growth and both AK (t) and AL (t) could grow asymptotically, while maintaining balanced growth. [K (t) , A (t) L (t)], but that However, notice that Theorem 7 does not require that Y (t) = F [K (t) , A (t) L (t)]. It is quite straightit should have a representation of the form Y (t) = F forward to see that in this Cobb-Douglas example we can dene A (t) = [AK (t)]/(1) AL (t), and the production function can be represented as Y (t) = [K (t)] [A(t)L(t)]1 , in other words, technological change can be represented as purely labor augmenting, which is what Theorem 7 requires. Notice nally that this theorem does not state that technological change has to be labor augmenting all the time. But it requires that it has to be labor augmenting asymptotically, i.e., along the balanced growth path. Based on these ideas, is possible to give the more heuristic proof of Theorem 7. Alternative Proof of Theorem 7: Suppose that Y (t) = AH (t) F [AK (t) K (t) , AL (t) L (t)], and since we are interested in asymptotic states, suppose that AH (t), AK (t) and AL (t) are growing asymptotically at the rates gH , gK and gL . Normalize AH (0), AK (0) and AL (0) to1. Then we can write that asymptotically AL (t) L (t) Y (t) = exp ((gH + gK ) t) F 1, K (t) AK (t) K (t) L (t) . exp ((gH + gK ) t) f exp ((gL gK ) t) K (t) Now we also have (t) Y (t) K =s , K (t) K (t) and in steady state, according to the hypotheses of the theorem, we have Y (t) /K (t) con (t) /K (t) = g , i.e., capital grows at the same rate as total output. Combined stant, so K 58

14.451: Introduction to Economic Growth with the hypothesis that L (t) = exp (nt) L (0), this then implies (for L(0) normalized to 1), Y (t) = exp ((gH + gK ) t) f (exp ((gL gK + n g ) t)) . K (t) But from this equation Y (t) /K (t) can remain constant only under the one of the two following circumstances: 1. exp ((gH + gK ) t) is constant and exp ((gL gK + n g) t) is constant, i.e., gH = gK = 0, and g = gL + n. 2. exp ((gH + gK ) t) increases exactly at the same rate as f (exp ((gL gK + n g ) t)) decreases, which is only possible when f (x) = x for some .Then, if we impose Assumption 1 (or just CRS and positive marginal products) then we get (0, 1). This completes the alternative proof of Theorem 7.

2.3.4

Now we are ready to analyze the Solow growth model with technological progress. I will only present the analysis for continuous time (the discrete time case is equivalent). From Theorem 7, we know that the production function must take the form F [K (t) , A (t) L (t)] , with purely labor-augmenting technological progress asymptotically. For simplicity, let us assume that it takes this form throughout. Moreover, suppose that there is technological progress at the rate g , i.e., (t) A = g, A (t) 59 (2.31)

14.451: Introduction to Economic Growth and population growth at the rate n, (t) L = n. L (t) Again using the constant savings rate we have (t) = sF [K (t) , A (t) L (t)] K (t) . K (2.32)

The simplest way of analyzing this economy is again to express everything in terms of a normalized variable. Since eective units of labor are given by A (t) L (t), and F exhibits constant returns to scale in its two arguments (by virtue of exhibiting constant returns to scale in capital and labor), we can dene k (t) K (t) . A (t) L (t) (2.33)

Now dierentiating this expression with respect to time, we obtain (t) (t) K k = gn k (t) K (t) The quantity of output per unit of eective labor can be written as y (t) Y (t) A (t) L (t) K (t) = F ,1 A (t) L (t) f (k (t)) . (2.34)

Income per capita is y (t) Y (t) /L (t), i.e., y (t) = A (t) y (t) . 60

14.451: Introduction to Economic Growth (t) from (2.32) into (2.34), we have Now substituting for K (t) k sF [K (t) , A (t) L (t)] = g n. k (t) K (t) Now using (2.33), (t) k sf (k (t)) = g n, k (t) k (t) (2.35)

which is very similar to the law of motion of the capital-labor ratio in the continuous time model, (2.25). An equilibrium in this model is dened similarly to before. Consequently, we have: Proposition 10 Consider the basic Solow growth model in continuous time, with Harrodneutral technological progress at the rate g and population growth at the rate n. Suppose that Assumptions 1 and 2 hold, and dene the eective capital-labor ratio as in (2.33). Then there exists a unique steady state equilibrium where the eective capital-labor ratio is equal to k (0, ) and is given by f (k ) +g+n = . k s

Per capita output and consumption grow at the rate g . The comparative static results are also similar to before, with the additional comparative static with respect to the initial level of the labor-augmenting technology, A (0) (since the level of technology later, A (t), is completely determined by A (0) given the assumption in (2.31)). Proposition 11 Suppose Assumptions 1 and 2 hold and let A (0) be the initial level of technology. Denote the balanced growth path level of eective capital-labor ratio by k (A (0) , s, , n) 61

14.451: Introduction to Economic Growth and the level of income per capita by y (A (0) , s, , n, t) (the latter is a function of time since it is growing over time). Then we have k (A (0) , s, , n) k (A (0) , s, , n) k (A (0) , s, , n) k (A (0) , s, , n) = 0, > 0, < 0 and < 0, A (0) s n and also

y (A (0) , s, , n, t) y (A (0) , s, , n, t) y (A (0) , s, , n, t) y (A (0) , s, , n, t) > 0, > 0, < 0 and < 0, A (0) s n Finally, we also have very similar transitional dynamics. Proposition 12 Suppose that Assumptions 1 and 2 hold, then the Solow growth model with Harrod-neutral technological progress and population growth in continuous time is asymptotically stable, and starting from any k (0) > 0, the eective capital-labor ratio converges to a steady-state value k , i.e., k (t) k . Therefore, the comparative statics and dynamics are very similar to the model without technological progress (and without population growth). The major dierence, of course, is that now the model generates growth in income per capita, so can be mapped to the data much better. However the disadvantage is that this growth is driven entirely exogenously. The growth rate is exactly the same as the exogenous growth rate of the technology stock. The model does not specify where this technology stock comes from and how fast it grows.

62

One of the important uses of the aggregate production function approach and the basic Solow model is that they provide us with a simple vehicle to look at the data, both at growth over time and income-level dierences (and growth rate dierences) across countries. I start here with over-time changes, i.e., growth accounting, and then will move to the more important application for the purposes of this course, which involves looking at cross-country dierences.

3.1

Growth Accounting

Let us go back to the most general form of the aggregate production function given by (2.1), whereby Y (t) = F [K (t) , L (t) , A (t)] . 63

14.451: Introduction to Economic Growth Dierentiate this function with respect to time on both sides to obtain (dropping timedependence) FK K K FA A A FL L L Y = + + . Y Y A Y K Y L , and also dening gL L/L /Y , gK K/K and Recalling the denition of factor shares above, and denoting g Y x FA A A Y A

as the contribution of technology to growth, we have x = g K gK L gL . This is the fundamental growth accounting equation. This equation lets us estimate the contribution of technological progress to economic growth from factor shares, output growth, labor force growth and capital stock growth. This contribution from technological progress is also referred to as Total Factor Productivity (TFP) or sometimes as Multi Factor Productivity. In particular, denoting an estimate by ^, we have the estimate of TFP growth as: x = g K gK L gL . If we are interested in A/A rather than x, we need to make further assumptions. For example, if we assume that the production function takes the standard labor-augmenting form [K (t) , A (t) L (t)] , Y (t) = F then we have 1 A [g K gK L gL ] , = A L 64

14.451: Introduction to Economic Growth but this equation is not particularly useful, since A/A is not something we are inherently interested in. Much more interesting is precisely x . In continuous time, this equation is exact. In practice, of course, instead of instantaneous changes, we look at changes over discrete time periods, for example over a year (or sometimes with the better data, perhaps over a quarter or a month). In this case, there is a problem, since over the time horizon in question, factor shares can change. It can be shown that this could lead to serious biases. The most common way of dealing with this is to use factor shares calculated as the average of the two points in time. Therefore in discrete time, for a change between times t and t + 1, we have K,t,t+1 gK,t,t+1 L,t,t+1 gL,t,t+1 , x t,t+1 = gt,t+1 where K,t,t+1 and L,t,t+1 is dened similarly. Applying this method, Solow found that much of economic growth over the 20th century was due to technological progress. This has been a landmark nding, focusing the attention of economists on sources of technology dierences over time, across nations, across industries and across rms. Since then, many economists, most notably Dale Jorgensen, have attempted to reduce the amount due to the residual technology by adjusting for the quality of labor and capital inputs. This is still an active research area, partly because there are conceptual issues about how far one should go in adjusting the quality of inputs. For example, better computers can translate into more capital, reducing the TFP residual, but at the end of the day better computers are a result of better technology. We will return to these issues again below. 65 K,t + K,t+1 2

3.2

We are now in a position to take the basic Solow model to the data. The simplest way of doing this is to follow the approach of Mankiw, Romer and Weil (1992). These authors basically estimated a cross country regression inspired by the above model. However, a basic estimation which does not take human capital into account proved to be inadequate. Therefore, Mankiw, Romer and Weil (1992) used an augmented Solow also incorporating human capital. I rst develop this model briey, and then look at the empirical evidence. Since our purpose here is to look at cross-country income dierences, from the beginning, I present the model for a cross-section of countries. Here already there is a major (and at some level a very problematic assumption), adopted by many authors, among them Mankiw, Romer and Weil (1992), Barro (1991) and much of Barro and Sala-i-Martin (2004), which is that the world consists of a cross-section of countries which do not interact. In other words, these countries do not trade nancial assets, goods, or there is no slow diusion of technology across these countries. These countries inhabit the world, but they are all islands onto themselves. I start with this case of no interdependence, but interdependences arising from technology ows and international trade will be discussed below.

3.2.1

Yj = Kj Hj (Aj Lj )1 ,

(3.1)

66

14.451: Introduction to Economic Growth where I have dropped time to simplify notation. We have , 0, + 1, j denotes country, Y is total output, H is human capital, L is labor, A is labor-augmenting technical change. The important assumption here is that human capital is taken to be a dierent factor of production rather than simply augmenting labor (i.e., equation (3.1) rather than Yj =

1 (Aj Hj ) with Hj interpreted as eciency units of labor, see (3.6) below). In fact, this Kj

latter approach is much more in line with the Becker model of human capital, and writing the model in this way is not without loss of any generality (as we will see below). But before seeing why this is, we should solve the model. First, we can use the usual trick of the neoclassical growth model of transforming variables to per capita eective units: Kj Hj and hj , Aj Lj Aj Lj

kj

hj . yj = Aj kj

(3.2)

Suppose also that population grows at a constant rate nj in country j . This model cannot be easily taken to the data because we have no idea what Aj is. A key assumption of Mankiw, Romer and Weil (1992), which enables them to take the augmented-Solow model to the data is the following: Common technology advances assumption: Aj (t) = Aj exp (gt) . That is, countries may dier according to their technology level, but they share the same common technology growth rate, g . This is in part motivated by the relative stability of 67

14.451: Introduction to Economic Growth the world income distribution discussed earlier. In the absence of this assumption, countries would grow at dierent rates, and the world income distribution would become more and more dispersed. Next, consider constant savings rates for human and physical capital, as a direct generalization of the standard Solow model:

k h h j = sk K j Yj Kj and Hj = sj Yj Hj

(3.3) (3.4)

As in our baseline models, in steady state, both kj and hj have to be constant. Thus

j = 0 in (3.3) and (3.4) and solving yields the following steady-state j = 0 and h setting k values of physical capital and human capital ratios to eective labor:

kj

= =

sk j nj + g + k sk j nj + g + k

!1 !

sh j nj + g + h sh j

h j

nj + g +

1 !1 1 . h

1 ! 1

(3.5)

This is an equation which can be estimated using cross-country data if we have measures of

k sh j . In addition, we can use investment rates (investments/GDP) for sj , population growth

68

14.451: Introduction to Economic Growth rates nj , and the standard depreciation rates for k . This is what Mankiw, Romer and Weil do (or they estimate a version of this with h = k ). They approximate sh j using the fraction of the working age population enrolled in school [... is this a good proxy for investment in human capital?...].

However, with all of these assumptions, equation (3.5) can still not be estimated, because the term ln Aj is unobserved to the econometrician, and could be correlated with all of the other right hand side variables. Therefore implicitly, Mankiw, Romer and Weil make another crucial assumption, considerably stronger than the common technology advances assumption:

With these assumptions, Mankiw, Romer and Weil estimate equation (3.5). The estimation is a success for the augmented-Solow model. If human capital is not included, the t is not very good and the estimates are not reasonable. This is shown in the next table. 69

Without human capital, the coecient in front of the investment/GDP ratio should be / (1 ), thus the estimate suggests ' 0.6, which is far too high bearing in mind that given the factor distribution of income we expect the exponent of capital in the production function to be closer to 1/3.

But for the augmented model with human capital, the t is very good as shown in the next table. Now the parameter estimates imply 1/3, 1/3 and R2 .78. 70

At face value, these results provide strong support for the augmented Solow model. The estimate of is consistent with a capital share of one-third in national income, and the R2 implies that almost 80 percent of the dierences in income per capita can be explained by investment decisions (human and physical capital dierences).

3.2.2

But there are two major (and related) problems with this approach: 1. The orthogonal technology assumption is too strong. When Aj varies across countries, 71

k it will plausibly be correlated with our measures of sh j and sj , so there will be an

omitted variable bias leading to overestimates of and as well as an exaggeration of the R2 . 2. The coecient on sh j is too large. To see this, recall that Mankiw, Romer and Weil use the fraction of the working age population enrolled in school. This variable ranges from 0.4 to over 12 in the sample of countries used for this regression. Their estimates therefore imply that a country with approximately 12 for this variable should have income per capita about 9 times that of a country with sh j = 1! (This is holding all other variables constant). More explicitly, the predicted log dierence in incomes between these two countries is (ln 12 ln (0.4)) 2.24, 1 and exp (2.24) 1 9 times. In practice, the dierence in average years of schooling between any two countries over this time period is less than 12. The labor literature suggests that additional years of schooling is associated with a 6 to 10 percent increase in individual earnings (e.g., consider the individual level Mincer regression ln wi = Xi0 + Ei where w is wage income, Xi is a set of demographic controls, and E is years of schooling. Here is estimated to be between 0.06 and 0.1). This implies that a worker with one more year of schooling is typically about 6 to 10 percent more productive. So in the absence of human capital externalities, a country with 12 more years of average schooling should be at most twice as rich instead of 9 times as rich! Even allowing for human capital externalities, one would need very very large human capital externalities 72

14.451: Introduction to Economic Growth in order to get this type of results (existing estimates of human capital externalities, for example, Acemoglu and Angrist, 2000, show that they are rather small). To understand this last point, consider a simple competitive economy. Suppose that each rm has a production function y = k 1 (Ah) Firms face cost of capital r, and human capital is a function of schooling, with the standard exponential form hi = exp (Ei ). First-order condition from rm maximization gives r = (1 ) (Ah/k ) . In other words, all workers, irrespective of their level of schooling, will work exactly at the same physical to human capital ratio. Wages are equal to marginal product, so w (h) = (1 )(1)/ Ar(1)/ h So wages are linear in human capital due to constant returns to scale. Taking logs of this equation, we end up with the standard log linear wage equation ln wi = cst + Ei , with the slope coecient on education measuring the relationship between education and human capital. Now consider two economies with the same technology, the same interest rate (for example, open capital accounts), the same technology, but in one economy all workers have E1 years of schooling, while in the other, they have E2 > E1 schooling. How large should the income gap between these two countries be? Using the fact that with the same interest rate, both economies will function at the same physical to human capital ratio, we immediately obtain Yi = A (1 )(1)/ r(1)/ exp (Ei ) , 73

14.451: Introduction to Economic Growth Or, taking logs, we obtain that log Y2 log Y1 = (E2 E1 ). So if one economy has on average one year more of schooling, and is about 6 percent, its income should be 6 percent higher. In the data, there are much larger dierences. For example a cross-country regression of income per capita on average years of schooling in 1985 gives log Y = 0.313 (0.027) In other words, the correlation between income and schooling is too strong relative to what we should expect on the basis of micro evidence. In particular, the eect of schooling on income is much larger than the 6-10 percent dierence expected. This result is not simply explained by the fact that interest rates vary across countries. Notice that we can write r = (1 ) Y /K , so including the (log) capital output ratio would be one way to control for interest rate dierences. In this regression, the log capital-output should have a coecient of (1 ) /, approximately 0.5 taking as 2/3. Running this regression with 1985 data, we obtain log Y = 0.266 (0.033) E + 0.408 (0.178) log K

Y

So, there is still a very large eect of education on income, and the quantitative eect of capital (as a proxy for interest rates) is plausible. This relationship between education and income may reect human capital externalities. For example, we might have the productivity term, A, as a function of average human capital in the economy. In this case, the rate of return to human capital in the Mincer regressions would only reect the private returnthat is, the increase in the individuals wage as a 74

14.451: Introduction to Economic Growth function of his own human capital, holding average human capital constant. But regressions using aggregate data would capture the total eect of an increase in human capital on incomethat is, the private plus the external eect of schooling. Therefore, one possibility is that there are large human capital externalities. However, as noted above, existing evidence indicates that human capital externalities are limited. The alternative interpretation of the patterns is that there are dierences in technologies, Aj s, and these are correlated with human capital dierences. Such a pattern of correlation may arise because human capital responds to technology, or because some third factor aects both human capital and technology.

3.2.3

A related approach is to use calibration/levels accounting rather than regression analysis and make use of the ndings of Mincer (micro wage) regressions. This is the approach rst taken by Bils and Klenow, and then by Klenow and Rodriguez and Hall and Jones. The advantage of the calibration approach is that the omitted variable bias underlying the estimates of Mankiw, Romer and Weil will be less important (since microlevel evidence is being used to anchor the contribution of human capital). The disadvantage is that certain assumptions on functional forms have to be taken much more seriously, and we explicitly have to assume no human capital externalities. Here let me follow Hall and Jones. Consider the following production function

1 (Aj Hj ) Yj = Kj

(3.6)

with Hj interpreted as eciency units of labor. Assume the following Mincer-type relation75

14.451: Introduction to Economic Growth ship between human capital and education Hj = X

E

exp { (E )} Lj (E )

where (E ) is the rate of return to E years of schooling and Lj (E ) is the number of individuals in country j with E years of schooling. We can use dierent values for (E ) and construct alternative estimates of Hj . Hall and Jones (1999) use a piecewise linear specication for (E ) based on work by Psacharapoulos from less developed countries (showing returns to earlier years of schooling that are greater than to higher education). Once we have a series for Hj and one for Kj , which can be constructed using standard perpetual inventory methods, we can construct predicted incomes, for example, as S 2/3 j = K 1/3 AU Y t Hj j and compare these predicted incomes with actual incomes. Alternatively, we could back out country-specic technology terms (relative to the U.S.) as Aj t = S AU t Ytj YtU S !3/2 KtU S Ktj 1/2 HtU S Htj

Hall and Jones perform this exercise using output per worker rather than income per capita. They nd:

1. Dierences in physical and human capital still matter a lot, accounting for as much as 50 percent of the actual dierences in output per worker. 2. But there are also signicant productivity dierences. 76

14.451: Introduction to Economic Growth The next gure and the table show a summary of their results:

77

The conclusion of this calibration exercise is therefore very similar to the one that followed from the regression analysis presented in the previous section. Naturally, some of the assumptions of these calibration exercise can be relaxed. For example instead of assuming at Cobb-Douglas production function, one could do levels accounting. Essentially, ranked the countries according to their capital-labor ratio (or capital-output ratio), and then use the equivalent of the growth accounting equation above, in particular, we can write

14.451: Introduction to Economic Growth where j stands for country, thus gK,j,j +1 is the proportional dierence in capital stock between countries j and j + 1, gL,j,j +1 is a proportional dierence of labor supply between the two countries, and x j,j +1 is the TFP dierence. With this method, and taking one of the countries, for example the United States, as the base country, we can calculate relative technology dierences across countries. Of course, for this we need to have good measures of factor shares in dierent countries which are not always available.

3.3

In the above approach, productivity/technology dierences are obtained as residuals from a calibration exercise, so we have to trust the functional form assumptions used in this exercise. An alternative is to use additional data. This is what Treer does to test an augmented version of the Heckscher-Ohlin approach to international trade. Although Treer does not emphasize the implications of his ndings for productivity dierences, a byproduct of his analysis is a series of estimates for dierences in factor productivities across countries. Treer starts from the standard Heckscher-Ohlin model of international trade, but allows for factor-specic productivity dierences across countries. Other than these factor-specic dierences, all countries share the same technology (i.e., there are no dierences in industry technologies) and share the same homothetic preferences (in particular, they allocate consumption expenditures across goods in the same manner). It is important that technology dierences take the form of factor productivity dierences. In particular, one unit of labor (or one college graduate) in the U.S. could be more productive than one unit of labor (or one college graduate) in Nigeria. The same applies to 79

14.451: Introduction to Economic Growth capital. This specication of course is more general than the production function in (3.1), since capital-augmenting technology dierences are allowed and the elasticity of substitution between dierent factors is not assumed to be equal to 1. A standard equation in international trade is that, in the absence of any trading frictions and with identical (or homothetic) preferences, the net export of factor f embedded in the

f , is exports of country j , Xj f Xj f f j Vj N X i=1 f f i Vi

sj

(3.7)

where Vjf is that endowment of factor f in country j , f j is the factor productivity of factor f in country j , and sj is the share of country j in world consumption (this uses the assumption that all countries have the same homothetic preferences). N is the total number of countries.

f s, equation (3.7) solves for Given estimates of the net export of factor contents, the Xj

a unique sequence of f j s taking one of the countries as the base. So from this equation we can obtain an estimate of the dierences in factor productivities. At this level, this may be viewed simply as an untested strong hypothesis. The major contribution of Treers paper is to note that if there is factor price equalization, we should also have

f wj

f j

f wj 0

f j0

(3.8)

f is the price of factor f in country j . With data for any pair of countries, j and j 0 , where wj

on factor prices, we can therefore construct alternative series for f j s. It turns out that the series implied by (3.7) and (3.8) are very similar, so there appears to be some validity to this approach. The following gure shows his estimates: 80

Given this validation, we can presume that there is some information in the numbers that Treer obtains. These numbers imply that there are very large dierences in labor productivity, and some substantial, but much smaller dierences in capital productivity. For example, labor in Pakistan is 1/25th as productive as labor in the United States. In contrast, capital productivity dierences are much more limited than labor productivity dierences. For example, capital in Pakistan is only half as productive as capital in the United States.

81

82

4.1 From Proximate to Fundamental Causes

The use of the Solow model and the production function approach illustrated how cross country income dierences can be understood as resulting from physical capital dierences, human capital dierences and technology dierences. These technology dierences, themselves, may represent actual dierences in the technologies used by countries, or other eciency dierences in the use of the factors. At this level, the framework we have does a very good job of helping us understand the proximate causes of income dierences. The same procedure also helps us understand the proximate causes of the process of economic growth. However, the observation that a country is poorer than another because it has worse technology, less physical capital and less human capital immediately poses the next question: why does it have worse technology, less physical capital, less human capital? This question 83

14.451: Introduction to Economic Growth is, in some sense, about the fundamental causes of income per capita (and growth) dierences across countries. Growth theory is useful in highlighting the proximate causes, in providing us with a framework for thinking about the fundamental causes, and also in clarifying the mechanics of the process of growth, so that we can more carefully evaluate dierent theories and approaches. But we have to take this additional step of looking for fundamental causes, otherwise what we have learned will be only partial.

4.2

Hypotheses

Why do some countries invest more in physical and human capital and possess better technologies? There are four sets of broad hypotheses: 1. Luck: some countries just turned out to be lucky. It is dicult to operationalize this approach, and at some level, it is quite similar to the other hypotheses, but less specic (one way of operationalizing it may be by using the multiple equilibrium models we will discuss below). A version of this hypothesis where such dierences are transitory is clearly not supported by the evidence presented so far, which points out to very persistent dierences over long periods. A version of this hypothesis where a small dierence caused by luck may lead to large persistent dierences is also dicult to reconcile with the data given the reversal documented above. So I will place less emphasis on the importance of luck. Nevertheless, some of the theories presented below will show how small dierences in initial condi84

14.451: Introduction to Economic Growth tions can lead to large ultimate dierences.

2. Geography: This view is becoming very popular recently. It claims that dierences in economic performance reect, to a large extent, dierences in geographic, climatic and ecological characteristics across countries. The most common is the view that climate has a direct eect on income through its inuence on work eort. This idea dates back to Machiavelli and Montesquieu. Alfred Marshall (1890) similarly wrote: vigor depends partly on race qualities: but these, so far as they can be explained at all, seem to be chiey due to climate. Gunnar Myrdal (1968): climate exerts everywhere a powerful inuence on all forms of life, and that serious study of the problems of underdevelopment... should take into account the climate and its impacts on soil, vegetation, animals, humans and physical assets in short, on living conditions in economic development. The recent bestseller by Jared Diamond, Guns, Germs and Steel, suggests that the timing of the Neolithic revolution has had a long lasting eect by determining which societies were the rst ones to develop strong armies, and technology. For example, he states that: ...proximate factors behind Europes conquest of the Americas were the dierences in all aspects of technology. These dierences stemmed ultimately from Eurasias much longer history of densely populated... societies dependent on food production (1997, p. 358). Diamond argues that dierences in the nature and history of food production, in turn, are due to the types of crops, domesticated animals, and the axis of agricultural technology diusion in dierent continents, all of which are geographically determined characteristics. In the economics circles, Je Sachs has been pushing for this view. He argues that Certain parts of the world are geographically 85

14.451: Introduction to Economic Growth favored. Geographical advantages might include access to key natural resources, access to the coastline and sea navigable rivers, proximity to other successful economies, advantageous conditions for agriculture, advantageous conditions for human health. (2000, p. 30). He further suggests that Tropical agriculture faces several problems that lead to reduced productivity of perennial crops in general and of staple food crops in particular (2000, p. 32), and that The burden of infectious disease is similarly higher in the tropics than in the temperate zones (2000, p. 32). Finally, Sachs argues that the greater population in the temperate areas over the past centuries led to more rapid advances in technologies appropriate for these areas relative to technologies necessary for development in the tropics (2001, p. 3 and 2000, pp. 33-34). The following gure shows the geographical distribution of income per capita, which is consistent with some geographic factors, such as climate) having an eect on the long-run distribution of income across countries:

86

3. Institutions: according to this view, dierences in economic performance largely reect dierences in the organization of society. Societies that provide incentives and opportunities for investment will be richer than those that fail to do so. There are many versions of this hypothesis, some of them suggesting that institutions that support property rights and rule of law are important, others suggest that limited government, or equal opportunity, or specic government policies are important for investment and eciency (of course, whether these policies are adopted is in turn determined by other factors). 4. Culture and social capital: this view instead emphasizes whether societies are able to engender the values conducive to entrepreneurship or cooperation among agents. 87

14.451: Introduction to Economic Growth Popular versions of this story include the thesis by Max Weber on the importance of religion for capitalism, and the recent work by Robert Putnam on social capital and co-operation (which is in turn related to some early work by Baneld on lack of corporation in the South of Italy). There are two major dierences between the institutions view and the culture view. First, in the institutions view, it is the social organization of the society, which, at least in theory, is changeable, that is responsible for prosperity. Instead, in the culture view, culture or social capital, to a rst approximation, cannot be changed. Second, the institutions view emphasizes much more the importance of conict between dierent groups or individuals as a determinant of social outcomes, whereas there is a more cooperative undertone to the culture view (especially in the social capital versions of this view). Finally, many versions of the culture view, such as those of Max Weber or David Landes, emphasize religion or other predetermined factors as crucial determinants of individuals approach to life and economic success.

Can we say anything about the relative importance of geography, institutions and culture? Measures of each are strongly correlated with income per capita or other determinants of income. This is borne out both by growth regressions, and level regressions. For example, returning to growth regressions of the type (1.2), the variables in X that enter signicantly can be interpreted as determinants of cross-country dierences in growth. There is a very large literature on regressions of this sort. These regression analyses nd a variety of variables to be important in explaining growth. First, investment rates in physical and human capital are found to be important. But, this does not inform us much about the ultimate sources of dierences in economic performance, since dierences in physical and 88

14.451: Introduction to Economic Growth human capital investments must be in turn caused by other factors. Among these other factors, openness, the role of government, institutions, geography, political instability, share of natural resources, nancial development, and demographics are typically included in these types of empirical analyses and found to be important. The big problem with all this literature is that there is very little attempt to formally establish causality. Much of the correlation may be no more than just thatspurious correlation, reecting the importance of other omitted factors. This lack of causality could be important especially when we think about the broad hypotheses outlined above. In particular, institutions and culture are endogenous, so one might be tempted to think that the correlation between these variables and income is more likely to spurious, and therefore give more importance to geography. This reasoning is invalid, since the particular historical development of the world economy may have brought about a correlation between institutional development and geography.

4.3

As discussed above, in Acemoglu, Johnson and Robinson (2002), we looked at the horserace between geography and institutions. The geography explanation predicts persistence in income, since the geographic, ecological and climatic factors that should matter are changing only little over periods as long as 500 years. Although the institutions view also suggests persistence, a major shock could disrupt persistence, or even create a reversal. In this context, the expansion of European overseas empire provides a natural experiment. Europeans aected the institutions of many societies through their colonization. 89

14.451: Introduction to Economic Growth More strikingly, it turns out that Europeans introduced worse institutionsin the sense of institutions discouraging investment in previously prosperous places. Therefore, while the geography view predicts persistence between 1500 and today among the former European colonies, the institutions view suggests the possibility of a reversal. The data, as discussed above, strongly suggest that there was a reversal in relative rankings across this set of countries. This discussion suggests that geographic or climactic dierences across countries are not of rst-order importance in shaping dierences in income we observe today. At the very least, these geographic factors appear to be less important than other factors. Nevertheless, this observation does not give us a direct estimate of the eect of institutions/social organization on economic performance. To go beyond a simple horserace of geography versus institutions, and to estimate the impact of institutions on economic performance, we need a source of exogenous variation in institutions. In Acemoglu, Johnson and Robinson (2001), we proposed a theory of institutional dierences among countries colonized by Europeans, and exploited this theory to derive a possible source of exogenous variation. Our theory rests on three premises: 1. There were dierent types of colonization policies which created dierent sets of institutions. At one extreme, European powers set up extractive states, exemplied by the Belgian colonization of the Congo. These institutions did not introduce much protection for private property, nor did they provide checks and balances against government expropriation. At the other extreme, many Europeans migrated and settled in a number of colonies. The settlers in many areas tried to replicate European institutions, with strong emphasis on private property and checks against government power. Primary examples of this include Australia, New Zealand, Canada, and the 90

2. The colonization strategy was inuenced by the feasibility of settlements. In places where the disease environment was not favorable to European settlement, extractive policies were more likely.

Based on these three premises, we use the mortality rates expected by the rst European settlers in the colonies as an instrument for current institutions in these countries. Summarizing this schematically:

settlements

early institutions

current institutions

current performance

The results show a large eect of institutions on income, and generate no evidence that geography matters. The following two gures summarize most of the ndings. The rst shows the cross-sectional relationship between income per capita and a measure of economic institutions, protection against expropriation risk. This is one of many potential variables capturing the institutional features of a country that can be used. Its advantage is that it is directly about protection of property rights, thus intimately related to economic incentives that are highlighted by the institutions approach. 91

10

ARG

HKG MLT BHS CHL VEN URY MEX GAB MYS ZAF CRI COL TTO BRA IDN

PAN GTM

TUN ECU PER DOM DZA PRY JAM EGYMAR BOL GUY AGO LKA HND NIC CMR GIN CIV COG SEN GHA PAK SDN VNM TGO HTI KEN UGA BGD NGA ZAR BFA MDG NER MLI SLV SLE ETH TZA

IND GMB

The second shows the rst-stage relationship between log (potential) settler mortality and protection against expropriation risk (so that higher scores correspond to better protection against expropriation by government or elites, or generally to better property rights protection), and the third shows the reduced form between income per capita and settler mortality. The latter two gures together give the two-stage least squares estimate of the eect of broad economic institutions on long-run income per capita dierences.

92

NZL AUS

IND BRA CHL IDN BHS MEX TTO COL VEN MAR JAM CRI URY PRY EGY ECU DZA TUN VNM ARG DOM LKA KEN SEN PAN PER BOL HND NIC BGD GTM SLV GAB

GMB

CIV TGO TZA CMR GIN GHA SLE AGO NER COG UGA BFA MDG MLI NGA

4 2

10

AUS NZL

USA SGP HKG CAN MLT CHL BHS BRB ARG VEN URY MEX GAB PAN COL CRI TTO BRA TUN ECU PER DZA DOM BLZ GTM PRY JAM IDN MAR EGY SLV BOL GUY AGO LKA HND NIC CMR GIN CIV MRT SEN COG GHA PAK IND SDN VNM TGO CAF HTI BEN LAO KEN UGA BGD ZAR BFA TCD NERMDG BDI RWA TZA SLE ETH MUS

Acemoglu, Johnson and Robinson (2001) conduct a variety of checks to show that this relationship is robust, and likely due to the institutional channel (but like all instrumental variable strategies, there is always the possibility that the instrument is not excludable). 93

14.451: Introduction to Economic Growth Even taking this evidence at face value, can we distinguish between culture and institutions? Some of the results in Acemoglu, Johnson and Robinson (2001), which control for proxies of cultural dierences, religion and the identity of colonial power, suggest that it is political and economic institutions not culture that matter more. But this is not conclusive. Future and smarter work is needed to make progress in distinguishing between culturebased and institutional explanations. For the rest of the course, we will look deeper into a range of models in order to understand how dierences in various policies, technologies, preferences and institutions translate into growth rates and cross-country dierences. Thus the rest of the course will be about the mechanics of economic growth. But throughout, you may want to bear in mind how these mechanics may relate to fundamental causes as discussed in this chapter.

94

95

14.451: Introduction to Economic Growth In this part, we discuss the basic neoclassical approaches to economic growth, focusing on models with exogenous technological progress. The most important advance over what we have seen so far is that these models explicitly incorporate consumer preferences and consumer behavior, so we can make meaningful statements about savings rates being endogenous and also think of welfare of consumers.

97

98

At this point, let us take a step back. The entire Solow growth model was predicated on a constant savings rate. Instead, it would be much more satisfactory to specify the preference orderings of individuals as in standard general equilibrium theory and go from there. To prepare for this, let us consider an economy consisting of a unit measure of innitely-lived households. These households can be truly innitely lived, or could consist of overlapping generations with full (or partial) altruism linking generations within the household. Then the problem would be one in which each household i has an instantaneous utility function given by ui (ci (t)) where ui : R R is increasing and concave and ci (t) is the consumption of household i this means that the individual does not derive any utility from the consumption of other households, so consumption externalities are ruled out. Throughout, we will assume that individuals discount the future proportionally (also referred to as exponentially), so that 99

14.451: Introduction to Economic Growth in discrete time and ignoring uncertainty, their preferences at time t = 0 are given by

X t=0

t i ui (ci (t)) ,

where i (0, 1) is the discount factor of household i. In addition, we can have dierences in households income processes, for example, for each household we could have eective labor

endowments of {hi (t)} t=0 , thus a sequence of labor income of {w (t) hi (t)}t=0 where w (t) is

the equilibrium wage rate per unit of eective labor. Unfortunately, at this level of generality, this problem is too hard. Even though we may be able to establish some existence results, it would be impossible to go beyond that. To avoid the complexities involved in this general formulation, the standard approach in macroeconomics and economic growth is to assume the existence of a representative consumer.

5.1

Representative Consumer

Instead of the more general framework mentioned above, we will look at economies that admit a representative consumer. What this means is that we will think that the preference side of the economy can be represented as if there were a single consumer making the consumption and saving decisions (and labor supply decisions when these are endogenized). One way of having a representative consumer is to assume that each household has the same utility function u (ci (t)) where u : R R is increasing and concave and ci (t) is the consumption of household i, and also the same discount factor , and the same sequence of eective labor endowments {h (t)} t=0 . The advantage of this approach is that the economy indeed has a representative 100

14.451: Introduction to Economic Growth consumer, so the representative consumer has a normative meaning as well as a positive meaning. In other words, we can represent the savings and consumption decisions as if they are coming from a representative consumer, and we can use the same preferences to evaluate aggregate welfare. Yet alternatively, we could assume that there is heterogeneity among households, but the aggregate behavior can be represented as if it were the outcome of the maximization of a representative consumer. In this case, the representative consumer will have positive meaning, but no normative meaning. In any case, with the representative consumer assumption in discrete time, we have that the preference side can be represented as the following maximization problem starting at time t = 0: max

X t=0

t u (c (t)) ,

where (0, 1) is the common discount factor of all the households, and c (t) is the consumption level of the representative household. This is an extremely convenient assumption, though as the next theorem shows, most models do not admit representative consumers: Theorem 8 (Debreu-Mantel-Sonnenschein) Consider an exchange economy with a nite number N < of commodities and H < households, each with potentially dierent preferences. Let p be the vector of prices and x (p) be the vector of aggregate excess demands 0 for these commodities at the price vector p. For > 0, let P = p RN + :pj /pj 0 for all j and j . Then any > 0, any continuous function x : P RN + that satises Walras Law and homogeneity of degree 0 can be an aggregate excess demand function. Proof. See Debreu (1974) or Mas-Colell, Winston and Green (1995), Proposition 17.E.3. 101

14.451: Introduction to Economic Growth Essentially, this theorem states that in general, the fact that there are optimizing individuals in the background imposes no restriction (such as being downward sloping, satisfying the weak axiom of revealed preference, or possessing a negative-semi-denite Jacobian) for aggregate (market) excess demand functions. This is therefore a negative result warning us against the use of models with representative consumers. Nevertheless, this result is partly an outcome of very strong income eects. Special but approximately realistic preference functions, as well as restrictions on the distribution of income across individuals, enable us to rule out arbitrary aggregate excess demand functions. Here the following aggregation theorem is particularly useful. To state this theorem, recall that an indirect utility function for household i is vi (p, yi ), which species the households (ordinal) utility as a function of the price vector p and the households income (wealth) yi . Theorem 9 (Gorman) Consider an economy with a nite number N < of commodities and H < households. Suppose that the preferences of household i lead to an indirect utility function of the form vi (p, yi ) = ai (p) + b (p) yi for i = 1, ..., H , then these preferences can be aggregated to be represented by those of a representative consumer, with indirect utility v (p, y ) = PH

H X i=1

ai (p) + b (p) y,

where y

i=1

yi is aggregate income.

Proof. The proof follows from basic micro theory, and is left to you as an exercise. Therefore, when there is a special form of quasi-linearity in the preferences, aggregating them to have representation for a representative consumer is possible. In this context, it is interesting to consider the CRRA (constant relative risk aversion) 102

14.451: Introduction to Economic Growth utility function in the innite-horizon economy P t c(t)1 1 if 6= 1 and 0 t=0 1 , U= P t ln c ( t ) if = 1 t=0

where is the coecient of relative risk aversion and also the inverse of the intertemporal elasticity of substitution, which regulates how willing individuals are to substitute consumption over time. This class of utility functions satisfy the conditions of Theorem 9. We will see below that CRRA preferences have a special role in models of economic growth, because they are the unique class of utility functions that are consistent with balanced growth. Therefore, if we wish to impose balanced growth, the assumption that the economy admits a representative consumer is not as restrictive as in models in which we wish to analyze growth without making the balanced growth assumption.

5.2

Problem Formulation

Let us now make the representative consumer assumption. Suppose that each households utility function in discrete time starting at time t = 0 is (ignoring uncertainty)

X t=0

t u (c (t)) ,

(5.1)

where (0, 1) is the discount factor of the households. In continuous time, this utility function becomes Z exp (t) u (c (t)) dt

0

(5.2)

14.451: Introduction to Economic Growth Where does the exponential form of the discounting in (5.2) come from? At some level, we called discounting in the discrete time case also exponential, so the link should be apparent. But to see it more precisely, imagine we are trying to calculate the value of $1 in T periods, and divide the interval [0, T ] into T /t equally-sized subintervals. Let the interest rate in each subinterval be equal to t r. It is important that the quantity r is multiplied by t, otherwise as we vary t, we would be changing the interest rate. Clearly the value of $1 in T periods at this interest rate is given by v (T | t) (1 + t r)T /t . Now we want to take the continuous time limit by letting t 0, i.e., we wish to calculate v (T ) lim v (T | t) lim (1 + t r)T /t .

t0 t0

Since the limit operator is continuous, we can write v (T ) exp lim ln (1 + t r) t0 T ln (1 + t r) = exp lim t0 t h

T /t

However, the term in square brackets has a limit of the form 0/0. Let us next write this as ln (1 + t r) r/ (1 + t r) = lim = rT t0 t0 t/T 1/T lim where I used lHopitals rule to obtain the rst equality, and then took the limits in the numerator and denominator to obtain the second equality. Therefore, v (T ) = exp (rT ) . 104

14.451: Introduction to Economic Growth With the same reasoning, $1 in T periods, is worth exp (rT ) today. The same reasoning applies to discounting utility, so discounting in continuous time takes the exponential form, with as the discount rate.

5.3

Welfare Theorems

Ultimately, we are interested in equilibrium growth. But in competitive economies such as those analyzed so far, we know that there should be an intimate connection between Pareto optima and competitive equilibria (so far we were not able to exploit these connections, since utility functions were not specied, so we could not talk of preferences explicitly). To remember these theorems, denote the vector of prices for a nite dimensional commodity vector by p, the vector of production across commodities and rms by q and the vector of consumption across commodities and households by x. Also denote the vector of endowments across households by and the vector of utilities by u. Denote the set of households by I , each household denoted by i, and use xi as the vector of consumption of household i, and p xi as the inner product of the vector of prices and the vector of consumption of household i, which is, by denition, equal to the total expenditure of household i. Other inner products and subvectors are dened similarly. Recall also that by a competitive economy, we refer to an environment without any externalities and where all commodities are traded competitively (recall that here goods at dierent dates are dierent commodities). Then we have the following two important theorems. Both of these theorems are proved in the most elegant fashion in Debreus Theory of Value for nite commodity spaces, and easier versions of the proofs are contained in Mas-Colell, Winston and Green (1995). Here I give sketch proofs.

105

14.451: Introduction to Economic Growth Theorem 10 (First Welfare Theorem) Consider a competitive economy with a nite number of individuals with preferences satisfying non-satiation and a nite number of commodities and an endowment vector . Suppose a competitive equilibrium (p , q , x ) exists. Then it is Pareto optimal. Proof. First recall that (p , q , x ) being a competitive equilibrium implies that each

household maximizes its utility by choosing x i at the price vector p and income level P yi (p ) p i + f if p q f , where if is the share of prots of rm f held by

household i, and q f is the competitive equilibrium production vector of rm f . We have P that i if = 1 by virtue of being shares.

Suppose to obtain a contradiction that there exists (p, q, x) which Pareto dominates

(p , q , x ). Then it must be the case that for all households i I , xi is weakly preferred to x i , i.e., xi i x i and for at least one i0 I , the new allocation is strictly preferred to x i , i.e., xi i x i. Since (p , q , x ) is a competitive equilibrium, it must be the case that for all i I , p xi yi (p ) (5.3)

where yi (p ) is the income of household i at price vector p dened above. Suppose not.

We know that by non-satiation p x i = yi (p ), then if p xi < yi (p ), household i could

choose more of each commodity, i.e., xi + for small enough, and again by non-satiation

reach higher utility than that given by x i . This would contradict the hypothesis that xi is

14.451: Introduction to Economic Growth Moreover, for i0 I , it must be that p xi0 > yi0 (p ) . Now summing (5.3) and (5.4) over I , we have X

iI

(5.4)

p xi >

X

iI

yi (p ) = =

since

we have that

X

iI

X

iI

p i + p i +

X

f

if p q f p q f ! ,

X

f

iI

X

f

p q f

X

f

p qf !

p xi >

X

iI

p i +

X

f

p qf

(5.5)

iI iI f

which contradicts (5.5). Consequently, the competitive equilibrium allocation (p , q , x ) is not Pareto dominated by any other feasible allocation, and is thus Pareto optimal. Notice that the proof of the rst welfare theorem only uses the summation of the values of commodities at a given price vector. No convexity assumption is necessary, but the fact that the sums above exist is essential for the proof. Finiteness of the number of commodities and number of individuals was sucient to guarantee the existence of the sums. Naturally, the 107

14.451: Introduction to Economic Growth sums may exist under other conditions, but with innite number of commodities, they may possibly fail to exist, in which case the proof, and even perhaps the First Welfare Theorem, may not apply. Theorem 11 (Second Welfare Theorem) Consider a Pareto optimal allocation yielding utility vector u to households. Then provided that all production sets and preferences are convex, there exists an endowment vector , such that the resulting competitive equilibrium (p , q , x ) yields exactly the utility vector u . Proof. (idea) Given convexity of preference and production sets, (p , q , x ) is a point of tangency between the aggregate production possibilities set and the aggregate preference set, both the which are convex. Then by the standard separating hyperplane theorem, there exists a hyperplane separating these two sets. This hyperplane gives relative prices that can decentralize the competitive equilibrium at an appropriately chosen endowment vector. The second welfare theorem is the harder theorem because of the convexity requirement. In many ways, it is also the more important one. It states that any Pareto optimal allocation can be decentralized as a competitive allocation. This motivates many macroeconomists to look for the set of Pareto optimal allocations instead of explicitly characterizing competitive equilibria. This is especially useful in dynamic models where sometimes competitive equilibria can be quite dicult to characterize or even to specify, while social welfare maximizing allocations are more straightforward. Motivated by this, we could start by looking at optimal growth, that is, a capital accumulation, saving and consumption path that is Pareto optimal given the preferences of a representative household. Although this is standard practice, there is a technical problem here, since the classical welfare theorem apply when there are nite number of commodities, 108

14.451: Introduction to Economic Growth whereas in growth models there is an innite number of commodities. The welfare theorems can be extended to innite number of commodities under certain circumstances, and the exact conditions will be discussed below. For now, let us suppose that the Second Welfare Theorem applies in this environment. In fact, having an innite-dimensional commodity space does not create a problem for the Second Welfare Theorem, as long as convexity continues to hold. The problem arises for the First Welfare Theorem because of the issue of existence of sums as discussed above. Given this, we can start on the analysis of economic growth at optimizing agents by looking at the social planners choice of an allocation that maximizes the representative households lifetime discounted utility. This is the optimal growth approach.

5.4

Let us continue to consider an economy characterized by an aggregate production function, and a representative consumer (household). The optimal growth problem in discrete time with no population growth or technological progress can be written as follows: max

X t=0

{c(t),k(t)} t=0

t u (c (t))

subject to k (t + 1) = f (k (t)) c (t) + (1 ) k (t) , k (t) 0 and given k (0) > 0. In other words, in this optimal growth problem, the social planner chooses an entire sequence of consumption levels and capital stocks in order to maximize the discounted sum 109 (5.6)

14.451: Introduction to Economic Growth of the utility of the representative consumer. The constraint (5.6) embeds the capital accumulation equation together with the production function. We have also specied that the initial level of capital stock is k (0), but this gives a single initial condition. We will see later that we need another boundary condition but not in the form of an initial condition. Instead, this will come from of the optimality of a dynamic plan in the form of a transversality condition. This maximization problem can be solved in a number of dierent ways, for example, by setting up an innite dimensional Lagrangian. But the most convenient and common way of approaching it is by using dynamic programming. Even if our purpose were not to characterize the Pareto optimal allocations, but to nd equilibrium, we would have to solve a problem similar to this. In particular, each household would be solving the following problem: max

X t=0

{c(t),k(t)} t=0

t u (c (t))

given a (0), where a (t) denotes the assets of the household at time t and r (t) is the rate of return on assets and w (t) is wage in come. The constraint, (5.7) is the ow budget constraint, meaning that it links tomorrows assets to todays assets. Here we need an additional condition so that this ow budget constraint eventually converges (i.e., so that a (t) should not go to negative innity). This can be ensured by imposing a lifetime budget constraint, but the ow budget constraint is often more convenient to work with, so we need to augment it with another condition as we will see later. 110

5.5

The formulation of the optimal growth problem in continuous time is very similar. In particular, we have

[c(t),k(t)]t=0

max

k (t) 0 and given k (0). Once again, this problem lacks one boundary condition which will come from the transversality condition. The most convenient way of characterizing the solution to this problem is via optimal control. We next discuss dynamic programming and optimal control briey.

111

112

Chapter 6

Here I provide a very brief overview of innite horizon optimization in discrete time, in particular of stationary dynamic programming. I also include some technical details, which are not essential for the purposes of this course, but may be useful for those of you who want to understand some of the tools better. 113

6.1

Using abstract but simple notation, the canonical dynamic optimization program in discrete time can be written as Problem A1

: sup

{xt+1 } t=0 X t=0

v (x0 ) =

t F (xt , xt+1 )

for all t 0

where xt X RK for some K 1. In many economic applications, we will have K = 1, so that xt R. Here I used sup rather than max, since there is no guarantee that the maximal value is attained by any feasible plan. Here F is the payo function, depending on xt , which is the state variable, and xt+1 , which corresponds to the control variable. In this simple formulation, xt+1 will also directly become the state variable in the next time period. The constraint on the problem is written as xt+1 (xt ) where :X X is a correspondence determining what type of xt+1 is allowed given the state variable xt . Notice that this problem is stationary in the sense that the payo function F is not time-dependent. It only depends on xt and xt+1 . 114

14.451: Introduction to Economic Growth Of particular importance is the function v (x0 ), which can be thought of as the value function, meaning the value of pursuing the optimal strategy starting with initial condition x0 . Notice also that I have already simplied life by writing the objective function as a discounted sum. This is the class of problems in which dynamic programming will be most useful. If instead we had a much more general problem, for example, sup

{xt+1 } t=0

F (x0 , x1 , ...),

then because there is no discounted structure, dynamic programming could not be used (at least in its simplest form). Moreover, it can be noted that problems that do not have an exponential discounted structure pose another problem for us: they are not time-consistent, in the sense that the original plan that maximizes the initial objective function is not necessarily what an individual would like to stick to if he or she is carrying out the optimization period by period. Time consistency is both a very natural property and one that makes the mathematical analysis much simpler. In many ways, it is also the essence of dynamic programming. For concreteness, let us recall the optimal growth problem from above: max

X t=0

{c(t),k(t)}t=0

t u (c (t))

subject to k (t + 1) = f (k (t)) c (t) + (1 ) k (t) , k (t) 0 and given k (0). To map this problem into the form here, let xt = k (t) and xt+1 = k (t + 1). Then use the constraint to write: c (t) = f (k (t)) k (t + 1) + (1 ) k (t) , 115

14.451: Introduction to Economic Growth and substitute into the objective function to obtain:

{c(t),k(t)} t=0

max

X t=0

t u (f (k (t)) k (t + 1) + (1 ) k (t))

subject to k (t) 0 (which is the simplest form of a constraint correspondence ). Problem A1, also referred to as the sequence problem, is one of choosing an innite

sequence {xt } t=0 from some (vector) space of innite sequences (for example, {xt }t=0 L ,

where L is the vector space of innite sequences that are bounded with the kk norm, which I will denote throughout by the simpler notation kk). Such problems sometimes have nice features, but often are dicult to characterize both analytically and numerically. The basic idea of dynamic programming is to turn the sequence problem into a functional equation, i.e., one of nding a function rather than a sequence. This often gives better economic insights, similar to the logic of comparing today to tomorrow. It is also often easier to characterize analytically or numerically. In this particular case, the relevant functional equation can be written as Problem A2 :

y (x)

v (x) =

(6.1)

In fact, this form of the problem suggests itself naturally from the formulation Problem

A1. Suppose Problem A1 has a maximum with optimal sequence denoted by {x t }t=0 starting

X t=0 t F (x t , xt+1 ) X

F (x0 , x 1) F (x0 , x 1)

+ +

j F (x j +1 , xj +2 )

j =0 v (x 1)

116

14.451: Introduction to Economic Growth This equation encapsulates the basic idea of dynamic programming: the principle of optimality. Essentially, an optimal plan can be broken into two parts, what is optimal to do today, and the optimal continuation path. Dynamic programming exploits this principle and provides us with a set of powerful tools to analyze optimization in discrete time innite horizon problems. Part of the theory of dynamic programming is about specifying the conditions under which Problems A1 and A2 are equivalent. These are not central for us to focus upon here, but I will return to some of these issues below. Problem A2 is commonly referred to as the Bellman equation, after Richard Bellman, who introduced dynamic programming to operations research and engineering applications (though identical tools and reasonings, including the contraction mapping theorem were earlier used by Lloyd Shapley in his work on stochastic games). A couple of points are immediately worth noting. First, v (x) is a function, more formally, v:XR Dierently from other maximization problems, here maximization itself denes the function v, as the notation makes it clear with the sup (or max) dening the function. Therefore,

instead of nding a sequence {xt } t=0 L , we will try to nd a function v , that satises

(6.1). Second, because the function v is dened recursively, in the sense that it is on the right hand side of (6.1) as well, this is often referred to as the recursive formulation. What makes this formulation useful is that the solution will often be a time invariant policy function, g : X X determining what value of xt+1 to choose for a given value of the state variable xt . [In general, there are two complications: rst, a control reaching the 117

14.451: Introduction to Economic Growth optimal value may not exist, which was the reason why we originally used the notation sup; second, we may not have a policy function, but a policy correspondence g : X X , because there may be more than one maximizers for a given state variable. Let me avoid these complications for now, and assume that g () is single valued, thus a functionconditions to guarantee this are provided below]. Moreover, as we will see, once the value function v is determined, the policy function is given straightforwardly. In particular, by denition it must be the case that v (x) = [F (x, g (x)) + v (g (x))] , for all x X, which is one way of determining the policy function. This equation simply follows from the fact that g (x) is the optimal policy, so reaches the maximal value v (x). The usefulness of the recursive formulation as in (6.1) comes from the fact that there are some powerful tools which not only establish existence of the solution, but also some of its properties. These are not essential for understanding the application of these tools to economic growth models, but they are useful for working with these tools in general and with growth models in particular.

6.2

6.2.1

Contraction Mappings

We say that (S, ) is a metric space, if S is a space and is a metric dened over this space with the usual properties (loosely corresponding to distance between elements of S ). Denition 6 Let (S, ) be a metric space and T : S S be an operator mapping S into 118

14.451: Introduction to Economic Growth itself. T is a contraction mapping (with modulus ) if for some (0, 1), (T x, T y ) (x, y ), for all x, y S. In other words, a contraction mapping brings elements of the space S closer to each other. For example, let us take a simple interval of the real line as our space, S = [a, b], with usual metric of this space (x, y ) = |x y |. Then T : S S is a contraction if for some (0, 1), |T x T y | < 1, |x y | all x, y S with x 6= y.

Denition 7 A xed point of T is any element of S satisfying T x = x. Recall also that a metric space (S, ) is complete if every Cauchy sequence in S converges to an element in S . Theorem 12 (Contraction Mapping Theorem) Let (S, ) be a complete metric space, and T : S S be a contraction. Then there exists a unique v S such that T = , i.e., a unique xed point. Proof. Note T n x = T (T n1 x) for any n = 1, 2, .... Now take 0 S , and a sequence { n } n=0 with each element in S , such that n+1 = T n so that n = T n 0. This implies that ( 2 , 1 ) = (T 1 , T 0 ) ( 1 , 0 ), 119

14.451: Introduction to Economic Growth where the last inequality uses the contraction property of T . Moreover, by induction, we have ( n+1 , n ) n ( 1 , 0 ), Hence, for any m > n, ( m , n ) ( m , m1 ) + ... + ( n+2 , n+1 ) + ( n+1 , n ) m1 + ... + n+1 + n ( 1 , 0 ) = n mn1 + ... + + 1 ( 1 , 0 ) n ( 1 , 0 ), 1 where the rst line uses the triangle inequality (which is true by denition for any metric), and the second line uses (6.2). The last line implies that as n, m , m and n are getting closer, so { n } n=0 is a Cauchy sequence. Since S is complete, this establishes that n S. Now note that for any 0 S and any n N, we have (T , ) (T , T n 0 ) + (T n 0 , ) ), ( , T n1 0 ) + (T n 0 , where the rst line again uses the triangle inequality, and the second line the denition of the contraction. The above argument shows that both of the terms on the right tend to zero as n , which implies that (T , ) = 0, establishing that T = , thus a xed point exists. 120 n = 1, 2, ... (6.2)

14.451: Introduction to Economic Growth Uniqueness is proved by contradiction. Suppose that there exist , S , such that T = and T = with 6= . This implies 0 < a = ( , ) = (T , T ) ( , ) = a. Since < 1, this yields a contradiction, proving uniqueness. The use of the contraction mapping theorem is that it can be applied to any metric space, so in particular to the space of functions. Applying it to equation (6.1) will establish the existence of a unique value function v , greatly facilitating the analysis of such dynamic models. Naturally, for this we have to prove that the recursion in (6.1) denes a contraction mapping. We will see below that this is often straightforward. Before doing this, let us consider another useful result. First, recall that if (S, ) is a complete metric space and S 0 is a closed subset of S , then (S 0 , ) is also a complete metric space. Theorem 13 Let (S, ) be a complete metric space, T : S S be a contraction mapping with T = . If S 0 is a closed subset of S , and T (S 0 ) S 0 , then S 0 . Moreover, if S 00 . T (S 0 ) S 00 S 0 , then Proof. Take an arbitrary 0 S 0 , and consider the sequence {T n 0 } n=0 . Each element of from Theorem 12. Since S 0 is this sequence is in S 0 by the fact that T (S 0 ) S 0 . T n 0 closed, S 0 , proving the rst claim in the theorem. If in addition we have that T (S 0 ) S 00 , S 00 , so S 00 , proving the second part. then by virtue of the fact that S 0. T The second part of this theorem is very important to prove results such as strict concavity or that a function is strictly increasing. This is because the set of strictly concave functions or the strictly increasing functions are not closed. The second part of the theorem enables us to avoid this complication. 121

14.451: Introduction to Economic Growth How do we check that a mapping is a contraction? Here, the following theorem is useful, especially in the context of dynamic programming. Let us use the notation (f + a)(x) = f (x) + a for some a R. Then: Theorem 14 (Blackwells sucient conditions for a contraction) Let X RK , and B (X ) be the space of bounded functions f : X R. dened on X Suppose that T : B (X ) B (X ) is an operator satisfying the following two conditions: 1. (monotonicity) For any f, g B (X ) and f (x) g (x) for all x X implies (T f )(x) (T g )(x) for all x X . 2. (discounting) There exists (0, 1) such that [T (f + a)](x) (T f )(x) + a, for all f B (X ), a 0, x X, Then, T is a contraction with modulus . Proof. Let f g stand for f (x) g (x) for all x X . By denition for any f, g B (X ), f g + kf gk , where again kk is the sup norm. Now applying the operator T on both sides, we have T f T (g + kf g k) T g + kf g k , where the rst inequality uses monotonicity and the second discounting. Applying the same argument in reverse establishes T g T f + kf g k . 122

14.451: Introduction to Economic Growth Combining these two inequalities yields kT f T gk kf g k , proving that T is a contraction.

6.2.2

Let us now apply the above tools to the problem of dynamic programming, outlined at the beginning. Consider a sequence {xt+1 } t=0 which attains the supremum of Problem A1. We will now show that this sequence will satisfy the recursive equation of dynamic programming v(xt ) = F (xt , xt+1 ) + v (xt+1 ), for all t = 0, 1, 2, ..., (6.3)

and moreover, under some boundedness conditions, any sequence that is a solution to (6.3) is a solution to Problem A1, in the sense that it attains its supremum. In other words, we will establish some equivalence results between the solutions to Problem A1 and Problem A2. To prepare for these results, let us dene the set of feasible sequences or plans starting with initial value x0 : (x0 ) = {{xt+1 } t=0 : xt+1 (xt ), t = 0, 1, ...}.

Let us denote a typical element of the set by x = (x0 , x1 , ...) (x0 ), and assume: Assumption 3 (x) is nonempty for all x X ; and for all x0 X and x (x0 ), P t limn n t=0 F (xt , xt+1 ) exists. 123

14.451: Introduction to Economic Growth , where R is the extended real line Next dene the supremum function v : X R v (x0 ) = sup u(x).

x(x0 )

Thus v (x0 ) is the supremum in Problem A1 (i.e., the value of the program in Problem A1). Note that it follows by denition that v is the unique function satisfying the following three conditions for Problem A1, or the sequence problem, SP: 1. if |v (x0 )| < , then v (x0 ) u(x), and for any > 0, v (x0 ) u(x) + , some x (x0 ); (6.5) all x (x0 ); (6.4)

2. if v (x0 ) = +, then there exists a sequence {xk } in (x0 ) such that limk u(xk ) = +; and 3. if v (x0 ) = , then u(x) = ,, for all x (x0 ). Conversely, we will say that v is a solution to Problem A2 (and thus satises the functional equation (6.3)), if the following three conditions for FE hold: 1. If |v (x0 )| < , then v (x0 ) F (x0 , y ) + v (y ), and for any > 0, v (x0 ) F (x0 , y ) + v (y ) + , 124 some y (x0 ); (6.7) all y (x0 ), (6.6)

14.451: Introduction to Economic Growth 2. if v (x0 ) = +, then there exists a sequence {y k } in (x0 ) such that

k

lim F (x0 , y k ) + v (y k ) = +;

(6.8)

3. if v (x0 ) = , then F (x0 , y ) + v (y ) = , We now have the following simple lemma: Lemma 1 Let X, , F, and satisfy Assumption 3. x = (x0 , x1 , ...) (x0 ), u(x) = F (x0 , x1 ) + u(x0 ) with x0 = (x1 , x2 , ...). Proof. Under Assumption 3, for any x0 X and any x (x0 ), u(x) =

n

all y (x0 ).

(6.9)

lim

n X t=0

t F (xt , xt+1 )

n X t=0

= F (x0 , x1 ) + lim

t F (xt+1 , xt+2 )

= F (x0 , x1 ) + u(x0 ).

This lemma basically says that the utility from any feasible plan can be decomposed into two parts, the current return and continuation value. It therefore formalizes the principle of optimality introduced more informally above. Theorem 15 Let X, , F, and satisfy Assumption 3. Then the function v is a solution to Problem A2. 125

14.451: Introduction to Economic Growth Proof. If = 0, the result is trivial. Suppose that > 0, and choose x0 X. Suppose v (x0 ) is nite. Then SP conditions (6.4) and (6.5) hold, and it is sucient to show that this implies that the FE conditions (6.6) and (6.7) hold. To establish (6.6), let x1 (x0 ) and > 0 be given. Then by SP (6.5) there exists x0 = (x1 , x2 , ...) (x1 ) such that u(x0 ) v (x1 ) . Note also that x = (x0 , x1 , x2 , ...) (x0 ). Hence it follows from SP (6.4) and Lemma 1 that v (x0 ) u(x) = F (x0 , x1 ) + u(x0 ) F (x0 , x1 ) + v (x1 ) for any > 0, establishing FE (6.6). To establish FE (6.7), choose x0 X and > 0. From SP (6.5) and Lemma 1, it follows that one can choose x = (x0 , x1 , ...) (x0 ), so that v (x0 ) u(x) + = F (x0 , x1 ) + u(x0 ) + , where x0 = (x1 , x2 , ...). It then follows from SP (6.4) that v (x0 ) F (x0 , x1 ) + v (x1 ) + . Since x1 (x0 ), this establishes FE (6.7). If v (x0 ) = +, then there exists a sequence {xk } in (x0 ) such that lim u(xk ) = +. Since xk 1 (x0 ), all k, and

0k k k u(xk ) = F (x0 , xk 1 ) + u(x ) F (x0 , x1 ) + v (x1 ), k

all k,

it follows that FE (6.8) holds for the sequence {y k = xk 1 } in (x0 ). If v (x0 ) = , then

14.451: Introduction to Economic Growth where x0 = (x1 , x2 , ...). Since F is real-valued (thus does not take the values or +), it follows that u(x0 ) = , all x1 (x0 ), all x0 (x1 ).

Hence v (x1 ) = , all x1 (x0 ). Since F is real-valued and > 0, (6.9) follows immediately. Under the additional boundedness condition, we have the following converse to this theorem: Theorem 16 Let X, , F, and satisfy Assumption 3. satises

n

lim n v(xn ) = 0,

(6.10)

then v = v . Proof. (sketch) Condition (6.10) implies that v cannot take on the values + or . Hence v satises (6.6) and (6.7), and it is sucient to show that this implies v satises (6.4) and (6.5). Since v is the solution to Problem A2, then (6.6) implies that for all x0 X and x (x0 ) v(x0 ) F (x0 , x1 ) + v (x1 ) F (x0 , x1 ) + F (x1 , x2 ) + 2 v (x2 ) . . . un (x) + n+1 v(xn+1 ). Now taking the limit as n and using the convergence property from (6.10), we obtain (6.4) for any x (x0 ). 127

14.451: Introduction to Economic Growth Now for a given x0 X and > 0, choose an arbitrary sequence { n } n=1 in R+ such P n1 that n=1 n /2. Since (6.7) holds, we can choose xt+1 (xt ) so that v (xt ) F (xt , xt+1 ) + v(xt+1 ) + t+1 . Using these inequalities, we obtain that for any x = (x0 , x1 , x2 , ...) (x0 ), we have v (x0 ) un (x) + n+1 v(xn+1 ) + ( 1 + 2 + ... + n n+1 ) un (x) + n+1 v(xn+1 ) + /2, n = 1, 2, ...

Since (6.10) implies that for n suciently large the second term is also less than /2, it follows that as n , v(x0 ) u(x) + , completing the proof. An important implication is that although Problem A2 may have many solutions, only one of those will satisfy the convergence condition (6.10). In general, we can make a lot of progress by studying solutions to Problem A2, but sometimes we need to impose (6.10) in order to pick the right solution (this is similar to sometimes working with necessary conditions for optimization, though of course then we need to impose the suciency conditions). Naturally, our interest is mainly with optimal plans. For this we have:

Theorem 17 Let X, , F, and satisfy Assumption 3. Let x (x0 ) be a feasible plan that attains the supremum in Problem A1 starting with initial state x0 . Then

v (x t ) = F (xt , xt+1 ) + v (xt+1 ),

t = 0, 1, 2, ...

(6.11)

128

0 v (x 0 ) = u(x ) = F (x0 , x1 ) + u(x )

all x (x0 ).

(6.12)

Now choose x1 = x 1 , (6.12) still holds. Since (x1 , x2 , x3 , ...) (x1 ) implies that (x0 , x1 , x2 , x3 , ...)

Therefore u(x0 ) = v(x 1 ). Substituting this into (6.12) yields (6.11) for t = 0. Continuing by induction establishes (6.11) for all t. Finally, the converse to this theorem is: Theorem 18 Let X , , F, and satisfy Assumption 3. Let x (x0 ) be a feasible plan from x0 satisfying (6.11), and with lim sup t v (x t ) 0.

t

(6.13)

Then x attains the supremum in Problem A1 for initial state x0 . Proof. Suppose that x (x0 ) satises (6.11) and (6.13). Then it follows by induction on (6.11) that v (x0 ) = un (x ) + n+1 v (x n+1 ), n = 1, 2, ...

Then using (6.13), we nd that v (x0 ) u(x ). Since x (x0 ), the reverse inequality holds, establishing the result. The above theorems are useful in showing the equivalence of Problem A1 and Problem A2. Now the usefulness of the dynamic programming formulation in Problem A2, and hence 129

14.451: Introduction to Economic Growth of the contraction mapping theorem, comes from the fact that its solution is often easy to characterize. So for this purpose, take the following version of the dynamic programming problem (Problem A2) v(x) = max [F (x, y ) + v(y )] ,

y (x)

(6.14)

where < 1. As before, X is the possible set of values for the state variable and : X X is the correspondence describing the constraints on the problem. We now make an additional assumption, which is not necessary, but greatly simplies the analysis. Assumption 4 X is a compact subset of RK , is nonempty, compact-valued and continuous. Moreover, let A = {(x, y ) X X : y (x)} and F : A R be bounded and continuous. The importance of Assumption 4 is that it will allow us to focus on the space of bounded functions. Most importantly, since F is bounded over its eective domain, there exists some B < , such that |F (x, y )| < B for all (x, y ) A. This immediately implies that |v (x)| B/(1 ), all x X . Consequently, we can focus our attention on value functions in the space C (X ) of continuous bounded functions dened on X , with the natural norm on this space, the sup norm, kf k = supxX |f (x)|. In particular, to see the usefulness of the contraction mapping theorem, now dene the operator T such that (T f )(x) = max [F (x, y ) + f (y )].

y (x)

(6.15)

A xed point of this operator, v = T v, will be a solution to (6.14), establishing the desired results. Then we can derive the policy functions from the value function.

130

14.451: Introduction to Economic Growth Theorem 19 Let X, , F, and satisfy Assumption 4 and let C (X ) be the space of bounded continuous functions f : X R, with the sup norm. Then the operator T maps C (X ) into itself, i.e., T : C (X ) C (X ), and has a unique xed point, v C (X ) satisfying (6.14). Proof. Formulated in this way, it is immediate that T is a contraction. Since the maximization problem on the right hand side of (6.15) is one of maximizing a bounded function over a compact set, it has a solution. Consequently, T is well dened and is easily seen to satisfy the sucient conditions for a contraction in Theorem 14. Therefore, applying Theorem 12, a unique v C (X ) satisfying (6.14) exists. Corollary 4 Let G : X X dened as G(x) = {y (x) : v(x) = F (x, y ) + v (y )} , (6.16)

be the policy function (correspondence). Under the assumptions of Theorem 19, G is compact valued and upper hemi-continuous. Proof. This follows immediately from Berges maximum theorem. We can next see how Theorem 13 enables us to establish more properties of the value function and the policy correspondence. In particular, for example, let us assume Assumption 5 For each y , F (, y ) is strictly increasing in each of its rst K arguments, and is monotone in the sense that x x0 implies (x) (x0 ). Theorem 20 Let X, , F, and satisfy Assumptions 4 and 5, and let v be the unique solution to (6.14). Then v is strictly increasing. 131

14.451: Introduction to Economic Growth Proof. Let C 0 (X ) C (X ) be the set of bounded, continuous, nondecreasing functions on X , and let C 00 (X ) C 0 (X ) be the set of strictly increasing functions. Since C 0 (X ) is a closed subset of the complete metric space C (X ), by Theorem 13, it is sucient to show that T [C 0 (X )] C 00 (X ). Assumption 5 immediately implies that for any nondecreasing f , T f is increasing, establishing the result. Furthermore, let us impose Assumption 6 F is strictly concave, i.e., F [(x, y ) + (1 )(x0 , y 0 )] F (x, y ) + (1 )F (x0 , y 0 ), all (x, y ), (x0 , y 0 ) A, In addition, the inequality is strict if x 6= x0 . Moreover, is convex in the sense that for any 0 1, and x, x0 X, y (x) and y 0 (x0 ) implies y + (1 )y 0 [x + (1 )x0 ]. This assumption imposes enough concavity on the problem, in particular, it rules out increasing returns of any form. Theorem 21 Let X, , F, and satisfy Assumptions 4, 5 and 6, and let v satisfy (6.14); and let G satisfy (6.16). function. Proof. The proof again follows from Theorem 13. Let C 0 (X ) C (X ) be the set of bounded, continuous, (weakly) concave functions on X , and let C 00 (X ) C 0 (X ) be the set of strictly 132 Then v is strictly concave and G is a continuous, single-valued and all (0, 1).

14.451: Introduction to Economic Growth concave functions. Since C 0 (X ) is a closed subset of the complete metric space C (X ), by Theorem 13, T [C 0 (X )] C 00 (X ) would establish the results. To see this, let f C 0 (X ) and let x0 6= x1 , (0, 1), and x = x0 + (1 )x1 .

Let yi (xi ) attain (T f )(xi ), for i = 0, 1. Then Assumption 6 implies that y = y0 + (1 )y1 (x ), so that (T f )(x ) F (x , y ) + f (y ) > [F (x0 , y0 ) + f (y0 )] + (1 )[F (x1 , y1 ) + f (y1 )] = (T f )(x0 ) + (1 )(T f )(x1 ), where the rst line is a simple implication of (6.15) and the fact that y (x ); the second line uses the hypothesis that f is concave and the concavity restriction on F from Assumption 6. Since these relationships are true for any f C 0 (X ), they establish T [C 0 (X )] C 00 (X ), so that the unique xed point v is strictly concave. Since, from Assumption 6, F is also concave and for each x X , (x) is convex, it follows that the maximum in (6.15) is attained at a unique y value. Hence G is a single-valued function, and its continuity follows from the fact that it is upper hemi-continuous. Finally, by also assuming dierentiability, we can also prove that the value function is dierentiable. Assumption 7 F is continuously dierentiable on the interior of its domain A. Theorem 22 Let X, , F, and satisfy Assumptions 4, 5, 6 and 7. Furthermore, let v satisfy (6.14) and G satisfy (6.16). Suppose also that x0 IntX and G (x0 ) Int (x0 ), then v is continuously dierentiable at x0 . 133

14.451: Introduction to Economic Growth Proof. From Theorem 21, G is a function (i.e., single valued). Moreover, since G(x0 ) Int(x0 ) and is continuous, it follows that G(x0 ) Int(x), for all x in some neighborhood D of x0 . Dene W () on D by W (x) = F [x, G(x0 )] + v[G(x0 )].

Since F is concave (Assumption 6) and dierentiable (Assumption 7), it follows that W () is concave and dierentiable. Moreover, since G(x0 ) (x) for all x D, it follows that

y(x)

for all x D

(6.17)

with equality at x0 . Now, we show that (6.17) implies that v () is dierentiable. For this note that v () is concave, thus v () is convex, and by a standard result in convex analysis, it possesses subgradients. Moreover, for any subgradient p of v at x0 must satisfy

for all x D,

where the rst inequality uses the denition of a subgradient and the second uses the fact that W (x) v(x), with equality at x0 as established in (6.17). Since W is dierentiable at x0 , p is unique, and again by a standard result in convex analysis, any convex function with a unique subgradient at an interior point x0 is dierentiable at x0 . This establishes that v (), thus v (), is dierentiable as desired. 134

6.3

6.3.1

Basic Equations

Now consider the functional equation v(x) = max [F (x, y ) + v(y )] , for all x X.

y (x)

(6.18)

We know that the solution to our problem has to satisfy this functional equation. Moreover, let us assume (as proved under some conditions above) that the value function v is dierentiable (we take the payo function F to be dierentiable everywhere). Moreover, consider y Int (x), in other words, the constraints on the problem are not binding. Then we can write a convenient Euler equation for this problem (again using s to denote optimal values) as y F (x , y ) + y v (y ) = 0. Let us rst focus on the case where both x and y are real numbers. Then, we have the simpler condition: F (x , y ) + v 0 (y ) = 0. y (6.19)

This is very intuitive; it requires the sum of the marginal gain today from increasing y and the discounted marginal gain from increasing y on the value of all future returns to be equal to zero. For example, we can think of F as being decreasing in y and increasing in x (recall for example the representation of the basic growth model with F (x, y ) corresponding to u (f (x) y + (1 ) x)or u (f (k (t)) k (t + 1) + (1 ) k (t))). In this case, equation (6.19) requires the current cost of increasing y to be compensated by higher values tomorrow. 135

14.451: Introduction to Economic Growth In the context of growth, this corresponds to current cost of reducing consumption to be compensated by higher consumption tomorrow. This is a very nice condition, but it involves v 0 (y ), i.e., the derivative of the value function, which we do not know. Here we can use the equivalent of the Envelope Theorem for dynamic programming, and dierentiate (6.18): x v (x) = x F (x, y ), where z f denotes the gradient vector of function f with respect to the vector z . In the case of one-dimensional variables, we have the more intuitive equation: v0 (x) = F (x, y ) . x (6.20)

These equations follow from the fact that x does not appear directly anywhere else (and its eects through y , i.e., x y or y/x can be ignored, given the optimality condition (6.19)). Now in the one-dimensional case, combining (6.20) together with (6.19), we have the following very useful condition: F (x , y ) F (y , g (y )) + =0 y x where x denotes the derivative with respect to the rst argument and y with respect to the second argument, and g (x) is the optimal policy given state variable x. Alternatively, we could write this with the time subscripts as

F (x F (x t , xt+1 ) t+1 , xt+2 ) + = 0. xt+1 xt+1

(6.21)

However, this Euler equation is not sucient for optimality. In addition we need the transversality condition. In the more general case this is equivalent to:

t lim t xt F (x t , xt+1 ) xt = 0

136

14.451: Introduction to Economic Growth where denotes the inner product operator. In the one-dimensional case, we have the simpler transversality condition: lim t

F (x t , xt+1 ) x t = 0. xt

(6.22)

In words, this condition requires that the product of the marginal return from the state variable x times the value of this state variable does not increase asymptotically at a rate faster than 1/ . We will see why this transversality condition makes sense shortly. But for now, we can note the following theorem: Theorem 23 Let X RK + , and suppose that X, , F, and satisfy Assumptions 4, 5, 6 and 7. Then the sequence xt+1 t=0 , with x t+1 Int(xt ), t = 0, 1, . . . , is optimal for Problem A1 given x0 , if it satises (6.21) and (6.22). Proof. Let x0 be given; let {x t } be a feasible (nonnegative) sequence satisfying (6.21) and (6.22) and {xt } another feasible (nonnegative) sequence. Assumptions 4, 6 and 7 imply that F is continuous, concave, and dierentiable, so let us dene lim

T X t=0 t [F (x t , xt+1 ) F (xt , xt+1 )]

as the dierence of the objective function between the feasible sequences {x t } and {xt }. If we establish that is nonnegative for any feasible nonnegative sequence {xt }, then we will have established {x t } yields no lower utility than any feasible {xt }, thus it must be optimal. Now by denition of a concave function, we have lim

T X t=0 t [Fx (x t , xt+1 ) (xt xt ) + Fy (xt , xt+1 ) (xt+1 xt+1 )]

137

14.451: Introduction to Economic Growth Since x 0 x0 = 0, rearranging terms gives (T 1 ) X T t [Fy (x lim t , xt+1 ) + Fx (xt+1 , xt+2 )] (xt+1 xt+1 ) + Fy (xT , xT +1 ) (xT +1 xT +1 ) .

T t=0

Since {x t } satises (6.21), the terms in the summation are all zero. Therefore, substituting from (6.21) into the last term and then using (6.22) gives

lim T Fx (x T , xT +1 ) (xT xT ) lim T Fx (x T , xT +1 ) xT , T T

where the last line uses the fact that from Assumption 5, F is increasing in x, i.e., Fx 0 and xt 0, all t. result. 0 then immediately follows from (6.22), establishing the desired

6.3.2

To get more insights into dynamic programming, let us return to the sequence problem. Also, let us suppose that xt is one dimensional and that there is a nite horizon T . Then the problem becomes

{xt+1 }T t=0 T X t=0

max

t F (xt , xt+1 )

subject to xt+1 0 with x0 as given. Moreover, let F (xT , xT +1 ) be the last periods utility, with xT +1 as the state variable left after the last period (this utility could be thought of as the salvage value for example), since the world ends after date T . In this case, we have a nite-dimensional optimization problem and we can simply look at rst-order conditions. Moreover, let us again assume that the optimal solution lies in 138

14.451: Introduction to Economic Growth the interior of the constraint set, i.e., x t > 0, so that we do not have to worry about boundary conditions and complementary-slackness type conditions. Given these, the rstorder conditions of this nite-dimensional problem are exactly the same as the above Euler equation. In particular, we have for any 0 t T 1, or for any 0 t T 1,

F (x F (x t , xt+1 ) t+1 , xt+2 ) + = 0, xt+1 xt+1 t F (xt , xt+1 )

xt+1

xt+1

= 0,

which are identical to the Euler equations for the innite-horizon case. In addition, for xT +1 , we have the following boundary condition

T x T +1 0, and F (x T , xT +1 ) xT +1 = 0. xT +1

(6.23)

Intuitively, this boundary condition requires that x T +1 should be positive only if an interior value of it maximizes the salvage value at the end. Again, returning to the growth example for a second, recall that F (x, y ) = u (f (x) + (1 ) x y ) , with the mapping x = k and y = k+1 . Now in this case at the last date T , we have

F (x T , xT +1 ) = u0 (c T ) < 0, xT +1 Therefore, we must have kT +1 = 0, i.e., there will be no capital left at the end of the world.

This is very intuitive. If any of it were left, utility could be improved by consuming that capital either at the last date or at some earlier date. 139

14.451: Introduction to Economic Growth Now, heuristically we can derive the transversality condition as an extension of condition (6.23) to T . Take this limit, which implies

T

lim T

F (x T , xT +1 ) xT +1 = 0. xT +1

F (x F (x T , xT +1 ) T +1 , xT +2 ) + = 0, xT +1 xT +1

T F (x T +1 , xT +2 ) xT +1 = 0. xT +1

or canceling the negative sign, and without loss of any generality, changing the timing: lim

T F (x T , xT +1 ) xT = 0, xT

which is exactly the transversality condition as (6.22). This derivation also emphasizes that alternatively we could have had the transversality condition as lim T

F (x T , xT +1 ) xT +1 = 0, xT +1

which emphasizes that there is no unique transversality condition, but we generally need a boundary condition at innity, which would be one of multiple potential conditions. This issue will return when we look at optimal control in continuous time. Therefore, a slightly dierent (and more heuristic) way of obtaining Theorem 23, is to consider the above sequence problem with T , i.e.,

{xt+1 } t=0

max

X t=0

t F (xt , xt+1 ).

140

14.451: Introduction to Economic Growth By taking the limit of the above nite-dimensional conditions, we obtain the Euler equation:

F (x F (x t , xt+1 ) t+1 , xt+2 ) + = 0 for all t 0, xt+1 xt+1

and now the transversality condition (6.22) is also necessary, which can be established by using a variational argument, or heuristically, as the limit of the boundary condition as derived above.

6.4

We are now in a position to apply the methods developed so far to the problem of optimal growth. In this section, I will limit myself to optimal growth. Recall the optimal growth problem as max

X t=0

{c(t),k(t)} t=0

t u (c (t))

(6.24)

subject to k (t + 1) = f (k (t)) + (1 ) k (t) c (t) and k (t) 0, with given k (0). We continue to make the standard assumptions on the production function as in Assumptions 1 and 2. In addition, we assume that: Assumption 8 u : [c, ) R is continuously dierentiable and strictly concave. This is considerably stronger than what we need. In fact, concavity or even continuity is enough for most of the results. But this assumption helps us avoiding inessential technical 141 (6.25)

14.451: Introduction to Economic Growth details. The lower bound on consumption is imposed to have a compact set of consumption possibilities. The rst step is to write the optimal growth problem as a (stationary) dynamic programming problem. This follows immediately from what we have done so far: V (k) = max {u (c) + V [f (k) + (1 ) k c]}

c(k)

(6.26)

with (k) given by the interval [c, f (k) + (1 ) k ] given the nonnegativity of the capital stock. Given the above theorems, in particular Theorems 15-22, the following proposition immediately follows: Proposition 13 Given Assumptions 1, 2 and 8, the optimal growth model as specied in (6.24) and (6.25), has a stationary solution characterized by the value function V (k) and consumption function c (k). The amount s (k) is the capital stock of the next period, where s (k) = f (k) + (1 ) k c (k). Moreover, V (k) is strictly increasing and concave in k and s (k) is nondecreasing. Proof. Optimality of the solution to the value function (6.26) for the problem (6.24) and (6.25) follows from Theorems 15-18. That V (k) exists follows from Theorem 19, and the fact that it is increasing and strictly concave, with the policy correspondence being a policy function follows from Theorem 21. Thus we only have to show that s (k) is nondecreasing. This can be proved by contradiction. Suppose, to arrive at a contradiction, that s (k) is decreasing, i.e., there exists k and k 0 > k such that s (k) > s (k0 ). Since k0 > k, s (k ) is feasible when the capital stock is k0 . Moreover, since, by hypothesis, s (k) > s (k0 ), s (k 0 ) is feasible at capital stock k. 142

14.451: Introduction to Economic Growth By optimality and feasibility, we must have: V (k) = u (f (k) + (1 ) k s (k)) + V (s (k)) u (f (k) + (1 ) k s (k0 )) + V (s (k0 )) V (k 0 ) = u (f (k0 ) + (1 ) k0 s (k0 )) + V (s (k0 )) u (f (k0 ) + (1 ) k0 s (k)) + V (s (k )) . Combining and rearranging these, we have u (f (k) + (1 ) k s (k)) u (f (k) + (1 ) k s (k0 )) [V (s (k0 )) V (s (k))] u (f (k0 ) + (1 ) k0 s (k)) u (f (k0 ) + (1 ) k0 s (k0 )) . Or denoting z f (k) + (1 ) k and x s (k ) and similarly for z 0 and x0 , we have u (z x0 ) u (z x) u (z 0 x0 ) u (z 0 x) . But clearly, (z x0 ) (z x) = (z 0 x0 ) (z 0 x) , which combined with the fact that z 0 > z and that u is strictly concave and increasing implies that u (z x0 ) u (z x) > u (z 0 x0 ) u (z 0 x) , contradicting (6.27). This establishes that s (k) must be nondecreasing everywhere. In addition, Assumption 2 (the Inada conditions) imply that savings and consumption levels have to be interior, thus Theorem 22 applies and immediately establishes: 143 (6.27)

14.451: Introduction to Economic Growth Proposition 14 Given Assumptions 1, 2 and 8, the value function V (k) dened above is dierentiable. Consequently, from Theorem 23, we can look at the Euler equations. To do this, let us write the recursive formulation as V (k) = max {u (f (k) + (1 ) k s) + V [s]}

s(k)

In this case the Euler equation takes the simple form: u0 (c) = V 0 (s) where s denotes the next dates capital stock. Applying the envelope condition, we have V 0 (k) = [f 0 (k) + (1 )] u0 (c) . Consequently, we have the familiar-looking condition u0 (ct ) = [f 0 (kt+1 ) + (1 )] u0 (ct+1 ) . A steady state is dened as usual as an allocation in which the capital-labor ratio and consumption do not depend on time, so again denoting this by *, we have the steady state capital-labor ratio as [f 0 (k ) + (1 )] = 1, (6.28)

which is a remarkable result, because it shows that the steady state capital-labor ratio does not depend on preferences, but simply on technology, depreciation and the discount factor. We will obtain an analogue of this result in the continuous-time neoclassical model as well. Moreover, since f () is strictly concave, k is uniquely dened. Thus we have

144

14.451: Introduction to Economic Growth Proposition 15 In the neoclassical optimal growth model specied in (6.24) and (6.25) with Assumptions 1, 2 and 8, there exists a unique steady-state capital-labor ratio k given by (6.28), and starting from any initial k0 , the economy monotonically converges to this unique steady state, i.e., if k0 < k , then the equilibrium capital stock sequence kt k and if k0 > k , then the equilibrium capital stock sequence kt k . Proof. Uniqueness and existence were established above. To establish monotonic convergence, simply use the fact that kt+1 = s (kt ) with s () dened in Proposition 13, and was shown to be nondecreasing. Since the steady-state is unique, s (kt ) for kt 6= k cannot satisfy k = s (k ), thus it must be increasing. Next, note s (kt ) is nonnegative and can never exceed =f s k , which exists, is unique and nite by Assumption 2. Consequently, s (kt ) is s k an increasing sequence in a compact set. A monotonically increasing sequence in a compact for some s . However, any limit point of s (kt ) must set necessarily converges, thus s (kt ) s be equal to k , since this is the unique steady state, thus s (kt ) k , completing the proof. Consequently, in the optimal growth model there exists a unique steady state and the economy monotonically converges to the unique steady state, for example by accumulating more and more capital (if it starts with a too low capital-labor ratio). Finally, we can also show that consumption also monotonically increases (or decreases) along the path of adjustments to the unique-steady state: Proposition 16 c (k ) dened in Proposition 13 is nondecreasing. Moreover, if k0 < k , then the equilibrium consumption sequence ct c and if k0 > k , then ct c , where c is given by c = f (k ) k . 145

14.451: Introduction to Economic Growth The proof of Proposition 16 is left as an exercise to you. This treatment shows that the optimal growth model is very tractable, and we can do the usual exercises we performed with the Solow growth model, including incorporating population growth and technological change. There is no immediate counterpart of a savings rate, since this depends on the utility function. But interestingly and very dierently from the Solow growth model, the steady state capital-labor ratio and steady state income level do not depend on the savings rate anyway. We will return to all of these issues, and provide a more detailed discussion of the equilibrium growth in the context of the continuous time model. But for now, it is also useful to see how this optimal growth allocation can be decentralized, i.e., in this particular case we can use the second welfare theorem to show that the optimal growth allocation is also a competitive equilibrium.

6.5

To show that the Pareto optimal growth allocation can be decentralized is very straightforward. Suppose that all households are identical, with utility function given by u (c) as above, and normalize their measure to 1. Suppose they all start with capital stock k0 . The other side of the economy are competitive rms. Households rent their capital to rms. It is straightforward to see that households will receive a rental price of Rt = f 0 (kt ) because of competitive market prices. They will therefore face a gross rate of return equal to rt = [f 0 (kt ) + (1 )] receive the wage rate of wt = f (kt ) kt f 0 (kt ). 146 (6.29)

for renting one unit of capital at time t in terms of date t + 1 goods. In addition, they will

14.451: Introduction to Economic Growth Now consider the maximization problem of the representative household: max

X t=0

t u (ct )

subject to the ow budget constraint at+1 = rt at + wt ct , where at denotes asset holdings at time t, and also subject to a no Ponzi constraint which requires the individual asset holdings not to go to minus innity. I will discuss this in greater detail below. For now, it suces to see that by exactly the same Euler equation type arguments, we have u0 (ct ) = rt+1 u0 (ct+1 ) . Imposing steady state implies that ct = ct+1 , therefore, we must have rt+1 = 1. Next, market clearing immediately implies that rt+1 is given by (6.29), so the capital-labor ratio of the competitive equilibrium is given by [f 0 (kt+1 ) + (1 )] = 1, The steady state is given by [f 0 (k ) + (1 )] = 1, Both are exactly as in the optimal growth problem, i.e., equations (6.28) and (6.29). In fact, a similar argument establishes that the whole competitive equilibrium path is identical to 147

14.451: Introduction to Economic Growth the optimal growth path. This is, of course, not surprising in view of the second (and rst) welfare theorems we saw above. We will discuss many of the implications of competitive economic growth in the neoclassical model once we go through the continuous time version as well.

148

Chapter 7

The continuous time problem brings a number of new issues. The main reason is that even with a nite horizon, the maximization is with respect to an innite-dimensional object (in fact an entire function, y : [t0 , t1 ] R). This requires us to review some basic ideas from the calculus of variation and from optimal control, but most of the tools and ideas that are necessary for this course are very straightforward.

I will start with the nite-horizon problem and the simplest treatment (which is much more similar to calculus of variation than optimal control), to give you the basic idea, and then provide the more powerful theorems from optimal control. 149

7.1

7.1.1

The Fundamental Problem

Consider the following nite-horizon continuous time problem Z t1 max J (x (t) , y (t)) f (t, x (t) , y (t)) dt

x(t),y (t),x1 0

(7.1)

subject to x (t) = g (t, x (t) , y (t)) and y (t) Y (t) for all t, x (0) = x0 and x (t1 ) = x1 . (7.3) (7.2)

Here x (t) R is the state variable, whose behavior is governed by the dierential equation (7.2). y (t) Y (t) R is the control variable. In addition, we assume that f and g are continuously dierentiable functions. This is the simplest optimal control problem because it has boundary conditions that regulate when the planning horizon ends (more generally, t1 can be a choice variable as well, or it could extend to innity as we will see later). The diculty of this problem arises from two features: 1. We are choosing a function: y : [0, t1 ] Y rather than a vector or a nite dimensional object. 2. The constraint takes an unusual form of a dierential equation. These features make it dicult for us to know what type of optimal policy to look for. For example, y may be a very discontinuous function. It may often hit the boundary of the feasible set etc. 150

7.1.2

Variational Arguments

Before going into greater detail, let us try to understand the essence of the problem, which can be done by using the variational principle of the calculus of variation. For this purpose, let us suppose that a continuous function y () dened over [0, t1 ] with y (t) IntY (t) which achieves the optimum in this problem. Therefore, we are ruling out both the boundary conditions and discontinuities. Now consider the following variation y (t, ) = y (t) + (t) , where (t) is an arbitrary xed continuous function. We refer to this as a variation, because given (t), by varying , we obtain dierent sequences of controls. The problem, of course, is that some of these may be infeasible, i.e., y (t, ) / Y (t) for some t. However, since y (t) IntY (t), and a continuous function over a compact set [0, t1 ] is bounded, we can always nd > 0 such that for any () function y (t) + (t) IntY (t) for all < . Thus we can conduct variational arguments for small s. But, in analogy with regular calculus, the argument that there is no gain from a variation for small s is essentially what we need. To prepare for these arguments, let us x an arbitrary (), and dene x (t, ) as the path of the state variable corresponding to the path of control variable y (t, ). This implies that x (t, ) is given by: x (t, ) = g (t, x (t, ) , y (t, )) for all t [0, t1 ] and with x (0, ) = x0 . 151 (7.4)

t1

(7.5)

By the fact that y (t) is optimal, and that for < , y (t, ) (and thus x (t, )) is feasible, we have () (0) for all < . Next, rewrite the equation (7.4), so for all t [0, t1 ]: g (t, x (t, ) , y (t, )) x (t, ) 0. Now for any continuously dierentiable function : [0, t1 ] R, it must be the case that Z t1 (t) [g (t, x (t, ) , y (t, )) x (t, )] dt = 0. (7.6)

0

The function (), chosen suitably, will be the costate variable, with a similar interpretation to the Lagrange multipliers in regular (constrained) optimization. Now add (7.6) to (7.5) to obtain: () Z

t1

[f (t, x (t, ) , y (t, )) + (t) [g (t, x (t, ) , y (t, )) x (t, )]] dt. R t1

0

Now we want to evaluate this term. Start by considering the integral Integrating this by parts, we have Z Z t1 (t) x (t, ) dt = (t1 ) x (t1 , ) (0) x0

0

t1

Substituting this back, we obtain: Z t1 h i () f (t, x (t, ) , y (t, )) + (t) g (t, x (t, ) , y (t, )) + (t) x (t, ) dt

0

152

14.451: Introduction to Economic Growth Now dierentiate this term with respect to . This is feasible by Leibnizs rule, since f and g are continuously dierentiable, and so is y (t, ) in by construction. Denoting their derivatives by x and y , and the derivatives of f and g by ft , fx , fy etc., dierentiation gives ()

0

t1

0

t1

[fy (t, x (t, ) , y (t, )) + (t) gy (t, x (t, ) , y (t, ))] (t) dt

t1

(t1 ) x (t1 , 0) . where x (t) denotes the path of the state variable corresponding to the optimal plan, y (t). As with the standard nite-dimensional optimization, if there exists some function (t) for which 0 (0) 6= 0, this means that the value of the program can be improved. Therefore, we need to have 0 (0) 0 for all (t) . This can only be possible if the second integral is equal to zero for all (t), i.e., only if Z

t1

t1

[fy (t, x (t) , y (t)) + (t) gy (t, x (t) , y (t))] (t) dt = 0 for all (t) ,

which is only possible if (t)) + (t) gy (t, x (t) , y (t)) 0 for all t [0, t1 ] . fy (t, x (t) , y 153 (7.7)

14.451: Introduction to Economic Growth By the same reasoning, x is also arbitrary, so we need to have the rst integral identically equal to zero, or (t) = [fx (t, x (t) , y (t)) + (t) gx (t, x (t) , y (t))] and therefore (t1 ) = 0. This derivation (from calculus of variation) therefore has established the following theorem: Theorem 24 (Necessary Conditions) Consider the problem of maximizing (7.1) subject to (7.2) and (7.3), with f and g continuously dierentiable, has an interior solution y (t) IntY (t) with corresponding path of state variable x (t), then there exists a continuously dierentiable costate function () dened over t [0, t1 ] such that (7.2), (7.7) and (7.8) hold. (7.8)

7.1.3

The conditions (7.7) and (7.8) should remind you of a Lagrangian maximization. By analogy with the Lagrangian, a much more economical way of expressing Theorem 24 is to construct the equivalent of the Lagrangian in this case, the Hamiltonian: H (t, x, y, ) f (t, x (t) , y (t)) + (t) gy (t, x (t) , y (t)) . Then we have Theorem 25 (Simplied Maximum Principle) Consider the problem of maximizing (7.1) subject to (7.2) and (7.3), with f and g continuously dierentiable, has an interior solution y (t) IntY (t) with corresponding path of state variable x (t). Let H (t, x, y, ) be 154 (7.9)

14.451: Introduction to Economic Growth given by (7.9). Then the optimal control y (t) and the corresponding path of the state variable x (t) satisfy the following necessary conditions: (t) , (t)) = 0 for all t [0, t1 ] . Hy (t, x (t) , y (t) = Hx (t, x (t) , y (t) , (t)) for all t [0, t1 ] , and (t1 ) = 0. x (t) = H (t, x (t) , y (t) , (t)) for all t [0, t1 ] , and x (0) = x0 . (7.10) (7.11) (7.12)

Theorem 25 is a simplied version of the celebrated Maximum Principle, and the more general version will be given below. For now, a couple of features are worth noting: 1. As in the usual constrained maximization problems, we nd the optimal solution by looking jointly for a set of multipliers and the optimal path of the control and state variables. Here the multipliers are referred to as the costate variables. 2. Again as in the usual constrained maximization problems, the costate variables are informative about the value of relaxing the constraint. Here (t) is the value of an innitesimal increase in x (t) at time t. 3. With this interpretation, it makes sense that (t1 ) = 0 is part of the necessary conditions. After the planning horizon, there is no value to having more x. This is therefore the nite-horizon equivalent of the transversality condition we encountered above. While Theorem 25 gives necessary conditions, as in regular optimization problems, these may not be sucient. Suciency is again guaranteed by imposing concavity. The following theorem provides conditions for the necessary conditions to also be sucient to characterize the optimal plan. 155

14.451: Introduction to Economic Growth Theorem 26 (Mangasarian Sucient Conditions) Consider the problem of maximizing (7.1) subject to (7.2) and (7.3), with f and g continuously dierentiable. Dene H (t, x, y, ) as in (7.9), and suppose that an interior solution y (t) IntY (t) and the corresponding path of state variable x (t) satisfy (7.10)-(7.12). Suppose also that for the resulting (t) and costate variable (t), H (t, x, y, ) is jointly concave in (x, y ) for all t [0, t1 ], then y the corresponding x (t) achieve the unique global maximum of (7.1). An alternative set of sucient conditions are provided by Arrow: Theorem 27 (Arrow Sucient Conditions) Consider the problem of maximizing (7.1) subject to (7.2) and (7.3), with f and g continuously dierentiable. Dene H (t, x, y, ) as in (7.9), and suppose that an interior solution y (t) IntY (t) and the corresponding path of state variable x (t) satisfy (7.10)-(7.12). Given the resulting costate variable (t), dene M (t, x, ) H (t, x, y (t) , ). If M (t, x, ) is concave in x for all t [0, t1 ], then y (t) and the corresponding x (t) achieve the unique global maximum of (7.1). The proofs of these theorems are long and not necessary for what will follow, so they are omitted. As stated Theorem 26, even Theorem 27, is dicult to apply, since given the concavity/convexity of the g () function, the concavity of the Hamiltonian will depend on the sign on the costate variable (t). The following lemma (again proof omitted) provides some information on the sign of (t): Lemma 2 Suppose that y (t) and the corresponding x (t) are the optimal solutions to maximizing (7.1) subject to (7.2) and (7.3), with corresponding costate variable (t). Then we have that 156

14.451: Introduction to Economic Growth 1. If fx (t, x (t) , y (t) , (t)) > 0 for all t [0, t1 ], then (t) > 0 for all t [0, t1 ). (t) , (t)) < 0 for all t [0, t1 ], then (t) < 0 for all t [0, t1 ). 2. If fx (t, x (t) , y (t) , (t)) = 0 for all t [0, t1 ], then (t) = 0 for all t [0, t1 ). 3. If fx (t, x (t) , y Therefore, as in standard maximization problems, there is an intimate relationship between the sign of the multiplier and the returns from increasing the stock of the state variable, but here we need the eect of the state variable to be positive everywhere in order for the multiplier to be positive, etc. The usefulness of Lemma 2 comes from the fact that if (t) > 0 for all t (which follows from fx > 0 for all t), then the Hamiltonian given in (7.9) is a concave function of x and y for given (t) when f and g are concave functions. Therefore, the sucient conditions in Theorem 26 are very straightforward to check (though often quite restrictive).

7.1.4

Generalizations

The above theorems can be immediately generalized to the case in which the state variable and the controls are vectors rather than scalars, and also to the case in which there are constraints. The constrained case requires constraint qualication conditions as in the standard nite-dimensional optimization case. These are slightly more messy to express, and since we will make no use of the constrained maximization problems, I will not state these theorems. The vector-values theorems are direct generalizations of the ones presented above, and are useful in growth models with multiple capital goods. In particular, let max J (x (t) , y (t)) 157 Z

t1

x(t),y(t),x1

(7.13)

14.451: Introduction to Economic Growth subject to x (t) = g (t, x (t) , y (t)) , and y (t) Y (t) for all t, x (0) = x0 and x (t1 ) = x1 . (7.15) (7.14)

Here x (t) RK for some K 1 is the state variable and again y (t) Y (t) RN for some N 1 is the control variable. In addition, we again assume that f and g are continuously dierentiable functions. We then have: Theorem 28 (Maximum Principle) Consider the problem of maximizing (7.13) subject to (7.14) and (7.15), with f and g continuously dierentiable, has an interior solution y (t) IntY (t) with corresponding path of state variable x (t). Let H (t, x, y, ) be given by H (t, x, y, ) f (t, x (t) , y (t)) + (t) gy (t, x (t) , y (t)) , (7.16)

where (t) RK . Then the optimal control y (t) and the corresponding path of the state variable x (t) satisfy the following necessary conditions: y H (t, x (t) , y (t) , (t)) = 0 for all t [0, t1 ] . (t) = x H (t, x (t) , y (t) , (t)) for all t [0, t1 ] and (t1 ) = 0. (t) , (t)) for all t [0, t1 ] and x (0) = x0 . x (t) = H (t, x (t) , y Moreover, we have straightforward generalizations of the suciency conditions: Theorem 29 (Mangasarian Sucient Conditions) Consider the problem of maximizing (7.13) subject to (7.14) and (7.15), with f and g continuously dierentiable. Dene 158 (7.17) (7.18) (7.19)

14.451: Introduction to Economic Growth H (t, x, y, ) as in (7.16), and suppose that an interior solution y (t) IntY (t) and the corresponding path of state variable x (t) satisfy (7.17)-(7.19). Suppose also that for the resulting (t) costate variable (t), H (t, x, y, ) is jointly concave in (x, y) for all t [0, t1 ], then y and the corresponding x (t) achieve the unique global maximum of (7.13).

Theorem 30 (Arrow Sucient Conditions) Consider the problem of maximizing (7.13) subject to (7.14) and (7.15), with f and g continuously dierentiable. Dene H (t, x, y, ) as in (7.16), and suppose that an interior solution y (t) IntY (t) and the corresponding path of state variable x (t) satisfy (7.17)-(7.19). Suppose also that for the resulting costate variable (t), dene M (t, x, ) H (t, x, y (t) , ). If M (t, x, ) is concave in x for all t [0, t1 ], then y (t) and the corresponding x (t) achieve the unique global maximum of (7.13).

7.1.5

Limitations

The limitations of what we have done so far are obvious. First, we have assumed that a continuous and interior solution to the optimal control problem exists. This is in general a very strong assumption. Second, and equally important for our purposes, we have so far looked at the nite horizon case, whereas analysis of growth models requires us to solve innite horizon problems. To deal with both of these issues, we need to look at the more modern theory of optimal control. This is done in the next section. 159

7.2

7.2.1

The Basic Problem: Necessary and Sucient Conditions

Consider the innite-horizon following version of problem of maximizing (7.1) subject to (7.2) and (7.3). Z

x(t),y(t)

(7.20)

subject to x (t) = g (t, x (t) , y (t)) , and y (t) R for all t, x (0) = x0 and lim x (t) x1 .

t

(7.21)

(7.22)

The main dierence is that now time runs to innity, and there is no choice of endpoint x1 . In addition, I have simplied the problem by removing the feasibility set on the control y (t), simply requiring this function to be real-valued. For this problem, we call a pair (x (t) , y (t)) admissible if y (t) is a piecewise continuous function of time and x (t) is a piecewise smooth function of time satisfying (7.21) given y (t) (since x (t) is given by a continuous dierential equation, the piecewise continuity of y (t) ensures the piecewise smoothness of x (t)). Notice that this is a signicant generalization of the above approach, since discontinuous controls are allowed as long as they are piecewise continuous. There are a number of technical diculties when dealing with the innite-horizon case, which are similar to those in the discrete time analysis. Primary among those is the fact 160

14.451: Introduction to Economic Growth that the value of the functional in (7.20) may not be nite. We will deal with some of these issues below. The main theorem for the innite-horizon optimal control problem is the following maximum principle: Theorem 31 Suppose that problem of maximizing (7.20) subject to (7.21) and (7.22), with f and g continuously dierentiable, has an interior solution y (t) with corresponding path of state variable x (t). Let H (t, x, y, ) be given by (7.9). Then the optimal control y (t) and the corresponding path of the state variable x (t) satisfy the following necessary conditions: Hy (t, x (t) , y (t) , (t)) = 0 for all t R+ . (t) = Hx (t, x (t) , y (t) , (t)) for all t R+ . (t) , (t)) for all t R+ , x (0) = x0 and lim x (t) x1 . x (t) = H (t, x (t) , y

t

Notice an important dierence between Theorem 25 and the current theorem. There is no boundary condition in Theorem 31 corresponding to (t1 ) = 0 of Theorem 25. Consequently, the necessary conditions in Theorem 31 will not uniquely pin down a solution path. To do this we need an innite-horizon version of the transversality condition. One might be tempted to impose a condition of the form

t

lim (t) 0

as the transversality condition, but this is not in general the case. We will see an example where this does not apply soon. A milder transversality condition of the form lim H (t, x, y, ) = 0 161

14.451: Introduction to Economic Growth always applies, but is not easy to check. Stronger transversality conditions apply when we put more structure on the problem. Before we do this, there are immediate generalizations of the suciency theorems to this case.

Theorem 32 (Mangasarian Sucient Conditions for Innite Horizon) Consider the problem of maximizing (7.20) subject to (7.21) and (7.22), with f and g continuously dierentiable. Dene H (t, x, y, ) as in (7.9), and suppose that a solution y (t) and the corresponding path of state variable x (t) satisfy (7.23)-(7.25). Suppose also that for the resulting costate variable (t), H (t, x, y, ) is jointly concave in (x, y ) for all t R+ and (t)) 0 for all x (t) implied by an admissible control path y (t), that limt (t) (x (t) x then y (t) and the corresponding x (t) achieve the unique global maximum of (7.20).

Theorem 33 (Arrow Sucient Conditions for Innite Horizon) Consider the problem of maximizing (7.20) subject to (7.21) and (7.22), with f and g continuously dierentiable. Dene H (t, x, y, ) as in (7.9), and suppose that a solution y (t) and the corresponding path of state variable x (t) satisfy (7.23)-(7.25). Given the resulting costate variable (t), dene M (t, x, ) H (t, x, y (t) , ). If M (t, x, ) is concave in x and limt (t) (x (t) x (t)) 0 for all x (t) implied by an admissible control path y (t), then y (t) and the corresponding x (t) achieve the unique global maximum of (7.20).

Notice that both of these this eciency theorems have the dicult to check condition that limt (t) (x (t) x (t)) 0 for all x (t) implied by an admissible control path y (t). This condition will disappear when we can impose a proper transversality condition. 162

7.2.2

The following example, which is very close to the original Ramsey model, illustrates that there are in general no transversality conditions. Example 2 Consider the following problem: Z [log (c (t)) log c ] dt max

0

t

lim k (t) 0

where c [k ] k and k (/ )1/(1) . In other words, c is the maximum level of consumption that can be achieved in this model. This way of writing the objective function makes sure that the integral converges and takes a nite value (since c (t) cannot exceed c forever). The Hamiltonian is straightforward to construct and takes the form H (k, c, ) = [log c log c ] + [k c k] , and implies the following necessary conditions (dropping time dependence to simplify the notation): Hc = Hk 1 =0 c = k 1 = . 163

14.451: Introduction to Economic Growth It can be veried that c (t) c satises the necessary conditions, and must be part of any optimal path. This, however, implies that lim (t) = 1 > 0 and lim k (t) = k . t c

Therefore, the equivalent of the standard nite-horizon transversality conditions do not hold. It can be veried, however, that along the optimal path lim H (k (t) , c (t) , (t)) = 0,

7.2.3

Part of the diculty, especially regarding the absence of a transversality condition, comes from the fact that we did not impose enough structure on the functions f and g . As discussed above, our interest is with the growth models where the utility is discounted exponentially. Then the problem is a more special one, taking the form: Z

x(t),y(t)

(7.26)

subject to x (t) = g (t, x (t) , y (t)) , and y (t) R for all t, x (0) = x0 and lim x (t) x1 .

t

(7.27)

(7.28)

The special feature of this problem is that the payo function, the equivalent of f , depends on time only through exponential discounting. The Hamiltonian in this case would 164

14.451: Introduction to Economic Growth be: H (t, x (t) , y (t) , (t)) = exp (t) u (x (t) , y (t)) + (t) g (t, x (t) , y (t)) = exp (t) [u (x (t) , y (t)) + (t) g (t, x (t) , y (t))] , where the second line denes (t) exp (t) (t). This equation makes it clear that the Hamiltonian depends on time explicitly only through the exp (t) term. In fact, in this case, rather than working with the standard Hamiltonian, we can work with the current-value Hamiltonian, dened as (x (t) , y (t) , (t)) u (x (t) , y (t)) + (t) g (t, x (t) , y (t)) H which is not explicitly a function of time. We have the following result, which states not only the necessary conditions similar to Theorem 31, but also shows the necessity of a transversality condition: Theorem 34 (Maximum Principle for Discounted Innite-Horizon Problems) Suppose that problem of maximizing (7.26) subject to (7.27) and (7.28), with u and g continuously dierentiable, has a solution y (t) with corresponding path of state variable x (t). Let (x, y, ) be the current-value Hamiltonian given by (7.29). Then the optimal control y H (t) and the corresponding path of the state variable x (t) satisfy the following necessary conditions: y (x (t) , y (t) , (t)) = 0 for all t R+ . H

t

(7.29)

(7.30)

x (x (t) , y (t) , (t)) for all t R+ and lim [exp (t) x (t) (t)] = 0. (t) (t) = H (7.31)

t

(x (t) , y (t) , (t)) for all t R+ , x (0) = x0 and lim x (t) x1 . x (t) = H 165

(7.32)

14.451: Introduction to Economic Growth The important feature of Theorem 34, which is the most useful theorem for the rest of the course, is that it also shows the transversality condition

t

is necessary. Notice that compared to the transversality condition before, there is the additional term exp (t). This is because the transversality condition applies to the original costate variable (t), i.e., limt [x (t) (t)] = 0, and as shown above the current-value costate variable (t) is given by (t) = exp (t) (t) = 0. The suciency theorems can also be strengthened now by incorporating the transversality condition and expressing the conditions in terms of the current-value Hamiltonian: Theorem 35 (Mangasarian Sucient Conditions for Discounted Innite-Horizon Problems) Consider the problem of maximizing (7.26) subject to (7.27) and (7.28), with u (x, y, ) as the current-value Hamiltonian as in and g continuously dierentiable. Dene H (7.29), and suppose that a solution y (t) and the corresponding path of state variable x (t) satisfy (7.30)-(7.32). Suppose also that for the resulting current-value costate variable (t), (x, y, ) is jointly concave in (x, y ) for all t R+ , then y H (t) and the corresponding x (t) achieve the unique global maximum of (7.26). Theorem 36 (Arrow Sucient Conditions for Discounted Innite-Horizon Problems) Consider the problem of maximizing (7.26) subject to (7.27) and (7.28), with u (x, y, ) as the current-value Hamiltonian as and g continuously dierentiable. Dene H in (7.29), and suppose that a solution y (t) and the corresponding path of state variable x (t) satisfy (7.30)-(7.32). Given the resulting current-value costate variable (t), dene (x, y M (t, x, ) H , ). If M (t, x, ) is concave in x, then y (t) and the corresponding x (t) achieve the unique global maximum of (7.26). 166

We are now ready to start our analysis of the standard neoclassical growth model (also known as the Ramsey, or Cass-Koopmans model). This model diers from the Solow model only in explicitly modeling the consumer side and endogenizing savings (i.e., allowing consumer optimization). Beyond its use as a basic growth model, this model has become a workhorse for many areas of macroeconomics, including the analysis of scal policy, taxation, business cycles, and even monetary policy.

8.1

The economy is an innite-horizon economy in continuous time (the discrete-time version was analyzed above). We assume that the economy admits a representative household with instantaneous utility function u (c (t)) , and we make the following standard assumptions on this utility function: 167 (8.1)

14.451: Introduction to Economic Growth Assumption 9 u (c) is strictly increasing, twice continuously dierentiable with derivatives u0 and u00 , and concave, and satises the following Inada type assumptions: lim u0 (c) = and lim u0 (c) = 0.

c

c0

Alternatively, we can think of the economy as consisting of a unit measure of identical households each with the instantaneous utility function given by (8.1). Population within each household grows at the rate n, starting with L (0) = 1, so that total population is L (t) = exp (nt) . All members of the household supply their labor inelastically. Consequently, we assume that each household maximizes overall utility U (0) at time t = 0 given by Z

(8.2)

(8.3)

where c (t) is consumption per capita at time t, is the subjective discount rate, and the eective discount rate is n, since it is assumed that the household derives utility from the consumption of its additional members in the future as well. We assume throughout that Assumption 10 > n. This assumption ensures that there is in fact discounting of future utility streams. Otherwise, (8.3) would have innite value, and standard optimization techniques would not be useful in determining what an optimal plan is (we would need to use over-taking type criteria etc.). More generally, there is something somewhat strange about models in which utility is 168

14.451: Introduction to Economic Growth equal to innity. Assumption 10 makes sure that in the model without growth, discounted utility is nite. When there is growth, we will strengthen this assumption. We start with an economy without any technological progress. Factor and product markets are competitive, and the production possibilities set of the economy is represented by the aggregate production function Y (t) = F [K (t) , L (t)] , which is a simplied version of the production function (2.1) used in the Solow growth model above, where the simplication comes from the fact that there is no technology term. As in the analysis there, we impose the standard constant returns to scale and Inada assumptions embedded in Assumptions 1 and 2. The constant returns to scale feature enables us to work with the per capita production function f () such that, output per capita is given by y (t) Y (t) L (t) K (t) = F ,1 L (t) f (k (t)) , K (t) . L (t)

Competitive factor markets then imply that, at all points in time, the rental rate of capital and the wage rate are given by: R (t) = FK [K (t), L(t)] = f 0 (k(t)). and w (t) = FL [K (t), L(t)] = f (k (t)) k (t) f 0 (k(t)). 169 (8.6) (8.5)

14.451: Introduction to Economic Growth The household optimization side is more complicated, since each household will solve a continuous optimization problem in deciding how to use their assets and allocating consumption over time. To prepare for this, let us denote the asset holdings of the representative household at time t by A (t). Then we have the following law of motion for the total assets of the household (t) = r (t) A (t) + w (t) L (t) c (t) L (t) A where c (t) is consumption per capita of the household, r (t) is the risk-free market rate of return on assets, and w (t) L (t) is the total labor income earnings of the household. Note that the r(t) is now a ow return, not a gross return as used before. Dening per capita assets as a (t) we obtain: a (t) = (r (t) n) a (t) + w (t) c (t) . (8.7) A (t) , L (t)

In practice, household assets can consist of capital stock, K (t), which they rent to rms and government bonds, B (t). In models with uncertainty, households would have a portfolio choice between the capital stock of the corporate sector and riskless bonds. Government bonds play an important role in models with uncertainty and heterogeneity, allowing households to smooth idiosyncratic shocks. But in representative household models without government, their only use is in pricing assets (for example riskless bonds versus equity etc.), since they have to be in zero net supply, i.e., total supply of bonds has to be B (t) = 0. Consequently, we will have that assets per capita are equal to the capital stock per capita (or the capital-labor ratio in the economy), i.e.: a (t) = k (t) . 170

14.451: Introduction to Economic Growth Moreover, since there is no uncertainty here and a depreciation rate of , the market rate of return on assets will be given by r (t) = R (t) . (8.8)

The equation (8.7) is only a ow constraint, and it is not sucient to act as a proper budget constraint on the individual. To see this, consider a nite-horizon economy, ending at the time T . In this case, we could express the entire set of constraints on the household as a single budget constraint of the form: Z T Z T c (t) L(t) exp r (s) ds dt + A (T ) 0 t Z T Z Z T = w (t) L (t) exp r (s) ds dt + A (0) exp

0 t

(8.9)

T

r (s) ds ,

which requires the households discounted budget constraint to hold at time T (hence all income and expenditures are carried forward to date T units). Clearly, dierentiating this expression and expanding L(t) gives (8.7). And yet (8.7) by itself does not guarantee that the level of A (T ) is such that this lifetime budget constraint holds. Therefore, in the nitehorizon, we would simply impose this lifetime budget constraint as a boundary condition. In the innite-horizon case, we need a similar boundary condition. This is generally referred to as the no-Ponzi-game condition, and takes the form Z t lim a (t) exp (r (s) n) ds 0.

t 0

(8.10)

This condition is stated as an inequality, to ensure that the individual does not asymptotically tend to a negative wealth. But we will see from the transversality condition of the individual problem that the individual would never want to have positive wealth asymptotically, so the no-Ponzi-game condition can be alternatively stated as: Z t lim a (t) exp (r (s) n) ds = 0.

t 0

(8.11)

171

14.451: Introduction to Economic Growth In what follows we will use (8.10), and then derive (8.11) using the transversality condition explicitly. The name no-Ponzi-game condition comes from the chain-letter schemes, which are sometimes called Ponzi games, where an individual can continuously borrow from a competitive nancial market (or more often, from unsuspecting souls that become part of the chain-letter scheme) and pay his or her previous debts using current borrowings. To understand where this form of the no-Ponzi-game condition comes from, multiply R T both sides of (8.9) by exp 0 r (s) ds to obtain Z

T

0 T

Z t Z T c (t) L(t) exp r (s) ds dt + exp r (s) ds A (T ) 0 0 Z t w (t) L (t) exp r (s) ds dt + A (0) ,

0

then divide everything by L (0) and note that L(t) grows at the rate n, to obtain Z t Z T Z T c (t) exp (r (s) n) ds dt + exp (r (s) n) ds a (T ) 0 0 0 Z t Z T = w (t) exp (r (s) n) ds dt + a (0) .

0 0

Now take the limit as T and use the no-Ponzi-game condition (8.11) to obtain Z t Z t Z Z c (t) exp (r (s) n) ds dt = a (0) + w (t) exp (r (s) n) ds dt,

0 0 0 0

which essentially requires the discounted sum of expenditures to be equal to initial income plus the discounted sum of labor income. Therefore this equation is a direct extension of (8.9) to innite horizon. This derivation makes it clear that the no-Ponzi-game condition (8.11) essentially ensures that the individuals lifetime budget constraint holds in innite horizon. 172

8.2

8.2.1

Characterization of Equilibrium

Denition of Equilibrium

We are now in a position to dene an equilibrium in this dynamic economy. I will provide two denitions, the rst somewhat less formal, and second more useful in characterizing the equilibrium below. A competitive equilibrium of the Ramsey economy consists of paths of consumption, capital stock, wage rates and rental rates of capital, [C (t) , K (t) , w (t) , R (t)] t=0 such that the representative household maximizes its utility given initial capital stock K (0) and the

time path of prices [w (t) , R (t)] t=0 , and the time path of prices [w (t) , R (t)]t=0 is such that

given the time path of capital stock and labor [K (t) , L (t)] t=0 all markets clear. Notice that in equilibrium we need to determine the entire time path of real quantities and the associated prices. This is a very important point. In dynamic models whenever we talk of equilibrium, this refers to the entire path of quantities and prices. In some models, we will focus on the steady-state equilibrium, but equilibrium always refers to the entire path. Since everything can be equivalently dened in terms of per capita variables, let me states the alternative denition in terms of those:

Denition 8 A competitive equilibrium of the Ramsey economy consists of paths of per capita consumption, capital-labor ratio, wage rates and rental rates of capital, [c (t) , k (t) , w (t) , R (t)] t=0 such that the representative household maximizes (8.3) subject to (8.7) and (8.10) given initial capital-labor ratio k (0) and factor prices [w (t) , R (t)] t=0 with the rate of return on assets r (t) given by (8.8), and factor prices [w (t) , R (t)] t=0 are given by (8.5) and (8.6). 173

8.2.2

Let us start with the problem of the representative consumer. From the denition of equilibrium we know that this is to maximize (8.3) subject to (8.7) and (8.11). Let us ignore (8.11) rst, and set up the current value Hamiltonian: (a, c, ) = u (c (t)) + (t) [w (t) + (r (t) n) a (t) c (t)] , H with state variable a, control variable c and current-value costate variable . From Theorem 34, the following are necessary conditions: c (a, c, ) = 0 = u0 (c (t)) (t) , H a (a, c, ) = (t) + ( n) (t) = (t) (r (t) n) , H

t

and the transition equation. Notice that the transversality condition is written in terms of the current-value costate variable. (a, c, ) is a concave function of (a, c), and thus from Theorem Moreover, for any (t), H 35, these conditions are sucient for a solution. Rearranging the second condition, we have (t) = (r (t) ) , (t) (8.12)

which states that the multiplier changes depending on whether the rate of return on assets is currently greater than or less than the discount rate of the household. The rst condition, on the other hand, implies u0 (c (t)) = (t) . 174

14.451: Introduction to Economic Growth To make more progress, let us dierentiate this with respect to time and divide by (t), which yields (t) u00 (c (t)) c (t) c (t) = . u0 (c (t)) c (t) (t) Substituting this into (8.12), we have c (t) 1 = (r (t) ) c (t) u (c(t)) where u (c (t)) u00 (c (t)) c (t) u0 (c (t)) (8.14) (8.13)

is the elasticity of the marginal utility u0 (c(t)). More importantly, u (c (t)) is also the inverse of the intertemporal elasticity of substitution, which plays a crucial role in most macro models. The intertemporal elasticity of substitution regulates the willingness of individuals to substitute consumption (or labor or any other attribute that yields utility) over time. This elasticity for dates t and s > t is dened as u (t, s) = As s t,we have u (t, s) u (t) = d log (c (s) /c (t)) . d log (u0 (c (s)) /u0 (c (t))) 1 u0 (c (t)) = . 00 u (c (t)) c (t) u (c (t))

This is not surprising, since the concavity of the utility function u (), thus the elasticity of marginal utility, determines how willing individuals are to substitute consumption over time. Next, note also that integrating (8.12), we have Z t (r (s) ) ds (t) = (0) exp 0 Z t 0 (r (s) ) ds , = u (c (0)) exp

0

175

14.451: Introduction to Economic Growth where the second line uses the rst optimality condition of the current-value Hamiltonian at time t = 0. Now substituting into the transversality condition, we have Z t 0 (r (s) ) ds = 0, lim exp ( ( n) t) a (t) u (c (0)) exp t 0 Z t (r (s) n) ds = 0, lim a (t) exp

t 0

which implies that the strict no-Ponzi condition, (8.11) has to hold. We can derive further results on the consumption behavior of households. In particular, R t notice that the term exp 0 r (s) ds is a present-value factor that converts a unit of income at time t to a unit of income at time 0. In the special case where r (s) = r, this factor would be exactly equal to exp (rt). But more generally, we can dene an average interest rate between dates 0 and t as 1 r (t) = t Z

t

r (s) ds.

In that case, we can express the conversion factor between dates 0 and t as exp (r (t) t) . Now recalling that the solution to the dierential equation y (t) = b (t) y (t) is y (t) = y (0) exp we can integrate (8.13), to obtain c (t) = c (0) exp Z

t

b (s) ds ,

r (s) ds u (c (s))

176

14.451: Introduction to Economic Growth as the consumption function. Thus once we determine c (0), the initial level of consumption, the path of consumption can be exactly solved out. In the special case where u (c (s)) is constant, for example, u (c (s)) = , this equation simplies to c (t) = c (0) exp r (t) t , Z

and substituting for c (t) into this lifetime budget constraint in this iso-elastic case, we obtain c (0) =

8.2.3

Equilibrium Prices

Equilibrium prices are straightforward and are given by (8.5) and (8.6). This implies that the market rate of return for consumers, r (t), is given by (8.8), i.e., r (t) = f 0 (k (t)) . Substituting this into the consumers problem, we have c (t) 1 = (f 0 (k (t)) ) c (t) u (c (t)) (8.16)

as the equilibrium version of the consumption growth equation, (8.13). Equation (8.15) in the iso-elastic utility case also similarly generalizes. 177

8.3

Optimal Growth

Before characterizing the equilibrium further, it is useful to look at the optimal growth problem, dened as the capital and consumption path chosen by a benevolent social planner trying to achieve a Pareto optimal outcome. In particular, suppose that the social planner gives exactly the same weights to people in dierent generations, so that it solves the problem max Z

[k(t),c(t)] t=0

subject to (t) = f (k (t)) (n + )k (t) c (t) k and k (0) > 0. To solve this problem, once again set up the current-value Hamiltonian, which in this case takes the form (k, c, ) = u (c (t)) + (t) [f (k (t)) (n + )k (t) c (t)] , H with state variable k, control variable c and current-value costate variable . From Theorem 34, the following are necessary conditions: c (k, c, ) = 0 = u0 (c (t)) (t) , H k (k, c, ) = (t) + ( n) (t) = (t) (f 0 (k (t)) n) , H

t

Going exactly through the same steps as before, it is straightforward to see that these optimality conditions imply c (t) 1 = (f 0 (k (t)) ) , c (t) u (c (t)) 178

14.451: Introduction to Economic Growth which is identical to (8.16), and the transversality condition Z t 0 (f (k (s)) n) ds = 0, lim k (t) exp

t 0

which is identical to (8.11). This establishes that the competitive equilibrium is a Pareto optimum, and the natural Pareto allocation can be decentralized as a competitive equilibrium with exactly the initial endowments. This result is stated in the next proposition: Proposition 17 In the neoclassical growth model described above, with Assumptions 1, 2, 9 and 10, the equilibrium is Pareto optimal and coincides with the optimal growth path maximizing the utility of the representative household.

8.4

Steady-State Equilibrium

Now let us characterize the steady-state equilibrium (or equivalently the steady-state optimal allocation). In steady state, consumption per capita will be constant, thus c (t) = 0. From (8.16), this implies that irrespective of the exact utility function, we must have a capital-labor ratio k such that f 0 (k ) = + , (8.17)

which is the equivalent of the steady-state relationship in the discrete-time optimal growth model, and as is the case there, it pins down the steady state capital-labor ratio only as a function of the production function, the discount rate and the depreciation rate. This also corresponds to the modied golden rule, rather than the golden rule we saw in the 179

14.451: Introduction to Economic Growth Solow model. Rather than maximizing consumption, the capital stock is chosen at a level that does not maximize steady-state consumption, because earlier consumption is preferred to later consumption. This is because of discounting, which means that the objective is not to maximize steady-state consumption, but involves giving a higher weight to earlier consumption. Given k , the steady-state consumption level is straightforward to determine as: c = f (k ) (n + )k , (8.18)

which is similar to the consumption level in the basic Solow model, but the steady-state capital-labor ratios determined dierently. Moreover, given Assumption 10, a steady state where the capital-labor ratio and thus output are constant necessarily satises the transversality condition. This analysis therefore establishes:

Proposition 18 In the neoclassical growth model described above, with Assumptions 1, 2, 9 and 10, the steady-state equilibrium capital-labor ratio, k , is uniquely determined by (8.17), and is independent of the utility function. The steady-state consumption per capita, c , is given by (8.18).

8.5

Transitional Dynamics

Next, we can determine the transitional dynamics of this model. Recall that transitional dynamics in the basic Solow model were given by a single dierential equation with an initial condition. This is no longer the case, since the equilibrium is determined by two dierential 180

14.451: Introduction to Economic Growth equations, repeated here for convenience: (t) = f (k (t)) (n + )k (t) c (t) k and c (t) 1 = (f 0 (k (t)) ) . c (t) u (c (t)) Moreover, we have an initial condition k (0) > 0, but also a boundary condition at innity, of the form Z t 0 (f (k (s)) n) ds = 0. lim k (t) exp

0

This combination of an initial condition and a transversality condition is quite typical for optimal control problems where we are trying to pin down the behavior of both state and control variables. This means that the notion of stability has to be dierent from that of those in Theorems 4, 5 and 6. In particular, the consumption level (or equivalently the costate variable ) is the control variable, and its initial value c (0) (or equivalently (0)) is free. It has to adjust in a way to satisfy the transversality condition at innity. Therefore, rather than requiring all eigenvalues of the linear system or the linearized system to be negative, what we want is saddle-path stability, which involves the number of negative eigenvalues to be the same as the number of state variables. In particular, we have the following straightforward generalizations of Theorems 4 and 5: Theorem 37 Consider the following linear dierential equation system x (t) = Ax (t) (8.19)

with initial value x (0), where x (t) Rn for all t and A is an n n matrix. Suppose that m n of the eigenvalues of A have negative real parts. Then there exists an m-dimensional 181

14.451: Introduction to Economic Growth manifold M of Rn such that starting from any x (0) M , the dierential equation (8.19) has a unique solution with x (t) x where x is the steady state (zero) of the system given by Ax = 0.

x (t) = F [x (t)]

(8.20)

where F : Rn Rn and suppose that F is continuously dierentiable, with initial value x (0). Let x be a zero of this system, i.e., F (x ) = 0. Dene A =F (x ) , and suppose that m n of the eigenvalues of A have negative real parts and the rest have positive real parts. Then there exists an open neighborhood of x , B (x ) Rn and an mdimensional manifold M B (x ) such that starting from any x (0) M , the dierential equation (8.20) has a unique solution with x (t) x .

Put dierently, these two theorems state that only a lower-dimensional subset of the original space leads to stable solutions. However, in this context this is exactly what we require, since c (0) will adjust in order to place us on exactly such a lower-dimensional subset of the original space. There are two ways of seeing this. The rst one is simply by analyzing the above system diagrammatically. This is done in the next picture: 182

= 0. The vertical line, on the The inverse U-shaped curve is the locus of points where k other hand, is the locus of points where c = 0. The shape of the rst one can be understood by analogy to the diagram where we saw the golden rule. If the capital stock is too low, steady-state consumption is low, and if the capital stock is too high, then the steady-state consumption is again low. There exists a unique level, kgold , which maximizes the state-state consumption per capita. The reason why the c = 0 locus is just a vertical line simply follows from the fact that only the unique level of k given by (8.17) can keep per capita consumption constant. Once these two loci are drawn, the rest of the diagram can be completed by looking at the direction of motion according to the dierential equations. Given this direction of movements, it is clear that there exists a unique stable arm, the lower-dimensional manifold 183

14.451: Introduction to Economic Growth tending to the steady state. All points away from this stable arm diverge, and eventually reach zero consumption or zero capital stock as shown in the gure. The next important observation is that the initial consumption level c (0) has to adjust to be on this stable arm. To see this note that if it were above it, in nite time, the capital stock would reach 0 with positive consumption, violating feasibility. If it were below it, in nite time, consumption would reach zero, thus capital would accumulate continuously, thus violating the transversality condition. This establishes that the transitional dynamics in the neoclassical growth model will take the following simple form: c (0) will jump to the stable arm, and then (k, c) will monotonically travel along this arm towards the steady state. This establishes:

Proposition 19 In the neoclassical growth model described above, with Assumptions 1, 2, 9 and 10, there exists a unique equilibrium path starting from any k (0) > 0 and converging to the unique steady-state (k , c ) with k given by (8.17). Moreover, if k (0) < k , then k (t) k and c (t) c , whereas if k (0) > k , then k (t) k and c (t) c .

An alternative way of establishing the same result is by linearizing the set of dierential equations, and looking at their eigenvalues. Recall the two dierential equations determining the equilibrium path: (t) = f (k (t)) (n + )k (t) c (t) k and c (t) 1 = (f 0 (k (t)) ) . c (t) u (c (t)) 184

14.451: Introduction to Economic Growth Linearizing these equations around the steady state (k , c ), we have (suppressing time dependence) = constant + (f 0 (k ) n ) (k k ) c k c f 00 (k ) (k k ) . c = constant + u (c ) Moreover, from (8.17), f 0 (k ) = , so the eigenvalues of this two-equation system are given by the values of that solve the following quadratic form: n 1 = 0. det 00 c f (k ) 0 u (c )

It is straightforward to verify that, since c f 00 (k ) /u (c ) < 0, there are two real eigenvalues, one negative and one positive. This implies that there exists a one dimensional stable manifold converging to the steady state, exactly as the stable arm in the above gure. Therefore, the local analysis also leads to the same conclusion. However, the local analysis can only establish local stability, whereas the above analysis established global stability.

8.6

The above analysis was for the neoclassical growth model without any technological change. Let us now extend the production function to: Y (t) = F [K (t) , A (t) L (t)] , where A (t) = exp (gt) A (0) . 185 (8.21)

14.451: Introduction to Economic Growth Notice that the production function (8.21) imposes purely labor-augmenting, or Harrodneutral, technological change. This is a consequence of Theorem 7 above, which was proved in the context of the constant savings rate model, but equally applies in this context. Only purely labor-augmenting technological change is consistent with balanced growth. We continue to adopt all the other assumptions, in particular Assumptions 1, 2 and 9. Assumption 10 will be strengthened further in order to ensure nite discounted utility in the presence of sustained economic growth. The constant returns to scale feature again enables us to work with normalized variables. Now let us dene y (t) Y (t) A (t) L (t) K (t) = F ,1 A (t) L (t) (t) , f k K (t) . A (t) L (t) (8.22)

is the capital to eective labor ratio, taking into account that eective labor is increasing because of labor-augmenting technological change. In addition to the assumption on technology, we also need to impose a further assumption on preferences in order to ensure balanced growth. We dene balanced growth as growth consistent with the Kaldor facts of constant capital-output ratio and capital share in national income. These two observations together also imply that the rental rate of return on capital, R (t), has to be constant, which, from (8.8), implies that r (t) has to be constant. In addition, balanced growth requires that consumption and output grow at a constant rate. The Euler 186

14.451: Introduction to Economic Growth equation implies that 1 c (t) = (r (t) ) . c (t) u (c (t)) (t) /c (t) gc is only possible if If r (t) r in BGP (balanced growth path), then c u (c (t)) u , i.e., if the elasticity of marginal utility of consumption is asymptotically constant. Therefore, balanced growth is only consistent with utility functions that have asymptotically constant elasticity of marginal utility of consumption. Given this restriction, we might as well start with a utility function that has this feature throughout. As noted above, the unique utility function with this feature is the CRRA preferences, given by c(t)1 1 if 6= 1 and 0 1 u (c (t)) = , ln c(t) if = 1

where the elasticity of marginal utility of consumption, u , is given by the constant . When = 0, these represent linear (risk-neutral) preferences, whereas when = 1, we have log preferences. As , these preferences become innitely risk-averse, and innitely unwilling to substitute consumption over time. More specically, we now assume that the economy admits a representative consumer with CRRA preferences Z

exp (( n)t)

c (t)1 1 dt. 1

(8.23)

I refer to this model, with labor-augmenting technological change and CRRA preference as given by (8.23) as the canonical model, since it is the model used in almost all applications with steady growth (unless non-balanced growth is the purpose as will be discussed in some of the structural change models below). Clearly, the Euler equation in this case takes the simpler form: 1 c (t) = (r (t) ) . c (t) 187 (8.24)

14.451: Introduction to Economic Growth Let us rst characterize the steady-state equilibrium in this model with technological progress. Since with technological progress there will be growth in per capita income, c (t) will grow. Instead, in analogy with y (t), let us dene c (t) C (t) A (t) L (t) c (t) . A (t)

We will see that this normalized consumption level will remain constant along the BGP. In particular, we have c (t) c (t) g c (t) c (t) 1 (r (t) g) . = Moreover, for the accumulation of capital stock, we have

(t) . k (t) = f k (t) c (t) (n + g + ) k

The transversality condition, in turn, can be expressed as Z th i 0 (t) exp (s) g n ds f k = 0. lim k

0

(8.25)

In addition, r (t) is still given by (8.8), so (t) r (t) = f 0 k Since in steady state c (t) must remain constant, therefore r (t) = + g 188

uniquely, in a way which pins down the steady-state value of the normalized capital ratio k similar to the model without technological progress. The level of normalized consumption is then given by (n + g + ) k , c =f k

= + + g, f0 k

(8.26)

(8.27)

while per capita consumption grows at the rate g . The only additional condition in this case is that because there is growth, we have to make sure that the transversality condition is in fact satised. Substituting (8.26) into (8.25), we have Z t lim k (t) exp [ (1 ) g n] ds = 0,

0

which can only be the case if the integral within the exponent goes to zero, i.e., if (1 ) g n > 0, or alternatively if the following assumption is satised: Assumption 11 n > (1 ) g. Note that this assumption strengthens Assumption 10 when < 1. Alternatively, recall that in steady state we have r = + g and the growth rate of output is g + n. Therefore, Assumption 11 is equivalent to requiring that r > g + n. We will encounter conditions like this all throughout, and they will also be related to issues of dynamic eciency as we will see below. For now, we have the following immediate generalization of Proposition 18: 189

14.451: Introduction to Economic Growth Proposition 20 Consider the neoclassical growth model with labor augmenting technological progress at the rate g and preferences given by (8.23). Suppose that Assumptions 1, 2, 9 and 11 hold. Then there exists a unique balanced growth path equilibrium with a normalized , given by (8.26), and output per capita and consumption capital to eective labor ratio of k per capita grow at the rate g .

Interestingly, the results that the steady-state capital-labor ratio was independent of given by (8.26) depends on the elasticity preferences is no longer the case, since now k of marginal utility (or the inverse of the intertemporal elasticity of substitution), . The reason for this is that there is now growth, so the willingness of individuals to substitute consumption today for consumption tomorrow determines how much they will accumulate and thus the equilibrium capital to eective labor ratio. A similar analysis to before also lead to an immediate generalization of Proposition 19, which is stated here. The proof is left as at home work exercise, but the next gure gives the sketch already.

Proposition 21 Consider the neoclassical growth model with labor augmenting technological progress at the rate g and preferences given by (8.23). Suppose that Assumptions 1, 2, 9 and 11 hold. Then there exists a unique equilibrium path of normalized capital and consumption, (t) , c , c given by (8.26). Morek (t) converging to the unique steady-state k with k

(0) < k , then k (t) k and c (0) > k , then k (t) k and over, if k (t) c , whereas if k c (t) c .

190

It is also useful to briey look at an example with Cobb-Douglas technology. Example 3 Consider the model with CRRA utility and labor-augmenting technological progress at the rate g . Assume that the production function is given by F (K, AL) = K (AL)1 , so that =k , f k

c 1 1 g , = k c

k 1 g n =k k 191

c . k

14.451: Introduction to Economic Growth k . Therefore, these and x k 1 , which implies that x/x Now dene z c /k = ( 1) k/ two equations can be written as x = (1 ) (x g n z ) x z c k = , z c k thus 1 z = (x g ) x + + g + n + z z 1 = (( )x (1 ) + n) + z.

(8.28)

(8.29)

The two dierential equations (8.28) and (8.29) together with the initial condition x (0) and the transversality condition completely determine the dynamics of the system. In Problem Set 4, you will be asked to complete this example for the special case in which 1 (i.e., log preferences).

8.7

In the above model, the rate of growth of per capita consumption and growth are determined exogenously, by the growth rate of labor-augmenting technological progress. The level of income, on the other hand, depends on preferences, in particular, on the intertemporal elasticity of substitution, 1/, the discount rate, , the depreciation rate, , the population growth rate, n, and naturally the form of the production function f (). If we were to go back to the proximate causes of dierences in income per capita or growth across countries, this model would give us a way of understanding those dierences only in terms of preference and technology parameters. 192

14.451: Introduction to Economic Growth However, the model can be easily enriched to include policy variables, which, at least according to the institutions view, play an equally important role in accounting for dierences in physical (and human) capital and technology across countries. Let us do this in the simplest possible way here and suppose that returns on capital net of depreciation are taxed at the rate and the proceeds of this are redistributed back to the consumers. In that case, the capital accumulation equation, in terms of normalized capital, still remains

(t) , k (t) = f k (t) c (t) (n + g + ) k

0 r (t) = (1 ) f k (t) . This implies c (t) c (t) g c (t) c (t) 1 (r (t) g ) . = 1 (t) g , (1 ) f 0 k = so that the steady-state capital to eective labor ratio is given by = + + g . f0 k 1 . Therefore, higher taxes on capital have the eect of depressing decreasing, it reduces k capital accumulation. In the next section, we will discuss how large these eects can be and whether they could account for the dierences in cross-country incomes. 193 A higher tax rate increases the right hand side, and since from Assumption 1, f 0 () is

14.451: Introduction to Economic Growth For now, we can also note that similar results would be obtained if instead of taxes being imposed on returns from capital, they were imposed on the amount of investment. We will see this in the next section.

8.8

8.8.1

Quantitative Evaluations

Policy Dierences

For a qualitative evaluation of the eect of policy dierences, let us follow Jones (1995) and Chari, Kehoe and McGrattan (1997). Imagine that the main policy dierence across countries is in terms of tax structure that aects the the relative price of capital goods. Chad Jones uses data from the Summers-Heston data set on the price of investment goods relative to consumption goods and shows that there are large dierences in the relative price of capital goods (compared to consumption goods); he also shows that a high relative price of capital goods is associated with low growth over the postwar period. This has led a number of economists, for example, Chari, Kehoe and McGrattan (1997) or Parente and Prescott (1994) to argue that a major dierence across countries is the extent of distortions arising from taxes, corruption or other policy dierences, which aect the relative price of capital. Although this is a plausible starting point, once you look at the data, the dierences in the relative price come not from the fact that investment goods are much more expensive in some countries, but from the fact that consumption goods are cheaper. We will discuss this later below, but for now let us stick with the traditional approach and think of dierential policies aecting the relative price of capital. Suppose that all countries admit a representative consumer with identical preferences, 194

exp (t)

1 Cj 1 dt, 1

(8.30)

where j J denotes country j . There is no population growth, so Cj interchangeably refers to total or per capita consumption. All countries also have access to the same production technology given by the Cobb-Douglas production function

1 Yj = Kj (AHj ) ,

(8.31)

with Hj representing exogenously given stock of eective labor (human capital). The accumulation equation is j = Ij Kj . K The only dierence across countries is in the budget constraint for the representative consumer, which takes the form (1 + j ) Ij + Cj Yj , (8.32)

where j is the tax on investment. This tax varies across countries, for example because of policies or dierences in institutions/property rights enforcement. Notice that 1 + j is also the relative price of investment goods (relative to consumption goods): one unit of consumption goods can only be transformed into 1/ (1 + j ) units of investment goods. Note that the right hand side variable of (8.32) is still Yj , which implicitly assumes that j Ij is wasted, rather than simply redistributed to some other agents in the economy. This is without any major consequence, since, as noted in Theorem 9 above, CRRA preferences as in (8.30) have the nice feature that they can be exactly aggregated across individuals, so we do not have to worry about the distribution of income in the economy. 195

14.451: Introduction to Economic Growth The competitive equilibrium can be characterized as the solution to the maximization of (8.30) subject to (8.32) and the capital accumulation equation. With the same steps as above, the Euler equation of the representative consumer is j C 1 (1 ) AHj = . Cj (1 + j ) Kj Consider the steady state. Because A is assumed to be constant, the steady state corre j /Cj = 0. (Alternatively, we could have A growing at a constant rate and C j /Cj sponds to C equal to the growth rate of A.) This immediately implies that Kj = (1 )1/ AHj

[(1 + j ) ( + )]1/

So countries with higher taxes on investment will have a lower capital stock in steady state. Equivalently, they will also have lower capital per worker, or a lower capital output ratio (using (8.31) the capital output ratio is simply K/Y = (K/AH ) ). Now substituting this into (8.31), and comparing two countries with dierent taxes (but the same human capital), we obtain the relative incomes as Y ( ) = Y ( 0 ) 1 + 0 1+

1

(8.33)

So countries that tax investment, either directly or indirectly, at a higher rate will be poorer. The advantage of using the neoclassical growth model for quantitative evaluation relative to the Solow growth model is that the extent to which dierent types of distortions (here captured by the tax rates on investment) will aect income and capital accumulation is determined endogenously. In contrast, in the Solow growth model, what matters is the savings rate, so we would need other evidence to link taxes or distortions to savings (or to other determinants of income per capita such as technology). 196

14.451: Introduction to Economic Growth How large is this eect? Or can such policy dierences have quantitatively large eects generating income dierences comparable to what we observe in practice? Recall that a plausible value for is 2/3, since this is the share of labor income in national product which, with Cobb-Douglas production function, is equal to . The Summers-Heston data suggest that there is a large amount of variation in the relative price of investment goods. For example, countries with the highest relative price of investment goods have almost eight times as high a value as countries with the lowest relative price. Then, using = 2/3, equation (8.33) implies that the income gap between two such countries should be approximately threefolds: Y ( ) (8)1/2 3. Y ( 0 ) Therefore, dierences in capital-output ratios or capital-labor ratios caused by taxes or tax-type distortions, even very large dierences in taxes or distortions, are unlikely to account for anywhere near as large dierences in income per capita as we observe in practice. This is not surprising. The discussion above showed that dierences in income per capita across countries cannot be accounted for by dierences in capital per worker alone. Instead, to explain such large dierences in income per capita across countries, we need sizable dierences in the eciency with which these factors are used. Such dierences do not feature in this model. Therefore, the simplest model does not provide a good starting point. Nevertheless, many authors have tried to use this model to go further.

8.8.2

Extensions

Basically, these authors start from a one-sector model, and try to generate large responses to distortions. But there is a constraint in this exercise: the share of capital in GDP is 197

14.451: Introduction to Economic Growth 1/3, and in the simple Cobb-Douglas production function as in (8.31), this also turns out to determine the elasticity of output to distortions (this elasticity is simply (1 ) / as shown by (8.33)). This is intuitive: if capital only has a small share, in this setup this means that variations in capital will have only a small eect on income, so distortions that aect capital can only have a small eect on income. One line of attack is taken by Chari, Kehoe and McGrattan, who suggest that the correct value for is 2/3. They think that human capital is not exogenous, but accumulates in exactly the same way as physical capital. In particular, they posit

(1 + j ) (Ij + Xj ) + Cj Yj and j = Xj Hj . H where X denotes investment in human capital. With this reasoning, and using the numbers implied by Mankiw, Romer and Weils regression analysis discussed above, they take = 1/3 (or they take the share of accumulable factors in GDP to be 2/3). In this case, (8.33) implies income dierences as large as 64 fold. They therefore conclude that the augmented Solow model is capable of explaining income dierences across countries quantitatively based on distortions on investment. However, this conclusion is subject to exactly the same caveats as Mankiw, Romer and Weils analysis. A share of 2/3 for the accumulable factors in GDP is too high, and implies implausibly large eects of education on income as pointed out above. 198

8.9

Parente-Prescott argue that the simple neoclassical model is not sucient to account for the large dierences in income per capita across countries. Consistent with the evidence presented above, they suggest that we have to take dierences in technology into account. Their approach for technology dierences, however, is very similar to the tax-type distortions aecting physical capital (and human capital) decisions in the neoclassical model. In particular, they argue that technology dierences arise because there are barriers to technology adoption, inducing economies with worse distortions not to adopt superior technologies. Essentially, this explanation turns technology into an accumulable factor, with a neoclassical production function. Consequently, even though Parente-Prescott argue that their model is dierent from the neoclassical model, it is really a variant of it, and I will treat it that way here. Therefore, this explanation circumvents the problems of the neoclassical models without being forced to increase the share of capital in national income. Moreover, the ParentePrescott formulation does this while keeping exogenous growth. However, there is a sense in which what is being done here is to add the degree of freedom, and interpret a more broad concept of capital is technology. Here is a very simple version of their model. Suppose that output is given by Yt = At Nt where At is technology/knowledge which will be accumulated endogenously. Each rm can rms in this economy). This limit on rm workers (so there will be Nt /N at most employ N level employment is imposed because the production technology exhibits increasing returns to scale, and otherwise all workers would be employed in one rm. 199

14.451: Introduction to Economic Growth In acquiring their technology, countries/rms benet from world knowledge, which progresses exogenously as follows Tt = T0 (1 + g )t In order to benet from world technology, countries/rms need to undertake some investments. In particular, the investment required to improve technology from At to At+1 is Xt =

1

At+1

At

S Tt

dS

Intuitively, each incremental improvement between At and At+1 costs an amount that depends on the distance of this improvement to the frontier technology, and also a shift parameter . As before, a high level of corresponds to better technology for absorbing world knowledge, and a low level of corresponds to signicant distortions in the process of technology adoption. Solving this integral, we obtain Xt =

1

(1/) Tt

At+1 At

1/

1/

(1)/

1/ (1)/

At Tt

1/

(1)/

as the eective knowledge stock. Then, we have the law of motion of this knowledge stock as Zt+1 At+1 / (1 + g) Tt . So Zt+1 = 0 Zt + Xt as the law of motion, where 0 is a constant. This equation makes it clear that the modeling 200

14.451: Introduction to Economic Growth question is very similar to a neoclassical model, except that Xt has replaced investments and Zt has replaced the capital stock. Then, output per capita dierences in two economies can be expressed as functions of their knowledge stocks. Denoting output per capita in country j by y j , we have ! j yt Ztj = . 0 j0 yt Ztj Will dierences in now lead to large output dierences? The answer depends on . If is small, as in the neoclassical model with a small share of capital in national product, there will only be quantitatively small dierences in output per capita across countries. However, if is large, for example = 0.7, this economy will behave similar to the neoclassical model with a capital share equal to 2/3. So therefore the basic dierence between this model and the standard neoclassical model is that here instead of capital, we have knowledge being accumulated, and is assumed to be large so knowledge is taken to be very important in production (e.g., corresponding to a large share of payments to technology in GDP, if everything was priced according to marginal productbut because of the increasing returns to scale this is not the case). In other words, by introducing the knowledge stock and increasing returns to scale, Parente and Prescott take us back to a production function of the form Y = K 0.7 L0.3 but with K replaced by Z . As a result, they obtain signicantly larger eects of distortions on income than implied by the neoclassical production function with Y = K 0.3 L0.7 . Although the explanation is plausible, the model does not generate further insights than the statement that distortions lead to lower input use, and knowledge is just another input. As we will see below, many models of endogenous technological progress will have a similar avor of technology being accumulated by purposeful investments, but they will allow for 201

202

The models analyzed so far were assumed to admit a representative household or consumer. These models are useful in providing us a tractable framework for the analysis of capital accumulation and neoclassical growth. Moreover, they had the nice feature that the competitive equilibrium coincided with the natural Pareto optimal allocation. In many situations, however, the assumption of a representative household is not appropriate. This was already discussed above. But one specic set of circumstances where we have to depart from this assumption is the one where we look at an economy in which new households (individuals) arrive over time. These models, rst analyzed by Paul Samuelson and then later Peter Diamond, are referred to as overlapping generations models, since dierent generations are born at dierent points in time. For economic growth, these models are useful, rst, because they provide a tractable alternative to the innite-horizon representative agent models, second, because they have some very dierent implications, and third, they allow a discussion of national debt and Social Security type issues. 203

9.1

Problems of Innity

Let us start considering the following economy analyzed by Shell (1971). Consider the following static economy with a countably innite number of agents, i N and a countably innite number of commodities, j N. Assume all agents behave competitively (alternatively, we can have more than one agent of each type in order to ensure this). Agent i has preferences given by: ui = ci + ci+1 , that is, he enjoys the consumption of commodities with the same index as his index and the next indexed commodity. Moreover, the endowment vector of the economy is as follows: each agent has one unit endowment of the commodity with the same index as his. Let us choose the price of the rst commodity as the numeraire, i.e., p0 = 1. The following is straightforward to see: Proposition 22 In the above-described economy, the price vector p such that p j = 1 for all j N is a competitive equilibrium price vector and induces an equilibrium with no trade, denoted by x . Proof. At p , each individual has income equal to a 1, thus the budget constraint of the form ci + ci+1 1. This implies that consuming his endowment is optimal for each individual, establishing that p and no trade, x , constitute a competitive equilibrium. 204

14.451: Introduction to Economic Growth However, the competitive equilibrium in Proposition 22 is not Pareto optimal. To see this, consider the following alternative allocation, referred to as x . Consumer i = 0 consumes one unit of good j = 0 and one unit of good j = 1, and consumer i > 0 consumes one unit of good i + 1. This is a feasible allocation. All consumers with i > 0 are as well o in this allocation as in the competitive equilibrium at p , and individual i = 0 is strictly better o. This establishes Proposition 23 In the above-described economy, the competitive equilibrium at (p , x ) is not Pareto optimal. Therefore, in this economy the First Welfare Theorem, Theorem 10, does not hold. This is the reason why Theorem 10 was stated under the assumption of nite number of commodities. This niteness assumption was used in the proof in making sure that summations existed and were nite. When there are innite number of commodities and innite number of individuals, summing over the value of the consumption basket of all individuals may (will often) lead to innite sums, and the proof of Theorem 10 does not work. The reason why this admittedly articial economy is relevant is that it is isomorphic to the overlapping generations economy which we will analyze next. The same issue of Pareto suboptimality will arise in this overlapping generations economy. However, recall also that Theorem 11 did not make use of summations in the same way as Theorem 10; instead, it made use of convexity. So one might conjecture that in this model, which is convex, while competitive equilibrium may be suboptimal, Pareto optima must be decentralizable. This is in fact true, and the following proposition shows how the allocation x can be decentralized as a competitive equilibrium, even without any change in endowments: 205

14.451: Introduction to Economic Growth Proposition 24 In the above-described economy, there exists a reallocation of the endowment vector to , and an associated competitive equilibrium (p , x ) that is Pareto optimal where x is as described above, and p is such that p j = 1 for all j N. Proof. Consider the following reallocation of the endowment vector : the endowment of agent i 1 is given to agent i 1. Consequently, at the new endowment vector , individual i = 0 has one unit of good j = 0 and one unit of good j = 1, while all other agents i have one unit of good i + 1. At the price vector p , individual 0 has a budget set c0 + c1 2, thus chooses c0 = c1 = 1. All other agents have budget sets given by ci + ci+1 1, thus each i > 0 is happy to consume one unit of the good ci+1 , which is within his budget set and gives as high utility as any other allocation within his budget set, establishing that x is a competitive equilibrium.

9.2

9.2.1

In this economy, time is discrete and runs to innity. Each individual lives for two periods. For example, all individuals born at time t live for dates t and t + 1. For now let us assume 206

14.451: Introduction to Economic Growth a general utility function for individuals born at date t, of the form

(9.1)

where u () satises the conditions in Assumption 9, c1 (t) denotes the consumption of the individual born at time t during the rst period of his life (which is at date t), and c2 (t + 1) this the consumption during the second period of his life (at date t + 1). Also (0, 1) is the discount factor. Individuals can only work in the rst period of their lives, and supply one unit of labor inelastically, earning the equilibrium wage rate w (t). Let us also assume that there is population growth, so that total population is L (t) = (1 + n)t L (0) .

The production side of the economy is the same as before, characterized by a set of competitive rms, and represented by the standard constant returns to scale aggregate production function, satisfying Assumptions 1 and 2. Also assume that capital fully depreciates after being used. As a result, we have that the rate of return to saving equals the rental rate of capital, i.e., r(t) = R (t) = f 0 (k (t)) , (9.2)

where f (k) is the standard per capita production function described above, and the wage rate is w (t) = f (k (t)) k (t) f 0 (k (t)) . 207 (9.3)

9.2.2

Consumption Decisions

Let us start with the individual consumption decisions. Denoting savings by s (t), this is a solution to the following maximization problem

c1 (t),c2 (t+1),s(t)

max

subject to c1 (t) + s (t) w (t) and c2 (t + 1) R (t + 1) s (t) , where we are using the convention that old individuals rent their savings of time t as capital to rms at time t + 1, so receive the gross rate of return R (t + 1). The second constraint incorporates the notion that individuals will only spend money on their own end of life consumption (there is no consumption term for descendants etc.). I have not imposed the constraints that s (t) 0, since with negative savings, individuals would violate their secondperiod budget constraint (given non-negativity of consumption). It is clear that both constraints will hold as equalities given that u () is strictly increasing. Then the rst-order condition for a maximum implies: u0 (c1 (t)) = R (t + 1) u0 (c2 (t + 1)) . (9.4)

Solving these equations for consumption and thus for savings, we also obtain the following implicit solution for savings s (t) = s (w (t) , R (t + 1)) . (9.5)

The function s (, ) is increasing in its rst argument, and may be increasing or decreasing in its second argument. 208

9.2.3

Equilibrium

An equilibrium in this economy is an allocation in which rms maximize and consumers optimize. Therefore, the factor price sequence {R (t) , w (t)} t=0 is given by (9.2) and (9.3), while individual consumption and saving decisions are given by (9.4) and (9.5). Consequently, given the full depreciation assumption, the law of motion of the capital stock is given by K (t + 1) = N (t) s (w (t) , R (t + 1)) , or in words, simply by the savings of the newly born at time t, N (t). Writing everything in terms of per worker units, this implies k (t + 1) = s (w (t) , R (t + 1)) 1+n

or substituting for R (t + 1) and w (t) from (9.2) and (9.3), we obtain the k (t + 1) = s (f (k (t)) k (t) f 0 (k (t)) , f 0 (k (t + 1))) 1+n (9.6)

as the fundamental law of motion of the overlapping generations economy. A steady state is given by a solution to this equation such that k (t + 1) = k (t) = k , i.e., k = s (f (k ) k f 0 (k ) , f 0 (k )) 1+n (9.7)

Since the savings function s (, ) can take essentially any form, the dierence equation (9.6) can lead to quite complicated dynamics, and multiple steady states are possible. The next gure shows some potential plots of the equation (9.6), which can lead to a unique stable equilibrium, to multiple equilibria, or to an equilibrium with zero capital stock.

209

9.2.4

To get more insights, let us now specialize the above setup by assuming CRRA utility functions, in particular,: c1 (t)1 1 + U (t) = 1 c2 (t + 1)1 1 1 ! . (9.8)

where > 0, (0, 1). Furthermore, assume that technology is Cobb-Douglas, so that f (k ) = k Everything else is the same as above. This simplies the rst-order condition for consumer optimization and implies c2 (t + 1) = (R (t + 1))1/ . c1 (t) 210

14.451: Introduction to Economic Growth You can recognize this expression as the discrete-time counterpart of the Euler equation from the Ramsey model, which was c/c = (r )/. Alternatively, the rst-order condition can be written as s (t) R (t + 1)1 = (w (t) s (t)) . Equation (9.9) implies that the saving rate can be written as s (t) = where (t + 1) [1 + 1/ R (t + 1)(1)/ ] > 1, ensuring that savings are always less than earnings. The relationship between the savings and factor prices is given by sw sr s (t) 1 = , w (t) (t + 1) s (t) s (t) 1 = . (R (t + 1))1/ R (t + 1) (t + 1) w (t) , (t + 1) (9.10) (9.9)

Note that 0 < sw < 1. Moreover, sr > 0 if < 1, but sr < 0 if > 1, and sr = 0 if = 1. The relationship between the rate of return on savings and the level of savings reects the counteracting inuences of income and substitution eects you are familiar with from basic micro. The case of = 1, i.e., log preferences, is of special importance and is often used in many applied models. With log preferences, income and substitution eects exactly cancel each other, and thus changes in the interest rate (and therefore changes in the capital-labor ratio of the economy) have no eect on the savings rate. Now equation (9.6) implies k (t + 1) = w(t) s(t) = , (1 + n) (1 + n) (t + 1) 211 (9.11)

14.451: Introduction to Economic Growth or more explicitly, k (t + 1) = (1 + n) [1 + 1/ f 0 (k(t + 1))(1)/ ] f (k ) k f 0 (k ) . (1 + n) [1 + 1/ f 0 k )(1)/ ] f (k (t)) k (t) f 0 (k (t)) (9.12)

The steady state then involves a solution to the following implicit equation: k =

Now using the Cobb-Douglas formula, we have that the steady state is the solution to the equation h (1)/ i = (1 )(k )1 . (1 + n) 1 + 1/ (k )1

(9.13)

For simplicity, dene R (k )1 as the marginal product of capital in steady-state, in which case, equation (9.13) can be rewritten as h i 1 (1 + n) 1 + 1/ (R )(1)/ = R .

(9.14)

The steady-state value of R and thus k can now be determined from equation (9.14), which always has a unique solution. We can next investigate the stability of this steady state. To do this, substitute for the Cobb-Douglas production function in (9.12): k (t + 1) = (1 + n) [1 + 1/ (k(t + 1)1 )(1)/ ] (1 ) k (t) . (9.15)

Now using (9.15), the following proposition can be proved (proof left for Problem Set 5): Proposition 25 In the overlapping-generations model with two-period lived agents CobbDouglas technology and CRRA preferences, there exists a unique steady-state equilibrium with the capital-labor ratio k given by (9.13) and as long as 1, this steady-state equilibrium is globally stable for all k (0) > 0. The next gure shows the dynamics diagrammatically in this particular (well-behaved) case, which look very similar to the dynamics of the basic Solow model: 212

Figure 9.1:

213

9.2.5

Pareto Optimality

Let us now return to the general problem, and compare the overlapping-generations equilibrium to the choice of a social planner wishing to maximize a weighted average of all generations utilities. In particular, the social planner in question maximizes

X t=0

t S U (t)

where S is the discount factor of the social planner. Substituting from (9.1), this implies:

X t=0

subject to the resource constraint F (K (t) , N (t)) = K (t + 1) + N (t) c1 (t) + N (t 1) c2 (t) , which can alternatively be divided by N (t) and written in per capita terms as f (k (t)) = (1 + n) k (t + 1) + c1 (t) + This maximization problem immediately implies u0 (c1 (t)) = f 0 (k (t + 1)) u0 (c2 (t + 1)) , which is identical to (9.4) noting that R (t + 1) = f 0 (k (t + 1)). This is not surprising, since the social planner would allocate consumption of a given individual in exactly the same way as the individual himself would do. However, the social planners and the competitive economys allocations across individuals will dier, since the social planner is giving dierent weights to dierent generations 214 c2 (t) . 1+n

14.451: Introduction to Economic Growth as captured by the parameter S . In particular, it can be shown that the socially planned economy will converge to a steady state with capital-labor ratio k S such that S f 0 kS = 1 + n,

which is similar to the modied golden rule we saw in the context of the Ramsey growth model. In particular, it does not depend on preferences (the utility function u ()) and does not even depend on the individual rate of time preference, . Clearly, k S is typically dierent from the steady-state value of the competitive economy, k , given by (9.7), which is not surprising given the dierent preferences that are being maximized. More interesting is the question of whether the competitive equilibrium is Pareto optimal. The example from Shell in the previous section suggests that it may not be. In particular, exactly as in the Shells example, we cannot use the First Welfare Theorem (Theorem 10) because of the innite number of commodities. In fact, the competitive equilibrium is not in general Pareto optimal. The simplest way of seeing this is that the steady state level of capital stock, k , given by (9.7), can be so high that it is in fact greater than kgold , that is, the economy is to the right of the golden rule, thus by reducing savings, consumption can increase for every generation. More specically, note that in steady state we have

1 f (k ) (1 + n)k = c c2 1 + (1 + n)

c , where the rst line follows by the accounting identity, and the second denes c as the total steady-state consumption. Therefore c = f 0 (k ) (1 + n) k 215

14.451: Introduction to Economic Growth and kgold is dened as f 0 (kgold ) = 1 + n. Now if k > kgold , then c /k < 0, so reducing savings can increase (total) consumption for everybody. If this is the case, the economy is referred to as dynamically inecient. Another way of expressing dynamic ineciency is that r < n, that is, the steady-state interest rate r = R 1 is less than the rate of population growth. Recall that in the innite-horizon Ramsey economy, the transversality condition (which follows from individual optimization) required that r > g + n, therefore, dynamic ineciency could never arise in this Ramsey economy. Dynamic ineciency arises because of the heterogeneity inherent in the overlapping generations model which removes the transversality condition. In particular, suppose we start from steady state at time T with k > kgold . Consider the following variation where the capital stock for next period is reduced by a small amount, i.e. changed by k, where k > 0, and we move immediately to a new steady state. Then we have cT = (1 + n) k > 0 ct = (f 0 (k k) (1 + n)) k for all t > T Since k > kgold , for small enough k, f 0 (k k) (1 + n) < 0, thus ct > 0 for all t T . The increase in consumption for each generation can be allocated equally during the two periods of their lives, thus necessarily increasing their utility (by the assumption that u () is strictly increasing from Assumption 1). This variation clearly creates a Pareto improvement in which all generations are better o. This establishes: 216

14.451: Introduction to Economic Growth Theorem 39 In the overlapping-generations economy the competitive equilibrium is not necessarily Pareto optimal. More specically, whenever r < n and the economy is dynamically inecient, it is possible to reduce the capital stock starting from the competitive steady state and increase the consumption level of all generations. As the above derivation makes it clear, possible lack of Pareto eciency in the competitive equilibrium is intimately linked with dynamic ineciency. Dynamic ineciency, that is the rate of interest less than the rate of population growth, is not a theoretical curiosum. In Problem Set 5, you will be working through a numerical example that will illustrate under what conditions dynamic ineciency can happen.

9.3

We now briey discuss how Social Security can be introduced as a way of dealing with overaccumulation in the overlapping-generations model. Very briey, we will consider a fully-funded system, in which the young make contributions the Social Security and their contributions are paid back to them in their old age. The alternative is an unfunded system or a pay-as-you-go Social Security system, where transfers from the young directly go to the current old.

9.3.1

In a fully funded system, the government at date t raises some amount d (t) from the young (by compulsory contributions to their Social Security accounts etc.), and this is invested in the only productive asset of the economy, the capital stock, and pays the workers when they 217

14.451: Introduction to Economic Growth are old an amount R (t + 1) d (t). More specically, we now have the individual maximization problem as

c1 (t),c2 (t+1),s(t)

max

subject to c1 (t) + s (t) + d (t) w (t) and c2 (t + 1) R (t + 1) (s (t) + d (t)) , for a given choice of d (t) by the government. Notice that now the total amount invested in capital accumulation is s (t) + d (t) = (1 + n) k (t + 1). It is also no longer the case that individuals will always choose s (t) > 0, since they have the income from Social Security. Therefore this economy can be analyzed under two alternative assumptions, with the constraint that s (t) 0 and without. It is clear that as long as s (t) is free, whatever the sequence of Social Security payments {d (t)} t=0 (as long as it is feasible), the competitive equilibrium applies. When s (t) 0 is imposed as a constraint, then the competitive equilibrium applies if given the sequence

{d (t)} t=0 , the privately-optimal saving sequence {s (t)}t=0 is such that s (t) > 0 for all t.

This discussion immediately establishes: Proposition 26 Consider a fully funded social security system in the above-described environment whereby the government collects d (t) from young individuals at date t. 1. Suppose that s (t) 0 for all t. If given the feasible sequence {d (t)} t=0 of Social

Security payments, the utility-maximizing sequence of savings {s (t)} t=0 is such that

s (t) > 0 for all t, then the set of competitive equilibria without Social Security are the set of competitive equilibria with Social Security. 218

14.451: Introduction to Economic Growth 2. Without the constraint s (t) 0, given any feasible sequence {d (t)} t=0 of Social Security payments, the set of competitive equilibria without Social Security are the set of competitive equilibria with Social Security. This is very intuitive: the d (t) taken out by the government is fully oset by a decrease in s (t) as long as individuals were performing enough savings (or always when there are no constraints to force positive savings privately).

9.3.2

The situation is very dierent with unfunded Social Security. Now we have that the government collects d (t) from the young at time t and distributes this to the current old with per capita transfer b (t) = (1 + n) d (t) (which takes into account that there are more young than old because of population growth). Therefore, the individual maximization problem becomes

c1 (t),c2 (t+1),s(t)

max

subject to c1 (t) + s (t) + d (t) w (t) and c2 (t + 1) R (t + 1) (s (t)) + (1 + n) d (t + 1) , for a given feasible sequence of Social Security payment levels {d (t)} t=0 . What this implies is that the rate of return on Social Security payments is 1 + n rather than R (t + 1), because unfunded Social Security is a pure transfer system. Only s (t) goes into capital accumulation. Therefore, intuitively we expect unfunded Social Security to 219

14.451: Introduction to Economic Growth reduce capital accumulation, and in economies with dynamic ineciencies, this may be a good thing. This leads to the following proposition (proof left for Problem Set 5). Proposition 27 Consider the above-described overlapping generations economy and suppose that there is dynamic ineciency into decentralized competitive equilibrium. Then there exists a feasible sequence of unfunded Social Security payments {d (t)} t=0 which will constitute a competitive equilibrium starting from any date t. Intuitively, unfunded Social Security reduces the overaccumulation and improves the allocation of resources. In many ways, this is equivalent to commodities being transferred from high indexed agents to low indexed agents in the Shell example above.

220

10.1 The Brock-Mirman Model

The classic analysis of economic growth with stochastic shocks was undertaken by Brock and Mirman in their 1972 paper. This was done in the context of optimal growth. However, if the economy admits a representative household, it turns out that despite the stochastic shocks, the First and Second Welfare Theorems still hold, so equilibrium growth is the same as optimal growth. In fact, the Brock-Mirman model is the starting point of the Real Business Cycle models you will study later. For now, it suces to note that this model, for all practical purposes, is identical to the non-stochastic model, except that we have to think of expectations. In particular, it is a solution to the following program:

X t=0

{c(t),k(t)} t=0

max

E0

t u (c (t))

(10.1)

221

14.451: Introduction to Economic Growth subject to k (t + 1) = A (t) f (k (t)) + (1 ) k (t) c (t) and k (t) 0, (10.2)

with given k (0). Here E0 is the expectations operator conditional on information available at time t = 0. The budget constraint with the production function substituted in, equation (10.2), requires some care in interpreting. A (t) is now introduced as a stochastic productivity term. The expectations are taken because the time path of the sequence {A (t)} t=0 is not known in advance. This implies that strategies have to have the proper measurability conditions. In particular, in general we can do this by assuming that information at time t is represented by a partition Ft , so that E0 [x] = E [x | Ft ], and variables chosen at time t have to be measurable with respect to Ft . This simply means that they can not be conditioned on realizations of future-dated stochastic variables. The above model can be enriched by assuming that there are stochastic preference shocks, for example by augmenting the utility function u (c (t)) by a shock b (t), so that u (c (t) | b (t)) is also a random function dependent on the realization of b (t). In addition, analysis of growth under uncertainty makes the standard assumptions on the production function as in Assumptions 1 and 2 above, and the standard assumption on preferences as in Assumption 8. Given this setup, the problem can again be written as a dynamic programming problem, but now it is a stochastic dynamic programming problem, in particular, it takes the form V (k) = max {u (c) + EV [Af (k) + (1 ) k c]}

c(k)

(10.3)

where the expectation is included because there is uncertainty about future values of the stochastic variable A. The rest of the analysis is very similar to the non-stochastic case, except that the Euler equations also include expectations. For example, assuming that A (t) 222

14.451: Introduction to Economic Growth is known at time t, the key Euler equation becomes: u0 (c (t)) = Et [(A (t + 1) f 0 (k (t + 1)) + (1 )) u0 (c (t + 1))] .

10.2

I now present the model from Acemoglu-Zilibotti (JPE 1997) aimed at capturing the interaction between diversication of risks and capital accumulation, and emphasizing the endogenous generation of risks in the growth process. This model will give an example of stochastic growth and also illustrate how the productivity of capital can change endogenously over the development process, and dier across countries. Finally, this model will also introduce some tools that are useful for analyses of dynamic stochastic economies.

10.2.1

The Environment

Consider the following model. There is a continuum of equally likely states represented by the unit interval. Agents have to invest their savings in intermediate sectors, which will than payo in the form of capital in the next period. Intermediate sector j [0, 1] pays a positive return only in state j and nothing in any other state. This formulation implies that investing in a sector is equivalent to buying a Basic Arrow Security that only pays in one state of nature. More formally, an investment of F j in sector j generates capital of the amount RF j if state j occurs and F j Mj , and nothing otherwise. There is also is a safe project, which transforms one unit of savings into r < R of capital. The requirement F j Mj implies that all intermediate sectors have linear technologies but some require a certain minimum size, Mj , before being productive. The distribution of 223

Mj = max 0,

Sectors j have no minimum size requirement and for the rest of the sectors, the minimum size requirement increases linearly). The next gure shows the minimum size requirements diagrammatically, and will be used for determining the equilibrium as well once demand for assets is introduced:

D (j ) . (1 )

There are two important features here 1. risky investments have a higher expected return than the safe asset (i.e. R > r); 2. dierent projects are imperfectly correlated so that there is safety in variety.

_

convenient implication of this formulation is that if a portfolio consists of an equiproportional investment F in all projects j J [0, 1], and the measure of the set

_

J is p, then the portfolio pays the return RF with probability p, and nothing with probability 1 p. 224

14.451: Introduction to Economic Growth These features imply that if the aggregate production set were convex (i.e. D = 0), all agents would invest an equal amount in all intermediate goods sectors and diversify all risks. However, in the presence of nonconvexities, as captured by the minimum size requirements, there is a trade-o between insurance and high productivity. The preferences of consumers over nal goods is dened as: Et U (ct , ct+1 ) = log(ct ) + Z

1

(10.4)

which again ensure a constant savings rate. Note that integration over [0,1] is over the states of nature. The individual life cycle and decisions are summarized in the next gure:

. Yt = AKt L1 t

(10.5)

14.451: Introduction to Economic Growth The aggregate capital stock depends on the realization of the state of nature. If the R j j )dh where Fh,t is the amount of savings state of nature is j , then Ktj+1 = t (rh,t + RFh,t

invested by agent h t in sector j , h,t is the amount invested in the safe asset, and t is the set of young agents at time t. Since both labor and capital trade in competitive markets, equilibrium factor prices in state j are given as: Wtj+1 = (1 )A Ktj+1 (1 )A 1 = A Ktj+1 A Z

t

(rh,t +

j RFh,t )dh

(10.6)

j t+1

(rh,t +

j RFh,t )dh

(10.7)

10.2.2

Equilibrium

Now consider the portfolio decisions of households. Each household takes the set of traded securities as given, and maximizes its utility by allocating its savings across dierent assets. Securities are labeled by the indices of the project to which they are attached. Therefore, one unit of security j entitles its holder to R units of t + 1 capital in state of nature j . Denote the unit price of security j (in terms of savings of time t) by Pj,t . Assume that the intermediates are supplied by nancial intermediaries. Since 1 unit of savings invested in a project thats open yields one unit of capital, competition among nancial intermediaries ensures that in equilibrium Pj,t = 1that is, all projects will be oered to households at marginal cost. Therefore, denoting the set of open projects at time t by Jt , optimal consumption, savings and portfolio decisions can be characterized by: log(ct ) + max j st ,t ,{Ft }0j1 226 Z

1

(10.8)

Ftj dj = st ,

It is important that these agents not only take wt , j t+1 , but also the set of risky assets Jt as given. A static equilibrium given wage earnings of young agents, Wt , (or given Kt ) is a solution to the maximization problem (10.8) subject to (10.9)-(10.12), such that Ftj Mj for all open sectors. A dynamic equilibrium is a sequence of static equilibria linked to each other through (10.6) Because preferences are logarithmic, the following saving rule is obtained irrespective of the risk-return trade-o:

s t s (wt ) =

wt . 1+

(10.13)

Given this result, a households optimization problem can be broken into two parts: rst, the amount of savings is determined, and then an optimal portfolio is chosen. Next observe that in equilibrium we will have 1. Ftj = Ftj j, j 0 that are open (i.e. j, j 0 Jt ). Since each individual is facing the same price for all of the traded symmetric Arrow securities, he would want to purchase an equal amount of eachi.e., a balanced portfolio. 2. The set of open projects will be Jt = [0, nt ] for some nt [0, 1]. This states that when only a subset of projects can be opened in equilibrium, small projects are opened 227

0

14.451: Introduction to Economic Growth before large projects. As a result, if a sector j is open, all sectors j j must also be open. Given this result, the maximization problem simplies to: h i h i (qG ) (qB ) max nt log t+1 (RFt + rt ) + (1 nt ) log t+1 (rt ) ,

t,Ft

(10.14)

(q )

B where nt and j t+1 s are taken as parametric by the agent, and st is given by (10.13). t+1 =

(rt )1 is the marginal product of capital in the bad state, when the realized state

G = (RFt + rt )1 applies in the good is j > nt and no risky investment pays o. t+1

(q )

state, i.e. when the realized state is j nt . Maximization of (10.14) gives: t = (1 nt )R s, R rnt t (10.16)

F Rr s , j n t t Rrnt t = . 0 j > nt

1/2

(10.17)

if Kt if Kt >

. Then there exists a unique equilibrium such that s where A(1 ) 1+ t = j )AKt , and t , Ft are given by (10.16) and (10.17) with nt = nt (Kt ).

D 1/

D 1/

(10.18)

(1 1+

This equilibrium can be expressed as the intersection of the aggregate demand of each risky asset, F (nt ), with the thick curve that traces minimum size requirements in the gure. 228

10.2.3

Dynamics

Next, it is straightforward to characterize the full stochastic equilibrium process, the equilibrium law of motion of Kt is: Kt+1 =

r(1n t) RKt Rrn t

RKt

prob. 1 n t prob. n t

(10.19)

The capital stock follows a Markov process in which the level of capital next period depends on whether the economy is lucky in the current period (which happens when the risky investments pay-o, probability n t ). Moreover, the probability of this event changes over time. As the economy develops, it can aord to open more sectors, and the probability of transferring a large capital stock to the next period, n t , increases. Also from (10.19), the expected productivity of an economy depends on its level of development and diversication. To see this, dene expected total factor productivity (conditional on the proportion of sector open) by r(1 n ) R + n R R rn

e (n (Kt )) = (1 n )

(10.20)

Simple dierentiation establishes that as n t increases, this measure also increases. To formalize the dynamics of development, dene the following concepts; (i) QSSB : quasi steady state of an economy which always has unlucky draws. An economy would converge to this quasi steady state if it follows the optimal investments characterized above but the sectors invested never pay-o due to bad luck . 229

14.451: Introduction to Economic Growth (ii) QSSG : quasi steady state of an economy which always receives good news.

K QSSB

and

K QSSG = (R) 1 .

(10.21)

If uncertainty could be completely removed, that is n(K QSSG ) = 1, then there would never be bad news, and the good quasi steady state would be a real steady state; a point, if reached, from which the economy would never depart. From equations (10.18) and (10.21), the condition for this steady state to exist is that the saving level corresponding to K QSSG be sucient to ensure a balanced portfolio of investments, of at least D, in all the intermediate sectors. Thus, if:

D < 1 R 1 ,

(10.22)

a steady state will exist and we denote it by K SS . The following gure is useful in understanding the dynamics.

230

At very low levels of capital, the Inada conditions of the production function guarantee positive growth even conditional on bad news (both curves lie above the 45 line). Then, there is a range (region II ) in which growth only occurs conditional on good draws (the bad draws curve is below the 45 line). Regions I and II are separated by K QSSB . When they are below this level, all economies will grow towards it. When they are above this level, their output will fall in case they receive bad shocks, and the probability of bad news is very high when the economy has a level of capital stock just above K QSSB . As good news is received, the capital stock will grow and the probability of a further lucky draw will increase. Note that even when it grows, the economy is still exposed to large undiversied risks, and will typically experience some set-backs. Finally, provided (10.22) is satised, the economy will eventually enter region III where 231

14.451: Introduction to Economic Growth all diversiable risks will be removed (since all sectors are open and an equal amount is invested in all sectors), and there will be deterministic convergence to K SS .

10.2.4

Eciency

Since all agents are price takers, it may be conjectured that the decentralized equilibrium here is ecient. This turns out not to be the case. To illustrate this, consider the portfolio allocation that a social planner maximizing the welfare of the current generation of savers would choose taking the amount of savings as given. The dierence between the social planners allocation and a decentralized equilibrium is that, the social planner explicitly chooses Ftj and the number of open sectors, Jt . It is straightforward to see that the subset of projects in which the planner will invest will be of the form J F B = [0, nF B ]similar to the decentralized equilibrium. Therefore, subject to feasibility, the planner will solve max j nt ,t ,{Ft }0jnt Z

nt

(10.23)

Let n (Kt ) be given by (10.18) and St = s t denote total savings. Then,St < D,

nF B (Kt ) > n (Kt ), F B (Kt ) < (Kt ) and each agent receives the following portfolio of assets: F j,F B = Mj > Mj if t Ftj,F B = Mj F j,F B = 0 t

jt < jt

. if nF B (Kt ) jt jt

(10.24)

if

jt > nF B (Kt )

232

14.451: Introduction to Economic Growth In other words, the social planner will always open more sectors/projects than the decentralized equilibrium, and will nance this by investing less in the sectors without the minimum size requirement. This is shown in the next picture:

Why is the decentralized equilibrium inecient? The answer is a pecuniary externality due to missing markets. As an additional sector opens, all existing projects become more attractive relative to the safe asset because the amount of undiversied risks they carry are reduced, and as a result, risk-averse agents are more willing to buy the existing securities. Since each agent ignores his impact on others diversication opportunities, the externality is not internalized. It is important to also note that the decentralized equilibrium did not correspond to an Arrow-Debreu equilibrium, and this gives the technical intuition for the ineciency. In an Arrow-Debreu equilibrium, all commodities, even those that are not traded in equilibrium are priced, whereas such a price schedule does not exist in this economy because of the non233

14.451: Introduction to Economic Growth convexities of the production set. Instead, the analysis here uses a more natural competitive equilibrium notion (common in general equilibrium analyses of monopolistic competition) where only commodities traded in equilibrium are priced.

10.2.5

Implications

An important implication of this analysis is that there will be systematic dierences in productivity across countries depending on the realization of past shocks: economies with low capital, that is those that had received bad shocks in the past, will have fewer sectors open, and therefore, they will correctly fear undiversied risks. As a result, these economies will invest in the low productivity safe assets, and achieve low productivity. The analysis also implies a systematic relationship between the variability of the current performance and the level of development (the level of the capital stock). Richer countries will have less variable growth rates. This is a pattern we see in the data.

10.2.6

Would the market failure in portfolio choices be overcome if some nancial institution could coordinate households investment decisions? Imagine that rather than all agents acting in isolation and ignoring their impact on each others decisions, funds are intermediated through a nancial coalition-intermediary. This intermediary can collect all the savings and oer to each saver a complex security (as dierent from a Basic Arrow Security) that pays

B B RFtj,F B + rF in each state j , where Ftj,F B and F are as in the optimal portfolio. Holding t t

this security would make each consumer better o compared to the equilibrium. Although from this discussion it may appear that the ineciency we identied may 234

14.451: Introduction to Economic Growth not be robust to the formation of more complex nancial institutions, we will show that this is not the case. Unless some rather strong assumptions are made about the set of contracts that a nancial intermediary can oer, the unique equilibrium allocation with unfettered competition among intermediaries will be identical to the one we characterized as the equilibrium above. In order to model the endogenous formation of coalitions, let us now assume that savings can be intermediated by some households who decide to act as middlemen and run an investment fund . Put dierently, following Townsend (1983), some agents initiate the formation of a coalition of households which buys securities on behalf of its members. In return, participants to the nancial coalition can be charged an intermediation fee, . Projects are still run by individual households. Let me now introduce the following three assumptions for the coalition-formation game: 1. An agent cannot be part of two coalitions at the same time. 2. Coalitions at all points maximize a weighted utility of their members. In particular, a coalition cannot commit to a path of action that will be against the interests of its members in the continuation game. 3. Coalitions cannot exclude other agents (or coalitions) from investing in a particular project. The rst assumption is introduced to simplify the objective function of coalitions. In fact, this assumption makes it easier for an ecient allocation to be sustained as an equilibrium. The second one is the most important assumption. We view this as a very natural assumption along the lines of subgame perfection, and its importance will be discussed further below. 235

14.451: Introduction to Economic Growth Assumption three is also mainly expositional. We will see below that as long as Assumption number two holds, coalitions would never want to exclude others, and thus this assumption is only imposed to simplify the exposition. Formally, the game has now three stages. In the rst stage, each household h can announce that he is willing to act as an intermediary for a specied set of households h (where h , the set of all subsets of , and we dene (). as the Lebesque measure over ). In general, only a subset of agents belonging to h will accept the oer of the intermediary. Let a h h denote this subset of households. Note that because of assumption 1, in equilibrium will be partitioned into disjoint coalitions. The intermediary h will invest the savings he collects (net of his commission h ) in shares of both risky and safe projects so as to maximize the total utility of the agents belonging to a h . A rst-stage strategy for h , h ) R+ . If agent h announces that he household h is an announcement Zh = ( will not act as an intermediary, then Zh = . Among the possible non-null announcements, there is autarky, i.e. Zh = (0, {h}), which means that h will only intermediate (at most) his own savings. Finally, we denote the set of rst-stage announcements of all agents by Z (1) : R+ . In the second stage, each agent h can announce his plan to run at most one project and sell the corresponding Basic Arrow Security, i.e. h announces a pair (j, Pj,h ), as in the game discussed in Section 3. But, now, securities are sold to nancial intermediaries rather than directly to households. Formally, the second-stage announcement for agent h is Zh = (j, Pj,h ) [0, 1] R+ . and Z (2) : [0, 1] R+ is the set of all second-period announcements. We will also denote the set of minimum security prices announced in the second stage of the game by P = {P j }j J . In the third stage, each household takes the set of prior announcements, Z (1) and Z (2) , 236

(2) (1) (1) (1)

14.451: Introduction to Economic Growth as given, and chooses which coalition to join. Or equivalently, Zh is hs choice of an o (1) n (1) intermediary from Mh Z i , i ), h i , the set of coalitions which i | Zi = ( announced his name. Note that although the set Mh Z (1) could be empty, this will never be the case in equilibrium, since any agent can costlessly make the autarky announcement in the rst stage. Finally, after all agents announce which coalition-intermediary they will belong to, each intermediary makes the optimal investment decision. We still use the notation h ,

j to denote the investment of an agent (through a coalition) in the safe and risky assets. Fh j j in project j , then Fh will be the share of agent h More precisely, if a coalition invests F j . in this coalition times F (3)

Denition 9 A (perfect) equilibrium is a set of announcements Z = (Z (1) , Z (2) , Z (3) ) at each stage of the game, a price function P (Z ) for all Basic Arrow Securities, a saving

j decision s h (Z ), and induced holdings of the safe asset h (Z ) and securities Fh (Z ) for all

agents, and factor payments W and such that given the announcements of the previous stage(s) and the announcements of all other agents in the current stage, every household chooses Zh that maximizes its utility as given by (5) and factor returns are determined by (10.6) and (10.7). Note that the denition of equilibrium used so far was also subgame perfect. Here we emphasize perfection in order to reiterate the importance of Assumption 3 in our analysis. The rst observation is that free entry will drive prots (commissions) to zero in both the rst and second stages. This is established by the following lemma (proof omitted). Lemma 3 In equilibrium, (i) P j, (Z ) = 1, j ; (ii) h = 0, h . With this remark, it is now possible to establish the following proposition: 237

(i)

14.451: Introduction to Economic Growth Proposition 28 The set of (perfect) equilibria is non-empty and all allocations in this set have the following characteristics: 1. h , Mh 6= (all agents are included in some coalition). 2. Let n t be dened in (10.18). Then, h either Zh

j [0, n t ] . And, j [0, nt ] h such that Zh (2) (2)

= or Zh

(2)

= (j, 1) where

= (j, 1).

given by equations (10.16) and (10.17). This result implies that even with unrestricted coalitions, the ineciency cannot be prevented. The key feature is that each agent would be creating a positive externality by holding a non-balanced portfolio like the one necessary for eciency, and they will typically nd a way of moving towards a balanced portfolio, undermining eorts to sustain the ecient allocation.

238

239

14.451: Introduction to Economic Growth Until now, we investigated economic growth models without growth in the sense that, either the economy settled into a steady state without any economic growth, or growth came exogenously from the unmodeled process of labor-augmenting technological progress. The rest of the lectures will look into issues of endogenous growth and how the process of development, whereby a society makes the transition from being a less-developed economy to a more developed one, takes place endogenously.

241

242

The rst-generation models of endogenous growth made a big advance relative to the neoclassical growth model in generating sustained growth. Two approaches are noteworthy here. The rst one basically keeps the essence of the neoclassical approach, with competitive markets and no externalities. The second makes the rst attempt at endogenizing technology by introducing externalities and knowledge spillovers (ows) across rms.

11.1

AK Model Revisited

Let us start with the simplest neoclassical model of sustained growth, which we already encountered in the context of the Solow growth model. This is the so-called AK model, where the production technology is linear in capital. We will also see that in fact what matters is that the accumulation technology is linear, not necessarily the production technology. But 243

14.451: Introduction to Economic Growth for now it makes sense to start with the simpler case of the AK economy.

11.1.1

Since there will be growth and we are, at least at rst, interested in balanced growth, we are forced to use preferences that are asymptotically consistent with balanced growth. We may as well assume these preferences from the beginning, thus choose the standard CRRA preferences of the canonical model. More specically, let us assume that the economy admits an innitely-lived representative household with utility given by Z U= subject to the constraint a = (r n)a + w c, (11.2) c1 1 exp ( ( n) t) dt, 1

(11.1)

where a is assets per person, r is the interest rate, w is the wage rate, and n is the growth rate of population. I have now suppressed time dependence to simplify notation. We again impose the no-Ponzi game constraint: Z t [r(s) n] ds 0 lim a(t) exp

t 0

(11.3)

The Euler equation for the representative household is the same as before and gives: c 1 = (r ) c and the transversality condition is: Z t [r(s) n] ds = 0. lim a(t) exp

t 0

(11.4)

(11.5)

244

14.451: Introduction to Economic Growth The production sector is similar to before, except that Assumptions 1 and 2 are not satised. More specically, we have that per capita output is given by y = f (k ) = Ak, (11.6)

with A > 0 being a constant. Equation (11.6) has a number of notable dierences from our standard production function satisfying Assumptions 1 and 2. First, output is only a function of capital, and there are no diminishing returns (i.e., it is no longer the case that f 00 () < 0). More important is the fact that the Inada conditions embedded in Assumption 2 are no longer satised. In particular, lim f 0 (k ) = A > 0.

This feature is essential for sustained growth. The conditions for prot-maximization are very similar to before, and require that the marginal product of capital be equal to the rental price of capital, R = r + . Since, as is obvious from equation (11.6), the marginal product of capital is constant and equal to A, we also have that the net rate of return on the savings is constant and equal to: r = A . (11.7)

Since the marginal product of labor is zero, the wage rate, w, is zero. This is a somewhat extreme result, and again it can be relaxed as we will see below. Alternatively, in this model we can think of k as a combination of physical and human capital, in which case there will be labor income coming from human capital, which will be accumulating in the same way as physical capital (in particular linearly). 245

11.1.2

Equilibrium

To characterize the equilibrium, which is dened in exactly the same way as in the basic neoclassical model, we again use a = k, r = A , and w = 0, and substitute these into equations (11.2), (11.4), and (11.5), to obtain: = (A n)k c k c 1 = (A ), c lim k(t)e(An)t = 0. (11.8) (11.9) (11.10)

The important result immediately follows from equation (11.9). There is a constant rate of consumption growth (as long as A > 0), and this is entirely independent of the level of capital stock per person, k. This will also imply that there are no transitional dynamics in this model. Starting from any k (0), the economy will immediately start growing at a constant rate. To see this, integrate equation (11.9) starting from some initial level of consumption c(0) [still to be determined]. This gives 1 (A )t . c(t) = c(0) exp (11.11)

Since there is growth in this economy, we have to ensure that the transversality condition is satised (i.e., that lifetime utility is bounded away from innity), and also we want to ensure positive growth. Therefore we impose:

14.451: Introduction to Economic Growth The rst part of this condition ensures that there will be positive consumption growth, while the second part is the analogous condition to + g > g + n in the neoclassical growth model with technological progress, which was imposed to ensure bounded utility (and thus was used in proving that the transversality condition was satised).

11.1.3

Transitional Dynamics

We now more explicitly show there are no transitional dynamics, that is, not only the growth rate of consumption, but the growth rates of capital and output are also constant at all points in time, and equal the growth rate of consumption given in equation (11.9). To do this, let us substitute for c(t) from equation (11.11) into equation (11.8), which yields 1 (A )t , k = (A n)k c(0) exp

(11.12)

which is a rst-order, non-autonomous linear dierential equation in k. This type of equation can be solved easily. In particular recall that if z = az + g (t) , then, the solution is z (t) = z0 exp (at) + exp (at) Z

t

for some constant z0 chosen to satisfy the boundary conditions. Therefore, equation (11.12) solves for: k(t) = exp((A n) t) + [(A )( 1)/ + / n]1 [c(0) exp ((1/) ((A )t))] , (11.13) 247

14.451: Introduction to Economic Growth where is a constant to be determined. Notice also that Assumption 12 ensures that [(A )( 1)/ + / n] > 0. From (11.13), it may look like capital is not growing at a constant rate, since it is the sum of two components growing at dierent rates. However, this is where the transversality condition becomes useful. Let us substitute from (11.13) into the transversality condition, (11.10), which yields lim [ + [(A )( 1)/ + / n]1 c(0) exp ( [(A )( 1)/ + / n] t)] = 0. Note that [(A )( 1)/ + / n] > 0, so the second term in this expression converges to zero as t . But the rst term is a constant. Thus the transversality condition can only be satised if = 0. Therefore we have from (11.13) that: k(t) = [(A )( 1)/ + / n]1 [c(0) exp ((1/) ((A )t))] = k (0) exp ((1/) ((A )t)) , where the second line immediately follows from the fact that the boundary condition has to hold for capital at t = 0. This equation naturally implies that capital and output grow at the same rate as consumption. It also pins down the initial level of consumption exactly as c (0) = [(A )( 1)/ + / n] k (0) . (11.15) (11.14)

It is also interesting to note that in this simple AK model, growth is not only endogenous in the sense of being sustained, but it is also endogenous in the sense of being aected by underlying parameters. For example, consider an increase in the rate of discount, 248

14.451: Introduction to Economic Growth . Recall that in the Ramsey model, this aected the level of income per capita, but had no eect on the growth rate, which was determined by the exogenous labor-augmenting rate of technological progress. Here, clearly it will reduce the growth rate. Similarly, changes in A and aect the levels and growth rates of consumption, capital and output. In Problem Set 5, you will also see that policy can now aect the growth rate of output permanently. Finally, we can calculate the saving rate in this economy. It is dened as total investment (increase in capital plus replacement investment) divided by output: s= +n+ + K k/k A + n + ( 1) K = = Y A A (11.16)

where naturally k/k = (A )/. Consequently, the savings rate which was taken as exogenous in the basic Solow model is now a function of parameters, and more specically of exactly the same parameters that determine the per capita growth rate. Summarizing, we have: Proposition 29 Consider the above-described AK economy, with a representative household with preferences given by (11.1), and the production technology given by (11.6). In this economy there exists a unique equilibrium path in which consumption, capital and output all grow at the same rate g (A )/ starting from any initial positive capital stock per worker k (0), and the savings rate is endogenously determined by (11.16). One important implication of the AK model is that since all markets are competitive, there is a representative household, and there are no externalities, the competitive equilibrium will be Pareto optimal. This can be proved either using First Welfare Theorem type reasoning, or by directly constructing the optimal allocation. The result is stated in the next proposition and left for you to prove): 249

14.451: Introduction to Economic Growth Proposition 30 Consider the above-described AK economy, with a representative household with preferences given by (11.1), and the production technology given by (11.6). In this economy the unique equilibrium path in which consumption, capital and output all grow at the same rate g (A )/ is Pareto optimal.

11.1.4

It is straightforward to incorporate policy into this framework. The simplest and arguably one of the most relevant classes of policies is, as also discussed above, that which aects the rate of return to accumulation. In particular, suppose that there is an eective tax rate of on the rate of return from capital income. Repeating the analysis above immediately implies that this will adversely aect the growth rate of the economy, which will now become: g= (1 ) (A ) .

which is a decreasing function of if A > 0. Therefore, in this model, the savings rate is constant in equilibrium as in the basic Solow model, but in contrast to that model, it responds endogenously to policy.

11.2

The model studied in the previous section is attractive in many respects. It generates sustained growth, which responds to policy, to underlying preferences and to technology. 250

14.451: Introduction to Economic Growth Moreover, it is a very close cousin of the neoclassical model. In fact, as argued there, the endogenous growth equilibrium is Pareto optimal. One unattractive feature of this model, however, is that all of national income accrues to capital. Essentially, it is a one sector model with only capital as the factor of production. This makes it unattractive as an application to real world situations. It also blurs what is the key underlying characteristic driving growth in this model. As I pointed out above, it is not that the production technology is AK , but the related feature that the accumulation technology is linear. In this section, I will briey illustrate this by developing a more workable version of the AK model with two sectors, which in fact features a constant share of capital in national income less than 1. The preference and demographics are the same as in the model of the previous section, in particular, equations (11.1)-(11.5) apply as before (but with a slightly dierent interpretation for the interest rate in (11.4) as will be discussed below). Moreover, to simplify the analysis I will shut down population growth, so n = 0, and the total amount of labor in the economy is equal to L and is supplied inelastically. The main dierence is in the production technology. Rather than a single good used for consumption and investment, we now envisage an economy with two sectors. Sector 1 produces consumption goods with the following technology C (t) = B (KC (t)) LC (t)1 , (11.17)

where the subscript C denotes that these are capital and labor used in the consumption sector, which has a Cobb-Douglas technology. In fact, the Cobb-Douglas assumption here is quite important in ensuring that the share of capital in national income is constant [can 251

14.451: Introduction to Economic Growth you see why?]. The capital accumulation equation is given by: (t) = I (t) K (t) , K where I (t) denotes investment. Investment goods are produced with a dierent technology than (11.17), however. In particular, we have I (t) = AKI (t) . (11.18)

The distinctive feature of the technology for the investment goods sector, (11.18), is that it is linear in the capital stock and does not feature labor. This is an extreme version of an assumption often made in two-sector models that the investment-good sector is more capital-intensive than the consumption-good sector. In the data, there seems to be some support for this, though the capital intensities of many sectors have been changing over time as the nature of consumption and investment goods has changed. Market clearing implies: KC (t) + KI (t) K (t), for capital, and LC (t) L, for labor (since labor is only used in the consumption sector). An equilibrium in this economy is dened similarly to that in the neoclassical economy, but also features an allocation decision of capital between the two sectors. Moreover, since the two sectors are producing two dierent goods, consumption and investment goods, there will be a relative price between the two sectors which will adjust endogenously. Since both market clearing conditions will hold as equalities (the marginal product of both factors is always positive), we can simplify notation by letting (t) denotes the share 252

14.451: Introduction to Economic Growth of capital used in the investment sector KC (t) = (1 (t)) K (t) and KI (t) = (t) K (t). From prot maximization, the rate of return to capital has to be the same when it is employed in the two sectors. Let the price of the investment good be denoted by pI (t) and that of the consumption good by pC (t), then we have pI (t) A = pC (t) B L (1 (t)) K (t) 1 . (11.19)

Dene a steady-state (a balanced growth path) as an equilibrium path in which (t) is constant and equal to say . Moreover, let us choose the consumption good as the numeraire, so that pC (t) = 1 for all t. Then dierentiating (11.19) implies that at the steady state: p I (t) = (1 ) gK , pI (t) where gK is the steady-state (BGP) growth rate of capital. As noted above, the Euler equation for consumers, (11.4), still holds, but the relevant interest rate has to be for consumption-denominated loans, denoted by rC (t). In other words, it is the interest rate that measures how many units of consumption good an individual will receive tomorrow by giving up one unit of consumption today. Since the relative price of consumption goods and investment goods is changing over time, the proper calculation goes as follows. By giving up one unit of consumption, the individual will buy 1/pI (t) units of capital goods. This will have an instantaneous return of rI (t). In addition, the individual will get back the one unit of capital, which has now experienced a change in its price of p I (t) /pI (t), and nally, he will have to buy consumption goods, whose prices changed by p C (t) /pC (t). Therefore, the general formula of the rate of return denominated 253 (11.20)

14.451: Introduction to Economic Growth in consumption goods in terms of the rate of return denominated in investment goods is rC (t) = I (t) p C (t) rI (t) p + . pI (t) pI (t) pC (t)

In our setting, given our choice of numeraire, we have p C (t) /pC (t) = 0. Moreover, p I (t) /pI (t) is given by (11.20). Finally, rI (t) =A pI (t) given the linear technology in (11.18). Therefore, we have rC (t) = A + p I (t) . pI (t)

and in steady state, from (11.20), the steady-state consumption-denominated rate of return is: rC = A (1 ) gK . From (11.4), this implies a consumption growth rate of gC (t) 1 C = (A (1 ) gK ) . C (t) (11.21)

Finally, dierentiate (11.17) and use the fact that labor is always constant to obtain (t) C (t) C K = , C (t) KC (t) which, from the constancy of (t) in steady state, implies the following steady-state relationship: gC = gK . Substituting this into (11.21), we have

gK =

A 1 (1 ) 254

(11.22)

= gC

A . 1 (1 )

(11.23)

What about wages? Now since labor is being used in the consumption good sector, there will be positive wages. Since labor markets are competitive, the wage rate at time t is given by w (t) = (1 ) pC (t) B (1 (t)) K (t) L

Therefore, in the balanced growth path, we obtain (t) K w (t) p C (t) = + w (t) pC (t) K (t) = gK , which implies that wages also grow at the same rate as consumption. Moreover, with exactly the same arguments as in the previous section, it can be established that there are no transitional dynamics in this economy. This establishes the following result: Proposition 31 In the above-described extended AK economy, starting from any K (0) > 0, consumption and labor income grow at the constant rate given by (11.23), while the capital stock grows at the constant rate (11.22). It is straightforward to conduct policy analysis in this model, and as in the basic AK model, taxes on investment income will depress growth. Similarly, a lower discount rate will increase the equilibrium growth rate of the economy One important implication of this model, dierent from the neoclassical growth model, is that there is continuous capital deepening. Capital grows at a faster rate than consumption 255

14.451: Introduction to Economic Growth and output. Whether this is a realistic feature or not is debatable. The Kaldor facts, discussed above, include constant capital-output ratio as one of the requirements of balanced growth. Here we have steady state and balanced growth without this feature. For much of the 20th century, capital-output ratio has been constant, but it has been increasing steadily over the past 30 years. Part of the reason why it has been increasing recently but not before is because of relative price adjustments. New capital goods are of higher quality, and this needs to be incorporated in calculating the capital-output ratio. These calculations have only been performed in the recent past, which may explain why capital-output ratio has been constant in the earlier part of the century, but not recently.

11.3

The model that started much all the interest in endogenous growth is Romer (1997). Romer wanted to explicitly model the process of knowledge accumulation, but realized that this would be dicult in the context of a competitive economy. His initial solution (later updated and improved in his and others work during the 1990s) was to consider knowledge as a byproduct of production that accumulates by itself. I now present this model.

11.3.1

Consider an economy without any population growth (we will see why this is important) and a production function with labor-augmenting knowledge (technology) that satises the standard assumptions, Assumptions 1 and 2. For reasons that will become clear, instead of working with the aggregate production function, let us look at the production function 256

14.451: Introduction to Economic Growth facing each one of the many innitesimal nal good producers (each indexed by i): Yi (t) = F (Ki (t) , A (t) Li (t)) , (11.24)

where Ki (t) and Li (t) are capital and labor rented by a rm i. Notice that A (t) is not indexed by i, since it is technology common to all rms. Let us normalize the measure of nal good producers to 1, so that we have the following market clearing conditions: Z 1 Ki (t) = K (t)

0

and

Li (t) = L,

where L is the constant level of labor (supplied inelastically) in this economy. Firms are competitive in all markets, which implies that they will all hire the same capital to eective labor ratio, and moreover, factor prices will be given by their marginal products, thus F (K (t) , A (t) L) L F (K (t) , A (t) L) . R (t) = K (t) w (t) = The key assumption of Romer (1997) is that although rms take A (t) as given, this stock of technology (knowledge) advances endogenously for the economy as a whole. In particular, Romer assumes that this takes place because of spillovers across rms, and attributes spillovers to physical capital. Lucas (1998) develops a similar model in which the structure is identical, but spillovers work through human capital (i.e., while Romer has physical capital externalities, Lucas has human capital externalities). The idea of externalities is not uncommon to economists, but both Romer and Lucas make an extreme assumption of suciently strong externalities such that A (t) can grow 257

14.451: Introduction to Economic Growth continuously at the economy level. In particular, Romer assumes

A (t) = BK (t) ,

(11.25)

i.e., the knowledge stock of the economy is proportional to the capital stock of the economy. This can be motivated by learning-by-doing whereby, greater investments in certain sectors increases the experience (of rms, workers, managers) in the production process, making the production process itself more productive. Alternatively, the knowledge stock of the economy could be a function of the cumulative output that the economy has produced up to now, thus giving it more of a avor of learning-by-doing. The reason why the externalities work through capital might be justied along the lines of the structural change model we will discuss below, where it is assumed that the manufacturing sector, which is more capitalintensive, is more important for generating externalities (whether this is so or not is not very clear, and in any case, there is no compelling evidence that such externalities are very large). In any case, substituting for (11.25) into (11.24) and using the fact that all rms are functioning at the same capital-eective labor ratio, we obtain the production function of the representative rm as Y (t) = F (K (t) , BK (t) L) .

Using the fact that F (, ) is homogeneous of degree one, we have Y (t) = F (1, BL) K (t) (L) . = f 258

14.451: Introduction to Economic Growth Alternatively, output per capita can be written as: y (t) Y (t) L Y (t) K (t) = K (t) L (L) , = k (t) f

where again k (t) K (t) /L is the capital-labor ratio in the economy. As in the standard growth model, marginal products and factor prices can be expressed (L). In particular, we have in terms of the normalized production function, now f 0 (L) w (t) = K (t) f and (L) Lf 0 (L) , R (t) = R = f which is constant. (11.27) (11.26)

11.3.2

Equilibrium

An equilibrium is dened similarly to the neoclassical growth model, as a path of consumption and capital stock for the economy, [C (t) , K (t)] t=0 that maximize the utility of the representative household and wage and rental rates [w (t) , R (t)] t=0 that clear markets. The important feature is that because the knowledge spillovers, as specied in (11.25), are external to the rm, factor prices are given by (11.26) and (11.27)that is, they do not price the role of the capital stock in increasing future productivity. Since the market rate of return is r (t) = R (t) , it is also constant. This immediately implies that consumption in this economy, given by the usual Euler equation, grows at the 259

= gC

It is also clear that capital grows exactly at the same rate as consumption, so the rate of

capital, output and consumption growth are all given by gC as given by (11.28).

1 0 (L) . f (L) Lf

(11.28)

so that there is positive growth, but also that growth is not fast enough to violate the transversality condition, in particular, (L) Lf 0 (L) < f + . 1 (11.30)

It is also straightforward to verify that as in the AK model above, there are no transitional dynamics in this model. This establishes: Proposition 32 In the above-described Romer model with physical capital externalities, as long as conditions (11.29) and (11.30) are satised, there exists a unique equilibrium path where starting with any level of capital stock K (0) > 0, capital, output and consumption grow at the constant rate (11.28). You can also see now why population was assumed constant in this model. To do this, rst, note that there is a scale eect here, in that when population (labor force) L is higher, (L) Lf 0 (L) is always increasing in L (by Assumption 1), the growth rate of the since f economy will increase. Moreover, if population is growing constantly, the economy will not admit a steady state and the growth rate of the economy will increase over time (output reaching innity in nite time and violating the transversality condition). 260

11.3.3

Given the presence of externalities, it is not surprising that the decentralized equilibrium characterized in Proposition 32 is not Pareto optimal. To characterize the allocation that maximizes the utility of the representative household, let us again set up on the currentvalue Hamiltonian, noting that the per capita accumulation equation for this economy can be written as =f (L) k c k. k The current-value Hamiltonian is h i c1 1 + f (L) k c k , H (k, c, ) = 1 and has the necessary conditions: c (k, c, ) = 0 = c H

h i Hk (k, c, ) = + = f (L) ,

These equations imply that the social planners allocation will also have a constant growth rate for consumption (and output) given by

S = gC

1 f (L) ,

(L) > f (L) Lf 0 (L). Esas given by (11.28)since f which is always greater than gC

sentially, the social planner takes into account that by accumulating more capital, she is improving the productivity in the future. Since this eect is external to the rms, the decentralized economy fails to internalize this externality. Therefore we have: 261

14.451: Introduction to Economic Growth Proposition 33 In the above-described Romer model with physical capital externalities, the decentralized equilibrium is Pareto suboptimal and grows at a slower rate than the allocation that would maximize the utility of the representative household.

262

The models discussed so far generated sustained economic growth, which is important both for understanding why some countries are much richer today than others, and the historical process of economic growth leading to the modern world. However, the process of economic development is not simply a linear sustained growth process. The process of development, as emphasized by Simon Kuznets, is also one of the transformation of the economy. Agriculture becomes less important, manufacturing becomes more important (and then later services become more important). Urbanization increases. Simultaneously, there is a process of coordination, or perhaps cumulative causation (where an economic process becomes selfsustaining once underway) going on, in which the increase in demand for certain goods and services (especially coming from cities), fuels further growth. Many economic, social and economic institutions also change in the process. To do justice to these topics, we need to delve much deeper into issues of development economics and political economy, which are 263

14.451: Introduction to Economic Growth beyond the scope of the current course. But, we can start getting a sense of these processes by quickly looking at models of economic development emphasizing multiple equilibria, or multiple steady states, and also looking at a very simple version of a model of structural change incorporating the features emphasized by Simon Kuznets.

12.1

Let us start with a very simple model of multiple equilibria arising from aggregate demand externalities. Below in discussing models of endogenous technological change, monopolistic competition will play a crucial role, since rms that discover new machines will become the monopolistic suppliers of these machines or of goods produced with these machines. However, the focus there will not be on multiple equilibria. Here we start with a simple two-period model of an economy with monopolistic competition, which will lead to multiple equilibria. The model is a version of Murphy, Shleifer and Vishnys (1989) Big Push paper. As the name of the paper suggests, the idea is to think of the development process as a move from one equilibrium to another, likely due to a coordinated move, a big push.

12.1.1

Consider the following two-period economy. All agents have preferences given by U=

1 1 C1 C 1 1 + 2 1 1

where C1 and C2 denote consumption at the two dates. plays a similar to before, with 1/ being the intertemporal elasticity of substitution, regulating how willing individuals are 264

14.451: Introduction to Economic Growth to substitute consumption between date 1 and date 2, and is the discount factor of the households. The resource constraint for the economy is C1 + I1 Y1 C2 Y2 , where I1 denotes investment in the rst date, Yt is total output at date t, and investment is only possible in the rst date. Individuals can borrow and lend, so an individuals budget constraint is c1 + c2 w2 + 2 w1 + 1 + , R R

where t denotes the prots accruing to the representative consumer, and wt is the wage rate at time t. R is the gross interest rate. Although individuals can borrow and lend, in the aggregate, the resource constraints have to hold, so R will be determined in equilibrium to ensure this. The new feature in this model is that output is an aggregate of intermediates. In particular, there is a continuum of dierentiated intermediate goods, with their total measure normalized to 1, and the aggregate production at time t is given by: Yt = Z

1

1

yt (i)

di

where yt (i) is the output level of intermediate i at date t. This production function has the standard love-for-variety feature rst introduced by Dixit and Stiglitz. This functional form can be used either for aggregating intermediates or directly as a utility function. Its advantage is that it provides an extremely tractable model of substitution between dierent 265

14.451: Introduction to Economic Growth goods, both with competition and monopolistic competition, because the elasticity of demand for each good is constant. We will make extensive use of these preferences in the rest of the course. For now, note that is the elasticity of substitution between intermediate goods within a given period, and is assumed to be strictly greater than one, i.e., > 1. The economy has a total labor supply of L, supplied inelastically. The production function of each good is as follows: y1 (i) = l1 (i) and l (i) with old technology 2 y2 (i) = l (i) with new technology 2 Z

(12.1)

where > 1 and lt (i) denotes labor devoted to the production of intermediate good i at time t. Labor market clearing, naturally, requires

1

lt (i) di L

(12.2)

At date 1, there is a designated producer for each intermediate, but a competitive fringe can also enter and produce each good as productively as the designated producer. At date 1, the designated producer can also invest in the new technology, which costs F per rm. If this investment is undertaken, this producers productivity at date 2 will be higher by a factor as indicated by equation (12.1). In contrast, the fringe will not benet from this technological improvement, thus the designated producer will have some degree of monopoly power. All rms are assumed to be owned equally by all the consumers. They will maximize prots taking the market prices (especially the market interest rate) as given. 266

12.1.2

Equilibrium

Since this is a two-period economy, we will be looking for a subgame perfect equilibrium. Moreover, to simplify the discussion, let us focus on symmetric subgame perfect equilibria, SSPE. An SSPE consists of an allocation of labor across rms, investment decisions for rms, wages for both periods and an interest rate linking consumption between the two periods. First, since all goods are symmetric, the rst period labor market clearing is straightforward and we will have l1 (i) = L for all i [0, 1] (recall that the measure of sectors and rms is normalized to 1). This implies that Y1 = L. At date 2, the equilibrium will depend on how many rms have adopted the new technology. Since we are looking at the symmetric equilibrium (SSPE), we only consider the two extremes where all rms adopt and no rm adopts. In either case, again the marginal productivity of all sectors are the same, so labor will be allocated equally, i.e., l2 (i) = L for all i [0, 1] . Consequently, when the technology is not adopted, we have Y2 = L and when the technology is adopted by all the rms, we have Y2 = L. We now turn to the pricing decisions. In the rst date, the designated producers have no monopoly power because of the competitive fringe, thus they charge price equal to marginal 267

14.451: Introduction to Economic Growth cost, which is w1 , and make zero prots. Since total output is equal to Y1 = L, this also implies that the equilibrium wage rate is equal to w1 = 1. In the second date, if the technology is not adopted, the same situation repeats, and we have w2 = 1 and no prots. In this case there is also no investment, so consumption at both dates is equal to L, thus the interest rate that makes individuals happy to consume this amount in both periods is = 1 . R To see this more formally, recall that the standard Euler equation in this case is

= RC2 , C1

(12.3)

(12.4)

as given in (12.3). which can only be satised with C1 = C2 , if the gross interest rate is R Next consider the situation in which the designated producers have invested in the advanced technology. Now they can produce units of output with one unit of labor, while the fringe of competitive rms still produces one unit of output with one unit of labor. This implies that the designated producers have some monopoly power. The extent of this monopoly power depends on the comparison of and . Let us rst nd the demand facing each producer, which is given as a solution to the following program of prot maximization for the nal goods sector: max Z

1

Z 1 1 di p2 (i) y2 (i) di,

[y2 (i)]i[0,1]

y2 (i)

268

14.451: Introduction to Economic Growth where p2 (i) is the price of intermediate i at date 2. The rst-order condition to this program implies y2 (i)1/ Y2 or y2 (i) = (p2 (i)) Y2 . (12.5)

1/

= p2 (i) ,

This expression is useful in laying the foundations for the aggregate demand externalities, which we will discuss soon; the demand for good i depends on the total amount of production, Y2 . [However, you should ask yourself why this actually causes an externality; even with perfectly competitive markets, the demand for my goods may depend on the supply of other goods in the economy. So why is there an externality here?] A nice feature of the demand curve implied by equation (12.5) is that it is iso-elastic (i.e., the demand elasticity is constant). This will be a very convenient feature in many of the models using this class of utility or production functions below. To make more progress, rst imagine the situation in which there is no fringe of competitive producers. In that case, each designated producer will act as an unconstrained monopolist and maximize its prots given by price minus marginal cost times quantity, i.e., w2 y2 (i) . 2 (i) = p2 (i) substituting from (12.5), the rm maximization problem is w2 (p2 (i)) Y2 , max 2 (i) = p2 (i) p2 (i) which has a rst-order condition (p2 (i))

This is the standard monopoly price formula of a markup related to demand elasticity over the marginal cost, w2 /. Here the markup is constant because the demand elasticity is constant. However, the monopolist can only charge this price if the competitive fringe could not enter and make prots stealing the entire market at this price. Since the competitive fringe can produce one unit using one unit of labor, the monopolist can only charge this price if 1 1. 1 Otherwise, the price would be too high and the competitive fringe would enter. Let us assume that is not so high as to make the monopolist unconstrained. In other words, let us impose Assumption 13 1 > 1. 1 Under this assumption, the monopolist will be forced to charge a limit price. It is straightforward to see that this equilibrium limit price would be p 2 = w2 . If it were any higher, the competitive fringe would enter, steal the whole market and make positive prots. If it were any lower, the monopolist could increase its price without losing the market, and thus increase its prots. This implies that under Assumption 13, each monopolist would make per unit prots equal to w2 w2 1 = w2 . 270

14.451: Introduction to Economic Growth The prots of rms are then obtained from substituting from (12.5) as: 2 = 1 1 w2 Y2 . (12.6)

The wage rate can be determined from income accounting. Total production will be equal to Y2 = L, and this has to be distributed between prots and wages, thus 1 1 w2 L + w2 L = L, which has an equilibrium at w2 = 1, as in the case without the technological investments. Therefore, in this economy the increased marginal product does not translate into higher wages. Instead, it leads to prots for rms. Nevertheless, all of these prots are redistributed to the agents, who are the owners of the rms. Thus C2 = L. However, because there was investment in the new technology at date 1, C1 = L F . Again the interest rate has to adjust so that individuals are happy to consume these amounts, i.e., so that they have a steep consumption prole without wanting to borrow. The Euler equation, (12.4), now implies (L) , (L F ) = R which solves for = 1 R (12.7)

L LF

> R.

Consequently, the interest rate in this case is higher than the one in which there is no investment. This is natural, since investment implies that individuals are being asked to forgo date 1 consumption for date 2 consumption. Note also that the greater is , the higher 271

14.451: Introduction to Economic Growth , since with a greater , there is less intertemporal substitution. Also a higher F , meaning is R a greater consumption sacrice at date 1 implies a higher interest rate. The question is whether rms will nd it protable to undertake the investment at date 1. The reason for the possibility of multiplicity is that the answer to this question will depend on whether other rms are undertaking the investment or not. Let us rst take a situation in which no other rm is undertaking the investment, and consider the incentives of a single designated rm to undertake such an investment. In this case total output at date 2 is equal to L (since the rm considering investment . Moreover, from (12.6) and the is innitesimal), and the market interest rate is given by R fact that w2 = 1, prots at date 2 are N 2 = 1 L.

where the superscript N denotes that no other rm is undertaking the investment. Therefore, the net discounted prots at date 1 for the rm in question is N = F + 1 1 L R 1 L. = F +

Next consider the case in which all other rms are undertaking the investment. In this case, prots at date 2 are I 2 = ( 1) L, where the superscript I designates that all other rms are undertaking the investment. Consequently, the prot gain from investing at date 1 is I = F + 1 ( 1) L R L ( 1) L. = F + LF 272

14.451: Introduction to Economic Growth As discussed above, the idea of the paper by Murphy, Shleifer and Vishny (1989), similar to the ideas of many economists writing on development before them, was to generate multiple equilibria, where one of the equilibria corresponds to backwardness, while the other one corresponds to industrialization. In this context, this means that for the same parameter values both no investment in the new technology and all rms investing in the new technology should be equilibria. This is only possible if we have N < 0 and I > 0, (12.8)

that is, when nobody else invests, investment is not protable, and when all other rms invest, investment is protable. This is clearly possible because of the aggregate demand externality, the fact that I > N ; when other rms invest, they produce more, there is more aggregate demand, and therefore prots from having invested in the new technology are higher. Counteracting this eect is the fact that the interest rate is also higher when all rms invest. Therefore, the existence of multiple equilibria requires the interest rate eect not to be too strong. For example, in the extreme case where preferences are linear, i.e., = 0, we have that I = F + ( 1) L > N = F + 1 L,

so (12.8) is certainly possible. More generally, the condition for the existence of multiple equilibria is that: L LF 1 L.

( 1) L > F >

(12.9)

It is also straightforward to see that whenever both equilibria exist, the equilibrium with investment Pareto dominates the one without investment, since condition (12.9) implies that all households are better o with the upward sloping consumption prole giving them higher consumption at date 2. 273

14.451: Introduction to Economic Growth This leads to the following result: Theorem 40 Consider the above-described environment and suppose that Assumption 13 holds and condition (12.9) is satised. Then there exist two pure strategy SSPE, one in which all rms undertake the investment at date 1 and the other one in which no rm does. The equilibrium with investment Pareto dominates the equilibrium without investment. Intuitively, multiple equilibria arise because (when) there is substantial aggregate demand at date 2 so that investing in the new technology at date 1 is protable. In turn, there will be substantial aggregate demand at date 2 when all rms invest in the new technology, so that they are more productive and produce more at date 2. This intuition highlights the importance of aggregate demand linkages. In fact, as noted above, these linkages take the form of aggregate demand externalities. The reason why they take the form of externalities is that the rm does not realize the full increase in the social product created by its investment, because the monopoly markup implies that at the margin, further increases in output create a rst-order gain for consumers. The presence of the markup means that the monopolist does not internalize this rst-order gain, thus turning the demand linkages into aggregate demand externalities. One interpretation of this result is that societies that can somehow coordinate on the equilibrium with investment (either because private expectations are aligned or because of some type of government action) will industrialize and realize both economic growth and Pareto improvement, and this corresponds to the big push ideas suggested by qualitative accounts of the early development process, for example, that provided by economists such as Nurske or Rosenstein-Rodan. Naturally, the model here is essentially a static one, so it does not allow a literal interpretation of a society being rst in the no investment equilibrium 274

14.451: Introduction to Economic Growth and then changing to the investment equilibrium and thus industrializing. Nevertheless, it is suggestive of such a process. Also, although the model makes it sound as if simple government action, for example, in the form of subsidies to rms, might realize such a big push, in practice government intervention is not easy, partly because it is not clear which sectors need to be subsidized, and perhaps more importantly because government interventions are often captured by interest groups, a topic that brings us to the political economy of development and economic growth.

12.2

The previous section illustrated the potential of development traps because of aggregate demand externalities. Investment by dierent rms may require coordination, leading to multiple equilibria. Underdevelopment may be thought to correspond to a situation in which the coordination is on the bad equilibrium, and the development process starts with the big push, changing the coordination to the high-investment equilibrium. Similar issues arise, in a more dynamic way, when the economy is subject to credit market problems. Moreover, credit market problems will illustrate how the distribution of income (and the incidence of poverty) in a society might aect economic growth and the process of economic development. I will illustrate these issues in the simplest possible way looking at the eect of credit market problems on human capital investments. 275

12.2.1

When credit markets are imperfect, a major determinant of human capital investments will be the distribution of income (as well as the degree of imperfection in the credit markets). I start with a discussion of the simplest case with no borrowing (extreme credit market problems) to illustrate how the distribution of income will matter, and may also selfperpetuate. Consider an economy with a continuum 1 of dynasties. Each individual lives for two periods, childhood and adulthood, and gets an ospring in his adulthood. There is consumption only at the end of adulthood. Preferences are given by

i (1 ) log ci t + log et+1

where c is consumption at the end of the individuals life, and e is the educational spending on the ospring of this individual. The budget constraint is

i i ci t + et+1 wt ,

where w is the wage income of the individual. There are a number of important features embedded in this utility function: 1. Even though it is a very similar utility function to that we worked with in the overlapping generations model, now the utility function refers to the utility that an individual obtains from his consumption and the indirect utility he obtains from leaving something to his ospring. In other words, this utility function features impure altruism (sometimes referred to warm glow preferences): parents do not care about the utility of their ospring, but simply about what they bequeath to them, here education. 276

14.451: Introduction to Economic Growth 2. It is logarithmic, which, as with the two-period overlapping generations model, will lead to constant savings rates. The labor market is competitive, and wage income simply depends on human capital:

i = Ahi wt t

Human capital of the ospring of individual i of generation t in turn is given by (ei ) if ei 1 t t i , ht+1 = h if ei < 1 t

(0, 1) is some minimum level of human capital that the individual where (0, 1) and h will attain even without any educational spending. Once spending exceeds a certain level (here set equal to 1), the individual starts beneting from the additional spending and accumulates further human capital (though with diminishing returns since < 1). This equation introduces a crucial feature necessary for models of credit market imperfections to generate multiple equilibria or multiple steady states; a nonconvexity in the technology of human capital accumulation. The budget constraint of individual i of generation t is:

i i ci t + et wt .

Given this description, the equilibrium is straightforward to characterize. Each individual will choose the spending on education that maximizes its own utility. This immediately implies the following savings rate:

i i ei t = wt = Aht .

(12.10)

277

14.451: Introduction to Economic Growth This rule has one unappealing feature (not crucial for any of the results), which is that because parents derive utility from educational spending on their children, they will spend on education even when ei t < 1, in which case educational spendings are in fact wasted (do not translate into higher human capital of the ospring). To obtain stark results let us also assume that A > 1 > Ah. (12.11)

Now, let us look at the dynamics of human capital for a particular dynasty i. If at time

1 i i 0, we have hi 0 < (A) , then (12.10) implies that et < 1, so the ospring will have h1 = h. 1 1 i Given (12.11), we have hi 1 = h < (A) , and repeating this argument, we have ht < (A) 1 for all t. Therefore, a dynasty that starts with hi will never reach a human capital 0 < (A)

1 i i Next consider a dynasty with hi 0 > (A) . Then from (12.11), we have h1 = (Ah0 ) >

1, so this dynasty will accumulate human capital and reach the steady state given by h = (Ah ) or h = (A) 1 > 1.

(as long as hi 0 < h ; otherwise, the dynasty would have started with too much human capital

and would decumulate human capital). The most important result is that this simple model features poverty traps due to the nonconvexities created by the credit market problems. It is interesting to contrast two economies subject to the credit market problems, but with dierent distributions of income. For example, imagine an economy with two groups starting at income levels h1 and h2 > h1 such that (A)1 < h2 . Now if inequality (poverty) is high so that h1 < (A)1 , a signicant fraction of the population will never accumulate 278

14.451: Introduction to Economic Growth much human capital. In contrast, if inequality is limited so that h1 > (A)1 , all agents will accumulate human capital, eventually reaching h . An important implication of this model is that the distribution of income and how credit markets work are important for human capital accumulation and the process of economic growth. This model and the next one (with imperfect capital markets) are sometimes interpreted as implying that an unequal distribution of income will lead to lower output (and growth), and the above example with two classes seems to support this conclusion. However, this is not a general result. For example, take the same economy with two classes, now starting with h1 < h2 < (A)1 . In this case, neither group will accumulate human capital, and redistributing resources away from group 1 to group 2 (thus increasing inequality), so that we push group 2 to h2 > (A)1 would increase human capital accumulation. This is a general feature: in models with nonconvexities, there are no general results about whether greater inequality is good or bad for accumulation and economic growth; it depends on whether greater inequality pushes more people to below or above the critical thresholds.

12.2.2

Now let us allow borrowing in the model above. Each individual still lives for two periods. In his youth, he can either work or acquire education. The utility function of each individual is

i (1 ) log ci t + log bt ,

where again c denotes consumption at the end of the life of the individual. The budget constraint is

i ci t + bt m

279

14.451: Introduction to Economic Growth where m is the individuals income. Note that utility of the parent now depends on monetary bequest to the ospring rather than the level of education expenditure. It will now be the individuals themselves who will use the monetary bequests to invest in education. Also, the logarithmic formulation will once again ensure a constant savings rate equal to . Education is a binary outcome, and educated (skilled) workers earn wage ws while uneducated workers earn wu . The required education expenditure to become skilled is h, and workers acquiring education do not earn the unskilled wage, wu , during the rst period of their lives. Imperfect capital markets are modeled by assuming that there is some amount of monitoring required for loans to be paid back. This creates a wedge between the borrowing and the lending rates. In particular, assume that there is a linear savings technology open to all agents, which xes the lending rate at some constant r. However, the borrowing rate is i > r, because of costs of monitoring necessary to induce agents to pay back the loans. Also assume that ws (1 + r) h > wu (2 + r) (12.12)

which implies that investment in human capital is protable when nanced at the lending rate r. Let us now consider an individual with wealth x. If x h, assumption (12.12) implies that individual will invest in education. If x < h, then whether it is protable to invest in education are not will depend on the wealth of individual and the borrowing interest rate, i. Let us now write the utility of this agent (with x < h) in the two scenarios, and also the 280

14.451: Introduction to Economic Growth bequest that he will leave to his ospring. These utility levels and bequests are given by Us (x) = log (ws + (1 + i) (x h)) + D bs (x) = (ws + (1 + i) (x h)) , when he invests in education. And Uu (x) = log ((1 + r) (wu + x) + wu ) + D bu (x) = ((1 + r) (wu + x) + wu ) , when he chooses not to invest. D is a constant term. Comparing these expressions we obtain that an individual likes to invest in education if and only if xf (2 + r) wu + (1 + i) h ws ir

The dynamics of the system can then be obtained simply by using the bequests of unconstrained, constrained-investing and constrained-non-investing agents. More specically, the equilibrium correspondence describing equilibrium dynamics is b (x ) = (ws + (1 + r) (xt h)) if xt h n t xt+1 = (12.13) bs (xt ) = (ws + (1 + i) (xt h)) if h > xt f b (x ) = ((1 + r) (w + x ) + w ) if xt < f u t u t u Equilibrium dynamics can now be analyzed diagrammatically by looking at the graph of (12.13). Note an important feature here. The correspondence (12.13) describes the behavior of the wealth of each individual. However, the whole wealth distribution can also be studied from (12.13). This is because dynamics in this economy are Markoviandescribed simply by the Markov process without any general equilibrium interactions. 281

14.451: Introduction to Economic Growth Now dene g as the intersection of the equilibrium correspondence (12.13) with the 45 degree line, when the equilibrium correspondence is steeper than the 45 degree line. Such an intersection will exist when the borrowing interest rate, i, is large enough.

All individuals with xt < g converge to the wealth level x U , while all those with xt > g converge to the greater wealth level x S . As in the example without credit markets, there is a poverty trap which attracts agents with low initial wealth. The distribution of income again has a potentially rst-order eect on the income level of the economy. If the majority of the individuals start with xt < g , the economy will have low productivity, low human capital and low wealth. It is also clear that nancial development should matter for human capital investments. In an economy with better nancial institutions, we may expect the wedge between the borrowing rate and the lending rate to be smaller, i.e., i to be smaller given r. With a smaller i, more agents will escape the poverty trap, and in fact the poverty trap may not exist (there may not be an intersection between (12.13) and the 45 degree line where (12.13) is steeper). 282

12.3

As mentioned above, an important element of the process of economic development, especially starting from the early stages of development, is that of structural change. Pretty much all societies have started as agricultural economies, and have grown together with a transformation of the economy, with the share of output of manufacturing (and services) increasing. The most standard reason for this is thought to be Engels law, which is the name given to the feature that the budget share of food declines as individuals become richer. Here I will outline a model by Matsuyama (1992), which incorporates both this feature and the possibility of learning-by-doing as an important factor in economic growth.

12.3.1

Consider the following continuous time economy, consisting of two sectors: manufacturing and agriculture. Both sectors produce using only labor. Population is constant and equal to L = 1, and labor is supplied inelastically. Technologies in the two sectors are given by the following diminishing returns production functions X M (t) = M (t) F (n (t)) F (0) = 0, F 0 > 0, F 00 < 0, X A (t) = AG(1 n (t)), G(0) = 0, G0 > 0, G00 < 0, (12.14) (12.15)

where n (t) is the fraction of labor employed in manufacturing as of time t. This way of writing the two production functions already imposes market clearing in the labor market. Notice that agricultural productivity, A, is not indexed by time, hence it is constant. 283

14.451: Introduction to Economic Growth Manufacturing productivity, M (t), is time-varying. In particular, as in the Romer (1997) model discussed above, M (t) reects knowledge accumulation taking place as a noninternalized byproduct of production. Moreover, Matsuyama assumes that this knowledge accumulation benets only from production in the manufacturing sector, for example, because greater production in manufacturing allows learning-by-doing in this sector, increasing future productivity. More specically, we have: (t) = X M (t) , M where > 0 measures the extent of these learning-by-doing eects. As in the Romer model, learning-by-doing eects are external to individual rms. Consequently, each rm will choose its labor demand in order to equate the value of the marginal product to the wage rate, w (t). Assuming an interior solution, this implies, w (t) = AG0 (1 n (t)) and w (t) = p (t) M (t) F 0 (n (t)) where p (t) is the relative price of the manufactured good (with the price of the agricultural goods normalized to 1 as the numeraire). Therefore, market clearing implies: (12.16)

AG0 (1 n (t)) = p (t) M (t) F 0 (n (t)). The economy admits a representative consumer with preferences given by Z W = log(cA (t) ) + log(cM (t)) exp (t) dt,

0

(12.17)

(12.18)

with , and > 0, and cA (t) denoting the consumption of the agricultural good and cM (t) denoting the consumption of the manufacturing good at time t. The parameter is the discount factor, and designates the importance of agricultural goods versus manufacturing 284

14.451: Introduction to Economic Growth goods in the utility function. The parameter is the new one relative to models we have seen so far and represents the subsistence level of food consumption. In particular, imagine that if cA (t) does not exceed , the individual will obtain negative innite utility (recall log (negative number) is undened). The presence of > 0 makes preferences non-homothetic and implies that the income elasticity of demand for agricultural goods will be less than unity (while that for manufacturing goods will be greater than unity). This is the simplest way of introducing Engels law. Let us also assume that AG(1) > L > 0. (12.19)

The rst inequality states that the economys agricultural sector is productive enough to provide the subsistence level of food to all consumersotherwise individuals would receive negative innite utility. The budget constraint of consumers in each period is cA (t) + p (t) cM (t) w (t) + (t) where (t) is the prots per representative household.

12.3.2

Equilibrium

An equilibrium is dened in the standard way as a sequence of consumption levels in the two sectors and allocation of labor between the two sectors at all dates, such that consumers maximize their utility and rms maximize prots given prices, and goods and factor prices are such that all markets clear. 285

14.451: Introduction to Economic Growth Maximization of (12.18) implies that for each household, and thus for the entire economy, we have cA (t) = + p (t) cM (t) . Since production has to be equal to consumption, we further have: cA (t) = X A (t) = AG(1 n (t)) and cM (t) = X M (t) = M (t) F (n (t)) Now combining these equations with (12.17) and (12.20) yields (n (t)) = /A, where (n) G(1 n) G0 (1 n)F (n)/F 0 (n). Moreover, we have (0) = G(1), (1) < 0 and 0 () < 0. The function (n) can be interpreted as the excess demand for manufacturing over agriculture. An equilibrium has to satisfy (12.21). From Assumption (12.19) it is clear that the equilibrium condition (12.21) has a unique interior solution in which n (t) (0, 1) . Since the right-hand side of (12.21) is decreasing in A, this solution can be written as a function of agricultural productivity, A: n (t) = v(A), with v 0 (A) > 0. 286 (12.22) (12.21) (12.20)

14.451: Introduction to Economic Growth This implies that the employment share of manufacturing is constant over time and positively related to A. This is not in line with the patterns we observe in the data, where the manufacturing share of employment also increases early on (and then declines while the share of services increases). However, given the learning-by-doing aspect, it generates another feature which is consistent with the empirical patterns in the data; the share of manufacturing output (and consumption of manufacturing goods) increases relative to those of agriculture. In particular, given the learning-by-doing in equation (12.16), output in manufacturing grows at a constant rate, F (v(A)), also positively related to A. This is an interesting observation, and shows that the growth rate of output in manufacturing is positively related to productivity in agriculture. This observation is consistent with some historical accounts of the development process, which emphasize how economies with high agricultural productivity were those that were able to make the transition to manufacturing. In those accounts and in this model, the reasoning is simple: manufacturing requires a suciently large size of employment to grow rapidly (either for creating aggregate demand externalities or for learning-by-doing), and this can only be achieved if agriculture is productive enough that sucient food can be produced by a relatively small fraction of the workforce. Since productivity and employment in agriculture are constant, aggregate food consumption and production stay constant at cA = X A = AG(1 v(A)) = + AG0 (1 v(A)) F (v(A)) . F 0 (v (A))

which is also increasing in A; this implies that higher agricultural productivity also increases agricultural consumption. Therefore, this discussion leads to the following simple result: 287

14.451: Introduction to Economic Growth Proposition 34 In the above described model, the combination of learning-by-doing and Engels law generate a unique equilibrium in which the share of employment of manufacturing and agriculture are constant, and manufacturing output and consumption grow faster than agricultural output and consumption. The growth rate of real consumption of agriculture is zero, while the growth rate of manufacturing output is F (v(A)). Although this proposition shows that the real consumption of agricultural goods is constant (and that of manufacturing goods is increasing), the expenditure on agricultural goods will not remain constant, because relative prices will change in favor of agricultural goods. This is a general phenomenon, in fact independent of Engels law; sectors that experience slower growth will also experience increases in their relative prices, because their output is becoming scarcer in the economy.

288

The analysis so far treated each country as a closed island, not interacting with the rest of the countries in the world. This is clearly not the correct way to view the world. In this chapter, we have a rst look at some models of interdependences. First, I begin with a model of technology transfer from an exogenously advancing world technology frontier. Then, I discuss a model of technology transfer and trade. Finally, I look at how international trade inuences the process of economic growth, creating interdependences across growing countries.

13.1

The Nelson-Phelps model is the simplest model of technology diusion across countries, and has proved a useful reduced-form model for many applications. In addition to its growth 289

14.451: Introduction to Economic Growth applications, the Nelson-Phelps model also suggests a new role of human capital, dierent from those emphasized by the Mincer equations we used in order to understand the role of human capital in contributing to cross-country income dierences. Imagine that there is a world technology frontier, T (t), advancing at an exogenous rate g , i.e., T (t) = T (0) exp (gt) . Countries can benet from this world technology by incorporating it into their production processes. But this is a human capital-intensive task. For example, a country needs highly skilled engineers to adapt world technologies to their conditions, to ll key positions in the implementation of these technologies and to train workers in the use of these new techniques. Nelson and Phelps postulate j (t) (hj ) (T (t) Aj (t)) A = , Aj (t) Aj (t) where hj is the human capital in country j , which is assumed to be time invariant. This equation states that the farther a country is from the world technology frontier, the faster is its rate of progress, since there is more technology out there to be absorbed. But also 0 (hj ) > 0 so that, the greater the human capital of a country is, the faster will this convergence be. The rst implication of (13.1) is that j (t) /Aj (t) 2A > 0, T (t) (hj ) so that human capital becomes more valuable when frontier technology is more advanced. Second, note that although equation (13.1) is in terms of technological progress, it does have a unique stable stationary distribution as long as (hj ) > 0 for all countries. In the 290 (13.1)

14.451: Introduction to Economic Growth stationary state, all Aj (t)s will grow at the same rate g, and this stationary cross-country distribution is given by Aj (t) = (hj ) T (t) . g + (hj ) (13.2)

Suppose now that output in each country is proportional to Aj (t). Equation (13.2) then implies that countries with low human capital will be poor, because they will absorb less of the frontier technology. This eect is in addition to the direct productive contribution of human capital to output, and suggests that human capital dierences across countries can be more important in causing income dierences than calculations based on private returns to schooling might suggest.

13.2

A more subtle and in many ways more useful model of technology transfer is that of Krugman (1979), which is also useful for our purposes because it combines interdependences due to technology transfer with those arising from international trade.

13.2.1

Consider two sets of economies, North and South. All individuals in all countries have the same Dixit-Stiglitz preferences with love for variety given by C= Z

M

1

c (i)

di

where c (i) is the consumption of the ith good, M is the total number of goods that will be determined endogenously, and > 1 is the elasticity of substitution between these goods. 291

14.451: Introduction to Economic Growth There is free international trade between countries. Goods fall into two categories: new goods are just invented in the North and can only be produced there; old goods have been invented in the past and their production technology has been transferred to the South, so they can be produced both in the South and in the North. One worker produces one unit of any good to which the country in which he is located has access to. Workers in the North have access to all goods, but workers in the South only have access to old goods. It is important to emphasize that when producing old goods, Northern workers have no productive advantage. Their only advantage (and the only dierence in technology) arises because they have access to a larger set of goods. There can be two types of equilibria. In the rst equilibrium, there are suciently few new goods that both workers in the South and the North will produce some of the old goods, and in this case, both new goods and old goods will command the same price, and incomes in the North and South will be the same (why?). Another possibility is that the South specializes in the production of old goods, while the Northern producers specializes in the production of new goods. In this case prices and wages will satisfy pS = wS pN = wN > wS where pS is the price of the old goods produced in the South, and pN denotes the price of new goods produced in the North. There will now be income dierences arising from technology dierences across countries. 292

14.451: Introduction to Economic Growth When will we be in this full specialization regime? To answer this question, note that from the rst-order condition of consumers the relative consumption of new and old goods have to satisfy cN = cS pN pS wN wS

(13.3)

cN =

LN LS and cS = MN MS

where LN is total labor force in the North, and LS is the total labor force in the South. MN is the total number of new goods (produced in the North) and MS is the total number of old goods. Combining this with (13.3) we obtain 1

wN = wS

LN MS LS MN

For this type of equilibrium to exist, we need wN /wS > 1, and this situation is also drawn diagrammatically in the next gure as the intersection of the relative demand curve for Northern labor with the relative supply curve at LN /LS . Note that wN /wS > 1 corresponds to an intersection when the relative demand services downward sloping. Instead the at portion of the relative demand curve corresponds to the case where there is no full specialization, and hence some of the old goods are produced in the North (and wN /wS = 1). 293

So if there is a suciently large technology gap between the North and South, Northern wages and incomes will be higher. What determines the number of new and old goods? Krugman developed a model to analyze this, formalizing an idea due to Vernon on the product cycle across countries. In particular, suppose that new goods are created according to the following Poisson process = iM M and these goods are imitated by the South slowly, according to the Poisson process S = tMN M and recall that M = MS + MN In steady state, we need the number of new and old goods to grow at the same rate, i.e., 294

14.451: Introduction to Economic Growth S /MS = M/M . Then M MS t = . MN i Relative wages can be obtained as: wN = wS LN t LS i 1

In this economy, relative utility and relative incomes per capita are simply proportional to relative wages. It is straightforward to check that as i, the rate of creation of new technologies, increases wages (and incomes) in the North relative to the South will increase. As the rate of imitation, t, increases the North becomes relatively poor.

13.2.2

Next, consider a variation on this model without international trade. In this case, the number of goods produced and consumed in each country will dier. Standard arguments give incomes in the North and the South as

0 0 1 wN = M 1 and wS = MS

1 1

So relative wages and incomes in steady state will now be 1 0 wN i 1 = 1+ 0 wS t The relative income dierences are typically larger now. For example, to illustrate this point, consider the case in which

LN t LS i

South and the North will have the same level of income when there is international trade 295

14.451: Introduction to Economic Growth between the North and the South. In contrast, without international trade, the North will be richer. Intuitively, trade between the North and the South enables the Southern consumers to consume goods that they did not have access to, eectively increasing their real incomes. In the absence of trade, technology dierences will typically matter more! (But not always! Why?)

13.3

Perhaps the most major source of interaction between countries is through international trade. A number of papers investigate how international trade aects the process of economic growth and creates interdependences across countries. One example is Acemoglu and Ventura (2002), who develop a tractable framework for analyzing cross-country income dierences that incorporates international trade. Here I outline a version of that model. An additional lesson from this model is that the stability of the world income distribution and ndings of conditional convergence do not necessary rule out endogenous growth (recall that these patterns were used as evidence against endogenous growth models).

13.3.1

The Model

Consider a world economy consisting of a continuum of small countries with mass 1. There is a continuum of intermediate products indexed by z [0, M ], and two nal products that are used for consumption and investment. There is free trade in intermediate goods and no 296

14.451: Introduction to Economic Growth trade in nal products or assets. Countries dier in their technology, savings and economic policies. For example, country j will be dened by its characteristics (j , j , j ), where is an indicator of how advanced the technology of the country is, is its rate of time preference, and is a measure the eect of policies and institutions on the incentives to invest. The joint distribution of these characteristics is denoted by G(, , ) and is assumed to be time invariant. All countries admit a representative consumer with utility function: Z ln c(t) exp (t) dt ,

0

(13.4)

where c(t) is consumption at date t in the (, , )-country (this is the same as the CRRA preferences we have used so far with 1). The budget constraint of the representative consumer is + pC c = y rk + w, pI k (13.5)

where pI and pC are the prices of the investment and consumption goods, k is capital stock, r is the rental rate, and w is the wage rate, and also total wage income, since population in each country is normalized to 1. There is no depreciation of capital. Since there is no international trade in assets, income, y , must equal to consumption, pC c, plus investment, . pI k Specialization is introduced as follows: is assumed to be the number of intermediates produced by the (, , )-country, with Z (j ) dG (j ) = M, where I have explicitly introduced the j to emphasize that these refer to country j , but I will drop this notation below and talk of a representative country. 297

14.451: Introduction to Economic Growth A higher level of corresponds to the ability to produce a larger variety of intermediates, so we interpret as an indicator of how advanced the technology of the country is. In all countries, intermediates are produced by competitive rms using a technology that requires one unit of capital to produce one unit of any intermediate that belongs to that country. Each country also contains many competitive rms in the consumption and investment goods sectors with unit cost functions: 1 M Z BC (w, r, p (z )) = w(1 )(1 ) r (1 ) p(z )1 dz ,

0

1 M Z BI (r, p (z )) = 1 r1 p(z )1 dz ,

(13.6)

(13.7)

There are a number of noteworthy features introduced with these unit cost functions: 1. Labor is only used in the production of consumption goods. This is a convenient way of introducing endogenous growth following Rebelo (1991)the accumulation equation is linear. 2. I have written the unit cost functions for convenience. The underlying production functions are quite similar. For example, the investment good would be produced as follows

M Z 1 1 xI (z ) dz I = BKI

where B is a normalizing constant, KI is capital used in the production of the investment good, and xI (z ) is the quantity of the z th intermediate good used in the production of the investment good. 298

14.451: Introduction to Economic Growth 3. The parameter is the share of intermediates in production and it will also turn out to be the ratio of exports to income (i.e., a measure of openness). 4. The parameter is the elasticity of substitution among the intermediates and also the price-elasticity of foreign demand for the countrys products. The inverse of this elasticity is often interpreted as a measure of the degree of specialization. Assume that > 1, ruling out immiserizing growth, that is, the country becoming poorer despite accumulating more. 5. The parameter corresponds to an inverse measure of the distortions aecting investment (this corresponds to the tax distortions modeled as s in Jones (1995) and Chari, Kehoe and McGrattan (1997)).

13.3.2

Equilibrium

Consumer maximization of (13.4) subject to (13.5) yields the following rst-order condition r (t) + p I (t) p C (t) c (t) =+ , pI (t) pC (t) c (t) and the transversality condition: pI (t) k (t) exp (t) = 0. t pC (t) c (t) lim (13.9) (13.8)

Equation (13.8) is the standard Euler equation and requires the rate of return to capital, r+p I p C , to equal the rate of time preference plus the slope of the consumption path. pI pC The only dierence from the familiar version of the Euler equation is that, as in the twosector extended AK economy discussed above, now the rate of return to savings includes the relative change in the price of investment goods compared to consumption goods, since by 299

14.451: Introduction to Economic Growth investing in one unit of investment good today, an individual will receive income tomorrow which will be spent on consumption goods, whose price may have changed. Thus the term C p I p is the adjustment for this change in relative prices. pI pC Equation (13.9) is the transversality condition. Integrating the budget constraint and using the Euler and transversality conditions, the optimal rule is found to be to consume a xed fraction of wealth: t Z r (s) + p I (s) w (v ) exp ds dv . pI (s)

0

Z

0

(13.10)

Next consider rm maximization. The price of any variety of intermediate produced in the (, , )-country is equal to: p (t) = r (t) . Choose the ideal price index for intermediates as the numeraire, i.e.,

M Z 0 1

(13.11)

p(z )

dz =

p1 dG = 1.

(13.12)

Since all countries export practically all of their production of intermediates and import the ideal basket of intermediates, this choice of numeraire implies that p is also the terms of trade of the country, i.e. the price of exports relative to imports. The conditions for price to equal marginal cost for the consumption and investment sectors imply:

(13.13) (13.14)

14.451: Introduction to Economic Growth Finally, we need to impose market clearing for capital and labor as well as trade balance. By Walras law, one of these is redundant, and we drop market clearing for capital. Trade balance requires y = p1 Y. where Y R (13.15)

output, y , and exports p1 Y. Equation (13.15) implies that when the number of varieties, , is larger, a given level of income y is associated with better terms of trade, p, and higher rental rate of capital, since r = p. Intuitively, a greater implies that for a given level of aggregate capital stock, there will be less capital allocated to each variety of intermediate, so each will command a higher price in the world market. Conversely, for a given , a greater relative income y/Y translates into lower terms of trade, p, and a lower rental rate, r. Market clearing for labor is also straightforward. Labor demand comes only from

the consumption goods sector, and given the Cobb-Douglas assumption, this demand is (1 ) (1 ) times consumption expenditure, pC c, divided by the wage rate, w. So the market clearing condition for labor is: 1 = (1 ) (1 ) pC c . w (13.16)

Finally, because (13.16) implies labor income, w, is always proportional to consumption expenditure, the optimal consumption rule, (13.10), can be simplied to: pC c = pI k. 1 (1 ) (1 ) 301 (13.17)

14.451: Introduction to Economic Growth The state of the world economy is described by a distribution of capital stocks. This distribution of capital stocks can be obtained from the law of motion of the capital stock of each country: k = r , k (13.18)

(this law of motion simply follows from the budget constraints of the representative consumer, (13.5), combined with equilibrium conditions in (13.17)). In addition, the market clearing conditions also imply that for each country: Z

rk + w = r

(rk + w)dG.

(13.19)

(1 ) (1 ) w = . rk + w [ + (1 ) ] r + (1 ) (1 )

(13.20)

For a given cross-section of rental rates, the set of equations in (13.18) determine the evolution of the distribution of capital stocks. For a given distribution of capital stocks, the set of equations in (13.19) and (13.20) determine the cross-section of rental rates. It can now be shown that the world economy has a unique and stable steady state in which all countries grow at the same rate. /Y , and the relative income of a (, , )-country Dene the world growth rate as x Y

= y/y as yR y/Y . Then, setting the same growth rate for all countries, i.e., k/k = x , the steady-state cross-section of rental rates are: 1/ + x r = Moreover:

yR

(13.21)

= + x 302

(13.22)

1

dG = 1.

(13.23)

Equation (13.22) describes the steady-state world income distribution and states that rich countries are those which are patient (low ), create incentives to invest (high ), and have access to better technologies (high ). Equation (13.23) implicitly denes the steadystate world growth rate. This discussion establishes: Proposition 35 In the above-described world economy, there exists a unique steady state equilibrium in which all countries grow at the same rate x dened by (13.23), but have unequal levels of income, terms of trade and rates of return on capital. The terms of trade and the rental return on capital for each economy is given by (13.21) and the relative position of each country in the world income distribution is given by (13.22). The implications of this model and this proposition are described next.

13.3.3

Implications

The important implications of this analysis are: 1. There is a stable world income distribution, despite the fact that in the absence of international trade, each country would grow at dierent rates (e.g., consider the limiting case where = 0). 2. So why is there a stable world income distribution here? The reason is due to changes in relative prices. In the open economy, when a country accumulates more capital, it is supplying more of the goods that it produces to the world economy, experiencing a 303

14.451: Introduction to Economic Growth decline in its terms of trade. This reduces the return to capital and discourages further accumulation. When = 0, this relative price eect is absent, and each country grows at a dierent rate determined by its technology, distortions and savings rate. 3. Dierences in saving rates or distortions can have much larger eects than those implied by the standard neoclassical model. The strength of these eects depend on and , and they become arbitrarily large as or as = 0. The rst of these is the Heckscher-Ohlin limit, in which there are no decreasing returns coming from relative price changes. The second is the closed economy case, with standard endogenous growth, where small dierences will translate into innitely large level dierences (since they imply dierences in growth rates). 4. In the meantime, the share of capital in GDP is independent of this, determined largely by the share of consumption investment goods in income.

13.4

The above model incorporated trade between countries together with terms of trade eects. An alternative would be to incorporate trade assuming that each country is a small open economy. This is done in Ventura (1997). If each country is within the cone of diversication, this means there is factor price equalization, and thus each country takes factor prices as given. Imagine the world rate of return to capital is equal to r ; there is no trade in nancial assets (only in goods, which equalizes factor prices), and each country has identical preferences given by our standard CRRA formula. This implies that consumption growth in all 304

14.451: Introduction to Economic Growth countries will be given by 1 c j = (r ) . cj However, now imagine countries dier according to their patience, i.e., discount rate, j , as we allowed in the previous model. Then the above equation becomes c j 1 r j . = cj

In this case, more patient countries will have lower initial consumption but higher consumption growth, and therefore they will accumulate more capital and invest in their own country. Ultimately, the more patient countries will become much richer. This process will end either when the world moves out of the cone of diversication, or one country produces almost all of the output of the world economy. In fact, this feature that with given prices, the more patient country will ultimately become much richer than the rest of the world is more general than the open economy model outlined here. In a closed economy with individuals that have dierent discount rates, those with smaller discount rates (greater patience) will ultimately become much richer than the rest. In general, we tend to assume that all individuals have the same discount rates in order to ensure a stable income distribution within a country.

305

306

307

14.451: Introduction to Economic Growth Until now, we have investigated models of economic growth of exogenous or endogenous variety, but growth was never a result of the actual process of technological change. Either growth was exogenous, or it was sustained because of linear technology of accumulation, or growth took place as a byproduct of knowledge spillovers. Much more attractive are models in which growth is a consequence of technological change, and technological change is a consequence of purposeful investments by individuals. These models not only allow us to talk about the endogenous rates of technological progress, but they make contact with industrial organization models of technology, innovation, anti-trust, R&D policy etc., and also enable us to discuss issues of directed technical change. These models will be discussed in the next few chapters. Before going into details of the specic models, a general principle of this class of models is useful to highlight. As originally noted by Arrow (1962), and as assumed so far in all of the models we studied, knowledge is, in essence, a non-excludable and non-rival good. Once an idea about how to produce a new good or how to improve the productivity of a certain process is out there, many individuals and rms will have access to it, unless explicitly prohibited. Moreover, the fact that I am making use of a particular idea does not preclude other people from doing so, making knowledge not only non-excludable but also non-rival. This observation creates a problem in constructing models of purposeful innovation. In fact, as noted by Arrow, why would a competitive rm invest upfront resources to improve the production technology if other rms will also benet from this improvement (and it will still end up making zero prots)? Romers (1997) model we studied above tried to avoid this problem by making knowledge accumulation endogenous, but not a purposeful activity. It was a byproduct, an externality, created by production. Endogenous technological change models are explicitly about making knowledge accu309

14.451: Introduction to Economic Growth mulation endogenous. They break the paradox pointed out by Arrow by introducing monopolistic competition and patent rights. In particular, we will now be looking at models of monopolistic competition, where a rm that invents a new machine, a new product or a new production process will be protected under either a patent law or because nobody else will be able to replicate this invention without the specic know-how of the inventor. Such protection will enable the inventor to become a monopolist. The monopoly prots the inventor expects will, in turn, stimulate research and induce rms to make the upfront investments to improve productivity and generate growth. This insight that monopoly rights are important for innovation, which also goes back to Schumpeter, will be central to the models that follow, but it will also imply that private and social incentives for innovation will not be typically aligned.

310

The simplest models of endogenous technological change are those in which the variety of inputs used by rms increases (expands) over time as a result of R&D undertaken by research rms. The key is that the R&D is purposeful, undertaken for prots, and it leads to an output that increases the productivity of existing factors. Two versions of essentially the same model could be used. In the rst, research leads to the invention of new goods, and individuals have love-for-variety, so they derive greater utility when they have more goods available, so real income increases. In the second, which is the one I will use here, it is the variety of machines that expand (because of invention of new varieties), and a greater variety of machines leads to greater division of labor, increasing the productivity of nal good rms. In all of these models, and also in the models of quality competition we will see below, we will use the Dixit-Stiglitz constant elasticity structure. 311

14.1

We start with a particular version of the growth model with expanding varieties of inputs and an R&D technology such that only output is used in order to undertake research. This is sometimes referred to as the lab equipment model, since all that is required for research is additional investment in more equipment in labs etc.

14.1.1

Imagine an innite-horizon economy in continuous time admitting a representative household with preferences Z

0

(14.1)

Throughout I suppress time dependence when this causes no confusion. There is no population growth. The unique consumption good of the economy is produced with the following aggregate production function: 1 Y = 1 Z

N

k(v )

dv L

(14.2)

where L is the aggregate labor input, N denotes the dierent number of varieties of capital inputs, and k (v ) is the total amount of capital (machine) of input type v. The term (1 ) in the denominator is included for notational simplicity. Notice that for given N , which nal good producers take as given, equation (14.2) exhibits constant returns to scale. Therefore, nal good producers are competitive and subject to constant returns to scale, justifying our use of the aggregate production function to represent their production possibilities set. 312

14.451: Introduction to Economic Growth We simplify the analysis by assuming that the capital inputs are just like intermediate goods and they immediately depreciate after being used (thus it may be easier to think of them as intermediate goods instead of capital, though the machine interpretation may be nice for certain purposes). The budget constraint of the economy is C +I +X Y (14.3)

where I is investment and X is expenditure on R&D, which is for now assumed to come out of the total supply of the nal good. (Other models of R&D will be discussed below). Assume that the creation of new inputs takes place as follows: = X, N and the economy starts with some initial technology stock N (0) > 0. This implies that greater spending on R&D leads to the invention of new inputs. There is no uncertainty in this process, at least at the aggregate level. One may want to think that there is uncertainty at the individual level, but with many dierent research labs undertaking such expenditure, at the aggregate level, equation (14.4) holds deterministically. The important point is that R&D expenditure expands the potential set of capital/machine varieties. A rm that invents a new capital variety is the sole supplier of that type of machine, and sets its price (v ) to maximize prots. The demand for capital of type v is obtained by maximizing (14.2). Namely, simply considering the aggregate production function, the maximization problem for inputs is: Z N Z N 1 1 k(v) dv L (v) k (v )dv wL. max [k(v )]lv[0,N ] ,L 1 0 0 313 (14.4)

(14.5)

14.451: Introduction to Economic Growth Recall that machines depreciate fully after use, so (v) is also the user cost of machines, which is incorporated in the expression above. The rst-order condition with respect to k (v ) for any v [0, N ] yields the demand for machines from the nal good sector. These demands take the convenient isoelastic form: L k(v) = (v) 1/ . (14.6)

Assume also that, once the blueprint of a particular input is invented, the research rm can create one unit of that machine at marginal cost equal to units of the nal good. Now consider the monopolist owning a machine of type invented at time t. This monopolist chooses an investment plan and a sequence of capital stocks so as to maximize the present discounted value of prots starting from time t, as given by Z s Z exp r ( ) d [(, s)k(, s) k(, s)] ds V (, t) =

t t

(14.7)

where r (t) is the market interest rate at time t. Alternatively, assuming that the value function is dierentiable in time, this could be written as a dynamic programming equation of the form (, t) = (, t)k (, t) k(, t). r (t) V (, t) V (14.8)

14.1.2

To see why (14.8) follows from (14.7), you should think of the principle of optimality again (now in continuous time rather than discrete time). In particular, rewrite (14.7) at time t as: V (, t) = Z

t

Z exp

Z exp

r () d [(, s)k(, s) k(

14.451: Introduction to Economic Growth which is just an identity for any t. For suciently small t, this can be written as V (, t) ' t ((, t) ) k(, t) + exp (r (t) t) V (, t + t) 0 ' t ((, t) ) k(, t) + exp (r (t) t) V (, t + t) exp (r (t) 0) V (, t), whereexp (r (t) 0) = 1. Now divide both sides by t and take the limit t 0, which makes the approximation exact, giving ((, t) ) k(, t) + lim exp (r (t) t) V (, t + t) exp (r (t) 0) V (, t) = 0. t0 t

When the value function is dierentiable in time, this is equivalent to (exp (r (t) t) V (, t + t)) ((, t) ) k (, t) + = 0. t t=0 thus, applying the chain rule, (, t) = 0, ((, t) ) k(, t) r (t) V (, t) + V which is identical to (14.8).

14.1.3

Characterization of Equilibrium

Since (14.6) denes isoelastic demands, the solution to the maximization problem of the monopolist involves setting the same price in every period, (, t) = , 1

that is, all monopolists charge a constant rental rate, equal to a mark-up over the marginal cost. Without loss of generality, normalize the marginal cost of machine production to (1 ), so that (, t) = = 1 315

14.451: Introduction to Economic Growth Prot-maximization also implies that each monopolist rents out the same quantity of machines in every period, equal to k (v, t) = L, and makes prots (v, t) = ((, t) ) k (v, t) = L, (14.10) (14.9)

implying that all monopolists sell exactly the same amount, charge the same price and make the same amount of prots. Substituting (14.6) and the machine prices into (14.2), we obtain Y (t) = 1 N (t) L. 1 (14.11)

This is the major equation of the expanding product or input variety models. It shows that even though the aggregate production function is constant returns to scale from the viewpoint of nal good rms which take N as given, for the overall economy, there are increasing returns to scale and increases in the variety of machines, N , increase the productivity of output. In particular, (14.11) makes it clear that if N increases at the constant rate, so will output per capita. Similarly, the labor decision of the nal good sector, from the rst-order condition of maximizing (14.5) with respect to L, implies the following equilibrium condition w (t) = N (t) . 1 (14.12)

Finally, there is free entry into research. This implies that at all points in time we must have V (, t) = 1, 316 (14.13)

14.451: Introduction to Economic Growth where V (, t) is given by (14.7). Recall that one unit of nal good spend on R&D leads to the invention of units of new inputs, each making prots given by (14.7). Naturally, this free entry condition may be violated if research is so unprotable that nobody wants to enter, so it should really be written as a complementary slackness condition with V (, t) 1, X (v, t) 0 and (V (, t) 1) X (v, t) = 0, but for the relevant parameter values there will be entry and economic growth (though just technological change), so we simplify the exposition by writing it in the form of (14.13).

14.1.4

Denition of Equilibrium

[C (t) , X (t)] t=0 such that given the price path [r (t) , w (t)]t=0 , the representative household

is maximizing its utility given by (14.1), capital demands by the nal goods sector satisfy (14.9), the wage rate is given by (14.12), and the value of each monopolist, V (, t), satises (14.7) and (14.13).

14.1.5

Steady State

Let us start with the steady state. In the steady state, the value of an invention will be = 0, and also the interest rate will be constant, i.e., r (t) = r (where I constant, thus V again use stars to denote BGP/steady-state values). Substituting this in either (14.7) or (14.8), we obtain V = r (14.14)

where is the (constant) ow of net prots per period, given by (14.10) above. 317

14.451: Introduction to Economic Growth For there not to be further incentives to undertake R&D, we need one unit of nal good spent for R&D to generate exactly the same discounted value. Therefore, the no entry (free entry) condition (14.13) can be expressed as: L =1 r This equation pins down the steady-state interest rate, r , as: r = L From consumer maximization, in particular from the standard Euler equation, we also have that the rate of growth of consumption, gc , is given by

= gc

1 C = (r ) C

(14.15)

and in steady state, the rate of growth of the economy is the same as the rate of growth of

consumption, so we have that the whole economy grows at the rate g = gc .

Therefore, given the steady-state interest rate we can simply determine the long-run growth rate of the economy as: g = 1 (L ) (14.16)

Since this is a growing economy, we need to ensure that the transversality condition is satised in equilibrium. As usual, this requires r > g (since there is no population growth), i.e., (1 ) L < , which we assume holds. Notice that there is a scale eect here: the larger is L, the greater is the growth rate. The scale eect comes from the increasing returns to scale nature of the technology of the model 318 (14.17)

14.451: Introduction to Economic Growth of endogenous technical change (this is a point related to the non-rival nature of knowledge, emphasized in Romer, 1990). I will return to the issue of the scale eect further below. This discussion establishes: Proposition 36 In the above-described expanding input-variety model of endogenous technological change, there exists a unique steady state in which technology, output and consumption all grow at the same rate given by (14.16).

14.1.6

Transitional Dynamics

It is also straightforward to see that there are no transitional dynamics in this model. To see this, let us go back to the value function for each monopolist. Substituting for prots, this gives (, t) = L. r (t) V (, t) V Free entry gives V (, t) = 1. (, t) = 0, which is only Dierentiating this with respect to time immediately implies V consistent with r (t) = r for all t, thus r (t) = L for all t. This establishes: Proposition 37 In the above-described expanding input-variety model of endogenous technological change, with initial technology stock N (0) > 0, there is a unique equilibrium path in which technology, output and consumption always grow at the rate g as in (14.16). 319

14.451: Introduction to Economic Growth In other words, exactly as in the AK model, the economy always grows at a constant rate. At some level this is not surprising, since the derived equation for output, (14.11), is essentially a linear AK production function.

14.1.7

The presence of monopolistic competition implies that the competitive equilibrium is no longer Pareto optimal. There is a version of the aggregate demand externalities we saw in the static context in previous lectures. It is straightforward to set up the problem of the social planner and derive the optimal growth rate. To do this, notice that the social planner will also use the same quantity of all types of machines in production, but because of the absence of a markup, this quantity will be dierent. The social planner will also take into account the eect of an increase in the variety of inputs on the overall productivity in the economy, which monopolists could not because they do not capture the full surplus from inventions. More explicitly, given N , the social planner will choose Z N Z N 1 1 k(v) dv L k (v )dv wL, max [k(v )]lv[0,N ] ,L 1 0 0 which only diers from the private maximization problem because the marginal cost of machine creation, , is used. Recalling that 1 , this implies ks (v) = thus (1 )(1 )/ Y (t) = N (t) L 1 = (1 )1/ N (t) L. 320 L (1 )1/ ,

14.451: Introduction to Economic Growth Recall that the aggregate budget constraint is C (t) + I (t) + X (t) Y (t) . Let Y n (t) Y (t) I (t) be net output, after the costs of machines are subtracted (recall that it is net output that is distributed between R&D expenditure and consumption). We have that Z N (t) 1/ n Y (t) = (1 ) N (t) L ks (v, t) dv

0

= (1 )

1/

= (1 )1/ N (t) L. Now, given this and (14.4), the maximization problem of the social planner can be written as max subject to (t) = (1 )1/ N (t) L C (t) . N In this problem, N (t) is the state variable, and C (t) is the control variable. Let us set up the current-value Hamiltonian 1 h i 1 (N, C, ) = C (t) + (t) (1 )1/ N (t) L C (t) . H 1 C (N, C, ) = 0 = C (t) = (t) H N (N, C, ) = (t) (t) = (t) (1 )1/ L H

t

Z

0

14.451: Introduction to Economic Growth Combining these conditions, we obtain the following growth rate for consumption in the social planners allocation: C 1 1/ = (1 ) L , C The comparison boils down to that of (1 )1/ to , and it is straightforward to see that the former is always greater since (1 )1/ > 1 by virtue of the fact that (0, 1). This implies that the socially-planned economy will always grow faster than the decentralized economy. Intuitively, the social planner values innovation more, because it will be able to use the machines more intensively after innovation, since the monopoly markup reducing the demand for machines is absent in the social planners allocation. This establishes: Proposition 38 In the above-described expanding input-variety model, the decentralized equilibrium is not Pareto optimal, and always grows less than the allocation that would maximize utility of the representative household. (14.18)

which can be directly compared to the growth rate in the decentralized equilibrium, (14.16).

14.1.8

The divergence between the decentralized equilibrium and the socially planned allocation introduces the possibility that there might be Pareto-improving interventions. The most natural alternatives to consider in this model are two: 322

14.451: Introduction to Economic Growth 1. Subsidies to Research: by subsidizing research, the government can increase the growth rate of the economy, and this can be turned into a Pareto improvement if taxation is not distortionary and there can be appropriate redistribution of resources so that all parties benet.

2. Subsidies to Capital Inputs: the problem also arises from the fact that the decentralized economy is not using as many units of the machines/capital inputs (because of the monopoly markup); so subsidies to capital inputs given to nal good producers would also be useful in increasing the growth rate.

Moreover, it is noteworthy that as in the rst-generation endogenous growth models, a variety of dierent policy interventions, including taxes on investment income and subsidies of various forms will have growth eects not just level eects in this framework. Naturally, once we start thinking of policy in order to close the gap between the decentralized equilibrium in the Pareto optimal allocation, we also have to think of the objectives of policymakers and this brings us again to political economy issues. For that reason, rather than go into a detailed discussion of optimal policy, I simply note the gap between the decentralized equilibrium and the Pareto optimal allocation, leaving you to draw your own conclusions about what the implications of this gap will be. I will discuss some of the implications of dierent types of competition policies and intellectual property rights policies further below. 323

14.2

In the model of the previous section, growth resulted from the use of nal output for R&D. This is similar, in some way, to the endogenous growth model of Rebelo (1991), since the accumulation equation is linear in accumulable factors. As a result, we saw that, in equilibrium, output took a linear form in the stock of knowledge (new machines), thus a AN form instead of the Rebelos AK form. An alternative is to have scarce factors used in R&D. In other words, instead of the lab-equipment, we now have scientists as the key creators of R&D. In this case, there will not be endogenous growth, unless there are knowledge spillovers from past R&D. In other words, now current researchers need to stand on the shoulder of past giants. In fact, the original formulation by Romer (1990) was exactly of this knowledge-spillovers form, imposing the standing on the shoulders of giants as part of the technological possibilities frontier of the economy. A typical formulation in this case is = NLR N (14.19)

where LR is labor allocated to R&D. The term N on the right-hand side captures spillovers from the stock of existing ideas. The greater is N , the more productive is an R&D worker. LR could be skilled workers as in Romer (1990), or scientists or regular workers. In the latter case, there will be competition between the production sector and the R&D sector for workers, and the marginal cost of workers and research would be given by the wage rate and production sector. In particular, the free entry condition is now N (t) V (v, t) = w (t) 324

14.451: Introduction to Economic Growth where N is on the left-hand side because it parameterizes the productivity of an R&D worker from (14.19), and V (v, t) is again given by (14.7) above, while the ow cost of undertaking research is hiring workers for R&D, thus the wage rate w (t). In the model I outlined in the previous section, the equilibrium wage rate was derived as (recall equation (14.12)): w (t) = N (t) 1

So the steady-state free-entry condition, with a constant steady-state (balanced growth path) interest rate, r , becomes N (t) L N (t) = r 1

Hence the steady-state equilibrium interest rate is r = (1 ) L. Now using the Euler equation of the representative household, we have

gc

C 1 = ((1 ) L ) . C

(14.20)

The rest of the analysis is unchanged. In particular, the growth rate of technology and output are also given by (14.20). Also, there are again no transitional dynamics, and we can also compare the decentralized equilibrium to the Pareto optimal allocation. It is also useful to note that there is again a scale eect heregreater L increases the interest rate and the growth rate in the economy. This discussion immediately establishes: Proposition 39 In the above-described expanding input-variety model with knowledge spillovers, there exists a unique balanced growth path equilibrium in which, technology, output and con325

14.451: Introduction to Economic Growth sumption grow at the same rate given by (14.20) starting from any initial level of technology stock N (0) > 0.

14.2.1

Since we now have a model with monopolistic competition, we can also relate the results to standard issues in industrial organization, such as competition policy, anti-trust, patents etc.. For example, in this model we can introduce a fringe of competitive rms which could limit the markup that each monopolist can charge. For example, recall that the optimal markup that the monopolist charges is = . 1

Imagine, instead, that a fringe of competitive rms can copy the innovation of any monopolist, but they will not be able to produce at the same level of costs (because the inventor has more know-how). In particular, suppose that instead of a marginal cost , they will have marginal cost of with > 1. If > 1/ (1 ), this fringe is not a threat to the monopolist, since the monopolist could set its ideal, prot maximizing, markup and the fringe would not be able to enter without making losses. However, if < 1/ (1 ), the fringe would prevent the monopolist from setting its ideal monopoly price. In particular in this case the monopolist would be forced to set a limit price, exactly equal to = . (14.21)

This price formula follows immediately by noting that, if the price of the monopolist were higher than this, the fringe could undercut and make prots, since their marginal cost is equal to . If it were above this, the monopolist could further increase its price without 326

14.451: Introduction to Economic Growth losing any customers to the fringe and make more prots. Thus, there is a unique equilibrium price given by (14.21). When the monopolist charges this limit price, its prots per unit would be prots per unit = ( 1) = ( 1) (1 ) , which is less than , the prots per unit that the monopolist made in the absence of the competitive fringe. What is the implication of this on the rate of economic growth? It is straightforward to work out that in this case the economy would grow at a slower rate. For example, in the baseline model with the lab-equipment technology, this growth rate would be 1 1/ (1 )/ g = ( 1) (1 ) L ,

which is less than (14.16). Therefore, in this model, somewhat counter-intuitively, greater competition, which reduces markups (and thus static distortions), also reduces long-run growth. This is because prots are important in this model to encourage innovation by new research rms. If these prots are cut, incentives for research are also reduced. Of course, welfare is not the same as growth, and some degree of competition reducing prices below the unconstrained monopolistic level might be useful for welfare depending on the discount rate of the representative household. Essentially, with a lower markup, households are happier in the present, but suer slower consumption growth. The exact tradeo between these two opposing eects depends on the discount rate of the representative household. Another similar application is to that of patent policy. In practice, patents are for limited durations. In the baseline model, we assumed that patents are perpetual; once a rm invents a new good, it has a patent forever and it becomes the monopolist for that good 327

14.451: Introduction to Economic Growth forever. If patents are enforced strictly, then this might rule out the competitive fringe from competing, restoring the growth rate of the economy to (14.16). Also, even in the absence of the competitive fringe, we can imagine that once the patent runs out, the rm will cease to make prots on its innovation. In this case, it can easily be shown that growth is maximized by having as long patents as possible. Again there is a tradeo here between the equilibrium growth rate of the economy and the static level of welfare. But more important than these trade-os between growth and level is the fact that these models are the most basic models, so do not feature some of the potential benets of competition. For example, competitive pressure from other rms might encourage faster innovation. We will see this issue in Problem Set 6.

14.3

As we have seen, the models used so far feature a scale eect in the sense that a larger population, L, translates into a higher interest rate and a higher growth rate. This is problematic for three reasons as argued in a series of papers by Chad Jones: 1. Larger countries do not necessarily grow faster (though the larger market of the United States or European economies may have been an advantage during the early phases of the industrialization process). 2. The population in general is not constant, but growing. If we have constant population growth as in the standard neoclassical growth model, e.g., L (t) = exp (nt) L (0), these models would not feature a balanced growth path. Instead, growth would become faster and faster over time, eventually leading to an innite output in nite time, violating 328

14.451: Introduction to Economic Growth the transversality condition. 3. In the data, we see the total amount of resources devoted to R&D increases steadily, but this has not been associated with an increase in the growth rate. These observations have motivated Jones (1995) to suggest the following modication of the baseline model. Population at time t is L (t) and grows at the constant rate n (i.e., (t) = nL (t)). All agents have the standard CRRA preferences L Z

exp (t)

C 1 1 dt, 1

(14.22)

where C is consumption dened over the nal good of the economy. This good is produced as before, more specically, with the production function, (14.2) and all the other assumptions are the same as before. New goods are produced by allocating workers to the R&D process as in the knowledgespillovers model studied in the previous section. However, now there are limited knowledge spillovers, in particular, (t) = N (t) LR (t) N where < 1 and LR is labor allocated to R&D. So labor market clearing requires LE (t) + LR (t) = L, (14.24) (14.23)

where LE (t) is the level of employment in the production sector. The fact that not all workers are in the production sector implies that the aggregate output of the economy (by an argument similar to before) is given by Y (t) = 1 N (t) LE (t) , 1 329

14.451: Introduction to Economic Growth and prots of monopolists from selling their machines is (t) = LE (t) . The key assumption for the model is that < 1. The case where = 1 is the one analyzed in the previous section, and as commented above, with population growth this would lead to an exploding path, leading to innite utility. However, the model is well behaved when < 1. In particular, let us focus on the BGP (steady state), where a constant fraction of workers are allocated to R&D, the interest rate and the growth rate are constant. In this BGP allocation, we have the following free-entry condition: N (t) LE (t) N (t) , = w (t) = r 1 (1 ) LE (t) = 1. r

where the wage is again substituted from (14.12). This implies N (t)1

Now dierentiating this condition with respect to time, we obtain E (t) (t) L N + = 0. ( 1) N (t) LE (t) E (t) /LE (t) = n. Since in BGP, the fraction of workers allocated to research is constant, L This implies that the BGP growth rate of technology is given by (t) n N = . gN N (t) 1 is population growth, so consumption per capita gross at the rate

= gN gc n . gc = 1

(14.25)

From equation (14.11), this implies the total output grows at the rate gN + n. But now there

(14.26)

330

14.451: Introduction to Economic Growth Consequently, this model generates sustained growth in income per capita as well, and does so in the presence of population growth. More interestingly, in order to achieve this growth rate, it allocates more and more of the labor force to R&D. The reason for this is that the technology for creating new ideas, (14.23), only features limited spillovers, thus to maintain sustained growth more resources need to be allocated to R&D. The result is summarized in the next proposition: Proposition 40 In the above-described expanding input-variety model with limited knowledge spillovers as given by (14.23), starting from any initial level of technology stock N (0) > 0, there exists a unique balanced growth path equilibrium in which, technology and consumption per capita grow at the rate (14.25), and output grows at rate gN + n. This type of model is sometimes referred to as semi-endogenous growth, because while there is sustained growth, the per capita growth rate of the economy given in (14.26) is determined only by population growth and technology and does not respond to taxes or other policies. Some papers in the literature have attempted to develop models of endogenous growth without scale eects, but where economic growth still responds to policies, though this normally requires a combination of restrictive assumptions.

331

332

15.1 Baseline Model

In the model of expanding machine variety, dierent machines were complements in production. However, in practice when a better computer comes to the market, it replaces previous models. This is captured in the models of vertical quality competition or quality improvement, such as the models in Aghion and Howitt, or Grossman and Helpman. Population and labor supply are again constant at L. The major dierence from the previous setup is that the production function is now 1 Y (t) = 1 Z

1

dv L

(15.1)

where q (v, t) is the quality of machine v at time t and because now the number of varieties is constant, I have normalized it to 1. Consequently, while in the previous section growth took place because the variety of inputs expanded, here it takes place because existing inputs become more productive. In many ways, this seems to describe the growth process better, 333

14.451: Introduction to Economic Growth and it also has a nice Schumpeterian avor of creative destruction. When a better vintage of a particular machine is created, it replaces (destroys) the existing vintage. The rest of the setup is the same as before. In particular, as in the baseline endogenous technological change model, there is no population growth. Instead, the population and labor supply is xed at L. The economy admits a representative household with preferences given by the standard CRRA form, (14.1). To invent a new machine, rms undertake R&D on an existing machine (of type v). If a rm spends qz units of the nal good for R&D on a machine of quality q , then it has a ow rate z of inventing a new machine, with quality q . Notice that the cost of undertaking R&D is proportional to the quality of the machine on which the rm is working. This is natural. Without this assumption, R&D would become more and more protable over time, leading to an explosive path. The new machine will take over the market for this type of capital, but unless is very large, it will have to charge a limit price in order to exclude the previous leader. This is similar to the discussion of the limit price above (which led to equation (14.21) there). I assume that is not too large, so we will observe limited prices in equilibrium. Also, assume that the marginal cost of production is q for a machine of quality q . One issue here, which was absent in the expanding input variety model, is whether the existing leader will undertake R&D and innovation. In the expanding input variety model, this was irrelevant, since machines could not be improved upon, so there was only R&D for new machines, and who undertook them was not important. Here, in contrast, existing machines can be (and are) improved, and this is the source of economic growth. But also, incentives to undertake such innovations may dier between the incumbent monopolist and entrants. A major insight here comes from Arrow (1962), who noted the presence 334

14.451: Introduction to Economic Growth of the replacement eect ; the incumbent would be replacing its own machine, and thus destroying the prots that it is already making. In contrast, a new entrant does not have this replacement calculation in mind. As a result, with the same technology of innovation, it will always be the entrantsnew rms who do R&D in this model. This is an attractive implication, since it creates a real sense of creative destruction or churning. Of course in practice we see established big rms undertake innovation. This might be because the technology of innovation diers between incumbents and new potential entrants, or there is only a limited number of new entrants. One of the questions in Problem Set 6 will get you to work through a model along these lines. Following the same analysis as before, the demand for machines are now k (v, t) = [q (v, t)/(v, t)]1/ L. (15.2)

Let us normalize = 1 , so the monopolist sets the price (v, t) = q (v, t), and sells k (v ) = L. This generates prots (v, t) = 1 Lq(v, t) (15.3)

Substituting (15.2) into (15.1), we obtain total output as Y (t) = where Q (t) = is the average total quality of machines. 1 Q (t) L 1 Z

1

q (v, t)dv

The value of being the inventor is dierent now, because this position will not last forever. More formally, the standard dynamic programming equation now becomes: (v, t) = (v, t) x(v, t)V (v, t) r (t) V (v, t) V 335

14.451: Introduction to Economic Growth where x(v, t) is the rate at which new innovations occur in sector v at time t. When this event occurs, the existing monopolist loses its monopoly position and is replaced by the monopolist of the higher-quality machine. From then on, it receives zero prots, and thus has zero value. In the balanced growth path x(v, t) will be constant across dierent types of goods and over time, and let us denote it by x . Note that there is an immediate relationship between the innovation rate, x , and the BGP growth rate, g , given by: g = ( 1) x . This simply follows from the fact that on average, growth occurs because there are new and better machines, and those increase output by a factor 1. Free entry into R&D implies that V (v, t) = q(v, t). (15.4)

Otherwise, there will be entry into or exit from research, since one more unit of the nal good provides a ow rate of obtaining V . (v, t) = 0. So, dropping time and sector dependence and using stars In steady state, V again to denote BGP values, we have V = r + x r + g /( 1) q ( 1)2 L = = 1 q. [( 1)r + g ] =

where the penultimate equality follows from substituting for prots from (15.3), and the last equality follows from free entry condition (15.4). 336

14.451: Introduction to Economic Growth Moreover, the Euler equation (14.15) still applies, so combining those, we have that in steady state, r = g + , so ( 1)2 L =1 [( 1) (g + ) + g ] therefore g = or rearranging, 1 ( 1)2 L ( 1) . (( 1) + 1)

( 1) L .

(15.5)

Proposition 41 In the above-described quality-improvement model, there exists a unique balanced growth path equilibrium in which output and consumption grow at the same rate given by (15.5). The rate of innovation is g / ( 1).

15.2

Pareto Optimality

This equilibrium, like that of the endogenous technology model with expanding input varieties, is not, generally, Pareto optimal. But in fact, this can be because there is too little or too much innovation. The reason why there is too little innovation is the same as the model in the previous section: a monopolist does not sell as many units of the new machines as the social planner would like, and does not fully internalize the benets accruing to nal good producers (and the economy) from further innovation. However, counteracting this there is the business stealing eect coming from the Schumpeterian nature of the model; a new innovation steals the prots of the existing monopolist. This tends to induce entrants 337

14.451: Introduction to Economic Growth to do too much R&D, even when R&D has small social returns, because it enables them to become the monopoly producers of a new machine, thus becoming the claimant of the natural monopoly power accruing to the leader in a particular line of machines. The analysis of Pareto optimality is straightforward here because of the parallel between the structure of this model to that with expanding input variety. In particular, it is immediate to see that a social planner would choose demands for machines as ks (v) = L = 1/ L, 1/

given the assumption that in this case = 1 . This implies that total output, under the socially-planned economy, is equal to Y (t) = (1 )/ Q (t) L. (1 )

Recall again that the aggregate budget constraint is C (t) + I (t) + X (t) Y (t) . It is once more useful to work in terms of net output, which is dened as Y n (t) Y (t) I (t) as in the expanding variety model, and we have (1 )/ Q (t) L Y (t) = (1 )

n

= =

(1 )/ Q (t) L. 1

q (v, t) ks (v, t) dv

Finally, note that given the assumptions above, the social planner faces an aggregate technology frontier of the form (t) = ( 1) X (t) , Q 338

14.451: Introduction to Economic Growth since an R&D spending of Q (t) X (t) will lead to discoveries of better vintages at the ow rate of , each of these vintages increases average quality of machines by a proportional amount 1. Now, given this equation, the maximization problem of the social planner can be written as max subject to (t) = ( 1) Q

(1 )/

Z

0

Q (t) L ( 1) C (t) ,

where the constraint uses net output, (15.6), and the budget constraint. In this problem, Q (t) is the state variable, and C (t) is the control variable. Let us again set up the current-value Hamiltonian " # 1 (1 )/ C ( t ) 1 (Q, C, ) = H + (t) ( 1) Q (t) L ( 1) C (t) . 1 1 The necessary conditions are C (N, C, ) = 0 = C (t) = ( 1) (t) H

t

(1 )/

Combining these conditions, we obtain the following growth rate for consumption in the social planners allocation: 1 C = C ! (1 )/ ( 1) L . 1 339 (15.7)

14.451: Introduction to Economic Growth Comparing this to (15.5), we can see that either could be greater. For example, when is small, we see that g > g S , so that there is too much innovation in the decentralized economy relative to the social optimum. This illustrates the contracting inuences of the standard underinvestment and the business stealing eect discussed above. In particular: Proposition 42 In the above-described qualityimprovement model, the decentralized equilibrium is not Pareto optimal, and may grow less or more rapidly than the allocation that would maximize the utility of the representative household.

340

The framework analyzed so far assumed technical change to be neutral towards dierent factors, and in fact, in most applications, we limited ourselves to the Cobb-Douglas production function. Technical change is often not neutral towards dierent factors of production, and the elasticity of substitution between dierent factors is often found not to be equal to 1. So it is important to consider the implications of more general production functions, and think of endogenizing technology and technological dierences within this more general framework. There are, however, reasons for economists focus on Cobb-Douglas production function. The most important one is that a general production function, associated with arbitrary technological progress, does not generate balanced growth. Instead, with a nonCobb-Douglas production function, balanced growth requires all technical change to be laboraugmenting. Therefore, once we abandon the Cobb-Douglas production function, we need to develop a theory of why technical change is purely labor-augmenting, and a more generally think about various biases in the nature of technical change. 341

14.451: Introduction to Economic Growth Do we have reason to think that biased technical change is important? The answer appears to be yesthere are many examples of systematic biases in technical change. For example, the consensus among labor and macroeconomists is that technical change throughout the 20th century has been skill-biased. There is also a possible acceleration in skill-biased technical change during the past 25 years. In contrast, evidence suggests that technical change change during the 19th century may have been, at least in part, skill-replacing. This reasoning leads to the following major question: What explains these various biases and the direction of technical change? Let us consider a model in which prot incentives determine what type of technologies are developed. When developing technologies complementing a particular factor (say skilled workers) is more protable, more of these technologies will be developed. Whether the development of these technologies makes aggregate technology more skill-biased or not will depend on the elasticity of substitution between this factor and the rest. What determines the relative protability of developing dierent technologies?

1. The price eect: there will be stronger incentives to develop technologies when the goods produced by these technologies command higher prices.

2. The market size eect: it is more protable to develop technologies that have a larger market. The importance of market size in innovation was much emphasized by the famous scholar of innovation, Jacob Schmookler (1966), who, for example, argued: invention is largely an economic activity which, like other economic activities, is pursued for gain;... expected gain varies with expected sales of goods embodying the invention. 342

16.1

16.1.1

Denitions

First consider what factor-augmenting and factor-biased technical change correspond to. For this purpose, take the standard the constant elasticity of substitution (CES) production function i h 1 1 1 + (1 ) (AZ Z ) , y = (AL L)

where L is labor, and Z denotes another factor of production, which could be capital or skilled labor. Here (0, ) is the elasticity of substitution between the two factors. AL is labor-augmenting (labor-complementary) and AZ is Z-complementary. The relative marginal product of the two factors: MPZ 1 = MPL AZ AL

1 1 Z . L

(16.1)

This implies that when > 1, i.e., when the two factors are gross substitutes, AL is laborbiased and AZ is Z-biased. In contrast, when < 1, i.e., when the two factors are gross complements, AZ is labor-biased and AL is Z-biased.

16.1.2

Basic Model

Now we are in a position to consider a simple model of directed technical change. Assume that preferences are again given by the CRRA function Z

0

(16.2)

h 1 1 i 1 C + I + X Y YL + (1 )YZ

(16.3)

In words, the output aggregate is produced from two other (intermediate) goods, YL and YZ , with elasticity of substitution . Here Y can either be interpreted as the nal good aggregated from the two intermediates, YL and YZ , or Y could be an index of utility dened over the two nal goods, YL and YZ . Total output is again distributed between consumption, C , spending on machines, I , and spending on R&D, X . The fact that there is R&D spending signies that I will use the lab-equipment model to expose the basic ideas, but exactly the same results apply with the knowledge-spillovers model. Intermediate good production functions are: Z NL 1 1 xL (j, t) dj L , YL (t) = 1 0 and 1 YZ (t) = 1 Z

NZ

(16.4)

xZ (j, t)

dj Z .

Note here that the range of machines used with the two sectors are dierent (there are two disjoint sets of machines, though we use the index j to denote either for notational simplicity). Assume that machines to both sectors are supplied by technology monopolists. This is a straightforward generalization of the endogenous technical change model of product variety discussed above. Each monopolist sets a rental price L (j, t) or Z (j, t) for the machine it supplies to the market. These prices are potentially time-varying, but we will see that they will be constant in equilibrium. 344

(16.5)

14.451: Introduction to Economic Growth The marginal cost of production is the same for all machines and normalized to 1 in terms of the nal good. Price taking implies the following maximization problem for sector L rms at time t: Z NL max pL (t) YL (t) wL (t) L L (j, t) xL (j, t) dj, (16.6)

L,{xL (j,t)} 0

This gives machine demands as pL (t) xL (j, t) = L (j, (t)) Similarly pZ (t) xZ (j, t) = Z (j, t) 1/ L. (16.7)

1/

Z,

(16.8)

Since the demand curve for machines facing the monopolist, (16.7), is iso-elastic, the prot-maximizing price will be a constant markup over marginal cost. In particular, all machine prices will be given by L (j, t) = Z (j, t) = 1 for all j and t. These imply that xL (j, t) = [pL (t)]1/ L for all j , and xZ (j, t) = [pZ (t)]1/ Z for all j . Substituting these into (16.4) and (16.5), we obtain YL (t) = and YZ (t) =

1 1 [pZ (t)] NZ (t) Z 1 1 1 [pL (t)] NL (t) L 1

345

14.451: Introduction to Economic Growth Prots of technology monopolists at time t are then obtained as L (t) = [pL (t)]1/ L and Z (t) = [pZ (t)]1/ Z . (16.9)

Let VZ and VL be the net present discounted values of new innovations. Then in steady state, we have that (dropping time dependence): p L p Z VL = L and VZ = Z . r r

1/ 1/

(16.10)

The comparison of these two values is of crucial importance. The greater is VZ relative to VL , the greater are the incentives to develop Z -complementary machines, NZ , rather than NL . This highlights the two eects on the direction of technical change that I mentioned above: 1. The price eect: a greater incentive to invent technologies producing more expensive goods. 2. The market size eect: a larger market for the technology leads to more innovation. The market size eect encourages innovation for the more abundant factor. It is straightforward from the nal good production function given in (16.3) that the relative price of good Z to good L will be given by 1 pZ 1 YZ p = pL YL 1 1 NZ Z 1 p = NL L 346

14.451: Introduction to Economic Growth Substituting for relative prices into the steady state (BGP) value functions, relative profitability is obtained as: VZ = VL where ( 1) (1 ) . is the (derived) elasticity of substitution between the two factors. An increase in the relative factor supply, Z/L, will increase VZ /VL as long as > 1 and it will reduce it if < 1. Therefore, the elasticity of substitution regulates whether the price eect dominates the market size eect. Note also that we have 1 1 So the two factors will be gross substitutes when the two goods in utility function (or the two intermediates in the production of the nal good) are gross substitutes. We have so far characterized the demand for new technologies. Next we have to determine the supply of all new technologies, which will be, in part, regulated by the technological possibilities for generating new machine varieties. Suppose as in the analysis above that new machines in the two sectors are produced by investing in lab equipment: Z = Z XZ , L = L XL and N N where X denotes R&D expenditure. This gives the following steady-state technology market clearing condition: L VL = Z VZ. 347 (16.13) (16.12) 1

NZ NL

1 1 Z . L

(16.11)

14.451: Introduction to Economic Growth Then, the steady-state relative physical productivities can be solved for 1 NZ 1 Z = , NL L where the *s denote that this expression refers to the steady-state value Before going further, using the same type of analysis as before, we can characterize the equilibrium in this economy. Because there are two state variables now, the economy features transitional dynamics, but still has a unique balanced growth path. These are stated in the next proposition (and left for you to prove): Proposition 43 In the directed technical change model described here, there exists a unique balanced growth path equilibrium in which the relative technologies are given by (16.14), and consumption and output grow at the rate 1 1 1 1 1 g= (1 ) ( Z Z ) + ( L L) . path. More interesting than the aggregate growth rate of the economy in this case is how the direction of technical change aects relative factor prices and how it responds to changes in relative supplies. The study this issue, recall that relative factor prices are given by NZ wZ = p1/ = wL NL 1

(16.14)

Starting from any NL (0) > 0 and NZ (0) > 0, the economy converges to this balanced growth

NZ NL

First, the relative factor reward, wZ /wL , is decreasing in the relative factor supply, Z/L. Second, the same combination of parameters,

1 ,

1 1 Z . L

(16.15)

for more abundant factors is more protable also determines whether a greater NZ /NL i.e., a greater relative physical productivity of factor Z increases wZ /wL . 348

14.451: Introduction to Economic Growth When > 1, greater NZ /NL increases wZ /wL , but when < 1, it has the opposite eect. This implies that irrespective of whether is greater than or less than one, an increase in Z/L will change NZ /NL in a direction that increases the relative reward to factor Z , i.e., wZ /wL . To capture this notion, let us dene weak endogenous (relative) bias as the phenomenon that an increase in the relative supply of a factor changes technology in a direction that benets the factor that is becoming more abundant. This discussion, together with the denition of weak endogenous bias, establishes: Proposition 44 In the above-described directed technical change model, there is always weak endogenous (relative) bias, meaning that an increase in Z/L always causes relatively Z -biased technical change. Relative factor rewards are wZ = 1 wL 1 2 Z . L

(16.16)

Comparing this equation to the relative demand for a given technology, we see that the response of relative factor rewards to changes in relative supply is always more elastic in (16.16) than in (16.15) as implied by Proposition 44. This is simply an application of the LeChatelier principle, which states that demand curves become more elastic when other factors adjust, but with a new interpretationthat is, the relative demand curves become atter when technology adjusts. The more important and surprising result here is that if is suciently large, in particular if > 2, the relationship between relative factor supplies and relative factor rewards can be upward sloping. Let us refer to a situation in which an increase in the relative supply of a factor changes technology so much that the relative price of the factor becoming more 349

14.451: Introduction to Economic Growth abundant increases as strong endogenous (relative) bias. Therefore, the analysis so far has established:

Proposition 45 In the above-described directed technical change model, if > 2, there is strong endogenous (relative) bias in the sense that an increase in Z/L raises the relative marginal product and the relative wage of the Z factor compared to the L factor.

16.1.3

Implications

Let us now consider the implications of this simple model of directed technical change, and in particular of Propositions 44 and 45. One of the most interesting applications is to changes in the skill premium. For this application, imagine that Z = H stands for skilled workers, for example, college-educated workers. In the United States labor market, the skill premium has shown no tendency to decline despite a very large increase in the supply of college educated workers. On the contrary, following a brief period of decline during the 1970s in the face of the very large increase in the supply of college-educated workers, the skill (college) premium has increased very sharply throughout the 1980s and 1990s, to reach a level not experienced in the postwar era. The following gure shows the general patterns by plotting the college premium and the relative supply of college graduate workers in the United States since WWII. 350

College wage premium .6 Rel. supply of college skills Rel. supply of college skills .8

.6 .5 .4

.4 .2

.3 39 49 59 69 year 79 89 96

In the labor and macro literature, the most popular explanation for these patterns is skillbiased technological change. For example, the computers or the a new IT technologies are argued to favor skilled workers relative to unskilled workers. But why should the economy adopt and develop more skill-biased technologies throughout the past 20 years, or more generally throughout the entire 20th century? This question becomes more relevant once we remember that during the 19th century many of the technologies that were fueling economic growth, such as the factory system and the major spinning and weaving innovations, were skill-replacing rather than skill-complementary. Thus, in summary, we have the following stylized facts:

1. Secular skill-biased technical change increasing the demand for skills throughout 20th century. 351

14.451: Introduction to Economic Growth 2. Possible acceleration in skill-biased technical change over the past 25 years. 3. Many skill-replacing technologies during the 19th century. The current model, in particular, Theorems 44 and 45, gives us a way to think about these issues. Recall that if > 2, then the long-run relationship between the relative supply of skills and the skill premium is positive. With an upward sloping relative demand curve, or simply with the degree of skilled bias endogenized, we have a natural explanation for all of the patterns mentioned above. 1. The increase in the number of skilled workers that has taken place throughout 20th century is predicted to cause steady skill-biased technical change. 2. Acceleration in the increase in the number of skilled workers over the past 25 years is predicted to induce an acceleration in skill-biased technical change. 3. Large increase in the number of unskilled workers available to be employed in the factories during the 19th century could be expected to induce skill-replacing/laborbiased technical change. In addition, this framework with endogenous technology also gives a nice interpretation for the dynamics of the college premium during the 1970s and 1980s. It is reasonable to presume that the equilibrium skill bias of technologies, NH /NL , is a sluggish variable determined by the slow buildup and development of new technologies. In this case, a rapid increase in the supply of skills would rst reduce the skill premium as the economy would be moving along a constant technology (constant NH /NL ) curve in the gure. After a while 352

14.451: Introduction to Economic Growth the technology would start adjusting, and the economy would move back to the upward sloping relative demand curve, with a very sharp increase in the college premium. This approach can therefore explain both the decline in the college premium during the 1970s and the subsequent large surge, and relates both to the large increase in the supply of skilled workers.

Relative Wage

Long-run Rel Wage Initial Rel Wage Short-run Response Exogenous Shift in Relative Supply

If on the other hand we have < 2, the long-run relative demand curve will be downward sloping, though again it will be shallower than the short-run relative demand curve. Then following the increase in the relative supply of skills there will be an initial decline in the skill premium (college premium), and as technology starts adjusting the skill premium will increase. But it will end up below its initial level. To explain the larger increase in the 353

14.451: Introduction to Economic Growth 1980s, in this case we need some exogenous skill-biased technical change. The next gure draws this case.

Relative Wage

Initial Rel Wage Long-run Rel Wage Short-run Response Exogenous Shift in Relative Supply

16.2

The above model derived the relative bias results by assuming a constant elasticity of substitution production function. In fact, the spirit of the results are much more general. The following proposition generalizes these results:

354

14.451: Introduction to Economic Growth Proposition 46 Consider the above economy with two-factors, (Z, L) R2 + , and two factor-augmenting technologies, (AZ , AL ) R2 + , such that the production function is F (AZ Z, AL L). Assume that F is twice continuously dierentiable, concave and homothetic in its two arguments, and that the costs of producing technologies AZ and AL , C (AZ , AL ), is also twice continuously dierentiable, strictly convex and homothetic in AZ and AL . Denote the rst derivatives of C (AZ , AL ) by CZ and CL . Let be the (local) elasticity of substitution between ln(CZ (AZ ,AL )/CL (AZ, AL )) ln(Z/L) , and let = . Finally, supZ and L dened by = ln(wZ /wL ) AZ ln(AZ /AL ) AL L and denote equilibrium technologies by (A , A ), pose that factor supplies are given by Z, Z L L, A , A and wL Z, L, A , A . Then we have that and equilibrium factor prices by wZ Z, Z L Z L L : for all Z, 1 ln (A Z /AL ) = (16.17) ln (Z/L) 1 + and L, A , A /wL Z, L, A , A ln (A /A ) ln wZ Z, Z L Z L Z L 0 ln (AZ /AL ) ln (Z/L)

(16.18)

so that there is always weak relative equilibrium bias. Moreover, L, A , A /wL Z, L, A , A d ln wZ Z, 2 Z L Z L = , d ln (Z/L) 1 + so that there is strong relative equilibrium bias if 2 > 0. Proof. The proof is provided in Acemoglu (2005).

(16.19)

In this environment, therefore, the condition for strong equilibrium bias is > 2+ , thus more restrictive than the previous model. This is because costs of creating new technologies are convex. In models with knowledge spillovers, there are typically nonconvexities in the creation of new technologies as well. For example, invention of skill-biased technologies today may make further invention of skill-biased technologies easier, as in the standard building on 355

14.451: Introduction to Economic Growth the shoulders of giants specication. In that case, Acemoglu (2002) shows that the condition for an upward-sloping relative demand curve (i.e., strong relative equilibrium bias) is in fact > 2 0 , for some other parameter 0 > 0 measuring the extent of this nonconvexity. Thus in general, strong equilibrium bias requires sucient substitutability between factors, with the exact threshold depending on the structure of costs (or the technology possibilities frontier of the economy).

16.3

One of the advantages of the models of directed technical change is that they allow us to investigate why technological change might be purely labor-augmenting as required for balanced growth. Here I outline a model which generates this results (though under somewhat more restrictive assumptions than the directed technical change results we have seen so far).

16.3.1

Consider an economy consisting of L unskilled workers who work in the production sector, and S scientists who perform R&D. The distinction between unskilled workers and scientists is adopted to ensure that the production and R&D sectors do not compete for workers. The economy again admits a representative consumer with the usual constant relative risk aversion (CRRA) preferences: Z

0

(16.20)

where C (t) is consumption at the time t and 0 is the elasticity of marginal utility. 356

14.451: Introduction to Economic Growth The budget constraint of the representative consumer requires that consumption and investment expenditures are less than total income: C + I wL + rK + S S + , (16.21)

where I denotes investment, w is the wage rate of labor, r is the interest rate, K denotes the capital stock, S is the wage rate for scientists, and is total prot income. The resource constraint of the economy implies that

h 1 1 i 1 , wL + rK + S S + = Y = YL + (1 )YK

(16.22)

where Y is an output aggregate produced from a labor-intensive and a capital-intensive good, respectively YL and YK , with elasticity of substitution , where 0 < . We will see below that will also determine the short-run elasticity of substitution between capital and labor. A host of evidence suggests that this short-run elasticity between capital and labor is less than one, which in the context of this model implies that < 1. For simplicity, let us assume that there is no depreciation of capital, so the change in the capital stock (and in the representative consumers asset level) is given by = I. K (16.23)

Let us also use this opportunity to develop a variant of the models studied above. In particular, let us assume that the labor-intensive and capital-intensive goods are produced competitively from constant elasticity of substitution (CES) production functions of laborintensive and capital-intensive intermediates, with elasticity 1/(1 ): YL = Z

n

1/ Z yl (i) di and YK =

yk (i) di

1/

(16.24)

357

14.451: Introduction to Economic Growth where y (i)s denote the intermediate goods and (0, 1), so that > 1 and dierent intermediate goods are gross substitutes. This formulation implies that there are two different sets of intermediate goods, n of those that are produced with labor, and m that are produced using only capital. An increase in nan expansion in the set of labor-intensive intermediatescorresponds to labor-augmenting technical change, while an increase in m corresponds to capital-augmenting technical change. Intermediate goods are supplied by monopolists who hold the relevant patent, and are produced linearly from their respective factors: yl (i) = l(i) and yk (i) = k(i), (16.25)

where l(i) and k(i) are labor and capital used in the production of good i. Market clearing for labor and capital then requires: Z

n

l (i) di = L and

k (i) di = K.

(16.26)

To close the model, we need to specify the innovation possibilities frontierthat is, the technological possibilities for transforming resources into blueprints for new varieties of capital-intensive and labor-intensive intermediates. Let us assume that these blueprints are created by the R&D eorts of scientists, who are, in turn, employed by R&D rms. There is free-entry into the R&D sector. Once an R&D rm invents a new intermediate, it receives a perfectly enforced patent and becomes the perpetual monopolist of that intermediate. R&D rms have access to the following technologies for invention: n m = bl (Sl ) Sl and = bk (Sk ) Sk , n m 358 (16.27)

14.451: Introduction to Economic Growth where bl , bk and are strictly positive constants and () is a continuously dierentiable and decreasing function such that (s) s is always increasing, and (0) < . Sl and Sk denote, respectively, the number of scientists working to discover new labor-intensive and capital-intensive intermediates, with the market clearing condition Sl + Sk = S. I also assume that the economy starts at t = 0 with n (0) > 0 and m (0) > 0. Equation (16.27) implies a number of important features: 1. Technical change is directed, in the sense that the society (researchers) can generate faster improvements in one type of intermediates than the other. This feature will enable the analysis of whether equilibrium technical change will be labor- or capitalaugmenting. 2. The fact that () is decreasing means that there are intra-temporal decreasing returns to R&D eort; when more scientists are allocated to the invention of labor-intensive intermediates, the productivity of each declines. This might be, for example, because scientists crowd each other out in competing for the invention of similar intermediates. This decreasing returns assumption is adopted to simplify the analysis of transitional dynamicswhen () is constant, the behavior of Sl and Sk is discontinuous. 3. Research eort devoted to the invention of labor-intensive intermediates, (Sl ) Sl , leads to a proportional increase in the supply of these intermediates at the rate bl , while the same eort devoted to the discovery of capital-using intermediates leads to a proportional increase at the rate bk . The parameters bl and bk potentially dier since the discovery of one type of new intermediate may be technically more dicult than 359 (16.28)

14.451: Introduction to Economic Growth discovering the other type (the standard model with only labor-augmenting technical change can be thought as the special case with bk = 0). I also assume that the crowding eect captured by the function () is not internalized by individual R&D rms, so each R&D rm takes the productivity of allocating one more scientist to each of the two sectors, bl (Sl ) or bk (Sk ), as given when deciding which sector to enter. The results are identical when R&D rms act non-competitively and form global research consortiums, internalizing these crowding-out eects. 4. Each intermediate disappears at the rate , so that when there is no research eort devoted to a particular type of intermediates, its stock declines exponentially. With = 0, the results are similar, but there will exist multiple balanced growth paths (see below). Notice that in (16.27) scientists are standing on the shoulders of giants as in the model of knowledge spillovers analyzed above. In fact, equation (16.27) is a direct generalization of the accumulation equation in the one-sector knowledge spillovers model analyzed above, where we had a special form of n/n = bl (S ) S . However, when we go to an economy with two sectors, there is the issue of how innovations in one sector aect the knowledge base of the other sector. An additional assumption implicit in (16.27) is that a higher stock of knowledge accumulated in one sector benets only that sector (i.e., a higher n increases the productivity of scientists working in the n-sector). This, as we will see, is the crucial assumption that enables the model to generate endogenous technological change that is purely labor augmenting.

Finally, dene Sl and Sk as the number of scientists required to keep the state of tech ) Sk = . Let us impose: nology in each sector constant, i.e., bl (Sl ) Sl = and bk (Sk

360

Assumption 14 Sl + Sk < S,

This assumption implies that there is enough scientists in the society to enable technological progress in both sectors.

16.3.2

An equilibrium in this economy is given by time paths of factor, intermediate and good prices,

m w, r, S , [pl (i)]n i=0 , [pk (i)]i=0 , pL and pK , employment, consumption and saving decisions, m n m [l(i)]n i=0 , [k (i)]i=0 , [yl (i)]i=0 , [yk (i)]i=0 , C and I , and the allocation of scientists between the m two sectors, Sl and Sk , such that [yl (i)]n i=0 , [yk (i)]i=0 , C and I maximize the utility of the m representative consumer given factor, intermediate and good prices; and [l(i)]n i=0 , [k (i)]i=0 , m [pl (i)]n i=0 and [pk (i)]i=0 maximize prots of intermediate goods monopolists, Sl and Sk imply

zero-prots for all R&D rms, and all markets clear. I start with the optimal consumption path of the representative consumer, which satises the familiar Euler equation: 1 C = (r ), C (16.29)

where recall that r is the rate of interest. The consumption sequence [C (t)] 0 also satises the lifetime budget constraint of the representative agent (the no Ponzi game constraint): Z t r (v) dv = 0. lim K (t) exp

0

(16.30)

Consumer maximization gives the relative price of the capital-intensive good as: 1 pK = p pL 361 YK YL 1 , (16.31)

14.451: Introduction to Economic Growth where pK is the price of YK and pL is the price of YL . To determine the level of prices, I choose the price of the consumption aggregate, Y , in each period as numeraire, i.e., 1 1 1 = 1, which implies that: pL + (1 ) p1 K Next, consumer maximization and the CES functions in (16.24) yield the following isoelastic demand curves for intermediates: 1 1 pl (i) yl (i) pk (i) yk (i) = and = . pL YL pK YK 1 1 pK = p1 + (1 ) 1 and pL = + (1 ) p1 1 .

(16.32)

(16.33)

Given these isoelastic demands, prot maximization by the monopolists implies that prices will be set as a constant markup over marginal cost (which is w for the labor-intensive intermediates and r for the capital-intensive intermediates): 1 1 1 w 1 r pl (i) = 1 w= r= . and pk (i) = 1

(16.34)

Since, from (16.34), all labor-intensive intermediates sell at the same price, equation (16.33) implies that yl (i) = yl , for all i, and since all capital-intensive intermediates also sell at the same price, yk (i) = k for all i as well. Then from the market clearing equation (16.26), we obtain yl (i) = l(i) = L K and yk (i) = k(i) = . n m (16.35)

Substituting (16.35) into (16.24) and integrating gives the total supply of labor- and capital-intensive goods as: YL = n

1

L and YK = m

K.

(16.36)

These equations reiterate that n and m correspond to labor- and capital-augmenting technologies. Greater n enables the production of a greater level of YL for a given quantity of labor, and similarly an increase in m raises the productivity of capital. 362

14.451: Introduction to Economic Growth Equations (16.33), (16.34), (16.35) and (16.36) give the wage rate and the rental rate of capital as: w = n

1 1

pL and r = m

pK .

(16.37)

Finally, using (16.31) and (16.36), the relative price of the capital intensive good is 1 1 1 m K pK = . p pL n L The value of a monopolist who invents a new f -intermediate, for f = l or k, is: Vf (t) = Z

(16.38)

t

(16.39)

where r(t) is the interest rate at date t, is the depreciation (obsolescence) rate of existing intermediates, and l = 1 wL 1 rK and k = n m (16.40)

are the ow prots from the sale of labor- and capital-intensive intermediate goods. Scientists are paid a wage S , and competition between the two sectors and free-entry ensure that this wage is equal to the maximum of their contribution to the value of monopolists in the two sectors. Recall that R&D rms do not internalize the crowding eects, so the marginal value of allocating one more scientist to the invention of labor-intensive intermediates is bl (Sl ) nVl , and for capital-intensive intermediates, it is bk (Sk ) mVk , where Vl and Vk are given by (16.39). Therefore, free-entry requires: S = max {bl (Sl ) nVl , bk (Sk ) mVk } . (16.41)

Equation (16.41) implies zero expected prots for all rms at all point in time, so = 0 in (16.21). 363

14.451: Introduction to Economic Growth An equilibrium in this economy is therefore a set of factor prices, w, r and S that

m satisfy (16.37) and (16.41), good prices, [pl (i)]n i=0 , [pk (i)]i=0 , that satisfy (16.34), intermediate

production levels given by (16.35), output levels given by (16.36), sequences of aggregate consumption and investment levels that satisfy (16.29) and (16.30), and sequences of Sl and Sk that satisfy (16.41).

16.3.3

Let us dene an asymptotic path (AP) as an equilibrium path that the economy tends to as (t) /C (t) = t , and does not include limit cycles. In an AP, we can have either limt C

(t) /C (t) = gc , , i.e., consumption grows more than exponentially (explodes), or limt C i.e., the rate of consumption growth tends to a constant, possibly 0 (including the case where limt C (t) = 0 as a special case). A balanced growth path (BGP) is dened as an AP where output, consumption and the capital stock grow at the same nite constant rate, (t) /C (t) = limt Y (t) /Y (t) = limt K (t) /K (t) = g . i.e., limt C This subsection will show that with < 1, only BGPs can be an AP, so if the economy is going to tend to a non-cycling path, this has to be a BGP. In contrast, with 1, there may exist asymptotic paths where consumption grows more than exponentially or grows at a dierent rate than capital, but these artists interesting for us given our focus on < 1. To facilitate the analysis, let us a dope the notation: N n

1

and M m

and, together with (16.36), allows me to write output in a more compact way: h i 1 1 1 Y = (NL) + (1 ) (MK ) 364 (16.42)

14.451: Introduction to Economic Growth In addition, I dene a normalized capital stock, k MK , NL (16.43)

which is a direct generalization of the normalized capital stock dened in the neoclassical growth model as capital stock divided by the eective units of labor. Here the numerator contains the eective units of capital as well, since there can be capital-augmenting technical change. Then, using (16.32), (16.37), (16.38) and (16.43), we can write the interest rate as:

1 i h 1 1 + (1 ) . r = R(M, k ) (1 ) M k

(16.44)

rK 1 1 = pk = k . wL

(16.45)

The relationship between the relative share of capital and the normalized capital stock depends on , which is the elasticity of substitution between capital-intensive and laborintensive goods. Equation (16.45) shows that is also the elasticity of substitution between capital and labor in this economy. In response to an increase in k , sK will also increase if > 1, and will decrease if < 1. This analysis leads to the following crucial result: Proposition 47 With < 1, all APs are BGPs and feature purely labor-augmenting tech (t) /M (t) = 0. nical change, i.e., they have limt M This result demonstrates that with < 1, i.e., with labor and capital as gross complements, the only asymptotic (non-cycling) paths will feature purely labor-augmenting technical change. There will be research eort devoted to the invention of capital-intensive intermediates, but this is only to keep the state of technology in that sector at a constant level. 365

14.451: Introduction to Economic Growth This is essentially a generalization of the steady-state growth theorem, Theorem 7, which showed that balanced growth is only consistent with purely labor-augmenting technological change. This proposition shows the same is the case in this more general model.

16.3.4

We saw above that with < 1, only a BGP with purely labor-augmenting technical change can be an AP. Now I show that there in fact exists a unique BGP as long as > 0, and characterize the properties of this equilibrium path. First note that from the Euler equation, (16.29), the BGP rate of interest has to be constant. Moreover, since from Proposition 47, M/M = 0, equation (16.44) immediately implies that the price index for capital-intensive goods, pK , and therefore, the relative price of capital-intensive goods, p, must remain constant. In addition, in BGP, output, Y , the wage rate, w, and the capital stock, K , will all grow at a common rate, g . Furthermore, for p to remain constant, (16.31) implies that YL and YK should grow at the same rate. Therefore, with M constant, n has to grow at the rate g/ (1 ) (or N has to grow at the rate g ). We can then integrate equation (16.39), allowing for the depreciation of technologies at the rate , and the growth of w, K and n, to obtain the values of inventing labor- and capital-intensive goods as: Vl = wL/n 1 1 rK/m and Vk = . r + (1 2 ) g/ (1 ) r+g (16.46)

Notice that these values also grow at a constant rate along the BGP because w, K and n are growing. The denominator for Vl is dierent from that of Vk because its BGP growth rate is lower than that of Vk : n, which is in the denominator of l , grows along the balanced growth path, while m remains constant. 366

14.451: Introduction to Economic Growth Recall that in BGP, p and m are constant, so there is no net capital-augmenting technical

as dened above. The remaining change. This implies (Sk ) Sk = /bk , i.e., Sk = Sk

scientists will work on labor-augmenting technical change. The growth rate of the economy is therefore g = 1n 1 ) (S Sk ) ] . = [bl (S Sk n (16.47)

Assumption 14 ensures that g > 0. The Euler equation (16.29) then gives the BGP interest rate as r = + g . The interest rate has to be higher when the growth rate is higher in order to convince consumers to delay consumption, and the elasticity of marginal utility, , determines how strong this eect needs to be. Let k = G(M ) such that M and k are consistent with BGP (i.e., r = R(M, k )). It is clear from (16.44) that G0 > 0that is, there is a strictly increasing relationship between M and k. This is because a greater k implies a lower price of capital-intensive goods, so capital has to become more productive, i.e., M has to increase in order to keep the interest rate at r . Next, let k be the level of normalized capital such that at this normalized capital stock and at M/M = 0, R&D rms are indierent between capital- and labor-augmenting technical

change, i.e., bl (S Sk ) nVl = bk (Sk ) mVk , or from equation (16.10), bl (S Sk ) wL ) r K bk (Sk = . r + (1 2 ) g / (1 ) r + g

(16.48)

) (1 ) ( + + ( 1) g ) bl (S Sk , bk (Sk ) ((1 ) ( + ) + ((1 ) ( 1) + ) g )

(16.49)

367

14.451: Introduction to Economic Growth with g given by (16.47). In other words, using equation (16.45), we have: k=k

b 1

K = b .

(16.50)

Finally, let M be such that k G(M ), i.e., M is the level of capital-augmenting technology that is consistent with the equilibrium interest rate taking its BGP value when k = k . As a result, when k = k and M = M , the interest rate will be equal to r and the relative share of capital will be b . In BGP, M/M = 0, while N/N > 0. Because of the depreciation of technologies, there must be both research to invent new labor-intensive and capital-intensive intermediatesif there were no research directed at capital-intensive intermediates, we would have M/M < 0. This implies that rms working to invent both types of goods have to make equal prots, so we need conditions (16.48) and (16.49) to hold, i.e., k = k , which in turn requires that M = M so that r = r . We can therefore state: Proposition 48 Suppose that < 1 and > 0. Then there exists a unique BGP where k = k as given by (16.50), M = M = G1 (k ), r = r = + g , and output, consumption and wages grow at the rate g given by (16.47). This proposition characterizes the unique BGP, which features purely labor-augmenting technical change. In this BGP, most research is devoted to the invention of labor-intensive intermediates. There is just enough capital-augmenting technical change to keep the productivity of capital constantthat is, there is no net capital-augmenting technical change. As a result, despite growth and capital deepening, factor shares remain constant in the long run. Intuitively, when the relative share of capital is equal to K = b , R&D rms are 368

14.451: Introduction to Economic Growth just indierent between inventing capital-intensive and labor-intensive intermediates; so in equilibrium they allocate their eort between the two sectors precisely to keep the relative share of capital at b . We have already seen that when < 1, the BGP with purely laboraugmenting technical change is the only possible asymptotic equilibrium path. In addition, we will see also below that, under certain conditions, this BGP is dynamically stable, so starting from dierent initial conditions, the economy will tend towards this growth path. Given the CRRA preferences, the conclusion that for a BGP with constant interest rate and growth rate, we need M = M i.e., no net capital-augmenting technical changeis not surprising. What is important (perhaps surprising), however, is that such a BGP exists despite the possibility of capital-augmenting technical change. The results are similar in spirit when there is no technological depreciation, i.e., = 0, but there are now many balanced growth paths. These paths have the same growth rate, g (given by (16.47) evaluated at = 0), but dierent factor distributions of income. This reects that the equilibrium correspondence is lower-hemi continuous, but not continuous, in at = 0. Summarizing this result: Proposition 49 Suppose that < 1 and = 0. Then, there exists a BGP for each M M G1 (k ), where k is given by (16.49) and (16.50) with = 0. In all BGPs, output, consumption, wages, and the capital stock grow at the same rate g given by (16.47) with = 0, and the share of labor is constant. Each BGP has a dierent normalized capital stock, k = G (M ), and a dierent relative share of capital, K . The intuition for the multiplicity of BGPs is simple: without depreciation, all that is required for a BGP is that labor-augmenting improvements should be more protable than capital-augmenting improvements, i.e. Vk Vl , and this can happen for a range of capital 369

16.3.5

Transitional Dynamics

Finally, we would like to know whether the economy will tend to be balanced growth path with three labor augmenting technical change. Here, the feature that < 1 ensures this. In particular, we have the following result (which is proved in Acemoglu, 2003): Proposition 50 Suppose > 0 and that < 1, then the BGP characterized above is locally saddle-path stable. Therefore, this model provides a framework in which technological change can be capitalaugmenting in the short run or in the median run, but in the long run it will be endogenously labor-augmenting, ensuring a balanced growth path equilibrium as in the standard neoclassical growth model.

16.3.6

Policy Implications

Despite the similarity of this model to the neoclassical one, the implications are actually quite dierent. Let us consider one example here. Suppose that there is taxation of capital income, so that the budget constraint of the representative household becomes:

C + I wL + (1 ) rK + S S + + T. It can be veried that in the standard neoclassical growth model with exogenously laboraugmenting technological change, an increase in will aect the capital to eective labor ratio and the share of capital in national income. In contrast, here we have: 370

14.451: Introduction to Economic Growth Proposition 51 Suppose > 0 and that < 1, then the BGP capital share in national income is constant and independent of . The reason for this result is interesting: when taxes reduce the rate of return to capital, the composition of technology between capital-augmenting and labor-augmenting types adjusts endogenously in order to restore the rate of interest and the share of capital in national income back to its BGP level. Therefore, in this model, a variety of policies aect the composition of technological change, but may have much less eect on long-run growth properties.

371

372

Thinking of the composition of technology also opens the way for us to consider issues of appropriate technologies. Recall that in previous models technological dierences were often explained by assuming that technologies did not freely ow from advanced countries to less advanced ones. Why should ideas not ow and machines not be exported to poor countries? Perhaps distortions as in the previous model, but even when ideas could ow at no cost, productivity dierences may stay. Why? Many technologies used by LDCs are inappropriate because they are designed to make optimal use of the prevailing factors and conditions in DCs, where most technologies are developed. There is a mismatch between technologies developed in the North and LDCs weather conditions, labor force skills, etc. Most technologies are developed in the North. For example, over 90% of the world R&D 373

17.1

Atkinson and Stiglitz suggested the following idea: new technologies are specic to a given capital-labor ratio. When used with dierent capital labor ratios, they are less productive. For example, suppose that the production technology is Y = A (k | k0 ) K 1 L = A (k | k0 ) k1 L where k = K/L is the capital-labor ratio, and A (k | k0 ) is the productivity of technology designed to be used with capital-labor ratio k0 when used instead with capital-labor ratio k. For example, suppose that k A (k | k ) = A min 1, k0

0

for (0, 1). That is, when a technology designed for the capital labor ratio k 0 is used with a lower capital-labor ratio, there is a loss in eciency. Now suppose that new technologies are developed in richer economies, which have greater capital-labor ratios. Then productivity in a less developed country with the capital-labor ratio k < k 0 will be

Y = A (k | k0 ) k1 L = Ak 1+ (k 0 )

So less developed countries will produce with worse technologies. Moreover, this technological disadvantage will be larger when the gap in the capital intensity of production between these countries and in the technologically advanced economies is greater. 374

14.451: Introduction to Economic Growth The problem with this formulation is that it is static. A recent paper by Basu and Weil (1998) presents a dynamic formulation based on the idea that A is determined not only by the current capital-labor ratio in the technologically advanced economies, but by the whole history of these capital-labor ratios.

17.2

The Atkinson-Stiglitz and Basu-Weil approach emphasizes dierences in capital intensity between rich and poor economies. Another possibility is mismatch between the skill requirements of the frontier technologies in the rich economies and the available skills in the LDCs. Here I will outline a model where the skill requirements of new technologies are determined by directed technical change in the technologically advanced economies, and this creates a mismatch between these technologies and the supply of human capital in the LDCs.

17.2.1

A Model

Consider two groups of countries: the North and the South. H n /Ln > H s /Ls All technological progress originates in the North. But the South can adopt technologies without any impediments or costs. Z

t

C ( )1 1 exp(( t))d , 1

C + I + X Y exp

ln y (i)di ,

(17.1)

Here i denotes either a task that needs to be performed for production, or an industry that will contribute to nal output. Technology: y (i) = Z

NL

kL (i, v )

dv [(1 i)l(i)] +

NH

kH (i, v )

dv [iZh(i)] ,

(17.2)

There are 3 important features embedded in this technology: 1. Each task/industry output can be produced using two alternative technologies, one using skilled workers, the other one using unskilled labor. 2. The productivities of these two technologies are parameterized by NL and NH . 3. Skilled and unskilled labor have dierent comparative advantages across sectors. In particular, skilled workers are more productive in tasks/industries with high indices. Technological parameter, Z , determines how productive skilled workers are relative to unskilled workers. There is a continuum of machines j [0, NL ] (complementary to unskilled workers), and a continuum j [0, NH ] (complementary to skilled workers), as in the basic directed technical change model. Final goods sector is competitive and rents capital and labor services. A (technology) monopolist owns the patent for each type of machine, produces and rents machines at the rental rate z (v). 376

14.451: Introduction to Economic Growth Producers of the nal good i [0, 1] are price takers. They maximize prots, Z NL Z NH L (v ) kL (i, v )dv H (v ) kH (i, v )dv, p (i) y (i) wL l (i) wH h (i)

0 0

Solve for equilibrium kz (v ), z (v) and replace kz (v ) in production functions: h i1/ (1 )p(i) ((1 i)l(i)) /L (v ) , h i1/ . kH (i, v ) = (1 )p(i) (iZh(i)) /H (v ) kL (i, v ) = monopolists will again be a constant markup over marginal cost. Substituting machine prices into (17.3), and then using the resulting expressions with (17.2), we obtain output in sector i as y (i) = 1 p(i)(1 )/ [NL (1 i)l(i) + NH iZ h(i)] . Technical progress: increases in NL and NH (as in the baseline directed technical change model discussed above). NL and NH are the only state variables in the model. Now taking NL and NH , the equilibrium is straightforward to characterize. The equilibrium will take a similar form both in the North and in the South. a threshold J [0, 1] such that skilled workers will be used only in sectors i > J . More explicitly, i < J , h(i) = 0, and In equilibrium: 377 i > J , l(i) = 0.

(17.3) (17.4)

Given these isoelastic demand for machines, the optimal rental rates for the technology

14.451: Introduction to Economic Growth i < J, p(i) = PL (1 i) i > J, p(i) = PH i and l(i) = L/J ,

where PL and PH are price indices for goods produced intensively using the skilled or unskilled workers. Relative price of skill-intensive goods: /2 PH NH ZH = , PL NL L The equilibrium threshold will be given by 1/2 NH ZH J = , 1J NL L 2 Y = exp( ) (NL L)1/2 + (NH ZH )1/2 , Wage premium: wH =Z wL NH NL 1/2 ZH L 1/2

Total output:

(17.5)

Next, we have to determine NL and NH . This is similar to the analysis of directed technical change we saw above. In particular, assuming no state dependence (i.e., in terms of the model of directed technical change above, the innovation possibilities frontier as given by (16.12), steady state/balanced growth requires H = L . This implies that for technological equilibrium:

n PH = n PL

ZH n Ln

378

n n wH /wL =Z

independent of factor endowment in the North (this is the eect of directed technical change, but also the special case corresponding to = 2 in terms of the directed technical change model above). Next, assume that Southern producers take NL and NH from the North, and maximize prots. This captures the notion that the North is the technologically advanced economy, and the South is the follower. A monopolist in each Southern country copies each new machine and sells it to the producers in its country. In equilibrium: Js > Jn

In other words, certain tasks that are performed by skilled workers in the North will be performed by unskilled workers in the Souththis is simply an implication of the greater skill abundance in the North. Technology levels NH and NL are determined in the North, and Y s grows at the same rate g as in the North. 379

14.451: Introduction to Economic Growth Let us allow for the price of capital to be larger in the South than in the North (see Jones, 1995). This will imply that capital-labor ratios may dier between countries.

17.2.2

Implications

What are the productivity implications of directed technical change in the North, and the South importing technologies developed in the North? Dene A= y=

Y L+ZH Y L+H

Both output per eective unit of labor and output per capita are greater in the North than the South, even if both countries have the same cost of capital. This is true a fortiori, if cost of capital is higher in the South. Intuition: TFP is maximized in the North. Why? The world technologies are designed to make best use of factor abundance/scarcity in the North. For example, there are many more skilled workers in the North, so technologies developed in the North are more skill-biased then what is required in the South. Since J s > J n , these skill-biased technologies will be less useful in the South than in the North, leading to endogenous productivity dierences between these countries.

17.2.3

Calibration

Can this mechanism lead to sizable eects? Can we generate output per worker dierences which resemble those in the data? Can we improve on the neoclassical model? 380

c = A (K c )1 (Lc + ZH c ) . YNC

c = NL (K ) YAZ

c 1

(L )

c 1/2

NH ZH c NL

1/2 !2

U SA = 1. NL is chosen so as to normalize yAZ

381

5th y NC

Our model

5th y AZ

LDC y NC

<2 NC

LDC y AZ

<2 AZ 0.728 0.937 0.934 0.843 0.723 0.931 0.918 0.803 0.707 0.901 0.840 0.689

0.16 0.15 0.15 0.18 0.17 0.16 0.17 0.19 0.21 0.21 0.21 0.21

0.651 0.39 0.816 0.26 0.808 0.28 0.718 0.37 0.625 0.40 0.757 0.28 0.745 0.31 0.666 0.39 0.540 0.41 0.540 0.31 0.540 0.36 0.540 0.44

0.09 0.05 0.07 0.13 0.09 0.05 0.08 0.14 0.10 0.07 0.11 0.17

Sec. compl. 1.8 0.39 Higher Primary Sec. att. 1.8 0.43 1.5 0.46 1.5 0.41

Sec. compl. 1.5 0.42 Higher Primary Sec. att. 1.5 0.45 1.0 0.49 1.0 0.49

y LDC = 0.21 (avg. GDP per worker in non-OECD). y 5th = 0.03 (GDP 5th poorest country)

382

This course so far has been about understanding the mechanics of economic growth. The models we have seen are very useful for understanding how individuals accumulate capital, how physical and human capital aect economic growth and income levels, and how technology endogenously changes and is transferred from one country to another. However, the major question motivating much of the analysis of economic growth is to understand why some countries are rich while some others are poor, or why some countries grow faster than while others stagnate. At some level, what we have focused on are the proximate causes of this process. Exactly as in the empirical analysis of decomposing cross-country income dierences into dierences in physical capital, human capital and technology, we have learned how to construct microfounded models which help us in thinking about the process of economic growth in a careful and rigorous way. 383

14.451: Introduction to Economic Growth But after seeing all these models, can we answer the question of why is Nigeria poorer than the United States? The answer is yes and no, probably with more emphasis on the no. Inevitably, the answer comes to preferences, policies and institutions. The models we have seen give us a way of translating dierences in preferences, policies, institutions (and sometime technology) into dierences in growth rates and income levels. Therefore, the next step in the study of economic growth is to understand why dierent countries adopt dierent policies. This is the realm of the political economy of growth or political economy of development. These topics fall beyond the scope of this course. Nevertheless, as a pointer for those who are interested in thinking about these topics, I include a brief discussion of some of the issues here and also provide a very simple model showing how institutions and policies can be incorporated into a simple growth-type model to analyze how distributional conict inuences the growth prospects of a society.

18.1

As discussed above, institutions (and related policy dierences originating from institutional dierences) have become popular recently in thinking of fundamental causes of dierences in income per capita and growth performance of countries. In this context, institutions contrast with other potential fundamental causes such as geographical dierences or cultural factors. While geographic characteristics of countries and regions may lead to dierences in the technology available to individuals or make their investments in physical and human capital more dicult, institutional dierences, associated with dierences in the organization of society, shape economic and political incentives and aect the nature of equilibria via these 384

18.1.1

Douglass North (1990, p. 3) oers the following denition: Institutions are the rules of the game in a society or, more formally, are the humanly devised constraints that shape human interaction. Three important features of institutions are apparent in this denition: (1) that they are humanly devised, which contrasts with other potential fundamental causes, like geographic factors, which are outside human control; (2) that they are the rules of the game setting constraints on human behavior; (3) that their major eect will be through incentives (see also North, 1981). There are tremendous cross-country dierences in the way that economic and political life is organized. A voluminous literature documents large cross-country dierences in economic institutions, and a strong correlation between these institutions and economic performance, and we have seen some of those in the early lectures of this course. Knack and Keefer (1995), for instance, look at measures of property rights enforcement compiled by international business organizations, Mauros (1995) study looks at measures of corruption, and work by Djankov, La Porta, Lopez-De-Silanes and Shleifer compiles measures of entry barriers across countries, while many studies look at variation in educational institutions and the corresponding dierences in human capital. All of these authors nd substantial dierences in these measures of economic institutions, and signicant correlation between these measures and various indicators of economic performance. For example, Djankov et al. nd that, while the total cost of opening a medium-size business in the United States is less than 0.02 percent of GDP per capita in 1999, the same cost is 2.7 percent of GDP per capita in Nigeria, 1.16 percent in Kenya 0.91 percent in Ecuador and 4.95 percent 385

14.451: Introduction to Economic Growth in the Dominican Republic. These entry barriers are highly correlated with various economic outcomes, including the rate of economic growth and the level of development. Nevertheless, as already discussed in the earlier lectures, this type of correlation does not establish that the countries with worse institutions are poor because of their institutions. After all, the United States diers from Nigeria, Kenya and the Dominican Republic in its social, geographic, cultural and economic fundamentals, so these may be the source of their poor economic performance. In fact, these dierences may be the source of institutional differences themselves. Consequently, evidence based on correlation does not establish whether institutions are important determinants of economic outcomes. To make further progress, one needs to isolate a source of exogenous dierences in institutions, so that we approximate a situation in which a number of otherwise-identical societies end up with dierent sets of institutions. European colonization of the rest of the world provides a potential laboratory to investigate these issues. From the late 15th century, Europeans dominated and colonized much of the rest of the Globe. Together with European dominance came the imposition of very dierent institutions and social power structures in dierent parts of the world. Acemoglu, Johnson and Robinson, AJR, (2001) document that in a large number of colonies, especially those in Africa, Central America, the Caribbean and South Asia, European powers set up extractive states. These institutions (again broadly construed) did not introduce much protection for private property, nor did they provide checks and balances against the government. The explicit aim of the European in these colonies was extraction of resources, in one form or another. This colonization strategy and the associated institutions contrast with the institutions Europeans set up in other colonies, especially in colonies where they settled in large numbers, for example, the United States, Canada, Australia and 386

14.451: Introduction to Economic Growth New Zealand. In these colonies the emphasis was on the enforcement of property rights for a broad cross section of the society, especially smallholders, merchants and entrepreneurs. The term broad cross section is emphasized here, since even in the societies with the worst institutions, the property rights of the elite are often secure, but the vast majority of the population enjoys no such rights and faces signicant barriers preventing their participation in many economic activities. Although investments by the elite can generate economic growth for limited periods, for sustained growth property rights for a broad cross section seem to be crucial (AJR, 2002a, Acemoglu, 2003). A crucial determinant of whether Europeans chose the path of extractive institutions was whether they settled in large numbers. In colonies where Europeans settled, the institutions were being developed for their own future benets. In colonies where Europeans did not settle, their objective was to set up a highly centralized state apparatus, and other associated institutions, to oppress the native population and facilitate the extraction of resources in the short run. Based on this idea, AJR (2001) suggest that in places where the disease environments made it easy for Europeans to settle, the path of institutional development should have been dierent from areas where Europeans faced high mortality rates. In practice, during the time of colonization, Europeans faced widely dierent mortality rates in colonies because of dierences in the prevalence of malaria and yellow fever. They provide a possible candidate for a source of exogenous variation in institutions. These mortality rates should not inuence output today directly, but by aecting the settlement patterns of Europeans, they may have had a rst-order eect on institutional development. Consequently, these potential settler mortality rates can be used as an instrument for broad institutional dierences across countries in an instrumental-variables estimation strategy. The key requirement for an instrument is that it should have no direct eect on the 387

14.451: Introduction to Economic Growth outcome of interest (other than its eect via the endogenous regressor). There are a number of channels through which potential settler mortality could inuence current economic outcomes or may be correlated with other factors inuencing these outcomes. Nevertheless, there are also good reasons for why, as a rst approximation, these mortality rates should not have a direct eect. Malaria and yellow fever were fatal to Europeans who had no immunity, thus having a major eect on settlement patterns, but they had much more limited eects on natives who, over centuries, had developed various types of immunities. The exclusion restriction is also supported by the death rates of native populations, which appear to be similar between areas with very dierent mortality rates for Europeans. The data also show that there were major dierences in the institutional development of the high-mortality and low-mortality colonies. Moreover, consistent with the key idea in AJR (2001), various measures of broad institutions, for example, measures of protection against expropriation, are highly correlated with the death rates Europeans faced more than 100 years ago and with early European settlement patterns. They also show that these institutional dierences induced by mortality rates and European settlement patterns have a major (and robust) eect on income per capita. For example, the estimates imply that improving Nigerias institutions to the level of those in Chile could, in the long run, lead to as much as a 7-fold increase in Nigerias income. This evidence suggests that once we focus on potentially-exogenous sources of variation, the data points to a large eect of broad institutional dierences on economic development. Naturally, mortality rates faced by Europeans were not the only determinant of Europeans colonization strategies. AJR (2002) focus on another important aspect, how densely dierent regions were settled before colonization. They document that in more denselysettled areas, Europeans were more likely to introduce extractive institutions because it was 388

14.451: Introduction to Economic Growth more protable for them to exploit the indigenous population, either by having them work in plantations and mines, or by maintaining the existing system and collecting taxes and tributes. This suggests another source of variation in institutions that may have persisted to the present, and AJR (2002) show similar large eects from this source of variation. Another example that illustrates the consequences of dierence in institutions is the contrast between North and South Korea. The geopolitical balance between the Soviet Union and the United States following the WWII led to separation along the 38th parallel. The North, under the dictatorship of Kim Il Sung, adopted a very centralized command economy with little role for private property. In the meantime, South Korea, though far from a free-market economy, relied on a capitalist organization of the economy, with private ownership of the means of production, and legal protection for a range of producers, especially those under the umbrella of the chaebols, the large family conglomerates that dominated the South Korean economy. Although not democratic during its early phases, the South Korean state was generally supportive of rapid development and is often credited with facilitating, or even encouraging, investment and rapid growth in Korea. Under these two highly contrasting regimes, the economies of North and South Korea diverged. While South Korea grew rapidly under capitalist institutions and policies, North Korea experienced minimal growth since 1950, under communist institutions and policies. Overall, a variety of evidence paints a picture in which broad institutional dierences across countries have had a major inuence on their economic development. This evidence suggests that to understand why some countries are poor we should understand why their institutions are dysfunctional. But this is only part of a rst step in the journey towards an answer. The next question is even harder: if institutions have such a large eect on economic riches, why do some societies choose, end up with and maintain these dysfunctional 389

18.1.2

As a rst step in modeling institutions, let us consider the relationship between three institutional characteristics: (1) economic institutions; (2) political power; (3) political institutions. As already mentioned above, economic institutions matter for economic growth because they shape the incentives of key economic actors in society, in particular, they inuence investments in physical and human capital and technology, and the organization of production. Economic institutions not only determine the aggregate economic growth potential of the economy, but also the distribution of resources in the society, and herein lies part of the problem: dierent institutions will not only be associated with dierent degrees of eciency and potential for economic growth, but also with dierent distribution of the gains across dierent individuals and social groups. How are economic institutions determined? Although various factors play a role here, including history and chance, at the end of the day, economic institutions are collective choices of the society. And because of their inuence on the distribution of economic gains, not all individuals and groups typically prefer the same set of economic institutions. This leads to a conict of interest among various groups and individuals over the choice of economic institutions, and the political power of the dierent groups will be the deciding factor. The distribution of political power in society is also endogenous. To make more progress here, let us distinguish between two components of political power; de jure (formal) and de facto political power (see Acemoglu and Robinson, 2005). De jure political power refers to power that originates from the political institutions in society. Political institutions, similar to economic institutions, determine the constraints on and the incentives of the key actors, 390

14.451: Introduction to Economic Growth but this time in the political sphere. Examples of political institutions include the form of government, for example, democracy vs. dictatorship or autocracy, and the extent of constraints on politicians and political elites. A group of individuals, even if they are not allocated power by political institutions, may possess political power; for example, they can revolt, use arms, hire mercenaries, co-opt the military, or undertake protests in order to impose their wishes on society. This type of de facto political power originates from both the ability of the group in question to solve its collective action problem and from the economic resources available to the group (which determines their capacity to use force against other groups). This discussion highlights that we can think of political institutions and the distribution of economic resources in society as two state variables, aecting how political power will be distributed and how economic institutions will be chosen. An important notion is that of persistence ; the distribution of resources and political institutions are relatively slowchanging and persistent. Since, like economic institutions, political institutions are collective choices, the distribution of political power in society is the key determinant of their evolution. This creates a central mechanism of persistence: political institutions allocate de jure political power, and those who hold political power inuence the evolution of political institutions, and they will generally opt to maintain the political institutions that give them political power. A second mechanism of persistence comes from the distribution of resources: when a particular group is rich relative to others, this will increase its de facto political power and enable it to push for economic and political institutions favorable to its interests, reproducing the initial disparity. Despite these tendencies for persistence, the framework also emphasizes the potential for change. In particular, shocks to the balance of de facto political power, including changes in technologies and the international environment, have the potential to 391

14.451: Introduction to Economic Growth generate major changes in political institutions, and consequently in economic institutions and economic growth. Therefore, what we need is a framework for organizing our approach to the determination of economic institutions and policies, taking into account that political institutions themselves are endogenous, and are chosen for their dynamic inuences on economic allocations. A simple way of summarizing some of these ideas in the form of a ow diagram is as follows: economic de jure performancet economic political political = institutionst = institutionst = powert & distribution & of resources de facto distribution t+1 = of resourcest = political political power institutions

t t+1

This diagram illustrates both the eect of economic institutions on economic performance

and the distribution of resources in a society, and the role of the combination of de jure and de facto political power in shaping both economic and political institutions.

18.1.3

Institutions in Action

As a brief example, consider the development of property rights in Europe during the Middle Ages. Lack of property rights for landowners, merchants and proto- industrialists was detrimental to economic growth during this epoch. Since political institutions at the time placed political power in the hands of kings and various types of hereditary monarchies, such rights were largely decided by these monarchs. The monarchs often used their powers to expropriate producers, impose arbitrary taxation, renege on their debts, and allocate the 392

14.451: Introduction to Economic Growth productive resources of society to their allies in return for economic benets or political support. Consequently, economic institutions during the Middle Ages provided little incentive to invest in land, physical or human capital, or technology, and failed to foster economic growth. These economic institutions also ensured that the monarchs controlled a large fraction of the economic resources in society, solidifying their political power and ensuring the continuation of the political regime. The seventeenth century, however, witnessed major changes in the economic and political institutions that paved the way for the development of property rights and limits on monarchs power, especially in England after the Civil War of 1642 and the Glorious Revolution of 1688, and in the Netherlands after the Dutch Revolt against the Hapsburgs. How did these major institutional changes take place? In England until the sixteenth century the king also possessed a substantial amount of de facto political power, and leaving aside civil wars related to royal succession, no other social group could amass sucient de facto political power to challenge the king. But changes in the English land market and the expansion of Atlantic trade in the sixteenth and seventeenth centuries gradually increased the economic fortunes, and consequently the de facto power of landowners and merchants opposed to the absolutist tendencies of the Kings. By the seventeenth century, the growing prosperity of the merchants and the gentry, based both on internal and overseas, especially Atlantic, trade, enabled them to eld military forces capable of defeating the king. This de facto power overcame the Stuart monarchs in the Civil War and Glorious Revolution, and led to a change in political institutions that stripped the king of much of his previous power over policy. These changes in the distribution of political power led to major changes in economic institutions, strengthening the property rights of both land and capital owners and spurring a process of nancial and 393

14.451: Introduction to Economic Growth commercial expansion. The consequence was rapid economic growth, culminating in the Industrial Revolution, and a very dierent distribution of economic resources from that in the Middle Ages. This discussion poses, and also gives clues about the answers to, two crucial questions. First, why do the groups with conicting interests not agree on the set of economic institutions that maximize aggregate growth? Second, why do groups with political power want to change political institutions in their favor? In the context of the example above, why did the gentry and merchants use their de facto political power to change political institutions rather than simply implement the policies they wanted? The issue of commitment is at the root of the answers to both questions. An agreement on the ecient set of institutions is often not forthcoming because of the complementarity between economic and political institutions and because groups with political power cannot commit to not using their power to change the distribution of resources in their favor. For example, economic institutions that increased the security of property rights for land and capital owners during the Middle Ages would not have been credible as long as the monarch monopolized political power. He could promise to respect property rights, but then at some point, renege on his promise, as exemplied by the numerous nancial defaults by medieval kings. Credible secure property rights necessitated a reduction in the political power of the monarch. Although these more secure property rights would foster economic growth, they were not appealing to the monarchs who would lose their rents from predation and expropriation as well as various other privileges associated with their monopoly of political power. This is why the institutional changes in England as a result of the Glorious Revolution were not simply conceded by the Stuart kings. James II had to be deposed for the changes to take place. 394

14.451: Introduction to Economic Growth The reason why political power is often used to change political institutions is related. In a dynamic world, individuals care not only about economic outcomes today but also in the future. In the example above, the gentry and merchants were interested in their prots and therefore in the security of their property rights, not only in the present but also in the future. Therefore, they would have liked to use their (de facto) political power to secure benets in the future as well as the present. However, commitment to future allocations (or economic institutions) is in general not possible because decisions in the future are made by those who hold political power at the time. If the gentry and merchants would have been sure to maintain their de facto political power, this would not have been a problem. However, de facto political power is often transient, for example because the collective action problems that are solved to amass this power are likely to resurface in the future, or other groups, especially those controlling de jure power, can become stronger in the future. Therefore, any change in policies and economic institutions that relies purely on de facto political power is likely to be reversed in the future. In addition, many revolutions are followed by conict within the revolutionaries. Recognizing this, the English gentry and merchants strove not just to change economic institutions in their favor following their victories against the Stuart monarchy, but also to alter political institutions and the future allocation of de jure power. Using political power to change political institutions then emerges as a useful strategy to make gains more durable. Consequently, political institutions and changes in political institutions are important as ways of manipulating future political power, and thus indirectly shaping future, as well as present, economic institutions and outcomes.

395

18.2

Now I present a simple model of the determination of institutions in the context of investigating their impact on economic growth. The basic setup is one in which an existing elite is in control of political power, and uses their monopoly of political power for their own interests even when this is costly for the society at large. I will present a simple model of this which will highlight various sources of ineciencies in policies, which in turn will translate into inecient (non-growth enhancing) institutions. It should be noted at this point, however, that the concept of ineciency here is not that of Pareto ineciency, since when distributional issues are important, Pareto eciency is not a strong enough concept. An economy in which all of the resources are allocated to a single individual who has no investment opportunities, thus growth is stied, may nevertheless be Pareto ecient. Thus the concept of ineciency here is being used in the sense of non-growth enhancing or non-surplus maximizing. The various sources of ineciencies in policies are 1. Revenue extraction: the group in powerthe elitewill set high taxes on middle class producers in order to extract resources from them. These taxes are distortionary. This source of ineciency results from the absence of non-distortionary taxes, which implies that the distribution of resources cannot be decoupled from ecient production. 2. Factor price manipulation: the group in power may want to tax middle class producers in order to reduce the prices of the factors they use in production. This ineciency arises because the elite and middle class producers compete for factors (here labor). By taxing middle class producers, the elite ensure lower factor prices and thus higher prots for 396

14.451: Introduction to Economic Growth themselves. 3. Political consolidation: to the extent that the political power of the middle class depends on their economic resources, greater middle class prots reduce the elites political power and endanger their future rents. The elite will then want to tax the middle class in order to impoverish them and consolidate their political power. Although all three ineciencies in policies arise because of the desire of the elite to extract rents from the rest of the society, the analysis will reveal that of the three sources of ineciency, the revenue extraction is typically the least harmful, since, in order to extract revenues, the elite need to ensure that the middle class undertakes ecient investments. In contrast, the factor price manipulation and political consolidation mechanisms encourage the elite to directly impoverish the middle class. An interesting comparative static result is that greater state capacity shifts the balance towards the revenue extraction mechanism, and thus, by allowing the elite to extract resources more eciently from other groups, may improve the allocation of resources. Additional ineciencies arise when there are commitment problems on the part of the elites, in the sense that they may renege on policy promises once key investments are made. Following the literature on organizational economics, I refer to this as a holdup problem. With holdup, taxes are typically higher and more distortionary. Holdup problems, in turn, are likely to be important, for example, when the relevant investment decisions are long-term, so that a range of policies will be decided after these investments are undertaken. The ineciencies in policies translate into inecient institutions. Institutions determine the framework for policy determination, and economic institutions determine both the limits of various redistributive policies and other rules and regulations that aect the economic transactions and productivity of producers. In the context of the simple model here, I 397

14.451: Introduction to Economic Growth associate economic institutions with two features: 1. limits on taxation and redistribution, and 2. regulation on the technology used by middle class producers. The same forces that lead to inecient policies imply that there will be reasons for the elite to choose inecient economic institutions. In particular, they may not want to guarantee enforcement of property rights for middle class producers or they may prefer to block technology adoption by middle class producers. Holdup problems, which imply equilibrium taxes even higher than those preferred by the elite, create a possible exception, and may encourage the elite to use economic institutions to place credible limits on their own future policies (taxes). This suggests that economic institutions that restrict future policies may be more likely to arise in economies in which there are more longer-term investments and thus more room for holdup. The model also sheds light on the conditions under which economic institutions discourage or block technology adoption. If the source of ineciencies in policies is revenue extraction, the elite always wish to encourage the adoption of the most productive technologies by the middle class. However, when the source of ineciencies in policies is factor price manipulation or political consolidation, the elite may want to block the adoption of more ecient technologies, or at the very least, they would choose not to invest in activities that would increase the productivity of middle class producers. This again reiterates that when the factor price manipulation and political consolidation mechanisms are at work, signicantly more inecient outcomes can emerge. While economic institutions regulate scal policies and technology choices, political institutions govern the process of collective decision-making in society. In the baseline model, 398

14.451: Introduction to Economic Growth the elite have de jure political power, which means that they have the formal right to make policy choices and inuence economic decisions. To understand the ineciencies in the institutional framework, we need to investigate the induced preferences of dierent groups over institutions. In the context of political institutions, this means asking whether the elite wish to change the institutional structure towards a more equal distribution of political power. The same forces that make the elite choose inecient policies also imply that the answer to this question is no. Consequently, despite the ineciencies that follow, the institutional structure with elite control tends to persist. The framework also enables me to discuss issues of appropriate and inappropriate institutions. Concentrating political power in the hands of the elite may have limited costs (may even be ecient), if the elite are suciently productive (more productive than the middle class). However, a change in the productivity of the elite relative to the middle class could make a dierent distribution of political power more benecial. In this case, existing institutions, which may have previously functioned relatively well, become inappropriate to the new economic environment. Yet there is no guarantee that there will be a change in institutions in response to the change in environment. Finally, I extend the framework here for analyzing changes in political institutions. Political institutions regulate the allocation of de jure political power, as in the example of constitutions or elections determining the party in government. There is more to political power than this type of de jure power, however. Certain groups may be able to disrupt the existing system, for example, by solving their collective action problem and undertaking demonstrations, unrest, protests, revolutions or military action. Each group may therefore possess de facto political power even when excluded from de jure political power. In this context, middle class producers, even though they have no formal say in a dictatorship or 399

14.451: Introduction to Economic Growth an oligarchic society, may sometimes have sucient de facto political power to change the system or at least to demand some concessions from the elite. Under these circumstances, changes in political institutions may emerge as an equilibrium outcome. They are useful as a way of committing to future allocations, because, by aecting the distribution of de jure political power in the future, they shape future policies and economic allocations. Such a commitment may be necessary when the current elite need to make concessions in response to a shift in the distribution of de facto political power and when their ability to make concessions within a given political system is limited. Consequently, changes in political institutions take place when the elite are forced to respond to temporary changes in de facto political power by changing the political system (and thus the distribution of de jure political power in the future). The analysis also shows that changes in political institutions are less likely when political stakes are higher, because, in this case, the elite will ght and use repression to defend the existing regime. Rents from the natural resources or land tend to increase political stakes and thus contribute to institutional persistence. Interestingly, state capacity, which makes redistribution more ecient, also increases political stakes and may create dynamic costs by increasing the longevity of the dictatorship of the elite.

18.2.1

Baseline Model

Consider an innite horizon economy populated by a continuum 1 + e + m of risk neutral agents, each with a discount factor equal to < 1. There is a unique non-storable nal good denoted by y . The expected utility of agent j at time 0 is given by:

j U0 X t=0

= E0

t cj t,

(18.1)

400

14.451: Introduction to Economic Growth where cj t R denotes the consumption of agent j at time t and Et is the expectations operator conditional on information available at time t. Agents are in three groups. The rst are workers, whose only action in the model is to supply their labor inelastically. There is a total mass 1 of workers. The second is the elite, denoted by e, who initially hold political power in this society. There is a total of e elites. Finally, there are m middle class" agents, denoted by m. The sets of elite and middle class producers are denoted by S e and S m respectively. With a slight abuse of notation, I will use j to denote either individual or group. Each member of the elite and middle class has access to production opportunities, represented by the production function

j = yt

(18.2)

where k denotes capital and l labor. Capital is assumed to depreciate fully after use. The Cobb-Douglas form is adopted for simplicity. The key dierence between the two groups is in their productivity. To start with, let us assume that the productivity of each elite agent is Ae in each period, and that of each middle class agent is Am . Productivity of the two groups diers, for example, because they are engaged in dierent economic activities (e.g., agriculture versus manufacturing, old versus new industries, etc.), or because they have dierent human capital or talent. On the policy side, there are activity-specic tax rates on production, e and m , which are constrained to be nonnegative, i.e., e 0 and m 0. There are no other scal instruments (in particular, no lump-sum non-distortionary taxes). In addition there is a total income (rent) of R from natural resources. The proceeds of taxes and revenues from natural resources can be redistributed as nonnegative lump-sum transfers targeted towards 401

14.451: Introduction to Economic Growth each group, T w 0, T m 0 and T e 0. Let us also introduce a parameter [0, 1], which measures how much of the tax revenue can be redistributed. This parameter, therefore, measures state capacity, i.e., the ability of the states to penetrate and regulate the production relations in society (though it does so in a highly reduced-form way). When = 0, state capacity is limited all tax revenue gets lost, whereas when = 1 we can think of a society with substantial state capacity that is able to raise taxes and redistribute the proceeds as transfers. The government budget constraint is Ttw +

m

Ttm

Tte

j S e S m

j j t yt dj + R.

(18.3)

j Let us also assume that there is a maximum scale for each rm, so that lt for all j

and t. This prevents the most productive agents in the economy from employing the entire labor force. Since only workers can be employed, the labor market clearing condition is Z

j lt dj 1,

(18.4)

j S e S m

e + m

1 ,

(ES)

there can never be full employment. Consequently, depending on whether Condition (ES) holds, there will be excess demand or excess supply of labor in this economy. Throughout, I assume that Assumption 15 e 1 1 and m , 402

14.451: Introduction to Economic Growth This assumption ensures that neither of the two groups will create excess demand for labor by itself. Assumption 15 is adopted only for convenience and simplies the notation (by reducing the number of cases that need to be studied).

18.2.2

Economic Equilibrium

m I rst characterize the economic equilibrium for a given sequence of taxes, { e t , t }t=0,1,...,

(the transfers do not aect the economic equilibrium). An economic equilibrium is dened as a sequence of wages {wt }t=0,1,..., , and investment and employment levels for all producers, o n j j m , lt j S e S m such that given { e kt t , t }t=0,1,..., and {wt }t=0,1,..., , all producers

t=0,1,...,

choose their investment and employment optimally and the labor market clears.

Each producer (rm) takes wages, denoted by wt , as given. Finally, given the absence of adjustment costs and full depreciation of capital, rms simply maximize current net prots. Consequently, the optimization problem of each rm can be written as 1 j j j t j j 1 j ) ( k ) kt , max ( A lt wt lt t j j 1 kt ,lt

j lt

j 1/ j j = (1 j A lt , and kt t)

(18.5)

=0

if wt >

(1 1 (1 1 (1 1

1/ j j A t)

[0, ] if wt = = if wt <

1/ j . j A t) 1/ j j A t)

(18.6)

A number of points are worth noting. First, in equation (18.6), the expression (1

j . If the wage is above this amount, this producer would not employ any workers, and if it is 403

14.451: Introduction to Economic Growth below, he or she would prefer to hire as many workers as possible (i.e., up to the maximum, ). Second, equation (18.5) highlights the source of potential ineciency in this economy. Producers invest in physical capital but only receive a fraction (1 j t ) of the revenues. Therefore, taxes discourage investments, creating potential ineciencies. Combining (18.6) with (18.4), equilibrium wages are obtained as follows: (i) If Condition (ES) holds, there is excess supply of labor and wt = 0. (ii) If Condition (ES) does not hold, then there is excess demand for labor and the equilibrium wage is wt = min e 1/ e m 1/ m . (1 t ) A , (1 t ) A 1 1 (18.7)

The form of the equilibrium wage is intuitive. Labor demand comes from two groups, the elite and middle class producers, and when condition (ES) does not hold, their total labor demand exceeds available labor supply, so the market clearing wage will be the minimum of their net marginal product. One interesting feature, which will be used below, is that when Condition (ES) does not hold, the equilibrium wage is equal to the net productivity of one of the two groups of producers, so either the elite or the middle class will make zero prots in equilibrium. Finally, equilibrium level of aggregate output is Z Z 1 1 j j e (1)/ e m (1)/ m (1 t ) (1 t ) A lt dj + A lt dj + R. Yt = 1 1 e m j S j S The equilibrium is summarized in the following proposition: Proposition 52 Suppose Assumption 15 holds. Then for a given sequence of taxes

m { e t , t }t=0,1,..., , the equilibrium takes the following form: if Condition (ES) holds, then

(18.8)

404

14.451: Introduction to Economic Growth wt = 0, and if Condition (ES) does not hold, then wt is given by (18.7). Given the wage sequence, factor demands are given by (18.5) and (18.6), and aggregate output is given by (18.8).

18.2.3

Inecient Policies

Now I use the above economic environment to illustrate a number of distinct sources of inecient policies. In this section, political institutions correspond to the dictatorship of the elite in the sense that they allow the elite to decide the policies, so the focus will be on the elites desired policies. The main (potentially inecient) policy will be a tax on middle class producers, though more generally, this could correspond to expropriation, corruption or entry barriers. As discussed in the introduction, there will be three mechanisms leading to inecient policies; (1) Resource Extraction; (2) Factor Price Manipulation; and (3) Political Consolidation. To illustrate each mechanism in the simplest possible way, I will focus on a subset of the parameter space and abstract from other interactions. Throughout, I assume that there is and e , where 1. This limit can be an upper bound on taxation, so that m t t institutional, or may arise because of the ability of producers to hide their output or shift into informal production. The timing of events within each period is as follows: rst, taxes are set; then, investments are made. This removes an additional source of ineciency related to the holdup problem whereby groups in power may seize all of the output of other agents in the economy once it has been produced. Holdup will be discussed below. To start with, I focus on Markov Perfect Equilibria (MPE) of this economy, where strategies are only dependent on payo-relevant variables. In this context, this means that 405

14.451: Introduction to Economic Growth strategies are independent of past taxes and investments (since there is full depreciation). In the dictatorship of the elite, policies will be chosen to maximize the elites utility. Hence, a

m w m e political equilibrium is given by a sequence of policies { e t , t , Tt , Tt , Tt }t=0,1,..., (satisfying

(18.3)) which maximizes the elites utility, taking the economic equilibrium as a function of the sequence of policies as given. More specically, substituting (18.5) into (18.2), we obtain elite consumption as e e 1/ e e (1 t ) A wt lt + Tte , (18.9) ct = 1 with wt given by (18.7). This expression follows immediately by recalling that the rst term in square brackets is the after-tax prots per worker, while the second term is the equilibrium wage. Total per elite consumption is given by their prots plus the lump sum transfer they receive. Then the political equilibrium, starting at time t = 0, is simply given by a sequence

m w m e of { e t , t , Tt , Tt , Tt }t=0,1,..., that satises (18.3) and maximizes the discounted utility of P t e the elite, t=0 ct .

The determination of the political equilibrium is simplied further by the fact that in the

MPE with full capital depreciation, this problem is simply equivalent to maximizing (18.9). We now characterize this political equilibrium under a number of dierent scenarios.

18.2.4

Revenue Extraction

To highlight this mechanism, suppose that Condition (ES) holds, so wages are constant at zero. This removes any eect of taxation on factor prices. In this case, from (18.6), we also

j = for all producers. Also assume that > 0 (for example, = 1). have lt

It is straightforward to see that the elite will never tax themselves, so e t = 0, and will redistribute all of the government revenues to themselves, so Ttw = Ttm = 0. Consequently 406

14.451: Introduction to Economic Growth taxes will be set in order to maximize tax revenue, given by Revenuet = m (1)/ m A m + R (1 m t ) 1 t (18.10)

m at time t, facedownwhere the rst term is obtained by substituting for lt = and for (18.5) m into (18.2) and multiplying it by m t , and taking into account that there are middle class

producers and a fraction of tax revenues can be redistributed. The second term is simply the revenues from natural resources. It is clear that tax revenues are maximized by m t = . In other words, this is the tax rate that puts the elite at the peak of their Laer curve. In contrast, output maximization would require m t = 0. However, the output-maximizing tax rate is not an equilibrium because, despite the distortions, the elite would prefer a higher tax rate to increase their own consumption. At the root of this ineciency is a limit on the tax instruments available to the elite. If they could impose lump-sum taxes that would not distort investment, these would be preferable. Inecient policies here result from the redistributive desires of the elite coupled with the absence of lump-sum taxes. It is also interesting to note that as increases, the extent of distortions are reduced, since there are greater diminishing returns to capital and investment will not decline much in response to taxes. Even though m t = is the most preferred tax for the elite, the exogenous limit on taxation may become binding, so the equilibrium tax is

RE m min {, } t =

(18.11)

for all t. In this case, equilibrium taxes depend only on the production technology (in particular, how distortionary taxes are) and on the exogenous limit on taxation. For example, 407

14.451: Introduction to Economic Growth as decreases and the production function becomes more linear in capital, equilibrium taxes decline. This discussion is summarized in the following proposition (proof in the text): Proposition 53 Suppose Assumption 15 and Condition (ES) hold and > 0, then the

RE unique political equilibrium features m min {, } for all t. t =

18.2.5

I now investigate how inecient policies can arise in order to manipulate factor prices. To highlight this mechanism in the simplest possible way, let us rst assume that = 0 so that there are no direct benets from taxation for the elite. There are indirect benets, however, because of the eect of taxes on factor prices, which will be present as long as the equilibrium wage is positive. For this reason, I now suppose that Condition (ES) does not hold, so that equilibrium wage is given by (18.7). Inspection of (18.7) and (18.9) then immediately reveals that the elite prefer high taxes in order to reduce the labor demand from the middle class, and thus wages, as much as possible. The desired tax rate for the elite is thus m t = 1. Given constraints on taxation,

FPM for all t. We therefore have: the equilibrium tax is m t =

Proposition 54 Suppose Assumption 15 holds, Condition (ES) does not hold, and = 0,

FPM then the unique political equilibrium features m for all t. t =

This result suggests that the factor price manipulation mechanism generally leads to higher taxes than the pure revenue extraction mechanism. This is because, with the factor price manipulation mechanism, the objective of the elite is to reduce the protability of the 408

14.451: Introduction to Economic Growth middle class as much as possible, whereas for revenue extraction, the elite would like the middle class to invest and generate revenues. It is also worth noting that, dierently from the pure revenue extraction case, the tax policy of the elite is not only extracting resources from the middle class, but it is also doing so indirectly from the workers, whose wages are being reduced because of the tax policy. The role of = 0 also needs to be emphasized. Taxing the middle class at the highest rate is clearly inecient. Why is there not a more ecient way of transferring resources to the elite? The answer relates to the limited scal instruments available to the elite. In particular, = 0 implies that they cannot use taxes at all to extract revenues from the middle class, so they are forced to use inecient means of increasing their consumption, by directly impoverishing the middle class. In the next subsection, I discuss how the factor price manipulation mechanism works in the presence of an instrument that can directly raise revenue from the middle class. This will illustrate that the absence of any means of transferring resources from the middle class to the elite is not essential for the factor price manipulation mechanism (though the absence of non-distortionary lump-sum taxes is naturally important).

18.2.6

I now combine the two eects isolated in the previous two subsections. By itself the factor price manipulation eect led to the extreme result that the tax on the middle class should be as high as possible. Revenue extraction, though typically another motive for imposing taxes on the middle class, will serve to reduce the power of the factor price manipulation 409

14.451: Introduction to Economic Growth eect. The reason is that high taxes also reduce the revenues extracted by the elite (moving the economy beyond the peak of the Laer curve), and are costly to the elite. To characterize the equilibrium in this case again necessitates the maximization of (18.9). This is simply the same as maximizing transfers minus wage bill for each elite producer. As before, transfers are obtained from (18.10), while wages are given by (18.7). When Condition (ES) holds and there is excess supply of labor, wages are equal to zero, and we obtain the same results as in the case of pure resource extraction. The interesting case is the one where (ES) does not hold, so that wages are not equal to zero, and are given by the minimum of the two expressions in (18.7). Incorporating the fact that the elite will not tax themselves and will redistribute all the revenues to themselves, the maximization problem can be written as m 1 e e m (1)/ m m m A lt + R , A wt lt + e (1 t ) max m 1 1 t t subject to (18.7) and

e m + m lt = 1, and e lt m 1/ m = if (1 m A Ae . lt t )

(18.12)

(18.13) (18.14)

The rst term in (18.12) is the elites net revenues and the second term is the transfer they receive. Equation (18.13) is the market clearing constraint, while (18.14) ensures that middle class producers employ as much labor as they wish provided that their net productivity is greater than those of elite producers. The solution to this problem can take two dierent forms depending on whether (18.14) holds in the solution. If it does, then w = Ae / (1 ), and elite producers make zero prots and their only income is derived from transfers. Intuitively, this corresponds to the case where the elite prefer to let the middle class producers undertake all of the protable 410

14.451: Introduction to Economic Growth activities and maximize tax revenues. If, on the other hand, (18.14) does not hold, then the elite generate revenues both from their own production and from taxing the middle class producers. In this case w = (1 m )1/ Am / (1 ). Rather than provide a full taxonomy, I impose the following additional assumption: Assumption 16 A (1 )

e (1)/

This assumption ensures that the solution will always take the latter form (i.e., (18.14) does not hold). Intuitively, this condition makes sure that the productivity gap between the middle class and elite producers is not so large as to make it attractive for the elite to make zero prots themselves (recall that (1 )(1)/ < 1, so if e = m and Ae = Am , this condition is always satised).

1/ m m A t / (1 ), Consequently, when Assumption 16 holds, we have wt = (1 m t )

and the elites problem simply boils down to choosing m t to maximize 1 m m (1)/ m m m 1/ m t (1 t ) (1 m A l +R A , t ) e 1 1 lm = (1 e ) /m . The maximization of (18.15) gives e m t e 1+ . = (, , , ) 1 m 1 (1 e ) t

(18.15)

where I have used the fact that all elite producers will employ employees, and from (18.13),

The rst interesting feature is that (, e , , ) is always less than . This implies that m t is always less than 1, which is the desired tax rate in the case of pure factor price manipulation. Moreover, (, e , , ) is strictly greater than / (1 ), so that m t is 411

14.451: Introduction to Economic Growth always greater than , the desired tax rate with pure resource extraction. Therefore, the factor price manipulation motive always increases taxes above the pure revenue maximizing level (beyond the peak of the Laer curve), while the revenue maximization motive reduces taxes relative to the pure factor price manipulation case. Naturally, if this level of tax is greater than , the equilibrium tax will be , i.e., m t =

COM

min

(, e , , ) , . 1 + (, e , , )

(18.16)

It is also interesting to look at the comparative statics of this tax rate. First, as increases, taxation becomes more benecial (generates greater revenues), but COM declines. This might at rst appear paradoxical, since one may have expected that as taxation becomes less costly, taxes should increase. Intuition for this result follows from the observation that an increase in raises the importance of revenue extraction, and as commented above, in this case, revenue extraction is a force towards lower taxes (it makes it more costly for the elite to move beyond the peak of the Laer curve). Since the parameter is related, among other things, to state capacity, this comparative static result suggests that higher state capacity will translate into lower taxes, because greater state capacity enables the elite to extract revenues from the middle class through taxation, without directly impoverishing them. In other words, greater state capacity enables more ecient forms of resource extraction by the groups holding political power. Second, as e increases and the number of elite producers increases, taxes also increase. The reason for this eect is again the interplay between the revenue extraction and factor price manipulation mechanisms. When there are more elite producers, reducing factor prices becomes more important relative to gathering tax revenue. One interesting implication of this discussion is that when the factor price manipulation eect is more important, there will 412

14.451: Introduction to Economic Growth typically be greater ineciencies. Finally, an increase in raises taxes for exactly the same reason as above; taxes create fewer distortions and this increases the revenue-maximizing tax rate. Once again summarizing the analysis: Proposition 55 Suppose Assumptions 15 and 16 hold, Condition (ES) does not hold, and

COM > 0. Then the unique political equilibrium features m as given by (18.16) for all t =

18.2.7

Political Consolidation

I now discuss another reason for inecient taxation, the desire of the elite to preserve their political power. This mechanism has been absent so far, since the elite were assumed to always remain in power. To illustrate it, the model needs to be modied to allow for endogenous switches of power. Institutional change will be discussed in greater detail later. For now, let us assume that there is a probability pt in period t that political power permanently shifts from the elite to the middle class. Once they come to power, the middle class will pursue a policy that maximizes their own utility. When this probability is exogenous, the previous analysis still applies. Interesting economic interactions arise when this probability is endogenous. Here I will use a simple (reduced-form) model to illustrate the trade-os and assume that this probability is a function of the income level of the middle class agents, in particular pt = p (m cm t ) [0, 1] , (18.17)

where I have used the fact that income is equal to consumption. Let us assume that p is continuous and dierentiable with p0 > 0, which captures the fact that when the middle 413

14.451: Introduction to Economic Growth class producers are richer, they have greater de facto political power. This reduced-form formulation might capture a variety of mechanisms. For example, when the middle class are richer, they may be more successful in solving their collective action problems or they may increase their military power. This modication implies that the scal policy that maximizes current consumption may no longer be optimal. To investigate this issue we now write the utility of elite agents recursively, and denote it by V e (E ) when they are in power and by V e (M ) when the middle class is in power. Naturally, we have Ae w le + 1e m (1 m )(1)/ Am lm m + R t t t t 1 1 t V e (E ) = max m e e t + [(1 p ) V (E ) + p V (M )]

t t

I wrote V e (E ) and V e (M ) not as functions of time, since the structure of the problem makes it clear that these values will be constant in equilibrium.

(1 1

1/ m m m m m m A lt wt lt . t )

The rst observation is that if the solution to the static problem involves cm t = 0, then the same scal policy is optimal despite the risk of losing power. This implies that, as long as Condition (ES) does not hold and Assumption 16 holds, the political consolidation mechanism does not add an additional motive for inecient taxation. To see the role of the political consolidation mechanism, suppose instead that Condition

RE (ES) holds. In this case, wt = 0 and the optimal static policy is m min {, } as t =

discussed above and implies positive prots and consumption for middle class agents. The dynamic maximization problem then becomes m m 1 e m (1)/ m A + e 1 t (1 t ) A + R 1 e . (18.18) V (E ) = max m + V e (E ) p (1 m )1/ Am m (V e (E ) V e (M )) t t 1 414

14.451: Introduction to Economic Growth The rst-order condition for an interior solution can be expressed as 1 m t + e p0 1 m t m 1/ m m (1 t ) A (V e (E ) V e (M )) = 0. 1

RE It is clear that when p0 () = 0, we obtain m min {, } as above. However, t = PC > RE min {, } as long as V e (E ) V e (M ) > 0. That when p0 () > 0, m t =

V e (E ) V e (M ) > 0 is the case is immediate since when the middle class are in power, they get to tax the elite and receive all of the transfers. Intuitively, as with the factor price manipulation mechanism, the elite tax beyond the peak of the Laer curve, yet now not to increase their revenues, but to consolidate their political power. These high taxes reduce the income of the middle class and their political power. Consequently, there is a higher probability that the elite remain in power in the future, enjoying the benets of controlling the scal policy. An interesting comparative static is that as R increases, the gap between V e (E ) and V e (M ) increases, and the tax that the elite sets increases as well. Intuitively, the party in power receives the revenues from natural resources, R. When R increases, the elite become more willing to sacrice tax revenue (by overtaxing the middle class) in order to increase the probability of remaining in power, because remaining in power has now become more valuable. This contrasts with the results so far where R had no eect on taxes. More interestingly, a higher , i.e., greater state capacity, also increases the gap between V e (E ) and V e (M ) (because this enables the group in power to raise more tax revenues) and thus implies a higher tax rate on the middle class. Intuitively, when there is no political competition, greater state capacity, by allowing more ecient forms of transfers, improves the allocation of resources. But in the presence of political competition, by increasing the political stakes, it leads to greater conict and more distortionary policies. 415

Proposition 56 Consider the economy with political replacement. Suppose also that Assumption 15 and Condition (ES) hold and > 0, then the political equilibrium features

PC m > RE for all t. This tax rate is increasing in R and . t =

18.2.8

I have so far focused on Markov perfect equilibria (MPE). In general, such a focus can be restrictive. In this case, however, it can be proved that subgame perfect equilibria (SPE) coincide with the MPE. This will not be true in the next subsection, so it is useful to briey discuss why it is the case here. MPE are a subset of the SPE. Loosely speaking, SPEs that are not Markovian will be supported by some type of history-dependent punishment strategies. If there is no room for such history dependence, SPEs will coincide with the MPEs. In the models analyzed so far, such punishment strategies are not possible even in the SPE. Intuitively, each individual is innitesimal and makes its economic decisions to maximize prots. Therefore, (18.5) and (18.6) determine the factor demands uniquely in any equilibrium. Given the factor demands, the payos from various policy sequences are also uniquely pinned down. This means that the returns to various strategies for the elite are independent of history. Consequently, there cannot be any SPEs other than the MPE characterized above. Therefore, we have:

Proposition 57 The MPEs characterized in Propositions 53-56 are the unique SPEs. 416

18.2.9

Lack of CommitmentHoldup

The models discussed so far featured full commitment to taxes by the elites. Using a term from organizational economics, this corresponds to the situation without any holdup . Holdup (lack of commitment to taxes or policies) changes the qualitative implications of the model; if expropriation (or taxation) happens after investments, revenues generated by investments can be ex post captured by others. These types of holdup problems are likely to arise when the key investments are long-term, so that various policies will be determined and implemented after these investments are made (and sunk). The problem with holdup is that the elite will be unable to commit to a particular tax rate before middle class producers undertake their investments (taxes will be set after investments). This lack of commitment will generally increase the amount of taxation and ineciency. To illustrate this possibility, I consider the same model as above, but change the timing of events such that rst individual producers undertake their investments and then the elite set taxes. The economic equilibrium is unchanged, and in particular, (18.5) and (18.6) still determine factor demands, with the only dierence that m and e now refer to expected taxes. Naturally, in equilibrium expected and actual taxes coincide. What is dierent is the calculus of the elite in setting taxes. Previously, they took into account that higher taxes would discourage investment. Since, now, taxes are set after investment decisions, this eect is absent. As a result, in the MPE, the elite will always want

HP to tax at the maximum rate, so in all cases, there is a unique MPE where m for t =

HP for Proposition 58 With holdup, there is a unique political equilibrium with m t =

all t. 417

14.451: Introduction to Economic Growth It is clear that this holdup equilibrium is more inecient than the equilibria characterized above. For example, imagine a situation in which Condition (ES) holds so that with the original timing of events (without holdup), the equilibrium tax rate is m t = . Consider the extreme case where = 1. Now without holdup, m t = and there is positive economic activity by the middle class producers. In contrast, with holdup, the equilibrium tax is m t = 1 and the middle class stop producing. This is naturally very costly for the elite as well since they lose all their tax revenues. In this model, it is no longer true that the MPE is the only SPE, since there is room for an implicit agreement between dierent groups whereby the elite (credibly) promise a dierent tax rate than . To illustrate this, consider the example where Condition (ES) holds and = 1. Recall that the history of the game is the complete set of actions taken up to that point. In the MPE, the elite raise no tax revenue from the middle class producers. Instead, consider the following trigger-strategy combination: the elite always set m = and the middle class producers invest according to (18.5) with m = as long as the history consists of m = and investments have been consistent with (18.5). If there is any other action in the history, the elite set m = 1 and the middle class producers invest zero. With this strategy prole, the elite raise a tax revenue of (1 )(1)/ Am m / (1 ) in every period, and receive transfers worth (1 )(1)/ Am m . (1 ) (1 ) m = 1, and they will raise (1 )(1)/ Am m . 1 (18.20) (18.19)

If, in contrast, they deviate at any point, the most protable deviation for them is to set

The trigger-strategy prole will be an equilibrium as long as (18.19) is greater than or equal 418

14.451: Introduction to Economic Growth to (18.20), which requires 1 . Therefore we have: Proposition 59 Consider the holdup game, and suppose that Assumption 15 and Condition (ES) hold and = 1. Then for 1 , there exists a subgame perfect equilibrium where m t = for all t. An important implication of this result is that in societies where there are greater holdup problems, for example, because typical investments involve longer horizons, there is room for coordinating on a subgame perfect equilibrium supported by an implicit agreement (trigger strategy prole) between the elite and the rest of the society.

18.2.10

Suppose now that taxes are set before investments, so the source of holdup in the previous subsection is absent. Instead, suppose that at time t = 0 before any economic decisions or policy choices are made, middle class agents can invest to increase their productivity. In particular, suppose that there is a cost (Am ) of investing in productivity Am . The function is non-negative, continuously dierentiable and convex. This investment is made once and the resulting productivity Am applies forever after. Once investments in technology are made, the game proceeds as before. Since investments in technology are sunk after date t = 0, the equilibrium allocations are the same as in the results presented above. Another interesting question is whether, if they could, the elite would prefer to commit to a tax rate sequence at time t = 0. The analysis of this case follows closely that of the baseline model, and I simply state the results (without proofs to save space): 419

14.451: Introduction to Economic Growth Proposition 60 Consider the game with technology adoption and suppose that Assumption 15 holds, Condition (ES) does not hold, and = 0, then the unique political equilibrium

FPM for all t. Moreover, if the elite could commit to a tax sequence at features m t = FPM . time t = 0, then they would still choose m t =

That this is the unique MPE is quite straightforward. It is also intuitive that it is the unique SPE. In fact, the elite would choose exactly this tax rate even if they could commit at time t = 0. The reason is as follows: in the case of pure factor price manipulation, the only objective of the elite is to reduce the middle class labor demand, so they have no interest in increasing the productivity of middle class producers. For contrast, let us next consider the pure revenue extraction case with Condition (ES) satised. Once again, the MPE is identical to before. As a result, the rst-order condition for an interior solution to the middle class producers technology choice is: 0 (Am ) = 1 (1 m )1/ 11 (18.21)

where m is the constant tax rate that they will face in all future periods. In the pure }. With the revenue extraction case, recall that the equilibrium is m = RE min {, same arguments as before, this is also the unique SPE. Once the middle class producers have made their technology decisions, there is no history-dependent action left, and it is impossible to create history-dependent punishment strategies to support a tax rate dierent than the static optimum for the elite. Nevertheless, this is not necessarily the allocation that the elite prefer. If the elite could commit to a tax rate sequence at time t = 0, they would choose lower taxes. To illustrate this, suppose that they can commit to a constant tax rate (it is straightforward to show that they will in fact choose a constant tax rate even without this restriction, but this restriction saves on notation). Therefore, the optimization problem of 420

14.451: Introduction to Economic Growth the elite is to maximize tax revenues taking the relationship between taxes and technology as in (18.21) as given. In other words, they will solve: max m (1 m )(1)/ Am m / (1 ) subject to (18.21). The constraint (18.21) incorporates the fact that (expected) taxes aect technology choice. The rst-order condition for an interior solution can be expressed as Am

m 1 m m m dA A + =0 1 m d m

where dAm /d m takes into account the eect of future taxes on technology choice at time t = 0. This expression can be obtained from (18.21) as: 1 (1 m )(1)/ 1 dAm < 0. = d m 11 00 (Am ) This implies that the solution to this maximization problem satises m = T A < RE min {, }. If they could, the elite would like to commit to a lower tax rate in the future in order to encourage the middle class producers to undertake technological improvements. Their inability to commit to such a tax policy leads to greater ineciency than in the case without technology adoption. Summarizing this discussion: Proposition 61 Consider the game with technology adoption, and suppose that Assumption 15 and Condition (ES) hold and > 0, then the unique political equilibrium features m t = } for all t. If the elite could commit to a tax policy at time t = 0, they would RE min {, prefer to commit to T A < RE . An important feature is that in contrast to the pure holdup problem where SPE could prevent the additional ineciency (when 1 , recall Proposition 59), with the technology adoption game, the ineciency survives the SPE. The reason is that, since middle 421

14.451: Introduction to Economic Growth class producers invest only once at the beginning, there is no possibility of using historydependent punishment strategies. This illustrates the limits of implicit agreements to keep tax rates low. Such agreements not only require a high discount factor ( 1 ), but also frequent investments by the middle class, so that there is a credible threat against the elite if they deviate from the promised policies. When such implicit agreements fail to prevent the most inecient policies, there is greater need for economic institutions to play the role of placing limits on future policies.

18.2.11

The previous analysis shows how inecient policies emerge out of the desire of the elite, which possesses political power, to redistribute resources towards themselves. I now discuss the implications of these mechanisms for inecient institutions. Since the elite prefer to implement inecient policies to transfer resources from the rest of the society (the middle class and the workers) to themselves, they will also prefer inecient economic institutions that enable and support these inecient policies. To illustrate the main economic interactions, I consider two prototypical economic institutions: (1) Security of property rights; there may be constitutional or other limits on the extent of redistributive taxation and/or other policies that reduce protability of producers investments. In terms of the model above, we can think of this as determining the level of . (2) Regulation of technology, which concerns direct or indirect factors aecting the productivity of producers, in particular middle class producers. As pointed out in the introduction, the main role of institutions is to provide the framework for the determination of policies, and consequently, preferences over institutions are derived from preferences over policies and economic allocations. Bearing this in mind, let 422

14.451: Introduction to Economic Growth us now discuss the determination of economic institutions in the model presented here. To simplify the discussion, for the rest of the analysis, and in particular, throughout this section, I focus on MPE, and start with security of property rights. The environment is the same as in the previous section, with the only dierence that at time t = 0, before any decisions are taken, the elite can reduce , say from H to some level in the interval [0, H ], thus creating an upper bound on taxes and providing greater security of property rights to the middle class. The key question is whether the elite would like to < H do so, i.e., whether they prefer = H or The next three propositions answer this question: Proposition 62 Without holdup and technology adoption, the elite prefer = H . The proof of this result is immediate, since without holdup or technology adoption, putting further restrictions on the taxes can only reduce the elites utility. This proposition implies that if economic institutions are decided by the elite (which is the natural benchmark since they are the group with political power), they will in general choose not to provide additional security of property rights to other producers. Therefore, the underlying economic institutions will support the inecient policies discussed above. The results are dierent when there are holdup concerns. To illustrate this, suppose that the timing of taxation decision is after the investment decisions (so that there is the holdup problem), and consider the case with revenue extraction and factor price manipulation combined. In this case, the elite would like to commit to a lower tax rate than H in order to encourage the middle class to undertake greater investments, and this creates a useful role for economic institutions (to limit future taxes): Proposition 63 Consider the game with holdup and suppose Assumptions 15 and 16 hold, 423

14.451: Introduction to Economic Growth Condition (ES) does not hold, and > 0, then as long as COM given by (18.16) is less than = COM . H , the elite prefer The proof is again immediate. While COM maximizes the elites utility, in the presence of holdup the MPE involves = H , and the elite can benet by using economic institutions to manipulate equilibrium taxes. This result shows that the elite may provide additional property rights protection to producers in the presence of holdup problems. The reason is that because of holdup, equilibrium taxes are too high even relative to those that the elite would prefer. By manipulating economic institutions, the elite may approach their desired policy (in fact, it can exactly commit to the tax rate that maximizes their utility). Finally, for similar reasons, in the economy with technology adoption discussed above, the elite will again prefer to change economic institutions to restrict future taxes: Proposition 64 Consider the game with holdup and technology adoption, and suppose that H , the elite Assumption 15 and Condition (ES) hold and > 0, then as long as T A < prefer = T A. As before, when we look at SPE, with pure holdup, there may not be a need for changing economic institutions, since credible implicit promises might play the same role (as long as 1 as shown in Proposition 59). However, parallel to the results above, in the technology adoption game, SPE and MPE coincide, so a change in economic institutions is necessary for a credible commitment to a low tax rate (here T A ). Turning to the regulation of technology now, we see that economic institutions also have and major eect on the environment for technology adoption or more directly the technology choices of producers. For example, by providing infrastructure or protection of intellectual 424

14.451: Introduction to Economic Growth property rights, a society may improve the technology available to its producers. Conversely, the elite may want to block, i.e., take active actions against, the technological improvements of the middle class. Therefore the question is: do the elite have an interest in increasing the productivity of the middle class as much as possible? Consider the baseline model. Suppose that there exists a government policy g {0, 1}, which inuences only the productivity of middle class producers, i.e., Am = Am (g ), with Am (1) > Am (0). Assume that the choice of g is made at t = 0 before any other decisions, and has no other inuence on payos (and in particular, it imposes no costs on the elite). Will the elite always choose g = 1, increasing the middle class producers productivity, or will they try to block technology adoption by the middle class? When the only mechanism at work is revenue extraction, the answer is that the elite would like the middle class to have the best technology: Proposition 65 Suppose Assumption 15 and Condition (ES) hold and > 0, then w = 0 and the the elite always choose g = 1. The proof follows immediately since g = 1 increases the tax revenues and has no other eect on the elites consumption. Consequently, in this case, the elite would like the producers to be as productive as possible, so that they generate greater tax revenues. Intuitively, there is no competition between the elite and the middle class (either in factor markets or in the political arena), and when the middle class is more productive, the elite generate greater tax revenues. The situation is dierent when the elite wish to manipulate factor prices: Proposition 66 Suppose Assumption 15 holds, Condition (ES) does not hold, = 0, and < 1, then the elite choose g = 0. 425

14.451: Introduction to Economic Growth Once again the proof of this proposition is straightforward. With < 1, labor demand from the middle class is high enough to generate positive equilibrium wages. Since = 0, taxes raise no revenues for the elite, and their only objective is to reduce the labor demand from the middle class and wages as much as possible. This makes g = 0 the preferred policy for the elite. Consequently, the factor price manipulation mechanism suggests that, when it is within their power, the elite will choose economic institutions so as to reduce the productivity of competing (middle class) producers. The next proposition shows that a similar eect is in operation when the political power of the elite is in contention. Proposition 67 Consider the economy with political replacement. Suppose also that Assumption 15 and Condition (ES) hold and = 0, then the elite prefer g = 0. In this case, the elite cannot raise any taxes from the middle class since = 0. But dierently from the previous proposition, there are no labor market interactions, since there is excess labor supply and wages are equal to zero. Nevertheless, the elite would like the prots from middle class producers to be as low as possible so as to consolidate their political power. They achieve this by creating an environment that reduces the productivity of middle class producers. Overall, this section has demonstrated how the elites preferences over policies, and in particular their desire to set inecient policies, translate into preferences over inecient non-growth enhancingeconomic institutions. When there are no holdup problems, introducing economic institutions that limit taxation or put other constraints on policies provides no benets to the elite. However, when the elite are unable to commit to future taxes (because of holdup problems), equilibrium taxes may be too high even from the viewpoint of the 426

14.451: Introduction to Economic Growth elite, and in this case, using economic institutions to manipulate future taxes may be benecial. Similarly, the analysis reveals that the elite may want to use economic institutions to discourage productivity improvements by the middle class. Interestingly, this never happens when the main mechanism leading to inecient policies is revenue extraction. Instead, when factor price manipulation and political consolidation eects are present, the elite may want to discourage or block technological improvements by the middle class.

18.3

The above analysis characterized the equilibrium under the dictatorship of the elite, a set of political institutions that gave all political power to the elite producers. An alternative is to have the dictatorship of the middle class, i.e., a system in which the middle class makes the key policy decisions (this could also be a democratic regime with the middle class as the decisive voters). Finally, another possibility is democracy in which there is voting over dierent policy combinations. If e + m < 1, then the majority are the workers, and they will pursue policies to maximize their own income. I now briey discuss the possibility of a switch from the dictatorship of the elite to one of these two alternative regimes. It is clear that whether the dictatorship of the elite or that of middle class is more ecient depends on the relative numbers and productivities of the two groups, and whether elite control or democracy is more ecient depends on policies in democracy. Hence, this section will rst characterize the equilibrium under these alternative political institutions. Moreover, for part of the analysis in this subsection, I simplify the discussion by imposing the following assumption:

427

14.451: Introduction to Economic Growth Assumption 17 1 m = e < , 2 This assumption ensures that the number of middle class and elite producers is the same, and they are in the minority relative to workers.

18.3.1

With the dictatorship of the middle class, the political equilibrium is identical to the dictatorship of the elite, with the roles reversed. To avoid repetition, I will not provide a full analysis. Instead, let me focus on the case, combining revenue extraction and factor price manipulation. The analog of Assumption 16 in this case is: Assumption 18 A (1 )

m (1)/

e A m.

e

Given this assumption, a similar proposition to that above immediately follows; the middle class will tax the elite and will redistribute the proceeds to themselves, i.e., Ttw = Tte = 0, and moreover, the same analysis as above gives their most preferred tax rate as (, m , , ) e COM t = min , . (18.22) 1 + (, m , , ) Proposition 68 Suppose Assumptions 15 and 17 hold, Condition (ES) does not hold, and COM as > 0, then the unique political equilibrium with middle class control features e t = given by (18.22) for all t. Comparing this equilibrium to the equilibrium under the dictatorship of the elite, it is apparent that the elite equilibrium will be more ecient when Ae and e are large relative 428

14.451: Introduction to Economic Growth to Am and m , and the middle class equilibrium will be more ecient when the opposite is the case. Proposition 69 Suppose Assumptions 15-18 hold, then aggregate output is higher with the dictatorship of the elite than the dictatorship of the middle class if Ae > Am and it is higher under the dictatorship of the middle class if Am > Ae . Intuitively, the group in power imposes taxes on the other group (and since m = e , these taxes are equal) and not on themselves, so aggregate output is higher when the group with greater productivity is in power and is spared from distortionary taxation.

18.3.2

Democracy

Under Assumption (A4), workers are in the majority in democracy, and have the power to tax the elite and the middle class to redistribute themselves. More specically, each

w workers consumption is cw t = wt + Tt , with wt given by (18.7), so that workers care

about equilibrium wages and transfers. Workers will then choose the sequence of policies P t w m w m e { e t , t , Tt , Tt , Tt }t=0,1,..., that satisfy (18.3) to maximize t=0 ct .

It is straightforward to see that the workers will always set Ttm = Tte = 0. Substituting

for the transfers from (18.3), we obtain that democracy will solve the following maximization problem to determine policies:

t , t

max wt + e m

m (1)/ m m m e e (1)/ e e e t (1 m ) A l + (1 ) A l +R t t t 1

As before, when Condition (ES) holds, taxes have no eect on wages, so the workers will tax at the revenue maximizing rate, similar to the case of revenue extraction for the elite above. This result is stated in the next proposition (proof omitted): 429

14.451: Introduction to Economic Growth Proposition 70 Suppose Assumption 15 and Condition (ES) hold and > 0, then the

e RE min {, }. unique political equilibrium with democracy features m t = t =

Therefore, in this case democracy is more inecient than both middle class and elite control, since it imposes taxes on both groups. The same is not the case, however, when Condition (ES) does not hold and wages are positive. In this case, workers realize that by taxing the marginal group they are reducing their own wages. In fact, taxes always reduce wages more than the revenue they generate because of their distortionary eects. As a result, workers will only tax the group with the higher marginal productivity. More specically, for

m m 1/ m example, if Am > Ae , we will have e A = Ae or t = 0, and t will be such that (1 t ) 1/ m A Ae . Therefore, we have: m t = and (1 )

Proposition 71 Suppose Assumptions 15 and 18 hold and Condition (ES) does not hold. Then in the unique political equilibrium with democracy, if Am > Ae , we will have e t = 0,

Dm will be such that (1 Dm )1/ Am = Ae or Dm = and (1 )1/ Am Ae . and m t = e De If Am < Ae , we will have m will be such that (1 De )1/ Ae = Am or t = 0, and t =

De = and (1 )1/ Ae Ae . The most interesting implication of this proposition comes from the comparison of the case with and without excess supply. While in the presence of excess labor supply, democracy taxes both groups of producers and consequently generates more ineciency than the dictatorship of the elite or the middle class, when there is no excess supply, it is in general less distortionary than the dictatorship of the middle class or the elite. The intuition is that when Condition (ES) does not hold, workers understand that high taxes will depress wages and are therefore less willing to use distortionary taxes. 430

18.3.3

Consider a society where Assumption 18 is satised and Ae < Am so that middle class control is more productive (i.e., generates greater output). Despite this, the elite will have no incentive, without some type of compensation, to relinquish their power to the middle class. In this case, political institutions that lead to more inecient policies will persist even though alternative political institutions leading to better outcomes exist. One possibility is a Coasian deal between the elite and the middle class. For example, perhaps the elite can relinquish political power and get compensated in return. However, such deals are in general not possible. To discuss why (and why not), let us distinguish between two alternative approaches. First, the elite may relinquish power in return for a promise of future transfers. This type of solution will run into two diculties. (i) such promises will not be credible, and once they have political power, the middle class will have no incentive to keep on making such transfers. (ii) since there are no other, less distortionary, scal instruments, to compensate the elite, the middle class will have to impose similar taxes on itself, so that the alternative political institutions will not be as ecient in the rst place. Second, the elite may relinquish power in return for a lump-sum transfer from the middle class. Such a solution is also not possible in general, since the net present value of the benet of holding political power often exceeds any transfer that can be made. Consequently, the desire of the elite to implement inecient policies also implies that they support political institutions that enable them to pursue these policies. Thus, in the same way as preferences over inecient policies translate into preferences over inecient economic institutions, they 431

14.451: Introduction to Economic Growth also lead to preferences towards inecient political institutions. I will discuss how political institutions can change from the ground-up in Section 18.3.4 below. Another interesting question is whether a given set of economic institutions might be appropriate for a while, but then become inappropriate and costly for economic activity later. This question might be motivated, for example, by the contrast of the Northeastern United States and the Caribbean colonies between the 17th and 19th centuries. The Caribbean colonies were clear examples of societies controlled by a narrow elite, with political power in the monopoly of plantation owners, and few rights for the slaves that made up the majority of the population. In contrast, Northeastern United States developed as a settler colony, approximating a democratic society with signicant political power in the hands of smallholders and a broader set of producers. While in both the 17th and 18th centuries, the Caribbean societies were among the richest places in the world, and almost certainly richer and more productive than the Northeastern United States, starting in the late 18th century, they lagged behind the United States and many other more democratic societies, which took advantage of new investment opportunities, particularly in industry and commerce. This raises the question as to whether the same political and economic institutions that encouraged the planters to invest and generate high output in the 17th and early 18th centuries then became a barrier to further growth. The baseline model used above suggests a simple explanation along these lines. Imagine an economy in which the elite are in power, Condition (ES) does not hold, is small, Ae is relatively high and Am is relatively small to start with. The above analysis shows that the elite will choose a high tax rate on the middle class. Nevertheless, output will be relatively high, because the elite will undertake the right investments themselves, and the distortion on the middle class will be relatively small since Am is small. 432

14.451: Introduction to Economic Growth Consequently, the dictatorship of the elite may generate greater income per capita than an alternative society under the dictatorship of the middle class. This is reminiscent of the planter elite controlling the economy in the Caribbean. However, if at some point the environment changes so that Am increases substantially relative to Ae , the situation changes radically. The elite, still in power, will continue to impose high taxes on the middle class, but now these policies have become very costly because they distort the investments of the more productive group. Another society where the middle class have political power will now generate signicantly greater output. This simple example illustrates how institutions that were initially appropriate (i.e., that did not generate much distortion or may have even encouraged growth) later caused the society to fall substantially behind other economies.

18.3.4

To develop a better understanding for why inecient institutions emerge and persist, we need an equilibrium model of institutional change. I now briey discuss such a model. It is rst useful to draw a distinction between de jure and de facto political power. De jure political power is determined by political institutions. In the baseline model, de jure political power is in the hands of the elite, since the political institutions give them the right to set taxes and determine the economic institutions. De facto political power, which comes from other sources, did not feature so far in the model (except in the discussion of political consolidation). The simplest example of de facto political power is when a group manages to organize itself and poses a military challenge to an existing regime or threatens it with a revolution. I will conceptualize institutional change as resulting from the interplay of de jure and de facto political power. 433

14.451: Introduction to Economic Growth Imagine a society described by the baseline model above where de jure political power is initially in the hands of the elite. In each period, with probability q the middle class solve the collective action problem among its members and gather sucient de facto political power to overthrow the existing regime and to monopolize political power (establish a dictatorship of the middle class). However, violently overthrowing the existing regime is still costly, and in particular, each middle class agent incurs a cost of in the process. Moreover, in the process, the elite are harmed substantially. In particular, I assume that following a violent overthrow, the elite receive zero utility. Let us assume that the dictatorship of the middle class, if established, is an absorbing state and once the middle class comes to power, there will never be any further institutional change. With probability 1 q , the middle class has no de facto political power. Also denote the state at time t by the tuple (Pt , st ), where Pt {E, M } denotes whether the elite or the middle class are in control, and st {H, L} denotes the level of threat (high or low) against the regime controlled by the elite. When the middle class amass de facto political power, the elite need to respond in some way, since letting the middle class overthrow the existing regime is excessively costly for them. The elite can respond in three dierent ways: (i) they can make temporary concessions, such as reducing taxes on the middle class, etc.; (ii) they can give up power; (iii) they can use repression, which is costly, but manages to prevent the regime from falling to the middle class. I assume that repression costs for the elite as a whole. Throughout this section, I focus on MPE. In a MPE, strategies are only a function of the state st , so when st = L, the elite will set the policies that maximize their utility, which were characterized above. So the interesting actions take place in the state st = H . Moreover, to simplify the discussion, I assume throughout that Condition (ES) is satised, so that the 434

14.451: Introduction to Economic Growth main motive for inecient policy is revenue extraction. Let us rst calculate the value of a middle class agent when the middle class is in power. Since Condition (ES) is satised, the above analysis shows that they will not tax themselves, set a tax of e = RE on the elite in every period, and redistribute all the revenue to themselves. To write the resulting value function, let us introduce the following notation: T j ( ) (1 )(1)/ Aj j / (1 ) as the tax revenue raised from group j at the tax

rate , and j ( ) (1 )1/ Aj / (1 ) as the prot of a producer in group j facing the tax rate . Then, using M to indicate a value function under the dictatorship of the middle class, we have V

m

m (0) + T e RE + R /m , (M ) = 1

(18.23)

where RE is given by (18.11). The rst term in the numerator is their own revenues, Am / (1 ), and the second is the distribution from the revenue obtained by taxing the elite and from natural resources. The term 1 provides the net present discounted value of this stream of revenues. Similarly, the value of an elite producer in this case is e RE e . V (M ) = 1 the no threat state: V m (E, L) = m RE + (1 q) V m (E, L) + qV m (E, H ) . (18.25)

(18.24)

What about the dictatorship of the elite? Let us write this value recursively starting in

This expression incorporates the fact that, in the MPE, during periods of low threat, the elite will follow their most preferred policy, m = RE and T m = 0. The low threat state recurs with probability 1 q . What happens when st = H ? As noted above, there are 435

14.451: Introduction to Economic Growth three possibilities. Let us rst start by investigating whether the elite can prevent a switch of political power by making concessions in the high threat state. For this purpose, let us denote the highest possible value to the middle class under the dictatorship of the elite by m (E, H ). Then, the condition for concessions within the given political regime to prevent V action by the middle class is simply m (E, H ) V m (M ) , V (18.26)

where recall that is the cost of regime change for the middle class. When this constraint holds, the elite could make sucient concessions to keep the middle class happy within the existing regime. Therefore, to determine whether concessions within the dictatorship of the elite will be m (E, H ). Note that the sucient to satisfy the middle class, we simply need to calculate V best concession that the elite can do is to adopt a policy that is most favorable for the middle class, i.e., m = 0, e = RE , and T m = T e RE + R /m . Therefore,

m (E, L) is given by expression (18.25), with V m (E, H ) replacing V m (E, H ) on the where V

right hand side. Combining (18.27) and (18.25), we obtain: e RE RE m m m (0) + T + (1 (1 q )) + R / (1 q ) m (E, H ) = (18.28) V (1 ) This is the maximum credible utility that the elite can promise the middle class within the existing regime. The reason why they cannot give them greater utility is because of commitment problems. As (18.28) makes it clear, the elite transfer resources to the middle class only in the state st = H . Even if they promise to make further transfers or not tax 436

14.451: Introduction to Economic Growth them in the state st = L, these promises will not be credible (they cannot commit to them), and in the MPE, when the state st = L arrives, the elite will choose their most preferred policy of taxing the middle class and transferring the resources to themselves. If given this expression, (18.26) is satised, then the elite can prevent a violent overthrow by making concessions within the existing regime. Nevertheless, the elite may not necessarily prefer such concessions. To investigate this issue, we rst need to determine the exact concessions that the elite will make. They will clearly not follow the most preferable policy for the middle class, since this will give more than sucient utility to prevent an overthrow. m , T e such that V m (E, H ) = m , Instead, the elite will choose a policy combination e , T

V m (M ) , i.e., they will make the middle class just indierent between overthrowing the

regime or accepting the concessions. The value of such concessions to the elite is, by similar arguments, given by: i h m RE e e e e e )+T + R / + (1 (1 q )) ( (1 q ) (0) + T (1 ) (18.29)

e (E, H ) = V

Whether the elite will make these concessions or not then depends on the values of other options available to them. Another alternative is the use of repression whenever there is a threat from the middle class. Such repression is always eective, so the only cost of this strategy for the elite is the cost they incur in the use of repression, . Denote V e (O, st ) the value function to the elite it uses repression and the state is st . By standard arguments, we can obtain this value by writing the following standard recursive formulae: V e (O, H ) = e (0)+ T m RE + R /m + (1 q ) V e (O, L)+ qV e (O, H ) and V e (O, L) = e (0)+ m RE T + R /m + (1 q ) V e (O, L) + qV e (O, H ). These two expressions incorporate 437

the fact that, when using the repression strategy, the elite will always choose their most for

14.451: Introduction to Economic Growth preferred policy combination, and will use repression when st = H to defend their regime. Combining these two equations, we obtain: m RE m e (0) + T + R / (1 (1 q )) V e (O, H ) = . 1 V e (O, H ). Finally, the third alternative for the elite is to allow regime change, and obtain V e (M ) as e (E, H ), since in the latter case they only given by (18.24). Evidently, V e (M ) is less than V make concessions (in fact limited concessions) with probability q. Therefore, regime change will only happen when (18.26) does not hold. In addition, for similar reasons, for regime change to take place, we need V e (M ) V e (O, H ). Note that all of the values here are simple functions of parameters, so comparing these values essentially amounts to comparing nonlinear functions of the underlying parameters. Putting all these pieces together and assuming for convenience that when indierent the elite opt against repression, we obtain the following proposition: (18.30)

e (E, H ) Consequently, for the elite to prefer concessions, it needs to be the case that V

Proposition 72 Consider the above environment with potential regime change and suppose that Condition (ES) holds. Then there are three dierent types of political equilibria: e (E, H ) V e (O, H ), in the unique equilibrium the regime 1. If (18.26) holds and V always remains the dictatorship of the elite. When st = L, the elite set their most preferred policy of m = RE , e = 0 and T m = 0, and when st = H , the elite make concessions m e m e m m sucient to ensure V (E, H ) = V (M ) , i.e., they adopt the policy , ,T ,T . e (E, H ) < V e (O, H ), or if (18.26) does not hold and V e (M ) < 2. If (18.26) holds but V V e (O, H ), then the regime always remains the dictatorship of the elite. The elite always set 438

14.451: Introduction to Economic Growth their most preferred policy of m = RE , e = 0 and T m = 0, and when st = H , they use repression against the middle class. 3. If (18.26) does not hold and V e (M ) V e (O, H ), then there is equilibrium institutional change. When st = L, the elite set their most preferred policy of m = RE , e = 0 and T m = 0, and when st = H , the elite voluntarily pass political control to the middle class. This proposition illustrates how various dierent institutional equilibria can arise. The most interesting case is 3, where there is equilibrium institutional change as a result of the elite voluntarily relinquishing political control. Why would the elite give up their dictatorship? The reason is the de facto political power of the middle class, which threatens the elite with a violent overthrowan outcome worse than the dictatorship of the middle class. The elite then prevent such a violent overthrow by changing political institutions to transfer de jure political power to the middle class. This transfer exploits the role of political institutions as a commitment device (a commitment to a dierent distribution of de jure political power), and acts as a credible promise of future policies that favor the middle class (the promise is credible, since institutional change gives de jure political power and thus the right to set scal policy in the future to the middle class). This discussion highlights that institutional change has two requirements: (i) that concessions within the existing regime are not sucient to appease the middle class; (ii) that repression is suciently costly for the elites to accept regime change. The comparative statics of regime change are also interesting. First, when repression is more costly, i.e., is higher, institutional change is more likely. Moreover: RE m RE m e e (0) + T + R / (1 (1 q)) V e (O, H ) V e (M ) = 1 is increasing in R and . This implies that when R is high, so that there are greater rents from 439

14.451: Introduction to Economic Growth natural resources, V e (M ) V e (O, H ) becomes less likely, and the elite now prefer to use repression rather than allowing institutional change. Similarly, greater , which corresponds to greater state capacity, has the same impact on institutional equilibrium, since greater state capacity enables greater tax revenues in the future. This implies that, as already suggested by the results in subsection 18.2.7, greater state capacity, which typically leads to less distortionary policies, also increases political stakes and makes the use of repression by the elite the more likely. Nevertheless, increases in R or do not make institutional change unambiguously less likely, since they also make (18.26) more likely to be violated, making it more dicult for the elite to use concessions to appease the middle class. Therefore, when the trade-o for the elite is between repression and institutional change, greater R and make repression more likely, while when the trade-o is between concessions and institutional change, they may encourage institutional change. This analysis also illustrates the conditions for institutional persistence. Persistence is the natural course of things and something unusual, the success of the middle class in solving their collective action problem and amassing de facto political power, creates the platform for institutional change. However, even the possibility of collective action by the middle class is not sucient, since the elite can use costly methods to defend the existing regime. Therefore, institutions will be more persistent when the elite are unwilling to give up the right to determine policies in the future, which will in turn be the case when there is signicant distributional conict between the elite and the middle class, for example because tax revenues are important (i.e., high ) or because rents from natural resources, R, are high. Therefore, a set of political institutions will persist when political stakes are high, i.e., when alternative institutional arrangements are costly for those who currently hold political power and have the means to use force to maintain the existing political institutions. 440

14.451: Introduction to Economic Growth The model also suggests the possibility of interesting interactions between economic forces and institutional equilibria. The rst is an interaction between economic and political institutions. Suppose that economic institutions impose a low . This implies that control of scal policy will generate only limited gains, reducing political stakes, and the elite will have less reason to use repression in order to defend the existing regime. Consequently, institutional persistence might be more of an issue in societies where economic institutions enable those with political power to capture greater rents. When the ability of the middle class to solve their collective action problem is endogenous (as in the model used above to illustrate the political consolidation eect), there will be a further interaction between economics and politics. In particular, suppose that the probability q that the middle class will be able to pose an eective threat to the regime is endogenous and depends on the prots of the middle class. In this case, the elite realize that the richer are the middle class, the greater the threat from them in the future. When political power is very valuable, for example because tax revenues or rents from natural resources are high, the elite will wish to overtax the middle class to impoverish them and to reduce their political power. These higher taxes will, in turn, increase institutional persistence by making it more dicult for the middle class to solve their collective action problem and mount challenges against the dictatorship of the elite. This suggests another interesting interaction, this time between inecient policies and institutional persistence.

441